Sum of column by group wise


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Sum of column by group wise
# 1  
Old 02-21-2010
Sum of column by group wise

Hello All ,

I have a problem with summing of column by group

Input File -

Code:
COL_1,COL_2,COL_3,COL_4,COL_5,COL_6,COL_7,COL_8,COL_9,COL_10,COL_11
3010,21,1923D ,6,0,0.26,0,0.26,-0.26,1,200807
3010,21,192BI ,6,24558.97,1943.94,0,1943.94,22615.03,1,200807
3010,21,192BI ,12,1156.91,62.68,0,62.68,1094.23,1,200807
3010,21,192MED,6,-1062.7,558.79,0,558.79,-1621.49,1,200807
3010,21,192MED,12,101.6,23.11,0,23.11,78.49,1,200807
3010,21,192PD ,6,-26516.61,1295.86,0,1295.86,-27812.47,1,200807
3010,21,211CCD,12,187.07,62.14,0,62.14,124.93,2,200807
3010,21,211COL,6,-4584.9,1531.98,0,1531.98,-6116.88,2,200807
3010,21,211COL,12,118.11,64.02,0,64.02,54.09,2,200807
3010,21,211F  ,6,0,0,0,0,0,2,200807
3010,21,211TOW,6,-64.86,74.12,0,74.12,-138.98,2,200807
3010,21,211TOW,12,19.14,2.66,0,2.66,16.48,2,200807
3010,63,192BI ,6,1043.51,147.32,0,147.32,896.19,3,200807
3010,63,192MED,6,16.25,15.46,0,15.46,0.78,3,200807
3010,63,192PD ,6,-1192.37,98.2,0,98.2,-1290.57,3,200807
3010,63,192UMB,6,18.7,20.6,0,20.6,-1.9,3,200807
3010,63,192UMP,6,-1.03,1.5,0,1.5,-2.53,3,200807

Out Put needed :

The column COL_10 contain the group -- it has value of 1 /2/3/4 -etc
as long as it it group 1 it will list the records and the moment category finished it will print the sum of column col_5,col_6,col_7,col_8,col_9
and then the header again and then category 2 start and so on ..

So i need the category wise sum of mentioned column


Code:
COL_1,COL_2,COL_3,COL_4,COL_5,COL_6,COL_7,COL_8,COL_9,COL_10,COL_11
3010,21,1923D ,6,0,0.26,0,0.26,-0.26,1,200807
3010,21,192BI ,6,24558.97,1943.94,0,1943.94,22615.03,1,200807
3010,21,192BI ,12,1156.91,62.68,0,62.68,1094.23,1,200807
3010,21,192MED,6,-1062.7,558.79,0,558.79,-1621.49,1,200807
3010,21,192MED,12,101.6,23.11,0,23.11,78.49,1,200807
3010,21,192PD ,6,-26516.61,1295.86,0,1295.86,-27812.47,1,200807

*SUM -This record should show the sum of the column col_5 through col_9

Code:
3010,21,211CCD,12,187.07,62.14,0,62.14,124.93,2,200807
3010,21,211COL,6,-4584.9,1531.98,0,1531.98,-6116.88,2,200807
3010,21,211COL,12,118.11,64.02,0,64.02,54.09,2,200807
3010,21,211F  ,6,0,0,0,0,0,2,200807
3010,21,211TOW,6,-64.86,74.12,0,74.12,-138.98,2,200807
3010,21,211TOW,12,19.14,2.66,0,2.66,16.48,2,200807

*SUM -This record should show the sum of the column col_5 through col_9


Code:
3010,63,192BI ,6,1043.51,147.32,0,147.32,896.19,3,200807
3010,63,192MED,6,16.25,15.46,0,15.46,0.78,3,200807
3010,63,192PD ,6,-1192.37,98.2,0,98.2,-1290.57,3,200807
3010,63,192UMB,6,18.7,20.6,0,20.6,-1.9,3,200807
3010,63,192UMP,6,-1.03,1.5,0,1.5,-2.53,3,200807

*SUM -This record should show the sum of the column col_5 through col_9

awk -solution would be disarable

Last edited by Scott; 02-21-2010 at 07:08 AM.. Reason: Code tags please...
# 2  
Old 02-21-2010
Ccolumn 10 should be in ascending order...
Code:
awk -F, 'NR==1{print;next}{
if($10==col10){sum+=sum + $5 + $6 + $7 + $8 + $9}
else{print "\nSum: " sum "\n";sum=0}}
{col10=$10;print $0}
END{print "\nSum: " sum "\n"}' infile

# 3  
Old 02-21-2010
Hi,
I have edited the scripts ....

Code:
perl sum_column.pl test.txt

Code:
#!/usr/bin/perl

$filename=$ARGV[0];
`sort -t',' -nk10 $filename -o $filename`;
open (FH,$filename);
$current=0;
$sum_col4=0;
$sum_col5=0;
$sum_col6=0;
$sum_col7=0;
$sum_col8=0;
while (<FH>) {
chomp;
if ($. > 1 ) {
@array=split /,/;

if ($current != $array[9] ) {
        if ($. > 2 ) {
                print ",,,,$sum_col4,$sum_col5,$sum_col6,$sum_col7,$sum_col8,,,\n";
                $sum_col4=0;
                $sum_col5=0;
                $sum_col6=0;
                $sum_col7=0;
                $sum_col8=0;
        }
        print "Group $array[9] started:-- \n\n";
        print $_,"\n";
        $sum_col4= $array[4] + $sum_col4 ;
        $sum_col5= $array[5] + $sum_col5 ;
        $sum_col6= $array[6] + $sum_col6;
        $sum_col7= $array[7] + $sum_col7;
        $sum_col8= $array[8] + $sum_col8;
      }
else {
        print $_,"\n";
        $sum_col4= $array[4] + $sum_col4 ;
        $sum_col5= $array[5] + $sum_col5;
        $sum_col6= $array[6] + $sum_col6;
        $sum_col7= $array[7] + $sum_col7;
        $sum_col8= $array[8] + $sum_col8;
     }
}
$current = $array[9];
}
print ",,,,$sum_col4,$sum_col5,$sum_col6,$sum_col7,$sum_col8,,,\n";


Last edited by pravin27; 02-21-2010 at 10:18 AM..
# 4  
Old 02-21-2010
Code:
nawk -F"," 'NR>1{a[$10]+=a[$10]+$5+$6+$7+$8+$9 ; b[$10]=b[$10]"\n"$0 ; next}
END{for (i in a) print b[i]"\nSum "i" =\t"a[i]}' infile.txt

BR

SmilieSmilieSmilie
# 5  
Old 02-21-2010
Clarify more on the requirement

Thank you all for responding me - however the requirement was different .

let me clarify it in more simpler -

Input file -
Code:
col_1,col_2,col_3,col_4,col_5,col_6,col_7,col_8,col_9,col_10,col_11
3010,21,1923D ,0,5,10,15,20,25,1,200807
3010,21,192BI ,0,60,65,70,75,80,1,200807
3011,22,192xx ,0,20,30,40,50,60,2,200903
3011,22,192xx ,0,25,30,40,50,60,2,200903
3011,22,192xx ,0,50,55,60,65,70,3,200903
3011,22,192xx ,0,25,30,40,50,60,3,200903

Column 10 contain category 1 , 2, 3
Out put would be-- At the end of each category it would print the
sum of column 5,6,7,8,9 and rest are just blank

Example :
Code:
col_1,col_2,col_3,col_4,col_5,col_6,col_7,col_8,col_9,col_10,col_11
3010,21,1923D ,0,5,10,15,20,25,1,200807
3010,21,192BI ,0,60,65,70,75,80,1,200807
,,,,65,75,85,95,105,,,  
col_1,col_2,col_3,col_4,col_5,col_6,col_7,col_8,col_9,col_10,col_11
3011,22,192xx ,0,20,30,40,50,60,2,200903
3011,22,192xx ,0,25,30,40,50,60,2,200903
,,,,45,60,80,100,120,,,
col_1,col_2,col_3,col_4,col_5,col_6,col_7,col_8,col_9,col_10,col_11
3011,22,192xx ,0,50,55,60,65,70,3,200903
3011,22,192xx ,0,25,30,40,50,60,3,200903
,,,,75,85,100,115,130,,

After end of each category it is printing the sum of col4 ,sum of col 5 ,sum of 6,sum of col7,sum of col 8,sum of col9 and then it start the header again then then same process for each category

Hope this helps to formulate an solution

Last edited by Scott; 02-21-2010 at 10:43 AM.. Reason: Code tags, PLEASE!
# 6  
Old 02-21-2010
Code:
awk -F, 'END { 
  printf "%s\n%s\n,,,,%s,%s,%s,%s,%s,,,\n", h, r, \
  _5, _6, _7, _8, _9 
   }
NR == 1 { h = $0; next }   
!c[$10]++ && NR > 2 {
  printf "%s\n%s\n,,,,%s,%s,%s,%s,%s,,,\n", h, r, \
  _5, _6, _7, _8, _9
  _5 = _6 = _7 = _8 = _9 = r = x
  }
{  
  _5 += $5; _6 += $6; _7 += $7; _9 += $9
  _8 += $8; r = r ? r RS $0 : $0 
  }' infile


Last edited by radoulov; 02-21-2010 at 10:44 AM.. Reason: The header ... :)
# 7  
Old 02-21-2010
Assuming column 10 is in ascending order:

Code:
awk -F, 'NR==1{h=$0;print;next}
g && g != $10 {
  print ",,,,"c5, c6, c7, c8, c9 ",,,"
  c5=c6=c7=c8=c9=""
  print h
}
{c5+=$5; c6+=$6; c7+=$7; c8+=$8; c9+=$9; g=$10}
END{print ",,,," c5, c6, c7, c8, c9 ",,,"}
1' OFS="," file

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Proper Column wise matching

My below code works fine if none of the columns has pipe as its content in it, If it has the pipe in any of the content then the value moves to the next column. I wanted my code to work fine even if the column has pipe in it apart from the delimiter. NOTE : If there is a pipe in the content... (6 Replies)
Discussion started by: nikhil jain
6 Replies

2. Shell Programming and Scripting

Generate sum of a particular column date wise

Hi All, I have a file with below content 01/22/2014,23:43:00,1742.8, 01/22/2014,23:43:00,1742.8, 01/22/2014,23:44:00,1749.06666666667, 01/25/2014,23:45:00,2046.45, 01/25/2014,23:43:00,1742.8, 01/25/2014,23:44:00,1749.06666666667, 01/25/2014,23:45:00,2046.45, 01/25/2014,23:43:00,1742.8,... (4 Replies)
Discussion started by: villain41
4 Replies

3. Shell Programming and Scripting

Column wise text adding

Hi I have pasted sample data as below:- in data.txt Please suggest any way out: as the 3rd field is cat data.txt 22:37:34 STARTING abc 22:37:40 FAILURE sadn 00:06:42 STARTING asd 00:06:51 FAILURE ad 02:06:38 STARTING acs 02:06:46 FAILURE cz 04:06:35 STARTING xzc... (1 Reply)
Discussion started by: Gaurav198
1 Replies

4. Shell Programming and Scripting

Sum up the column values group by using some field

12-11-2012,PNL,158406 12-11-2012,RISK,4564 12-11-2012,VAR_1D,310101 12-11-2012,VAR_10D,310101 12-11-2012,CB,866 12-11-2012,STR_VAR_1D,298494 12-11-2012,STR_VAR_10D,309623 09-11-2012,PNL,1024106 09-11-2012,RISK,4565 09-11-2012,VAR_1D,317211 09-11-2012,VAR_10D,317211 09-11-2012,CB,985... (7 Replies)
Discussion started by: manas_ranjan
7 Replies

5. Shell Programming and Scripting

Column sum group by uniq records

Dear All, I want to get help for below case. I have a file like this. saman 1 gihan 2 saman 4 ravi 1 ravi 2 so i want to get the result, saman 5 gihan 2 ravi 3 like this. Pls help me. (17 Replies)
Discussion started by: Nayanajith
17 Replies

6. UNIX for Advanced & Expert Users

How to compare two text files in column wise?

Hi All, I have two txt files like this File1: no name ---------- 12 aaaa 23 bbbb 55 cccc File2 dname dno ------------ civil 33 mech 55 arch 66 Now i want to compare col1 from File and col2 from File2, if its match i want fetch all columns from... (3 Replies)
Discussion started by: psiva_arul
3 Replies

7. Solaris

column wise substitution in a file

Hi, I have two files. Want to make an addition of the fifth column of from both the files and redirect it to a third file. Both files have same records except fifth field and same record should be inserted into new file having fifth field as addition of fifth fields of both files. for... (2 Replies)
Discussion started by: sanjay1979
2 Replies

8. Shell Programming and Scripting

Column wise file parsing.

Shell script for the below operation : File "A" contains : SEQ++1' MOA+9:000,00:ABC' RFF+AIK:000000007' FII+PH+0170++AA' NAD+PL+++XXXXXXXXXXX XXXXXXX XX++XXX XXXX XXXX X.X. XXXXXXXXX+++NL' SEQ++2' MOA+9:389,47:ABC' RFF+AIK:02110300000008' FII+PH+0PSTBNL2A:25:5+BB'... (5 Replies)
Discussion started by: navojit dutta
5 Replies

9. Shell Programming and Scripting

o/p column wise by nawk

hi i have file which hav following entries 1501,AAA,2.00 1525,AAA,2.00 1501,AAA,2.00 1525,AAA,2.00 1501,AAA,3.00 1525,AAA,3.00 1525,AAA,3.00 1501,AAA,3.00 1501,AAA,3.00 i want to have a o/p coloum wise like 1501,AAA,13 1525,AAA,10 here 13 comes as a sum of last colum value... (6 Replies)
Discussion started by: aaysa123
6 Replies

10. Shell Programming and Scripting

processing matrix column wise

I have a m X n matrix written out to file, say like this: 1,2,3,4,5,6 2,6,3,10,34,67 1,45,6,7,8,8 I want to calculate the column averages in the MINIMUM amount of code or processing possible. I would have liked to use my favorite tool, "AWK" but since it processes rowwise, getting the... (5 Replies)
Discussion started by: Abhishek Ghose
5 Replies
Login or Register to Ask a Question