print unique values of a column and sum up the corresponding values in next column


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting print unique values of a column and sum up the corresponding values in next column
# 1  
Old 12-18-2009
print unique values of a column and sum up the corresponding values in next column

Hi All,
I have a file which is having 3 columns as (string string integer)
a b 1
x y 2
p k 5
y y 4
.....
.....

Question:
I want get the unique value of column 2 in a sorted way(on column 2) and the sum of the 3rd column of the corresponding rows. e.g the above file should return the output as :
b 1
k 5
y 6

please help.
-amigarus
# 2  
Old 12-18-2009
Code:
gawk '{
    a[$2]=a[$2]+$3
}END{
    # for(i in a) print i,a[i] | "sort"
    b=asorti(a,c)
    for(o=1;o<=b;o++){
        print c[o],a[c[o]]
    }
}' file

# 3  
Old 12-18-2009
thanks ichigo.
can you explain the programme?
Is there a way it can be done without using awk or may be without sed... just a thought...

Last edited by amigarus; 12-18-2009 at 10:46 AM..
# 4  
Old 12-18-2009
Quote:
Originally Posted by amigarus
...
Is there a way it can be done without using awk or may be without sed...
Yes, you can use Perl -

Code:
$
$ cat f3
a b 1
x y 2
p k 5
y y 4
$
$ ##
$ sort -k2,2 f3 |
> perl -F'\s+' -lane 'if ($x ne $F[1] && $x ne "") {print "$x $y"; $y=0};
>                     $x=$F[1]; $y+=$F[2]; END {print "$x $y"}'
b 1
k 5
y 6
$
$

tyler_durden
# 5  
Old 12-21-2009
Quote:
Originally Posted by ichigo
Code:
gawk '{
    a[$2]=a[$2]+$3
}END{
    # for(i in a) print i,a[i] | "sort"
    b=asorti(a,c)
    for(o=1;o<=b;o++){
        print c[o],a[c[o]]
    }
}' file


Can anyone please explain this code. being new to unix I'm not getting this.
Thanks
# 6  
Old 12-21-2009
Code:
~/unix.com$ cat file
x y 2
a b 1
p k 5
y y 4
b a 9
c b 2
~/unix.com$ awk '{a[$2]+=$3}END{for (i in a) print i,a[i]}' file
a 9
b 3
k 5
y 6

This script is the same as above but lacks the sorting function, as awk seems to sort things itself...
the first {...} block creates an array a indexed with second parameter $2, adding $3.
From my file above, when 'b' on the $2nd column is encountered the first time, you end up with a['b']=1, then the second time (on the last line), a['b']+=2 (a['b'] is already 1, plus 2 = 3). The same goes for every $2nd element.
the second {...} block takes each index from the a array, store them in i, then displays i and the corresponding a['i']
Is it clear enough? Smilie

Last edited by tukuyomi; 12-21-2009 at 09:26 AM..
This User Gave Thanks to tukuyomi For This Post:
# 7  
Old 12-21-2009
yeah, its very much clear.

Thanks a lot, Tukuyomi.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Sum the values in the column using date column

I have a file which need to be summed up using date column. I/P: 2017/01/01 a 10 2017/01/01 b 20 2017/01/01 c 40 2017/01/01 a 60 2017/01/01 b 50 2017/01/01 c 40 2017/01/01 a 20 2017/01/01 b 30 2017/01/01 c 40 2017/02/01 a 10 2017/02/01 b 20 2017/02/01 c 30 2017/02/01 a 10... (6 Replies)
Discussion started by: Booo
6 Replies

2. UNIX for Beginners Questions & Answers

Find unique values but only in column 1

Hi All, Does anyone have any suggestions/examples of how i could show only lines where the first field is not duplicated. If the first field is listed more than once it shouldnt be shown even if the other columns make it unique. Example file : 876,RIBDA,EC2 876,RIBDH,EX7 877,RIBDF,E28... (4 Replies)
Discussion started by: mutley2202
4 Replies

3. UNIX for Dummies Questions & Answers

Match sum of values in each column with the corresponding column value present in trailer record

Hi All, I have a requirement where I need to find sum of values from column D through O present in a CSV file and check whether the sum of each Individual column matches with the value present for that corresponding column present in the trailer record. For example, let's assume for column D... (9 Replies)
Discussion started by: tpk
9 Replies

4. Shell Programming and Scripting

Sum column values based in common identifier in 1st column.

Hi, I have a table to be imported for R as matrix or data.frame but I first need to edit it because I've got several lines with the same identifier (1st column), so I want to sum the each column (2nd -nth) of each identifier (1st column) The input is for example, after sorted: K00001 1 1 4 3... (8 Replies)
Discussion started by: sargotrons
8 Replies

5. UNIX for Dummies Questions & Answers

Unique values in a row sum the next column in UNIX

Hi would like to ask you guys any advise regarding my problem I have this kind of data file.txt 111111111,20 111111111,50 222222222,70 333333333,40 444444444,10 444444444,20 I need to get this file1.txt 111111111,70 222222222,70 333333333,40 444444444,30 using this code I can... (6 Replies)
Discussion started by: reks
6 Replies

6. Shell Programming and Scripting

awk Print New Column For Every Two Lines and Match On Multiple Column Values to print another column

Hi, My input files is like this axis1 0 1 10 axis2 0 1 5 axis1 1 2 -4 axis2 2 3 -3 axis1 3 4 5 axis2 3 4 -1 axis1 4 5 -6 axis2 4 5 1 Now, these are my following tasks 1. Print a first column for every two rows that has the same value followed by a string. 2. Match on the... (3 Replies)
Discussion started by: jacobs.smith
3 Replies

7. Shell Programming and Scripting

Print every 5 4th column values as separate row with different first column

Hi, I have the following file, chr1 100 200 20 chr1 201 300 22 chr1 220 345 23 chr1 230 456 33.5 chr1 243 567 90 chr1 345 600 20 chr1 430 619 21.78 chr1 870 910 112.3 chr1 914 920 12 chr1 930 999 13 My output would be peak1 20 22 23 33.5 90 peak2 20 21.78 112.3 12 13 Here the... (3 Replies)
Discussion started by: jacobs.smith
3 Replies

8. Shell Programming and Scripting

Getting a sum of column values

I have a file in the following layout: 201008005946873001846130058030701006131840000000000000000000 201008006784994001154259058033001009527844000000000000000000 201008007323067002418095058034801002418095000000000000000000 201008007697126001722141058029101002214158000000000000000000... (2 Replies)
Discussion started by: jclanc8
2 Replies

9. Shell Programming and Scripting

How to sum column 1 values

I have a file file like this. I want to sum all column 1 values. input A 2 A 3 A 4 B 4 B 2 Out put A 9 B 6 (3 Replies)
Discussion started by: suresh3566
3 Replies

10. Shell Programming and Scripting

how to read the column and print the values under that column

hi all:b:, how to read the column and print the values under that column ...?? file1 have something like this cat file1 ======= column1, column2,date,column3,column4..... 1, 23 , 12/02/2008,...... 2, 45, 14/05/2008,..... 3, 56, 16/03/2008,..... cat file2 =======... (6 Replies)
Discussion started by: gemini106
6 Replies
Login or Register to Ask a Question