AWK : Add Fields of lines with matching field


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting AWK : Add Fields of lines with matching field
# 1  
Old 01-16-2011
AWK : Add Fields of lines with matching field

Dear All,

I would like to add values of a field, if the lines match in a certain field. Then I would like to divide the sum though the number of lines that have a matched field. This is the Input:

Input:
Code:
Test1 5
Test1 10
Test2 2
Test2 5
Test2 13
Test3 4

Output:
Code:
Test1 7.5
Test1 7.5
Test2 6.667
Test2 6.667
Test2 6.667
Test3 4

Any help is much appreaciated!
# 2  
Old 01-16-2011
The awk script:
Code:
{ l[NR] = $1; n[$1]++; s[$1] += $2; }
END {
  for (i in n) { a[i] = s[i] / n[i]; }
  for (i = 1; i <= NR; i++) { print l[i], a[l[i]]; }
}

Results in:
Code:
Test1 7.5
Test1 7.5
Test2 6.66667
Test2 6.66667
Test2 6.66667
Test3 4

You can adjust the print to get the numeric precision you need.
This User Gave Thanks to m.d.ludwig For This Post:
# 3  
Old 01-16-2011
Try this,
Code:
awk '{if(! a[$1]) {a[$1]=$2;j=0;b[$1]=++j}else{a[$1]=a[$1]+$2;b[$1]=++j}} END{for (i in a) {for(l=1;l<=b[i];l++){print i,a[i]/b[i]}}}' inputfile

This User Gave Thanks to pravin27 For This Post:
# 4  
Old 01-16-2011
DerSeb -- will the input data be ordered? Or is something like:
Code:
Test1 5
Test2 13
Test2 2
Test3 4
Test1 10
Test2 5

possible?
# 5  
Old 01-16-2011
Code:
awk 'NR==FNR{A[$1]++;B[$1]+=$2;next}{$2=B[$1]/A[$1]}1' infile infile

This User Gave Thanks to Scrutinizer For This Post:
# 6  
Old 01-16-2011
Wow, you were all really fast.

yes, the file is sorted.

Thx all, the scripts work great!
# 7  
Old 01-16-2011
As the file is already sorted you can do it in 1 pass:

Code:
awk 'function p(){for(I=C;I;I--)print R" "T/C} $1!=R{C=T=p()} {R=$1;C++;T+=$2} END{p()}' infile

This User Gave Thanks to Chubler_XL For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Using awk to add length of matching characters between field in file

The awk below produces the current output, which will add +1 to $3. However, I am trying to add the length of the matching characters between $5 and $6 to $3. I have tried using sub as a variable to store the length but am not able to do so correctly. I added comments to each line and the... (4 Replies)
Discussion started by: cmccabe
4 Replies

2. Shell Programming and Scripting

awk to add text to matching pattern in field

In the awk I am trying to add :p.=? to the end of each $9 that matches the pattern NM_. The below executes andis close but I can not seem to figure out why the :p.=? repeats in the split as in the green in the current output. I have added comments as well. Thank you :). file ... (4 Replies)
Discussion started by: cmccabe
4 Replies

3. UNIX for Beginners Questions & Answers

Continued trouble matching fields in different files and selective field printing ([g]awk)

I apologize in advance, but I continue to have trouble searching for matches between two files and then printing portions of each to output in awk and would very much appreciate some help. I have data as follows: File1 PS012,002 PRQ 0 1 1 17 1 0 -1 3 2 1 2 -1 ... (7 Replies)
Discussion started by: jvoot
7 Replies

4. Shell Programming and Scripting

awk to print lines based on text in field and value in two additional fields

In the awk below I am trying to print the entire line, along with the header row, if $2 is SNV or MNV or INDEL. If that condition is met or is true, and $3 is less than or equal to 0.05, then in $7 the sub pattern :GMAF= is found and the value after the = sign is checked. If that value is less than... (0 Replies)
Discussion started by: cmccabe
0 Replies

5. Shell Programming and Scripting

awk to remove lines where field count is greather than 1 in two fields

I am trying to remove all the lines and spaces where the count in $4 or $5 is greater than 1 (more than 1 letter). The file and the output are tab-delimited. Thank you :). file X 5811530 . G C NLGN4X 17 10544696 . GA G MYH3 9 96439004 . C ... (1 Reply)
Discussion started by: cmccabe
1 Replies

6. Shell Programming and Scripting

awk to combine all matching fields in input but only print line with largest value in specific field

In the below I am trying to use awk to match all the $13 values in input, which is tab-delimited, that are in $1 of gene which is just a single column of text. However only the line with the greatest $9 value in input needs to be printed. So in the example below all the MECP2 and LTBP1... (0 Replies)
Discussion started by: cmccabe
0 Replies

7. Shell Programming and Scripting

awk to add plus or minus to fields and split another field

In the tab-delimited input below I am trying to use awk to -10 from $2 and +10 to $3. Something like awk -F'\t' -v OFS='\t' -v s=10 '{split($4,a,":"); print $1,$2-s,$3+s,a,$5,$6} | awk {split(a,b,"-"); print $1,$2-s,$3+s,b-s,b+s,$5,$6}' input should do that. I also need to -10 from $4... (2 Replies)
Discussion started by: cmccabe
2 Replies

8. Shell Programming and Scripting

How to print 1st field and last 2 fields together and the rest of the fields after it using awk?

Hi experts, I need to print the first field first then last two fields should come next and then i need to print rest of the fields. Input : a1,abc,jsd,fhf,fkk,b1,b2 a2,acb,dfg,ghj,b3,c4 a3,djf,wdjg,fkg,dff,ggk,d4,d5 Expected output: a1,b1,b2,abc,jsd,fhf,fkk... (6 Replies)
Discussion started by: 100bees
6 Replies

9. Shell Programming and Scripting

Matching and Merging csv data fields based on a common field

Dear List, I have a file of csv data which has a different line per compliance check per host. I do not want any omissions from this csv data file which looks like this: date,hostname,status,color,check 02-03-2012,COMP1,FAIL,Yellow,auth_pass_change... (3 Replies)
Discussion started by: landossa
3 Replies

10. Shell Programming and Scripting

Compare Tab Separated Field with AWK to all and print lines of unique fields.

Hi. I have a tab separated file that has a couple nearly identical lines. When doing: sort file | uniq > file.new It passes through the nearly identical lines because, well, they still are unique. a) I want to look only at field x for uniqueness and if the content in field x is the... (1 Reply)
Discussion started by: rocket_dog
1 Replies
Login or Register to Ask a Question