Match first two columns and average third from multiple files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Match first two columns and average third from multiple files
# 8  
Old 02-02-2015
Quote:
Originally Posted by Scrutinizer
Try this modification:

Any awk:
Code:
awk '{v=$NF; sub(/ +[^ ]*$/,x); A[$0]+=v; C[$0]++} END{for(i in A) printf "%s%7.1f\n",i,A[i]/C[i]}' file1 file2

I was hoping to modify your script to subtract a field (3rd column) in File A from another field (3rd column) in File B, but its not working as I had thought. I am not familiar enough with Awk to make it work.

Any help is appreciated!

File A
Code:
24.01  -81.01    1.0
 24.02  -81.02    5.0
 24.03  -81.03    0.0

File B
Code:
24.01  -81.01    2.0
 24.02  -81.02    3.0
 24.03  -81.03    0.0

Result Output
Code:
24.01  -81.01   -1.0
 24.02  -81.02   2.0
 24.03  -81.03   0.0

# 9  
Old 02-02-2015
If you have just two files, and if you want to subtract file2's values from file1's,and if you accept some format change, this might help you:
Code:
awk 'FNR==NR {A[$1,$2]=$NF; next} {A[$1,$2]-=$NF} END{for(i in A) printf "%s%7.1f\n",i,A[i]}' SUBSEP=" " file1 file2
24.01 -81.01   -1.0
24.02 -81.02    2.0
24.03 -81.03    0.0

This User Gave Thanks to RudiC For This Post:
# 10  
Old 02-04-2015
Thanks to all for the help.

I have 1 final question.

I need to perform the following calculation on column 3.

Code:
awk '{$3 = ($3*9)/5; print}'

Input
Code:
46.0625 -99.1042    4.9
35.4792 -79.6042   -0.6
42.5208 -71.4792   -1.7

Expected Output
Code:
46.0625 -99.1042  8.8
35.4792 -79.6042 -1.1
42.5208 -71.4792 -3.1

I need to preserve the 4 sig digs in columns 1 and 2, but round column 3 to 1 sig dig.

I am close with this code but it wraps the 3rd column to the next line.
Code:
awk '{print $1, $2; printf("%.1f\n"), $3 = ($3 * 9)/5}'

# 11  
Old 02-04-2015
Take advantage of awk's string/number dualism, and use the %s format specifier:
Code:
awk '{printf "%s %s %.1f\n", $1, $2,  ($3 * 9)/5}' file
46.0625 -99.1042 8.8
35.4792 -79.6042 -1.1
42.5208 -71.4792 -3.1

This User Gave Thanks to RudiC For This Post:
# 12  
Old 02-12-2015
I now have the need to match the first two columns and when they match, calculate the percentage of average for the third columns. I cant seem to borrow from the same awk script and output the expected result.

percent of average = diff/avg * 100

Ill need to ignore/remove the minus/negative symbols if their are any in the output file.

Diff File
Code:
46.0625 -99.1042 8.8
35.4792 -79.6042 -2.1
42.5208 -71.4792 -3.1

Avg File
Code:
46.0625 -99.1042    4.9
35.4792 -79.6042   3.3
42.5208 -71.4792   10.0

expected output
Code:
46.0625 -99.1042    180
35.4792 -79.6042     63
42.5208 -71.4792     31

Any suggestions?
# 13  
Old 02-12-2015
Please open a new thread for a new request!

Try
Code:
awk 'NR==FNR {T[$1,$2]=$3; next} $1,$2 in T {P=T[$1,$2]/$3*100; printf "%s %s %.0f\n", $1, $2, (P>=0)?P:-P}' file1 file2
46.0625 -99.1042 180
35.4792 -79.6042 64
42.5208 -71.4792 31

This User Gave Thanks to RudiC For This Post:
# 14  
Old 02-13-2015
This has been resolved. Thanks for the help RudiC!

Last edited by ncwxpanther; 02-13-2015 at 02:46 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Data match 2 files based on first 2 columns matching only and join if match

Hi, i have 2 files , the data i need to match is in masterfile and i need to pull out column 3 from master if column 1 and 2 match and output entire row to new file I have tried with join and awk and i keep getting blank outputs or same file is there an easier way than what i am... (4 Replies)
Discussion started by: axis88
4 Replies

2. Shell Programming and Scripting

Average of a columns from three files

hello, I have three files in the following order ==> File1 <== 1 20977000 20977000 A C 1.00 0,15 15 45 1 115829313 115829313 G A 0.500 6,7 13 99 ==> File2 <== 1 20977000 20977000 A C 1.00 0,13 13 39 1 115829313 ... (5 Replies)
Discussion started by: nans
5 Replies

3. Shell Programming and Scripting

Average across multiple columns - awk

Hi forum members, I'm trying to get an average of multiple columns in a csv file using awk. A small example of my input data is as follows: cu,u3o8,au,ag -9,20,-9,3.6 0.005,30,-9,-9 0.005,50,10,3.44 0.021,-9,8,3.35 The following code seems to do most of what I want gawk -F","... (6 Replies)
Discussion started by: theflamingmoe
6 Replies

4. Shell Programming and Scripting

Match first two columns and calculate percent of average in third column

I have the need to match the first two columns and when they match, calculate the percent of average for the third columns. The following awk script does not give me the expected results. awk 'NR==FNR {T=$3; next} $1,$2 in T {P=T/$3*100; printf "%s %s %.0f\n", $1, $2, (P>=0)?P:-P}' diff.file... (1 Reply)
Discussion started by: ncwxpanther
1 Replies

5. Shell Programming and Scripting

Average across multiple columns group by

Hi experts, I want to group by average, for multiple columns starting column $7 until NF, group by ($1-$5), please help For just 7th column, I can do awk ' NR>1{ arr += $7 count += 1 } END{ for (a in arr) { print a, arr/count ... (10 Replies)
Discussion started by: ritakadm
10 Replies

6. Shell Programming and Scripting

Join two files combining multiple columns and produce mix and match output

I would like to join two files when two columns in each file matches with each other and then produce an output when taking multiple columns. Like I have file A 1234,ABCD,23,JOHN,NJ,USA 2345,ABCD,24,SAM,NY,USA 5678,GHIJ,24,TOM,NY,USA 5678,WXYZ,27,MAT,NJ,USA and file B ... (2 Replies)
Discussion started by: mady135
2 Replies

7. Shell Programming and Scripting

Average of a column in multiple files

I have several sequential files with name stat.1000, stat.1001....to stat.1020 with a format like this 0.01 1 3822 4.97379915032e-14 4.96982253992e-09 0 0.01 3822 1 4.97379915032e-14 4.96982253992e-09 0 0.01 2 502 0.00993165137406 993.165137406 0 0.01 502 2 0.00993165137406 993.165137406 0... (6 Replies)
Discussion started by: kayak
6 Replies

8. Shell Programming and Scripting

Awk match multiple columns in multiple lines in single file

Hi, Input 7488 7389 chr1.fa chr1.fa 3546 9887 chr5.fa chr9.fa 7387 7898 chrX.fa chr3.fa 7488 7389 chr21.fa chr3.fa 7488 7389 chr1.fa chr1.fa 3546 9887 chr9.fa chr5.fa 7898 7387 chrX.fa chr3.fa Desired Output 7488 7389 chr1.fa chr1.fa 2 3546 9887 chr5.fa chr9.fa 2... (2 Replies)
Discussion started by: jacobs.smith
2 Replies

9. Shell Programming and Scripting

Computing average values from multiple text files

Hi, first, I have searched in the forum for this, but I could not find the right answer. (There were some similar threads, but I was not sure how to adapt the ideas.) Anyway, I have a quite natural problem: Given are several text files. All files contain the same number of lines and the same... (3 Replies)
Discussion started by: rbredereck
3 Replies

10. Shell Programming and Scripting

Average of elements throught multiple files

Hi, I got a lot of files looking like this: 1 0.5 6 All together there are ard 1'000'000 lines in each of the ard 100 files. I want to build the average for every line, and write the result to a new file. The averaging should start at a specific line, here for example at line... (10 Replies)
Discussion started by: chillmaster
10 Replies
Login or Register to Ask a Question