Sponsored Content
Top Forums Shell Programming and Scripting Average of a columns from three files Post 303003114 by RavinderSingh13 on Thursday 7th of September 2017 02:43:01 PM
Old 09-07-2017
Quote:
Originally Posted by nans
Revisiting this post. Can we condition the command to say,
go through first two columns in each of the three files and if it matches (assuming the files are not in the same order) or even present in a single file (out of three), print the average of the 8th columns in the first file ?
Code:
==> File1 <==
1    115829313    115829313    G    A    0.500    6,7    13    99
1    20977000    20977000    A    C    1.00    0,17    17   45

==> File2 <==
1    20977000    20977000    A    C    1.00    0,15    15    45
1    115829313    115829313    G    A    0.500    7,7    14    99

==> File3 <==
1    115829313    115829313    G    A    0.500    7,10    17    99
1    209897000    209897000    T    C    1.00    0,16    16    48

Code:
==> Results <==
1    115829313    115829313    G    A    0.500    6,7    13    99   14.66
1    20977000    20977000    A    C    1.00    0,17    17   45   16
1    209897000    209897000    T    C    1.00    0,16    16    48   16

Thank you
Hello nans,

Could you please try following and let me know if this helps you.
Code:
awk '
FNR==NR && FNR==1{
   first_file=FILENAME
}
FNR==1 && val{
   close(val)
}
FNR==NR{
   val=FILENAME;
   c[$2,$3]=$0;
   d[$2,$3]++;
   a[$2,$3]+=$8;
   b[$2,$3]++;
   next
}
!b[$2,$3]++{
   c[$2,$3]=$0
}
{
   d[$2,$3]++;
   a[$2,$3]+=$8;
   val=FILENAME
}
END{
   for(i in c){
       print c[i],(a[i]/d[i])
}
}
' SUBSEP=" " File1 File2  File3

Above will only print the output on terminal, if you want to put the output into first Input_file1 named File1 then you could change print c[i],(a[i]/d[i]) to print c[i],(a[i]/d[i]) > first_file.

Thanks,
R. Singh

Last edited by RavinderSingh13; 09-07-2017 at 03:45 PM.. Reason: Added statement for adding output into first File1.
This User Gave Thanks to RavinderSingh13 For This Post:
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Taking the average of two columns and printing it on a new column

Hi, I have a space delimited text file that looks like the following: Aa 100 200 Bb 300 100 Cc X 500 Dd 600 X Basically, I want to take the average of columns 2 and 3 and print it in column 4. However if there is an X in either column 2 or 3, I want to print the non-X value. Therefore... (11 Replies)
Discussion started by: evelibertine
11 Replies

2. Shell Programming and Scripting

Average of columns with values of other column with same name

I have a lot of input files that have the following form: Sample Cq Sample Cq Sample Cq Sample Cq Sample Cq 1WBIN 23.45 1WBIN 23.45 1CVSIN 23.96 1CVSIN 23.14 S1 31.37 1WBIN 23.53 1WBIN 23.53 1CVSIN 23.81 1CVSIN 23.24 S1 31.49 1WBIN 24.55 1WBIN 24.55 1CVSIN 23.86 1CVSIN 23.24 S1 31.74 ... (3 Replies)
Discussion started by: isildur1234
3 Replies

3. Shell Programming and Scripting

How to calculate average of two columns and copy into another file?

Hi, I need help with the awk command. I have a folder with aprox 500 files each one with two columns and I want to print in a new file, the average of column 1 and average of column 2 and the name of each file. Input files are: File-1: 100 99 20 99 50 99 50 99 File-2: 200 85... (3 Replies)
Discussion started by: Lokaps
3 Replies

4. UNIX for Dummies Questions & Answers

Writing a script to take the average of two columns every 3 rows

I have a dataset with 120 columns. I would like to write a script, that takes the average of every two columns, starting from columns 2 and 3, and moving consecutively in frames of 3 columns, all the way until the last column. The first column in the output file would be the averages of columns... (1 Reply)
Discussion started by: evelibertine
1 Replies

5. Shell Programming and Scripting

Average across multiple columns group by

Hi experts, I want to group by average, for multiple columns starting column $7 until NF, group by ($1-$5), please help For just 7th column, I can do awk ' NR>1{ arr += $7 count += 1 } END{ for (a in arr) { print a, arr/count ... (10 Replies)
Discussion started by: ritakadm
10 Replies

6. Shell Programming and Scripting

Match first two columns and calculate percent of average in third column

I have the need to match the first two columns and when they match, calculate the percent of average for the third columns. The following awk script does not give me the expected results. awk 'NR==FNR {T=$3; next} $1,$2 in T {P=T/$3*100; printf "%s %s %.0f\n", $1, $2, (P>=0)?P:-P}' diff.file... (1 Reply)
Discussion started by: ncwxpanther
1 Replies

7. Shell Programming and Scripting

Match first two columns and average third from multiple files

I have the following format of input from multiple files File 1 24.01 -81.01 1.0 24.02 -81.02 5.0 24.03 -81.03 0.0 File 2 24.01 -81.01 2.0 24.02 -81.02 -5.0 24.03 -81.03 10.0 I need to scan through the files and when the first 2 columns match I... (18 Replies)
Discussion started by: ncwxpanther
18 Replies

8. Emergency UNIX and Linux Support

Average columns based on header name

Hi Friends, I have files with columns like this. This sample input below is partial. Please check below for main file link. Each file will have only two rows. ... (8 Replies)
Discussion started by: jacobs.smith
8 Replies

9. Shell Programming and Scripting

Average across multiple columns - awk

Hi forum members, I'm trying to get an average of multiple columns in a csv file using awk. A small example of my input data is as follows: cu,u3o8,au,ag -9,20,-9,3.6 0.005,30,-9,-9 0.005,50,10,3.44 0.021,-9,8,3.35 The following code seems to do most of what I want gawk -F","... (6 Replies)
Discussion started by: theflamingmoe
6 Replies

10. UNIX for Beginners Questions & Answers

Average of columns

I have files that have the following columns chr pos ref alt sample 1 sample 2 sample 3 chr2 179644035 G A 1,107 0,1 58,67 chr7 151945167 G T 142,101 100,200 500,700 chr13 31789169 CTT CT,C 6,37,8 0,0,0 15,46,89 chr22 ... (3 Replies)
Discussion started by: nans
3 Replies
All times are GMT -4. The time now is 01:15 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy