Average select rows


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Average select rows
# 1  
Old 12-07-2017
Average select rows

I have no idea how to even get started with this script.

I need to average field 3 for each of the unique identifiers found in field 1. However, I only want to average these rows when field 2 is equal to 1506 - 2000 (note that i replaced the values field 2 for security reasons, but the real data are sequential).

Code:
001001 1500 62.6283
001001 1501 65.3417
001001 1502 65.1475
001001 1503 63.805
001001 1504 63.9375
001001 1505 64.1858
001001 1506 62.29
001001 1507 64.5242
001001 1508 62.8933
001001 1509 64.0667
001001 1510 63.6867
.....

001003 1995 66.7417
001003 1996 66.1092
001003 1997 66.5117
001003 1998 68.6833
001003 1999 67.715
001003 2000 67.5867
001003 2001 66.9783
001003 2002 67.3217
001003 2003 66.7042
001003 2004 67.1367
001003 2005 67.3717
.....

I need the results appended to each respective row.

Code:
001001  1500   62.6283	63.49
001001  1501   65.3417	63.49
001001  1502   65.1475	63.49
001001  1503   63.805	63.49
001001  1504   63.9375	63.49
001001  1505   64.1858	63.49
001001  1506   62.29    63.49
001001  1507   64.5242	63.49
001001  1508   62.8933	63.49
001001  1509   64.0667	63.49
001001  1510   63.6867	63.49
.....			
	
001003  1995	66.7417	67.22
001003  1996	66.1092	67.22
001003  1997	66.5117	67.22
001003  1998	68.6833	67.22
001003  1999	67.715	67.22
001003  2000	67.5867	67.22
001003  2001	66.9783	67.22
001003  2002	67.3217	67.22
001003  2003	66.7042	67.22
001003  2004	67.1367	67.22
001003  2005	67.3717	67.22
.....

Any help is appreciated.
# 2  
Old 12-07-2017
Try
Code:
awk 'NR == FNR {if ($2 >=1506 && $2 <=2000) {SUM[$1] += $3; CNT[$1]++}; next} 
{print $0, SUM[$1]/CNT[$1]}' OFS="\t" OFMT="%.2f" file file

# 3  
Old 12-08-2017
Quote:
Originally Posted by RudiC
Try
Code:
awk 'NR == FNR {if ($2 >=1506 && $2 <=2000) {SUM[$1] += $3; CNT[$1]++}; next} 
{print $0, SUM[$1]/CNT[$1]}' OFS="\t" OFMT="%.2f" file file

I have played around with this but could not get anything to output.
# 4  
Old 12-08-2017
Difficult to believe as what I got was EXACTLY what you posted as desired output. It's not in vain that posters in here are requested to show at least their OS and shell versions, if not the tools used versions.
# 5  
Old 12-08-2017
Quote:
Originally Posted by RudiC
Difficult to believe as what I got was EXACTLY what you posted as desired output. It's not in vain that posters in here are requested to show at least their OS and shell versions, if not the tools used versions.
My apologies for the lack of information.

GNU Awk 3.1.7
Linux CentOS release 6.9 (Final)

Would the format of the input cause the issue? There is a single space between each field.
# 6  
Old 12-08-2017
I don't think so, as the FS default ((multiple) whitespace) is used.
Did you supply the input file TWICE? It needs two passes.

Last edited by RudiC; 12-08-2017 at 12:34 PM.. Reason: typo
# 7  
Old 12-08-2017
Oh I see. Thats the first time I have had to enter the input file twice. At first glance it appears to work as you said. Thanks for the help with this one!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Average across rows with a condition

Hi Friends, My input file Gene1 10 20 0 Gene2 5 0 15 Gene3 10 10 10 Gene4 5 0 0 If there is a zero for any gene in any column, I don't want that column to be considered which reduces the denominator value during average. Here is my output Gene1 10 20 0 10 Gene2 5 0 15 10 Gene3... (5 Replies)
Discussion started by: jacobs.smith
5 Replies

2. Shell Programming and Scripting

Average values of duplicate rows

I have this file input.txt. I want to take average column-wise for the rows having duplicate gene names. Gene Sample_1 Sample_2 Sample_3 gene_A 2 4 5 gene_B 1 2 3 gene_A 0 5 7 gene_B 4 5 6 gene_A 11 12 13 gene_C 2 3 4 Desired output: gene_A 4.3 7 8.3 gene_B 2.5 3.5 4.5 gene_C 2 3 4... (6 Replies)
Discussion started by: Sanchari
6 Replies

3. UNIX for Dummies Questions & Answers

Writing a script to take the average of two columns every 3 rows

I have a dataset with 120 columns. I would like to write a script, that takes the average of every two columns, starting from columns 2 and 3, and moving consecutively in frames of 3 columns, all the way until the last column. The first column in the output file would be the averages of columns... (1 Reply)
Discussion started by: evelibertine
1 Replies

4. Shell Programming and Scripting

Calculate average of rows between two specific patterns

Hi, I have a file like this: variableStep chrom=chrX span=1 92328 0 92329 0 92330 0 92331 0 92332 0 92333 0 ................ ................ ................ variableStep chrom=chrX span=1 45649610 -0.00386 45649611 1.56 45649612 -2.23 45649613 ... (2 Replies)
Discussion started by: Ranajit_Das
2 Replies

5. Shell Programming and Scripting

average of rows with same value in the first column

Dear All, I have this file tab delimited A 1 12 22 B 3 34 33 C 55 9 32 A 12 81 71 D 11 1 66 E 455 4 2 B 89 4 3 I would like to make the average every column where the first column is the same, for example, A 6,5 46,5 46,5 B 46,0 19,0 18,0 C 55,0 9,0 32,0 D 11,0 1,0 66,0... (8 Replies)
Discussion started by: paolo.kunder
8 Replies

6. UNIX Desktop Questions & Answers

Calculate average for rows in a text file

Dear Gurus, I have tab-delimited text files with matrix containing values. The first column is a identifier and other columns have the corresponding values. I would like to calculate the average value (total number/number of entries) for all entries from 2nd column to the last column in row... (3 Replies)
Discussion started by: Unilearn
3 Replies

7. Shell Programming and Scripting

Average calculation based on number of rows

Dear users, I need your support, I have a file like this: 272134.548 6680572.715 272134.545 6680572.711 272134.546 6680572.713 272134.548 6680572.706 272134.545 6680572.721 272134.543 6680572.710 272134.544 6680572.715 272134.543 6680572.705 272134.540 6680572.720 272134.544... (10 Replies)
Discussion started by: Gery
10 Replies

8. Shell Programming and Scripting

Script to find the average of a given column and also for specified number of rows?

Hi Friends, In continuation to my earlier post https://www.unix.com/shell-programming-scripting/99166-script-find-average-given-column-also-specified-number-rows.html I am extending my problem as follows. Input: Column1 Column2 MAS 1 MAS 4 ... (2 Replies)
Discussion started by: ks_reddy
2 Replies

9. Shell Programming and Scripting

Script to find the average of a given column and also for specified number of rows??

Hi friends I have 100 files in my directory. Each file look like this.. Temp1 Temp2 Temp3 MAS 1 2 3 MAS 4 5 6 MAS 7 8 9 Delhi 10 11 12 Delhi 13 14 15 Delhi 16 17 ... (4 Replies)
Discussion started by: ks_reddy
4 Replies

10. UNIX for Dummies Questions & Answers

Calculating the Number of Rows and Average

Hi All I like to know how can we calculate the number of rows and the average of the values present in the file. I will not know what will be the rowcount, which will be dynamic in nature of the file. eg. 29 33 48 30 28 (6 Replies)
Discussion started by: pk_eee
6 Replies
Login or Register to Ask a Question