Get column average using ID


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Get column average using ID
# 1  
Old 09-10-2013
Get column average using ID

I have a file that looks like this:


Code:
id window BV
1 1 0.5
1 2 0.2
1 3 0.1
2 1 0.5
2 2 0.1
2 3 0.2
3 1 0.4
3 2 0.6
3 3 0.8


Using awk, how would I get the average BV for window 1? Output like this:

Code:
window avgBV
1 0.47
2 0.23
3 0.37


Last edited by jwbucha; 09-10-2013 at 01:40 PM..
# 2  
Old 09-10-2013
How does your output correspond to your input?
# 3  
Old 09-10-2013
The output is showing the average BV for window 1, 2 etc.
# 4  
Old 09-10-2013
So the second column? Your suggested output does not seem to correspond to averages..
The average for window 1 is 0.466667 for example ( ( 0.5 + 0.5 + 0.4 ) / 3 )...
# 5  
Old 09-10-2013
fixed
# 6  
Old 09-10-2013
Not entirely. Is this homework?
# 7  
Old 09-10-2013
Not homework, but research problem. I am a PhD student in quantitative animal genomics. I am a geneticist, but working on learning unix programming. I have large output files (500 mb) that are analyzing effects from 50,000 DNA markers in 2000 animals. The program divides the genome into 1,000,000 base pair segments called a 'window'. I need to extract the breeding value (BV) for each window averaged across the 2000 animals. The table above is an example of what my output looks like. I apologize if I am not meeting the formatting requirements for this forum, it is my first post. I have been unable to find an awk solution anywhere else, hence my post.

edit: I should clarify that even though I am a student this is not for a class, but real data that is part of my research.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Average each numeric column

Hi all, Does anyone know of an efficient unix script to average each numeric column of a multi-column tab delimited file (with header) with some character columns. Here is an example input file: CHR RS_ID ALLELE POP1 POP2 POP3 POP4 POP5 POP6 POP7 POP8... (7 Replies)
Discussion started by: Geneanalyst
7 Replies

2. Shell Programming and Scripting

Check first column - average second column based on a condition

Hi, My input file Gene1 1 Gene1 2 Gene1 3 Gene1 0 Gene2 0 Gene2 0 Gene2 4 Gene2 8 Gene3 9 Gene3 9 Gene4 0 Condition: If the first column matches, then look in the second column. If there is a value of zero in the second column, then don't consider that record while averaging. ... (5 Replies)
Discussion started by: jacobs.smith
5 Replies

3. Shell Programming and Scripting

Calculate the average of a column based on the value of another column

Hi, I would like to calculate the average of column 'y' based on the value of column 'pos'. For example, here is file1 id pos y c 11 1 220 aa 11 4333 207 f 11 5333 112 ee 11 11116 305 e 11 11117 310 r 11 22228 781 gg 11 ... (2 Replies)
Discussion started by: jackken007
2 Replies

4. Shell Programming and Scripting

Average of columns with values of other column with same name

I have a lot of input files that have the following form: Sample Cq Sample Cq Sample Cq Sample Cq Sample Cq 1WBIN 23.45 1WBIN 23.45 1CVSIN 23.96 1CVSIN 23.14 S1 31.37 1WBIN 23.53 1WBIN 23.53 1CVSIN 23.81 1CVSIN 23.24 S1 31.49 1WBIN 24.55 1WBIN 24.55 1CVSIN 23.86 1CVSIN 23.24 S1 31.74 ... (3 Replies)
Discussion started by: isildur1234
3 Replies

5. Shell Programming and Scripting

average of rows with same value in the first column

Dear All, I have this file tab delimited A 1 12 22 B 3 34 33 C 55 9 32 A 12 81 71 D 11 1 66 E 455 4 2 B 89 4 3 I would like to make the average every column where the first column is the same, for example, A 6,5 46,5 46,5 B 46,0 19,0 18,0 C 55,0 9,0 32,0 D 11,0 1,0 66,0... (8 Replies)
Discussion started by: paolo.kunder
8 Replies

6. Shell Programming and Scripting

average each column in a file

Hi, I tried to do this in excel but there is a limit to how many rows it can handle. All I need to do is average each column in a file and get the final value. My file looks something like this (obviously a lot larger): Joe HHR + 1 2 3 4 5 6 7 8 Jor HHR - 1 2 3 4 5 6 7 8 the output... (1 Reply)
Discussion started by: kylle345
1 Replies

7. UNIX for Dummies Questions & Answers

Average for repeated elements in a column

I have a file that looks like this 452 025_E3 8 025_E3 82 025_F5 135 025_F5 5 025_F5 23 025_G2 38 025_G2 71 025_G2 9 026_A12 81 026_A12 10 026_A12 some of the elements in column2 are repeated. I want an output file that will extract the... (1 Reply)
Discussion started by: FelipeAd
1 Replies

8. UNIX for Dummies Questions & Answers

average of a column in a table

Hello, Is there a quick way to compute the average of a column data in a numerical tab delimeted file? Thanks, Gussi (2 Replies)
Discussion started by: Gussifinknottle
2 Replies

9. UNIX for Dummies Questions & Answers

Use awk to calculate average of column 3

Suppose I have 500 files in a directory and I need to Use awk to calculate average of column 3 for each of the file, how would I do that? (6 Replies)
Discussion started by: grossgermany
6 Replies

10. UNIX for Dummies Questions & Answers

calculate average of column 2

Hi I have fakebook.csv as following: F1(current date) F2(popularity) F3(name of book) F4(release date of book) 2006-06-21,6860,"Harry Potter",2006-12-31 2006-06-22,,"Harry Potter",2006-12-31 2006-06-23,7120,"Harry Potter",2006-12-31 2006-06-24,,"Harry Potter",2006-12-31... (0 Replies)
Discussion started by: onthetopo
0 Replies
Login or Register to Ask a Question