Visit Our UNIX and Linux User Community

Get column average using ID

Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Get column average using ID
# 8  
Old 09-10-2013
OK, try something like:
awk 'NR==1; NR>1{A[$2]+=$3; C[$2]++} END{for (i in A) print i,A[i]/C[i]}' OFMT='%.2f'  file

These 2 Users Gave Thanks to Scrutinizer For This Post:
# 9  
Old 09-10-2013
OK that worked beautifully. Thank you very much.

Now, I am legitimately working on learning to use awk on the fly without pouring over forums to get what I need. You seem to have mastered this already. Would you be willing to break down the code you just wrote and tell me what each piece is doing? This is not something I see done very often , and most of the resources on unix programming do not go into enough detail on what is actually being issued. I am assuming others would benefit from this as well. Thank you in advance if you are willing to do that.
# 10  
Old 09-10-2013
Sure, no problem:
awk '                
  NR==1                                # If it is the first line in the file it is the header, perform the default action, i.e. print the line { print $0 }
  NR>1{                                # For Any subsequent line, for every line :
    A[$2]+=$3                          # Create an element in associative array A with the second field as the index and the third field to its value..
    C[$2]++                            # Increase the counter in associative array C for field $2 
  END{                                 # When all lines are processed
    for (i in A) print i,A[i]/C[i]     # Run throught the array elements and print the index and the total value / divided by the number of occurences , using OFMT for the format.
' OFMT='%.2f' file                    # set the format to 2 decimals and specify the file name.

These 2 Users Gave Thanks to Scrutinizer For This Post:
# 11  
Old 09-10-2013
Amazing. Thank you!

Previous Thread | Next Thread
Test Your Knowledge in Computers #51
Difficulty: Easy
The term 'IoT' means the 'Internet of Tomorrow'.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Average each numeric column

Hi all, Does anyone know of an efficient unix script to average each numeric column of a multi-column tab delimited file (with header) with some character columns. Here is an example input file: CHR RS_ID ALLELE POP1 POP2 POP3 POP4 POP5 POP6 POP7 POP8... (7 Replies)
Discussion started by: Geneanalyst
7 Replies

2. Shell Programming and Scripting

Check first column - average second column based on a condition

Hi, My input file Gene1 1 Gene1 2 Gene1 3 Gene1 0 Gene2 0 Gene2 0 Gene2 4 Gene2 8 Gene3 9 Gene3 9 Gene4 0 Condition: If the first column matches, then look in the second column. If there is a value of zero in the second column, then don't consider that record while averaging. ... (5 Replies)
Discussion started by: jacobs.smith
5 Replies

3. Shell Programming and Scripting

Calculate the average of a column based on the value of another column

Hi, I would like to calculate the average of column 'y' based on the value of column 'pos'. For example, here is file1 id pos y c 11 1 220 aa 11 4333 207 f 11 5333 112 ee 11 11116 305 e 11 11117 310 r 11 22228 781 gg 11 ... (2 Replies)
Discussion started by: jackken007
2 Replies

4. Shell Programming and Scripting

Average of columns with values of other column with same name

I have a lot of input files that have the following form: Sample Cq Sample Cq Sample Cq Sample Cq Sample Cq 1WBIN 23.45 1WBIN 23.45 1CVSIN 23.96 1CVSIN 23.14 S1 31.37 1WBIN 23.53 1WBIN 23.53 1CVSIN 23.81 1CVSIN 23.24 S1 31.49 1WBIN 24.55 1WBIN 24.55 1CVSIN 23.86 1CVSIN 23.24 S1 31.74 ... (3 Replies)
Discussion started by: isildur1234
3 Replies

5. Shell Programming and Scripting

average of rows with same value in the first column

Dear All, I have this file tab delimited A 1 12 22 B 3 34 33 C 55 9 32 A 12 81 71 D 11 1 66 E 455 4 2 B 89 4 3 I would like to make the average every column where the first column is the same, for example, A 6,5 46,5 46,5 B 46,0 19,0 18,0 C 55,0 9,0 32,0 D 11,0 1,0 66,0... (8 Replies)
Discussion started by: paolo.kunder
8 Replies

6. Shell Programming and Scripting

average each column in a file

Hi, I tried to do this in excel but there is a limit to how many rows it can handle. All I need to do is average each column in a file and get the final value. My file looks something like this (obviously a lot larger): Joe HHR + 1 2 3 4 5 6 7 8 Jor HHR - 1 2 3 4 5 6 7 8 the output... (1 Reply)
Discussion started by: kylle345
1 Replies

7. UNIX for Dummies Questions & Answers

Average for repeated elements in a column

I have a file that looks like this 452 025_E3 8 025_E3 82 025_F5 135 025_F5 5 025_F5 23 025_G2 38 025_G2 71 025_G2 9 026_A12 81 026_A12 10 026_A12 some of the elements in column2 are repeated. I want an output file that will extract the... (1 Reply)
Discussion started by: FelipeAd
1 Replies

8. UNIX for Dummies Questions & Answers

average of a column in a table

Hello, Is there a quick way to compute the average of a column data in a numerical tab delimeted file? Thanks, Gussi (2 Replies)
Discussion started by: Gussifinknottle
2 Replies

9. UNIX for Dummies Questions & Answers

Use awk to calculate average of column 3

Suppose I have 500 files in a directory and I need to Use awk to calculate average of column 3 for each of the file, how would I do that? (6 Replies)
Discussion started by: grossgermany
6 Replies

10. UNIX for Dummies Questions & Answers

calculate average of column 2

Hi I have fakebook.csv as following: F1(current date) F2(popularity) F3(name of book) F4(release date of book) 2006-06-21,6860,"Harry Potter",2006-12-31 2006-06-22,,"Harry Potter",2006-12-31 2006-06-23,7120,"Harry Potter",2006-12-31 2006-06-24,,"Harry Potter",2006-12-31... (0 Replies)
Discussion started by: onthetopo
0 Replies

Featured Tech Videos