How to average the third column for each value of the first column?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to average the third column for each value of the first column?
# 8  
Old 11-12-2015
Quote:
Originally Posted by RudiC
You weren't too far off ... compare this to what you posted, and you'll find your mistakes:
Code:
awk '
        {arr[$1] += $3
         count[$1] ++
        }
END     {for (a in arr) {print "id avg " a " = " arr[a] / count[a]
                        }
        }
' FS="," file
id avg 32.498567 = 0.0309062
id avg 30.498001 = 0.020214 <--- it should be close to 0.0288807
id avg 34.758781 = 0.031875

Only that the input is not consistent and line 3 has an extra field, producing the wrong average.

Perhaps,
Code:
awk '
        {arr[$1] += $NF
         count[$1] ++
        }
END     {for (a in arr) {print "id avg " a " = " arr[a] / count[a]
                        }
        }
' FS="," pmfcg.file
id avg 32.498567 = 0.0309062
id avg 30.498001 = 0.0288807
id avg 34.758781 = 0.031875

A Perl alternative:
Code:
perl -nle '
    /^([\d\.-]+,[\d\.-]+).*?([\d\.-]+)$/ and @{$u{$1}}[0] += $2 and @{$u{$1}}[1]++;
    END{for $k (keys %u){print "$k,", @{$u{$k}}[0]/@{$u{$k}}[1]}}
' pmfcg.file

output:
Code:
34.758781,-87.650562,0.031875
32.498567,-86.136587,0.03090625
30.498001,-87.881412,0.0288806666666667

# 9  
Old 11-18-2015
Ok, so I tried this
Code:
awk '
        {arr[$1] += $3
         count[$1] ++
        }
END     {for (a in arr) {print "id avg “ a " = " arr[a] / count[a]
                        }
        }
' FS="," file

This is exactly what I need! Thanks so much for the pointers!

However, I also need to print it in the following format:
Code:
Lat,Long,avg

As of now, it gives me this:
Code:
id avg 32.498567 = 0.0309062
id avg 30.498001 = 0.0288807
id avg 34.758781 = 0.031875

How could I ammend the above code to give me the LONG too? Otherwise I would have to manually pair the LAT with the LONg.... yuck...

---------- Post updated at 05:05 PM ---------- Previous update was at 04:59 PM ----------

Alright, Aia's Perl solution did exactly what I needed to do. Thanks!!!!!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Average each numeric column

Hi all, Does anyone know of an efficient unix script to average each numeric column of a multi-column tab delimited file (with header) with some character columns. Here is an example input file: CHR RS_ID ALLELE POP1 POP2 POP3 POP4 POP5 POP6 POP7 POP8... (7 Replies)
Discussion started by: Geneanalyst
7 Replies

2. Shell Programming and Scripting

Check first column - average second column based on a condition

Hi, My input file Gene1 1 Gene1 2 Gene1 3 Gene1 0 Gene2 0 Gene2 0 Gene2 4 Gene2 8 Gene3 9 Gene3 9 Gene4 0 Condition: If the first column matches, then look in the second column. If there is a value of zero in the second column, then don't consider that record while averaging. ... (5 Replies)
Discussion started by: jacobs.smith
5 Replies

3. Shell Programming and Scripting

Get column average using ID

I have a file that looks like this: id window BV 1 1 0.5 1 2 0.2 1 3 0.1 2 1 0.5 2 2 0.1 2 3 0.2 3 1 0.4 3 2 0.6 3 3 0.8 Using awk, how would I get the average BV for window 1? Output like this: window avgBV 1 0.47 2 0.23 (10 Replies)
Discussion started by: jwbucha
10 Replies

4. UNIX for Dummies Questions & Answers

Average by specific column value, awk

Hi, I am searching for an awk-script that computes the mean values for the $2 column, but addicted to the values in the $1 column. It also should delete the unnecessary lines after computing... An example (for some reason I cant use the code tag button): cat list.txt 1 10 1 30 1 20... (2 Replies)
Discussion started by: bjoern456
2 Replies

5. Shell Programming and Scripting

Calculate the average of a column based on the value of another column

Hi, I would like to calculate the average of column 'y' based on the value of column 'pos'. For example, here is file1 id pos y c 11 1 220 aa 11 4333 207 f 11 5333 112 ee 11 11116 305 e 11 11117 310 r 11 22228 781 gg 11 ... (2 Replies)
Discussion started by: jackken007
2 Replies

6. Shell Programming and Scripting

average of rows with same value in the first column

Dear All, I have this file tab delimited A 1 12 22 B 3 34 33 C 55 9 32 A 12 81 71 D 11 1 66 E 455 4 2 B 89 4 3 I would like to make the average every column where the first column is the same, for example, A 6,5 46,5 46,5 B 46,0 19,0 18,0 C 55,0 9,0 32,0 D 11,0 1,0 66,0... (8 Replies)
Discussion started by: paolo.kunder
8 Replies

7. Shell Programming and Scripting

average each column in a file

Hi, I tried to do this in excel but there is a limit to how many rows it can handle. All I need to do is average each column in a file and get the final value. My file looks something like this (obviously a lot larger): Joe HHR + 1 2 3 4 5 6 7 8 Jor HHR - 1 2 3 4 5 6 7 8 the output... (1 Reply)
Discussion started by: kylle345
1 Replies

8. UNIX for Dummies Questions & Answers

average of a column in a table

Hello, Is there a quick way to compute the average of a column data in a numerical tab delimeted file? Thanks, Gussi (2 Replies)
Discussion started by: Gussifinknottle
2 Replies

9. UNIX for Dummies Questions & Answers

Use awk to calculate average of column 3

Suppose I have 500 files in a directory and I need to Use awk to calculate average of column 3 for each of the file, how would I do that? (6 Replies)
Discussion started by: grossgermany
6 Replies

10. UNIX for Dummies Questions & Answers

calculate average of column 2

Hi I have fakebook.csv as following: F1(current date) F2(popularity) F3(name of book) F4(release date of book) 2006-06-21,6860,"Harry Potter",2006-12-31 2006-06-22,,"Harry Potter",2006-12-31 2006-06-23,7120,"Harry Potter",2006-12-31 2006-06-24,,"Harry Potter",2006-12-31... (0 Replies)
Discussion started by: onthetopo
0 Replies
Login or Register to Ask a Question