averaging column values with awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting averaging column values with awk
# 1  
Old 01-25-2009
averaging column values with awk

Hello. Im just starting to learn awk so hang in there with me...I have a large text file formatted as such everything is in a single column

ID001
value 1
value 2
value....n
ID002
value 1
value 2
value... n

I want to be able to calculate the average for values for each ID from the whole column (in total there are 25000 IDs) and the n= anywhere from 100 to 2000

thanks in advance.
John
# 2  
Old 01-26-2009
Assuming the same file format as described above:

Code:
$ cat file
ID001
1
2
3
4
5
ID002
22
33
44
55
66
77
ID003
6
7
19
1


Code:
awk  'NR==1{id=$0; next} /^ID[0-9]+$/{print id, s/n; s=n=0; id=$0}
      /^[0-9]+$/{s+=$0; n++} END{print id, s/n}' file


Output:

Code:
ID001 3
ID002 49.5
ID003 8.25

# 3  
Old 01-26-2009
Thanks Rubin..I think I left out some info though. Your script works great but not for my data...I think the reason is...

1.) My IDs are not linear (i.e. ID001, ID002..rather random ID005, ID001, ID999)

2.) My IDs are not ID001 rather BC followed by a random 6 digits such ..BC000601, BC015656, etc.

3.) All the data is stored in a file called data.txt
such the format is
BC001061
56.66
51.1
12.1223
68
..n
BC567123
1
15.6
12.111
..n
etc etc

Also I am using cygwin and presumably gnu awk

thanks again. I hope that helps
cheers
John
# 4  
Old 01-26-2009
Code:
awk  'NR==1{id=$0; next} /^BC[0-9]+$/{print id, s/n; s=n=0; id=$0}
      /^[0-9]+$/{s+=$0; n++} END{print id, s/n}' data.txt

# 5  
Old 01-26-2009
yes...I tried it...but I get a fatal error; division by zero attempted.FNR=288

Line 288 happens to be the line with the ID for the second set of data in the column...
any ideas?
thanks again..in advance.
-J
# 6  
Old 01-26-2009
Code:
awk  'NR==1{id=$0; next} /^BC[0-9]+$/{print id, (n) ? s/n : "NA"; s=n=0; id=$0}
      /^[0-9]+$/{s+=$0; n++} END{print id, (n) ? s/n : "NA"}' data.txt

# 7  
Old 01-26-2009
Thanks....that removed the error..but I still need to figure out how to get the average for each ID...
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk script to append suffix to column when column has duplicated values

Please help me to get required output for both scenario 1 and scenario 2 and need separate code for both scenario 1 and scenario 2 Scenario 1 i need to do below changes only when column1 is CR and column3 has duplicates rows/values. This inputfile can contain 100 of this duplicated rows of... (1 Reply)
Discussion started by: as7951
1 Replies

2. Shell Programming and Scripting

awk Print New Column For Every Two Lines and Match On Multiple Column Values to print another column

Hi, My input files is like this axis1 0 1 10 axis2 0 1 5 axis1 1 2 -4 axis2 2 3 -3 axis1 3 4 5 axis2 3 4 -1 axis1 4 5 -6 axis2 4 5 1 Now, these are my following tasks 1. Print a first column for every two rows that has the same value followed by a string. 2. Match on the... (3 Replies)
Discussion started by: jacobs.smith
3 Replies

3. Shell Programming and Scripting

How to perform averaging of values for particular timestamp using awk or anythoing else??

I have a file of the form. 16:00:26,83.33 16:05:26,83.33 16:10:26,83.33 16:15:26,83.33 16:20:26,90.26 16:25:26,83.33 16:30:26,83.33 17:00:26,83.33 17:05:26,83.33 17:10:26,83.33 17:15:26,83.33 17:20:26,90.26 17:25:26,83.33 17:30:26,83.33 For the timestamp 16:00:00 to 16:55:00, I need to... (5 Replies)
Discussion started by: Saidul
5 Replies

4. Shell Programming and Scripting

Selective Replace awk column values

Hi, I have the following data: 2860377|"DATA1"|"DATA2"|"65343"|"DATA2"|"DATA4"|"11"|"DATA5"|"DATA6"|"65343"|"DATA7"|"0"|"8"|"1"|"NEGATIVE" 32340377|"DATA1"|"DATA2"|"65343"|"DATA2"|"DATA4"|"11"|"DATA5"|"DATA6"|"65343"|"DATA7"|"0"|"8"|"1"|"NEG-DID"... (3 Replies)
Discussion started by: sdohn
3 Replies

5. UNIX for Dummies Questions & Answers

awk for concatenation of column values

Hello, I have a table as shown below. I want to concatenate values in col2 and col3 based on a value in col4. 1 X Y A 3 Y Z B 4 A W B 5 T W A If col4 is A, then I want to concatenate col3 with itself. Otherwise it should concateneate col2 with col3. 1 X Y YY 3 Y Z YZ... (10 Replies)
Discussion started by: Gussifinknottle
10 Replies

6. Shell Programming and Scripting

Averaging each row with null values

Hi all, I want to compute for the average of a file with null values (NaN) for each row. any help on how to do it. the sample file looks like this. 1.4 1.2 1.5 NaN 1.6 1.3 1.1 NaN 1.3 NaN 2.4 1.3 1.5 NaN 1.5 NaN 1.2 NaN 1.4 NaN I need to do a row-wise averaging such that it will sum only... (14 Replies)
Discussion started by: ida1215
14 Replies

7. Shell Programming and Scripting

averaging specific column of multiple files

Hi all, I'm needing help again on scripting. I have weekly files with 3 columns, and I need to do monthly averaging on the values on column 3, the file naming convention is as follows: 20000105.u- 2000:year 01:month 05:day 20000112.u 20000119.u 20000126.u 20000202.u 20020209.u I need to... (15 Replies)
Discussion started by: ida1215
15 Replies

8. Shell Programming and Scripting

How to averaging column based on first column values

Hello I have file that consist of 2 columns of millions of entries timestamp and throughput I want to find the average (throughput ) for each equal timestamp before change it to proper format e.g : i want to average 2 coloumnd fot all 1308154800 values in column 1 and then print... (4 Replies)
Discussion started by: aadel
4 Replies

9. Shell Programming and Scripting

for each different entry in column 1 extract maximum values from column 2 in unix/awk

Hello, I have 2 columns (1st column has multiple entries but the corresponding values in the column 2 may be the same or different.) however I want to extract unique values for each entry in column 1 by assigning the max value from column 2 SDF4 -0.211654 SDF4 0.978068 ... (1 Reply)
Discussion started by: Diya123
1 Replies

10. Shell Programming and Scripting

How to pick values from column based on key values by usin AWK

Dear Guyz:) I have 2 different input files like this. I would like to pick the values or letters from the inputfile2 based on inputfile1 keys (A,F,N,X,Z). I have done similar task by using awk but in that case the inputfiles are similar like in inputfile2 (all keys in 1st column and values in... (16 Replies)
Discussion started by: repinementer
16 Replies
Login or Register to Ask a Question