Average, min and max in file with header, using awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Average, min and max in file with header, using awk
# 1  
Old 02-20-2013
Average, min and max in file with header, using awk

Hi,
I have a file which looks like this:
Code:
     FID        IID MISS_PHENO   N_MISS   N_GENO   F_MISS
  12AB43131   12AB43131          N    17774   906341  0.01961
  65HJ87451   65HJ87451          N    10149   906341   0.0112
  43JJ21345   43JJ21345          N     2826   906341 0.003118

I would like an awk script that extracts the average, min and max values from the last column. The following code prints the minimum value:
Code:
awk 'min=="" || $6 < min {min=$6} END{print min}' file

Putting > instead of < naturally gives me the header instead. Putting NR>1 gives me the last number of the column, here 0.003118. If I remove the header, it works fine, but I would like to skip that step. The following code, however, works fine for finding the average:
Code:
awk 'NR>1{sum+=$6}END{print "Average missing = ",sum/(NR-1)}' missing_200213.imiss

More info on file: It could be as much as 1 million lines in total. The numbers are usually between 0 and 1, ie lots of decimal numbers. After the header, the column in question is only numeric. There will always be a number, but it might be either 0 or 1.

Thank you!
# 2  
Old 02-20-2013
Well, you're on the right track, just need to get your ducks in a row. Look at this an adapt to your needs:
Code:
awk     'NR==1  {max=0;min=1}
         NR>1   {sum+=$6
                 if (min>$6) min=$6
                 if (max<$6) max=$6
                 cnt = NR
                }
         END    {print  sum/(cnt-1), min, max}
        ' file
0.0113093 0.003118 0.01961

Some further reading on awk might be really helpful.
This User Gave Thanks to RudiC For This Post:
# 3  
Old 02-20-2013
Code:
$  awk 'NR>1{a+=$NF;
  max=max>$NF?max:$NF;
  min=min>$NF||!min?$NF:min}
  END{print a/(NR-1),max,min}' file

0.0113093 0.01961 0.003118

This User Gave Thanks to pamu For This Post:
# 4  
Old 02-20-2013
Problem solved!

Thank you both,
both of the scripts work, though I understand more of the first one as a newbie. Do any of you know why it kept giving me the last line? What in the code I was using was telling awk to do that?
# 5  
Old 02-21-2013
Not sure I understand. You were computing the minimum, and the last line's entry is the minimum - so outputting that line's value is exactly what you were asking for.
# 6  
Old 02-21-2013
Ah, I should specify:
When I tried running the script I posted in the first thread, it kept giving me the last value in the actual huge file (not the example I posted), no matter if it was the smallest number or not. So something in the script is apparently telling it to post the last line, but I can't see what it is. Not a big deal, just want to understand awk a bit more.
# 7  
Old 02-21-2013
Hmmm, feeling uneasy - how do you expect us to give you a meaningful answer if you are supplying incomplete | insufficient | malphrased | meaningless | wrong input data?
We don't even know which of the two scripts you posted in post #1 you were talking of. And you ran it on data we have never seen.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk Sort 2d histogram output from min(X,Y) to max(X,Y)

I've got Gnuplot-format 2D histogram data output which looks as follows. 6.5 -1.25 10.2804 6.5404 -1.25 10.4907 6.58081 -1.25 10.8087 6.62121 -1.25 10.4686 6.66162 -1.25 10.506 6.70202 -1.25 10.3084 6.74242 -1.25 9.68256 6.78283 -1.25 9.41229 6.82323 -1.25 9.43078 6.86364 -1.25 9.62408... (1 Reply)
Discussion started by: chrisjorg
1 Replies

2. Shell Programming and Scripting

How to get min and max values using awk?

Hi, I need your kind help to get min and max values from file based on value in $5 . File1 SP12.3 stc 2240806 2240808 + ID1_N003 ID2_N003T0 SP12.3 sto 2241682 2241684 + ID1_N003 ID2_N003T0 SP12.3 XE 2239943 2240011 + ID1_N003 ID2_N003T0 SP12.3 XE 2240077 2241254 + ID1_N003 ... (12 Replies)
Discussion started by: redse171
12 Replies

3. Shell Programming and Scripting

awk script to find min and max value

I need to find the max/min of columns 1 and 2 of a 2 column file what contains the special character ">". I know that this will find the max value of column 1. awk 'BEGIN {max = 0} {if ($1>max) max=$1} END {print max}' input.file But what if I needed to ignore special characters in the... (3 Replies)
Discussion started by: ncwxpanther
3 Replies

4. Shell Programming and Scripting

Get the min avg and max with awk

aaa: 3 ms aaa: 2 ms aaa: 5 ms aaa: 10 ms .......... to get the 3 2 5 10 ...'s min avg and max something like min: 2 ms avg: 5 ms max: 10 ms (2 Replies)
Discussion started by: yanglei_fage
2 Replies

5. Shell Programming and Scripting

Number of elements, average value, min & max from a list of numbers using awk

Hi all, I have a list of numbers. I need an awk command to find out the numbers of elements (number of numbers, sort to speak), the average value the min and max value. Reading the list only once, with awk. Any ideas? Thanks! (5 Replies)
Discussion started by: black_fender
5 Replies

6. Shell Programming and Scripting

How to find the average,min,max ,total count?

Hi , Below is my sample data,I have this 8 column(A,B,C,D,E,F,G,H) in csv file. A , B ,C ,D ,E ,F,G ,H 4141,127337,24,15,20,69,72.0,-3 4141,128864,24,15,20,65,66.0,-1 4141,910053,24,15,4,4,5.0,-1 4141,910383,24,15,22,3,4.0,-1 4141,496969,24,15,14,6,-24.0,-18... (7 Replies)
Discussion started by: vinothsekark
7 Replies

7. Shell Programming and Scripting

Find min.max value if matching columns found using AWK

Input_ File : 2 3 4 5 1 1 0 1 2 1 -1 1 2 1 3 1 3 1 4 1 6 5 6 6 6 6 6 7 6 7 6 8 5 8 6 7 Desired output : 2 3 4 5 -1 1 4 1 6 5 6 8 5 8 6 7 (3 Replies)
Discussion started by: vasanth.vadalur
3 Replies

8. Shell Programming and Scripting

Count time min/max/average for ping

I am redirecting my ping output to a file. The sample output is like this: 64 bytes from xx.xx.xx.167: icmp_seq=4490 ttl=116 3.75 ms 2011Jul12- 15 40 16 64 bytes from xx.xx.xx.167: icmp_seq=4491 ttl=116 5.29 ms 2011Jul12- 15 40 17 64 bytes from xx.xx.xx.167: icmp_seq=4492 ttl=116 4.88 ms... (6 Replies)
Discussion started by: zorrox
6 Replies

9. Shell Programming and Scripting

get min, max and average value

hi! i have a file like the attachement. I'd like to get for each line the min, max and average values. (there is 255 values for each line) how can i get that ? i try this, is it right? BEGIN {FS = ","; OFS = ";";max=0;min=0;moy=0;total=0;freq=890} $0 !~ /Trace1:/ { ... (1 Reply)
Discussion started by: riderman
1 Replies

10. UNIX for Dummies Questions & Answers

Awk search for max and min field values

hi, i have an awk script and I managed to figure out how to search the max value but Im having difficulty in searching for the min field value. BEGIN {FS=","; max=0} NF == 7 {if (max < $6) max = $6;} END { print man, min} where $6 is the column of a field separated by a comma (3 Replies)
Discussion started by: Kirichiko
3 Replies
Login or Register to Ask a Question