Average, min and max in file with header, using awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Average, min and max in file with header, using awk
# 8  
Old 02-21-2013
1. I can not post an example with 1 million lines on here. The data is exactly the same in terms of style of content, it just goes on for another 999.997 lines.
2. I stated in the first post that writing NR>1 gave me the last number of the column. I'm sorry I didn't realize that in my example this number was in fact the smallest one, but in my original file it clearly is not, that I can see from using the head-command. Using tail, I accidentally discovered that it was the last entry in the column that was posted instead.
3. As stated, both scripts work. I used it on a data set I have of about 300 rows, which I'm able to check using excel, something I can't do with the largest data set, as excel will most likely crash.
# 9  
Old 02-21-2013
Have you tried my solution.. ?

Please check below..

Code:
$ cat file
   FID        IID MISS_PHENO   N_MISS   N_GENO   F_MISS
  12AB43131   12AB43131          N    17774   906341  0.01961
  65HJ87451   65HJ87451          N    10149   906341   0.0112
  43JJ21345   43JJ21345          N     2826   906341 0.003118
  43JJ21345   43JJ21345          N     2826   906341 0.3119
  43JJ21345   43JJ21345          N     2826   906341 0.3118

$ awk 'NR>1{a+=$NF;
  max=max>$NF?max:$NF;
  min=min>$NF||!min?$NF:min}
  END{print a/(NR-1),max,min}' file
0.131526 0.3119 0.003118

Same result got using Rudic's script.
# 10  
Old 02-21-2013
Yes, pamu, yours worked as well. I tested both on the 300 row file, and they gave the exact same output, and was verified by excel.

I hope to be able to understand more of the content of your script some day, but right now I have a deadline, so I will have to get back to it later. Looks really interesting though!

I did some minor changes to RudiC's script to fix line breaks etc. Is that what you were referring to, pamu?
# 11  
Old 02-21-2013
Quote:
Originally Posted by kayakj
I hope to be able to understand more of the content of your script some day, but right now I have a deadline, so I will have to get back to it later. Looks really interesting though!

I did some minor changes to RudiC's script to fix line breaks etc. Is that what you were referring to, pamu?
Ya Sure. Smilie

You can use both the script to get your desired output.

And i think you can use awk for large records. Unlike excel it won't crash..Smilie (probably)
# 12  
Old 02-21-2013
Yes, this is why I posted the question in the first place. I did my entire pipeline of analysis (including much more than just this) using excel to do editing etc, but it just kept crashing and saving just a fraction of the file etc.

Redoing with awk now, which runs really smoothly in comparison, just need to get all the scripts right.

Thank you so much for all your help!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk Sort 2d histogram output from min(X,Y) to max(X,Y)

I've got Gnuplot-format 2D histogram data output which looks as follows. 6.5 -1.25 10.2804 6.5404 -1.25 10.4907 6.58081 -1.25 10.8087 6.62121 -1.25 10.4686 6.66162 -1.25 10.506 6.70202 -1.25 10.3084 6.74242 -1.25 9.68256 6.78283 -1.25 9.41229 6.82323 -1.25 9.43078 6.86364 -1.25 9.62408... (1 Reply)
Discussion started by: chrisjorg
1 Replies

2. Shell Programming and Scripting

How to get min and max values using awk?

Hi, I need your kind help to get min and max values from file based on value in $5 . File1 SP12.3 stc 2240806 2240808 + ID1_N003 ID2_N003T0 SP12.3 sto 2241682 2241684 + ID1_N003 ID2_N003T0 SP12.3 XE 2239943 2240011 + ID1_N003 ID2_N003T0 SP12.3 XE 2240077 2241254 + ID1_N003 ... (12 Replies)
Discussion started by: redse171
12 Replies

3. Shell Programming and Scripting

awk script to find min and max value

I need to find the max/min of columns 1 and 2 of a 2 column file what contains the special character ">". I know that this will find the max value of column 1. awk 'BEGIN {max = 0} {if ($1>max) max=$1} END {print max}' input.file But what if I needed to ignore special characters in the... (3 Replies)
Discussion started by: ncwxpanther
3 Replies

4. Shell Programming and Scripting

Get the min avg and max with awk

aaa: 3 ms aaa: 2 ms aaa: 5 ms aaa: 10 ms .......... to get the 3 2 5 10 ...'s min avg and max something like min: 2 ms avg: 5 ms max: 10 ms (2 Replies)
Discussion started by: yanglei_fage
2 Replies

5. Shell Programming and Scripting

Number of elements, average value, min & max from a list of numbers using awk

Hi all, I have a list of numbers. I need an awk command to find out the numbers of elements (number of numbers, sort to speak), the average value the min and max value. Reading the list only once, with awk. Any ideas? Thanks! (5 Replies)
Discussion started by: black_fender
5 Replies

6. Shell Programming and Scripting

How to find the average,min,max ,total count?

Hi , Below is my sample data,I have this 8 column(A,B,C,D,E,F,G,H) in csv file. A , B ,C ,D ,E ,F,G ,H 4141,127337,24,15,20,69,72.0,-3 4141,128864,24,15,20,65,66.0,-1 4141,910053,24,15,4,4,5.0,-1 4141,910383,24,15,22,3,4.0,-1 4141,496969,24,15,14,6,-24.0,-18... (7 Replies)
Discussion started by: vinothsekark
7 Replies

7. Shell Programming and Scripting

Find min.max value if matching columns found using AWK

Input_ File : 2 3 4 5 1 1 0 1 2 1 -1 1 2 1 3 1 3 1 4 1 6 5 6 6 6 6 6 7 6 7 6 8 5 8 6 7 Desired output : 2 3 4 5 -1 1 4 1 6 5 6 8 5 8 6 7 (3 Replies)
Discussion started by: vasanth.vadalur
3 Replies

8. Shell Programming and Scripting

Count time min/max/average for ping

I am redirecting my ping output to a file. The sample output is like this: 64 bytes from xx.xx.xx.167: icmp_seq=4490 ttl=116 3.75 ms 2011Jul12- 15 40 16 64 bytes from xx.xx.xx.167: icmp_seq=4491 ttl=116 5.29 ms 2011Jul12- 15 40 17 64 bytes from xx.xx.xx.167: icmp_seq=4492 ttl=116 4.88 ms... (6 Replies)
Discussion started by: zorrox
6 Replies

9. Shell Programming and Scripting

get min, max and average value

hi! i have a file like the attachement. I'd like to get for each line the min, max and average values. (there is 255 values for each line) how can i get that ? i try this, is it right? BEGIN {FS = ","; OFS = ";";max=0;min=0;moy=0;total=0;freq=890} $0 !~ /Trace1:/ { ... (1 Reply)
Discussion started by: riderman
1 Replies

10. UNIX for Dummies Questions & Answers

Awk search for max and min field values

hi, i have an awk script and I managed to figure out how to search the max value but Im having difficulty in searching for the min field value. BEGIN {FS=","; max=0} NF == 7 {if (max < $6) max = $6;} END { print man, min} where $6 is the column of a field separated by a comma (3 Replies)
Discussion started by: Kirichiko
3 Replies
Login or Register to Ask a Question