|
Need awk to extract lines and sort
Hi,
My data looks like this.
Code:
CHR SNP BP A1 TEST NMISS OR STAT P
0 SNP_A-8282315 0 2 ADD 1530 1.074 0.7707 0.4409
0 SNP_A-8282315 0 2 COV1 1530 1.771e+04 4.764 1.898e-06
0 SNP_A-8282315 0 2 COV2 1530 1.513e+04 4.645 3.402e-06
0 SNP_A-8282315 0 2 COV3 1530 14.16 1.306 0.1915
0 SNP_A-8282315 0 2 COV4 1530 1.264 0.1139 0.9093
0 SNP_A-8282315 0 2 COV5 1530 2.389 0.4268 0.6695
0 SNP_A-8338258 0 4 ADD 1528 0.9498 -0.6824 0.495
0 SNP_A-8338258 0 4 COV1 1528 1.846e+04 4.777 1.783e-06
0 SNP_A-8338258 0 4 COV2 1528 1.374e+04 4.6 4.224e-06
0 SNP_A-8338258 0 4 COV3 1528 14.82 1.33 0.1836
0 SNP_A-8338258 0 4 COV4 1528 1.251 0.1087 0.9134
0 SNP_A-8338258 0 4 COV5 1528 2.431 0.4354 0.6633
.
.
- I want to extract only lines with "ADD" in col. 5 (TEST).
- I want to sort ascending on Col 9 (P). This column as shown above has both regular and scientific formats.
- How can I keep the headers of the file intact in the outfile?
What I tried:
Code:
gawk '{if(NR>1 && $5="ADD") {print $0}}' infile>outfile
Everything in col5 is changed to ADD.
Last edited by genehunter; 10-11-2009 at 02:33 AM..
Reason: spelling ALL to ADD
|