Sorting with header and mixed numerals (scientific and decimal) | awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Sorting with header and mixed numerals (scientific and decimal) | awk
# 1  
Old 01-04-2011
Question Sorting with header and mixed numerals (scientific and decimal) | awk

Code:
Assoc.txt
 CHR         SNP         BP   A1       TEST    NMISS         OR         STAT            P 
   1   rs2980319     766985    A        ADD     4154      1.024       0.1623       0.8711
   1   rs2980319     766985    A     AGECAT     4154      1.371        6.806    1.003e-11
   1   rs2980319     766985    A        EV1     4154   1.68e-30       -17.51      1.2e-68
   1   rs2980319     766985    A        EV2     4154   3.42e-12       -7.966    1.645e-15
   1   rs2980319     766985    A        EV3     4154    0.03088      -0.5361       0.5919
   1   rs2980319     766985    A        EV4     4154      17.18       0.2873       0.7739
   1  rs11260595    1028961    T        ADD     4158      1.004      0.01069       0.9915
   1  rs11260595    1028961    T     AGECAT     4158      1.364        6.671     2.54e-11
   1  rs11260595    1028961    T        EV1     4158  1.318e-30       -17.52    1.043e-68

I want to get only the lines where TEST (Col. 5) has value "ADD"; then
I want to sort P (Col.9) in ascending order, but also keep the column headers intact without getting sorted out.

Using awk, I can do the following
Code:
>awk '{if($5=="ADD") {print $0}}' OFS='\t'  Assoc.txt |sort -nrk9

This however, removes the Header Row
It also does not sort Col.9 properly, because I have values that are in scientific notation (3.457e-05) and decimals ( 0.0001029)
# 2  
Old 01-04-2011
For the header...

1st command, something like
Code:
head -1 <Assoc.txt >Assoc.srt

2nd command, like
Code:
blah blah >>Assoc.srt

Sorry about that blah blah, but going to have to think about that scientific notation.
You might be able to awk select for $5=ADD or $5=TEST, but that might create some weirdness in your later sorting. Thus, why I suggested two steps.

---------- Post updated at 01:48 PM ---------- Previous update was at 01:46 PM ----------

A couple of posts on scientific notation:

https://www.unix.com/shell-programmin...sion-unix.html
https://www.unix.com/shell-programmin...on-normal.html

Last edited by joeyg; 01-04-2011 at 02:49 PM.. Reason: Added a couple post references.
# 3  
Old 01-04-2011
try this:
Code:
nawk 'FNR==1{print | "cat 1>&2";next}$5=="ADD" {print $0}' OFS='\t'  Assoc.txt |sort -nrk9

# 4  
Old 01-04-2011
Quote:
Originally Posted by vgersh99
try this:
Code:
nawk 'FNR==1{print | "cat 1>&2";next}$5=="ADD" {print $0}' OFS='\t'  Assoc.txt |sort -nrk9

Code:
awk 'FNR==1{print | "cat 1>&2";next}$5=="ADD" {print $0}' OFS='\t'  Assoc.txt  |sort -gk9 >Assoc_sorted.TXT

Code:
CHR         SNP         BP   A1       TEST    NMISS         OR         STAT            P 
FL_NoOutlr :~>head Foll_LRM_Sorted.TXT
   6   rs4530903   32689867    A        ADD     4151      2.084        5.538    3.067e-08
   6   rs9268853   32537621    C        ADD     4182      1.749        5.335    9.552e-08
   6   rs9469220   32766288    C        ADD     4164     0.5633       -5.237    1.628e-07

What happens is that it prints the header on to the terminal and the rest is piped out to the file output.
Also found out that -gk9 works for scientific notation. -nrk9 did not work as expected.
# 5  
Old 01-04-2011
Code:
nawk 'FNR==1{print >out;next}$5=="ADD" {print $0}' OFS='\t' out=Assoc_sorted.TXT  Assoc.txt |sort -gk9 >>Assoc_sorted.TXT

This User Gave Thanks to vgersh99 For This Post:
# 6  
Old 01-04-2011
Or:
Code:
awk 'NR==1{print;next} $5=="ADD"{print | "sort -grk9"}' OFS="\t" file

This User Gave Thanks to Franklin52 For This Post:
# 7  
Old 01-04-2011
Quote:
Originally Posted by Franklin52
Or:
Code:
awk 'NR==1{print;next} $5=="ADD"{print | "sort -grk9"}' OFS="\t" file

Give descending order . Otherwise good solution and slightly faster.
Thank you Franklin52 and vgersh99 SmilieSmilie
I changed
Code:
sort -grk9 to sort -gk9

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. OS X (Apple)

Sorting scientific numbers with sort

Hey everybody, I'm trying to sort scientific numbers in a descending order using the command sort -gr <file>. It works fine on a Linux-Server, but doesn't on my macbook pro with OS X 10.10.3 (Yosemite). I tried to sort the following: 6.38e-10 6.38e-10 1.80e-11 1.00e-10 1.48e-12 And... (9 Replies)
Discussion started by: plebs
9 Replies

2. Shell Programming and Scripting

Help with filter result (scientific notation) by using awk

Input file: data1 0.05 data2 1e-14 data1 1e-330 data2 1e-14 data5 2e-60 data5 2e-150 data1 4e-9 Desired output: data2 1e-14 data1 1e-330 data2 1e-14 data5 2e-60 data5 2e-150 I would like to filter out those result that column 2 is less than 1e-10. Command try: (1 Reply)
Discussion started by: cpp_beginner
1 Replies

3. UNIX for Dummies Questions & Answers

How to control the decimal points for p-values in scientific format?

Dear all, I have a txt file with only one column which contains p values. My data looks like this: 5.04726976606584e-190 2.94065711152402e-189 2.94065711152402e-189 9.19932135717279e-176 1.09472516659859e-170 1.24974648916809e-170 0.1223974648916 0.9874974648916 ... what I want... (2 Replies)
Discussion started by: forevertl
2 Replies

4. Shell Programming and Scripting

Perl: scientific notation to decimal notation

hello folks, I have few values in a log which are in scientific notation. I am trying to convert into actual decimal format or integer but couldn't able to convert. Values in scientific notation: 1.1662986666666665E-4 2.0946799999999998E-4 3.0741333333333333E-6 5.599999999999999E-7... (2 Replies)
Discussion started by: scriptscript
2 Replies

5. Shell Programming and Scripting

Using awk to parse a file with mixed formats in columns

Greetings I have a file formatted like this: rhino grey weight=1003;height=231;class=heaviest;histology=9,0,0,8 bird white weight=23;height=88;class=light;histology=7,5,1,0,0 turtle green weight=40;height=9;class=light;histology=6,0,2,0... (2 Replies)
Discussion started by: Twinklefingers
2 Replies

6. Shell Programming and Scripting

awk - mixed for and if to select particular lines in a data file

Hi all, I am new to AWK and I am trying to solve a problem that is probably easy for an expert. Suppose I have the following data file input.txt: 20 35 43 20 23 54 20 62 21 20.5 43 12 20.5 33 11 20.5 89 87 21 33 20 21 22 21 21 56 87 I want to select from all lines having the... (4 Replies)
Discussion started by: naska
4 Replies

7. Shell Programming and Scripting

Sorting mixed numbers and letters

Hello, I have a file such as this: chr1 chr2 chr1 chr2 chr3 chr10 chr4 chr5 chrz chr1AI want to sort it, I use this command: sort -k1 -th -n testfilebut I get this output, how can I fix this? chr1 chr1 chr10 chr1A chr2 chr2 (3 Replies)
Discussion started by: Homa
3 Replies

8. UNIX for Dummies Questions & Answers

awk for scientific notion and decimal combined data

Dea all, I have a question. I have a column of numbers with scientific notion and decimal combined data. I want to print it only if the number <0.05 using awk. however the very small numbers with scientific notion is not selected. Do any one know how to solve it? Thanks! example as below: ... (4 Replies)
Discussion started by: forevertl
4 Replies

9. Shell Programming and Scripting

Sort roman numerals

If I use ls to print all the files of a folder, is there a way to sort using roman numerals? I am thinking about a result like: benjamin_I.wmv benjamin_II.wmv benjamin_II.wmv benjamin_III.wmv benjamin_IV.wmv benjamin_V.wmv benjamin_VI.wmv benjamin_VII.wmv benjamin_VIII.wmv... (6 Replies)
Discussion started by: locoroco
6 Replies

10. Shell Programming and Scripting

extracting only numerals from string.

Hi!!! i have two files "tushar20090429200000.txt" and "tushar_err20090429200000.txt" The numeric part here is date and time. So this part of file keeps changing after every hour. I want to extract the numeric part from the both file names and compare them whether they are equal or not. ... (4 Replies)
Discussion started by: tushar_tus
4 Replies
Login or Register to Ask a Question