Print a row with the max number in a column


 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Print a row with the max number in a column
# 1  
Old 02-13-2020
Print a row with the max number in a column

Hello,


I have this table:

Code:
chr1_16857_17742         -        chr1    17369   17436   "ENST00000619216.1";    "MIR6859-1";    -        67
chr1_16857_17742         -        chr1    14404   29570   "ENST00000488147.1";    "WASH7P";       -        885
chr1_16857_18061        -        chr1    17369   17436   "ENST00000619216.1";    "MIR6859-1";    -        67
chr1_16857_18061        -        chr1    14404   29570   "ENST00000488147.1";    "WASH7P";       -        1204
chr1_16857_18061        -        chr1    17369   17436   "ENST00000619216.1";    "MIR6859-1";    -        67
chr1_16857_18061        -        chr1    14404   29570   "ENST00000488147.1";    "WASH7P";       -        1204
chr1_17232_18061        -        chr1    17369   17436   "ENST00000619216.1";    "MIR6859-1";    -        67
chr1_17232_18061        -        chr1    14404   29570   "ENST00000488147.1";    "WASH7P";       -        829
chr1_17914_24891        -        chr1    14404   29570   "ENST00000488147.1";    "WASH7P";       -        6977
  chr1_18267_29570        -        chr1    14404   29570   "ENST00000488147.1";    "WASH7P";       -        11303

where I need based on the first column go through all the instances with the same pattern and print out only the rows where the last column has MAX value.



Desired output:


Code:
chr1_16857_17742         -       chr1    14404   29570   "ENST00000488147.1";    "WASH7P";       -       885
chr1_16857_18061        -       chr1    14404   29570   "ENST00000488147.1";    "WASH7P";       -       1204
chr1_17232_18061        -        chr1    14404   29570   "ENST00000488147.1";    "WASH7P";       -        829
chr1_17914_24891        -       chr1    14404   29570   "ENST00000488147.1";    "WASH7P";       -       6977
chr1_18267_29570        -       chr1    14404   29570   "ENST00000488147.1";    "WASH7P";       -       11303

I tried this:



Code:
awk ' ! ( $1 in A_max ) { A_max[$1] = $NF } {A_max[$1] = ( A_max[$1] > $NF ? A_max[$1] : $NF )} END { for ( k in A_max ) print A_max[k], k }'

but I'm stuck on how to print out the whole row.


Would appreciate any help! Smilie

Last edited by coppuca; 02-13-2020 at 03:05 PM..
# 2  
Old 02-13-2020
try:

Code:
sort -k9 -n -r file | awk '!a[$1]++'

In output shown above, what happened to the max line for chr1_17232_18061?

Last edited by rdrtx1; 02-18-2020 at 07:19 PM..
This User Gave Thanks to rdrtx1 For This Post:
# 3  
Old 02-13-2020
Thank you! It works Smilie The chr1_17232_18061 was an accident.
Could you please explain how does
Code:
awk '!a[$1]++'

work? It collapses the rows with the same pattern in given column?
# 4  
Old 02-13-2020
It prints the line occurrence of column 1 ($1). a[$1]++ fails for the first occurrence increment since the array value for that key does not exist. The ! means not true so the expression is evaluated to true. The default action for true in awk is to print the line. In other terms: if the previous value of column 1 in array "a" cannot be incremented then print the line. The line could have been written as: '{if (! a[$1]++) print $0}'

Last edited by RavinderSingh13; 02-28-2020 at 01:42 AM..
This User Gave Thanks to rdrtx1 For This Post:
# 5  
Old 02-13-2020
Try as well
Code:
sort -k1,1 -k9nr  file | uniq -w16
chr1_16857_17742        -        chr1    14404   29570   "ENST00000488147.1";    "WASH7P";       -        885
chr1_16857_18061        -        chr1    14404   29570   "ENST00000488147.1";    "WASH7P";       -        1204
chr1_17232_18061        -        chr1    14404   29570   "ENST00000488147.1";    "WASH7P";       -        829
chr1_17914_24891        -        chr1    14404   29570   "ENST00000488147.1";    "WASH7P";       -        6977
chr1_18267_29570        -        chr1    14404   29570   "ENST00000488147.1";    "WASH7P";       -        11303

# 6  
Old 02-14-2020
Regarding your initial attempt, it is quite close.
First, let me make a simplified version:
Code:
awk '! ($1 in A_Max) || $1 > A_Max[$1] { A_Max[$1]=$NF } END { for (a in A_Max) print A_Max[a],a }' file

Now, the final step is to store the lines in another array, along with the A_max[]
Code:
awk '! ($1 in A_Max) || $1 > A_Max[$1] { A_Max[$1]=$NF; L_Max[$1]=$0 } END { for (a in A_Max) print L_Max[a] }' file

The two arrays take some memory; the key field $1 is stored in both arrays.
It would be possible to only use a line array ...
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Filter Row Based On Max Column Value After Group BY

Hello Team, Need your expertise on following: Here is the set of data: C1|4|C1SP1|A1|C1BP1|T1 C1|4|C1SP2|A1|C1BP2|T2 C2|3|C2SP1|A2|C2BP1|T2 C3|3|C3SP1|A3|C3BP1|T2 C2|2|C2SP2|A2|C2BP2|T1 I need to filter above date base on following two steps: 1. Group them by column 1 and 4 2.... (12 Replies)
Discussion started by: angshuman
12 Replies

2. Shell Programming and Scripting

Print row on 4th column to all row

Dear All, I have input : SEG901 5173 9005 5740 SEG902 5227 5284 SEG903 5284 5346 SEG904 5346 9010 SEG905 5400 5456 SEG906 5456 5511 SEG907 5511 9011 SEG908 5572 9015 SEG909 5622 9020 SEG910 5678 5739 SEG911 5739 5796 SEG912 5796 9025 ... (3 Replies)
Discussion started by: attila
3 Replies

3. Shell Programming and Scripting

Get row number from file1 and print that row of file2

Hi. How can we print those rows of file2 which are mentioned in file1. first character of file1 is a row number.. for eg file1 1:abc 3:ghi 6:pqr file2 a abc b def c ghi d jkl e mno f pqr ... (6 Replies)
Discussion started by: Abhiraj Singh
6 Replies

4. Shell Programming and Scripting

Sum value in a row and print the max

I have the input file in attached. I want the output file : Date , Time , Max_Bearer 11/01/2013 , 23:00 , 1447.894167 11/02/2013 , 00:00 , 1429.266667 11/03/2013 , 00:00 , 712.3175 11/04/2013 , 22:00 , 650.9533333 11/05/2013 , 23:00 , 665.9558333 11/06/2013 , 23:00 , 659.8616667... (2 Replies)
Discussion started by: justbow
2 Replies

5. Shell Programming and Scripting

Print every 5 4th column values as separate row with different first column

Hi, I have the following file, chr1 100 200 20 chr1 201 300 22 chr1 220 345 23 chr1 230 456 33.5 chr1 243 567 90 chr1 345 600 20 chr1 430 619 21.78 chr1 870 910 112.3 chr1 914 920 12 chr1 930 999 13 My output would be peak1 20 22 23 33.5 90 peak2 20 21.78 112.3 12 13 Here the... (3 Replies)
Discussion started by: jacobs.smith
3 Replies

6. UNIX for Dummies Questions & Answers

awk to print first row with forth column and last row with fifth column in each file

file with this content awk 'NR==1 {print $4} && NR==2 {print $5}' file The error is shown with syntax error; what can be done (4 Replies)
Discussion started by: cdfd123
4 Replies

7. Shell Programming and Scripting

Print min and max value from two column

Dear All, I have data like this, input: 1254 10125 1254 10126 1254 10127 1254 10128 1254 10129 1255 10130 1255 10131 1255 10132 1255 10133 1256 10134 1256 10135 1256 10137... (3 Replies)
Discussion started by: aksin
3 Replies

8. Shell Programming and Scripting

print max number of 2 columns - awk

Is it possible to print max number of 2 columns - awk note: print max if the integer is positive and print min if the integer is negative input a 1 2 b 3 4 c 5 1 d -3 -5 d -5 -3 output a 2 b 4 c 5 d -5 d -5 (4 Replies)
Discussion started by: quincyjones
4 Replies

9. Shell Programming and Scripting

extracting row with max column value using awk or unix

Hello, BC106081_abc_128240811_128241377 7.96301 BC106081_abc_128240811_128241377 39.322 BC106081_cde_128240811_128241377 1.98628 BC106081_def_128240811_128241377 -2.44492 BC106081_abc_128240811_128241377 69.5504 FLJ00075_xyz_14406_16765 -0.173417 ... (3 Replies)
Discussion started by: Diya123
3 Replies

10. Shell Programming and Scripting

How to print column based on row number

Hi, I want to print column value based on row number say multiple of 8. Input file: line 1 67 34 line 2 45 57 . . . . . . line 8 12 46 . . . . . . line 16 24 90 . . . . . . line 24 49 67 Output 46 90 67 (2 Replies)
Discussion started by: Surabhi_so_mh
2 Replies
Login or Register to Ask a Question