Identify max value in diff columns for same row


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Identify max value in diff columns for same row
# 1  
Old 01-21-2014
Identify max value in diff columns for same row

Hi,

I have a file with 1M records

Code:
ABC 200 400 2.4 5.6
ABC 410 299 12  1.5
XYZ 4 5 6 7
MNO 22 40 30 70
MNO 47 55 80 150

What I want is for all the rows it should take the max value where there are duplicates

output
Code:
ABC 410 400 12 5.6
XYZ 4 5 6 7
MNO 47 55 80 150

How can i do this in awk/unix?

Thanks,
# 2  
Old 01-21-2014
If it is OK that the order is not preserved:
Code:
awk '
        !( $1 in A ) {
                A[$1] = $0
                next
        }
        ( $1 in A ) {
                n = split ( A[$1], R )
                for ( i = 2; i <= n; i++ )
                {
                        R[i] = R[i] > $i ? R[i] : $i
                        s = s ? s FS R[i] : R[i]
                }
                A[$1] = $1 FS s
                s = ""
        }
        END {
                for ( k in A )
                        print A[k]
        }
' file

# 3  
Old 01-21-2014
Thanks,

The order does not matter. I tried with my data and it did not work. Here is my original data

Code:


Last edited by Diya123; 01-21-2014 at 07:09 PM..
# 4  
Old 01-21-2014
Looks like your original data is tab separated. Apply below change in the code and retry:
Code:
awk -F'\t' '

# 5  
Old 01-21-2014
I tried that

It gives the following error

TN_rpkm: line 919: linc-TMEM183B-1: command not found
TN_rpkm: line 920: linc-PRELP: command not found
# 6  
Old 01-21-2014
Quote:
Originally Posted by Diya123
I tried that

It gives the following error

TN_rpkm: line 919: linc-TMEM183B-1: command not found
TN_rpkm: line 920: linc-PRELP: command not found
The code that I posted has just 21 lines.

But the error that you posted is reporting issue at line 919 and 920! Smilie
# 7  
Old 01-21-2014
Sorry I messed up something at my end. Thanks I check few rows and it worked perfectly fine.

Thanks,
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Print a row with the max number in a column

Hello, I have this table: chr1_16857_17742 - chr1 17369 17436 "ENST00000619216.1"; "MIR6859-1"; - 67 chr1_16857_17742 - chr1 14404 29570 "ENST00000488147.1"; "WASH7P"; - 885 chr1_16857_18061 - chr1 ... (5 Replies)
Discussion started by: coppuca
5 Replies

2. Shell Programming and Scripting

Filter Row Based On Max Column Value After Group BY

Hello Team, Need your expertise on following: Here is the set of data: C1|4|C1SP1|A1|C1BP1|T1 C1|4|C1SP2|A1|C1BP2|T2 C2|3|C2SP1|A2|C2BP1|T2 C3|3|C3SP1|A3|C3BP1|T2 C2|2|C2SP2|A2|C2BP2|T1 I need to filter above date base on following two steps: 1. Group them by column 1 and 4 2.... (12 Replies)
Discussion started by: angshuman
12 Replies

3. Shell Programming and Scripting

Add sum of columns and max as new row

Hi, I am a new bie i need some help with respect to shell onliner; I have data in following format Name FromDate UntilDate Active Changed Touched Test 28-03-2013 28-03-2013 1 0.6667 100 Test2 28-03-2013 03-04-2013 ... (1 Reply)
Discussion started by: gangaraju6
1 Replies

4. Shell Programming and Scripting

Sum value in a row and print the max

I have the input file in attached. I want the output file : Date , Time , Max_Bearer 11/01/2013 , 23:00 , 1447.894167 11/02/2013 , 00:00 , 1429.266667 11/03/2013 , 00:00 , 712.3175 11/04/2013 , 22:00 , 650.9533333 11/05/2013 , 23:00 , 665.9558333 11/06/2013 , 23:00 , 659.8616667... (2 Replies)
Discussion started by: justbow
2 Replies

5. UNIX for Dummies Questions & Answers

Select 2 columns and transpose row by row

Hi, I have a tab-delimited file as follows: 1 1 2 2 3 3 4 4 a a b b c c d d 5 5 6 6 7 7 8 8 e e f f g g h h 9 9 10 10 11 11 12 12 i i j j k k l l 13 13 14 14 15 15 16 16 m m n n o o p p The output I need is: 1 1 a a 5 5 e e 9 9 i i 13... (5 Replies)
Discussion started by: mvaishnav
5 Replies

6. Shell Programming and Scripting

awk, max value, array, row

Hello: I want to print out the entire row with max value in column 3 based on column 2. Input file is millions rows. test.dat: Contig1 lcl|1DL 111 155 265 27 Contig2 lcl|1DS 100 73 172 100 Contig3 lcl|1DL 140 698 837 140 Contig3 lcl|6DS 107 1488 1594... (1 Reply)
Discussion started by: yifangt
1 Replies

7. UNIX for Dummies Questions & Answers

how to join files with diff col # and row #?

I am a new user of Unix/Linux, so this question might be a bit simple! I am trying to join two (very large) files that both have different # of cols and rows in each file. I want to keep 'all' rows and 'all' cols from both files in the joint file, and the primary key variables are in the rows.... (1 Reply)
Discussion started by: BNasir
1 Replies

8. Shell Programming and Scripting

extracting row with max column value using awk or unix

Hello, BC106081_abc_128240811_128241377 7.96301 BC106081_abc_128240811_128241377 39.322 BC106081_cde_128240811_128241377 1.98628 BC106081_def_128240811_128241377 -2.44492 BC106081_abc_128240811_128241377 69.5504 FLJ00075_xyz_14406_16765 -0.173417 ... (3 Replies)
Discussion started by: Diya123
3 Replies

9. Shell Programming and Scripting

how to identify duplicate columns in a row

Hi, How to identify duplicate columns in a row? Input data: may have 30 columns 9211480750 LK 120070417 920091030 9211480893 AZ 120070607 9205323621 O7 120090914 120090914 1420090914 2020090914 2020090914 9211479568 AZ 120070327 320090730 9211479571 MM 120070326 9211480892 MM 120070324... (3 Replies)
Discussion started by: suresh3566
3 Replies

10. Shell Programming and Scripting

How i get the max value of a row?

I have a file like: <word> 5 <word> 3 <word> 5 <word> 2 <word> 6 <word> 8 <word> 12 and i need to know the max value of the second column, in this case 12. Plz help me! Actually i need the TOTAL, AVERANGE and MAX VALUE and i'm using this in... (10 Replies)
Discussion started by: Lestat
10 Replies
Login or Register to Ask a Question