Sort with UNIX


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Sort with UNIX
# 1  
Old 10-18-2010
Sort with UNIX

I want to sort unique values of column 2 that has the maximum value at column 7. this is my file and desired output


I have a file like this:


Code:
AD008 AD0081010180947 101018 0947 0950 1010180947 1010180950
AD008 AD0081010180947 101018 0947 0956 1010180947 1010180956
AD008 AD0081010180947 101018 0947 1014 1010180947 1010181014
AD008 AD0081010180947 101018 0947 0956 1010180947 1010180956
ED006 ED0061010180949 101018 0949 0956 1010180949 1010180956
ED006 ED0061010180949 101018 0949 0956 1010180949 1010180956
ED007 ED0071010180949 101018 0949 0956 1010180949 1010180956
ED011 ED0111010180949 101018 0949 0956 1010180949 1010180956
ED011 ED0111010180949 101018 0949 1201 1010180949 1010181201
ED011 ED0111010180949 101018 0949 0956 1010180949 1010180956
ED011 ED0111010180949 101018 0949 0956 1010180949 1010180956
ED011 ED0111010180949 101018 0949 0956 1010180949 1010180956
ED011 ED0111010180949 101018 0949 0956 1010180949 1010180956
KW005 KW0051010180950 101018 0950 0950 1010180950 1010180950
KW005 KW0051010180950 101018 0950 0956 1010180950 1010180956
OG010 OG0101010181002 101018 1002 0956 1010181002 1010180956
OG010 OG0101010181002 101018 1002 1123 1010181002 1010181123
OG010 OG0101010181002 101018 1002 0956 1010181002 1010180956

to give:

Code:
ED011 ED01171010180949 101018 0949 1201 1010180949 1010181201
OG010 OG01001010181002 101018 1002 1123 1010181002 1010181123
AD008 ED00801010180947 101018 0947 1014 1010180947 1010181014
ED006 ED00631010180949 101018 0949 0956 1010180949 1010180956
ED007 ED00741010180949 101018 0949 0956 1010180949 1010180956
KW005 KW00521010180950 101018 0950 0956 1010180950 1010180956

# 2  
Old 10-18-2010
Like this?
Code:
$ awk '$7>M[$1]{M[$1]=$7;A[$1]=$0}END{for(i in A)print A[i]}' infile427
KW005 KW0051010180950 101018 0950 0956 1010180950 1010180956
OG010 OG0101010181002 101018 1002 1123 1010181002 1010181123
ED011 ED0111010180949 101018 0949 1201 1010180949 1010181201
ED006 ED0061010180949 101018 0949 0956 1010180949 1010180956
AD008 AD0081010180947 101018 0947 1014 1010180947 1010181014
ED007 ED0071010180949 101018 0949 0956 1010180949 1010180956


Last edited by Scrutinizer; 10-18-2010 at 09:08 AM..
This User Gave Thanks to Scrutinizer For This Post:
# 3  
Old 10-18-2010
Quote:
Originally Posted by Scrutinizer
Like this?
Code:
$ awk '$7>M[$1]{M[$1]=$7;A[$1]=$0}END{for(i in A)print A[i]}' infile427
KW005 KW0051010180950 101018 0950 0956 1010180950 1010180956
OG010 OG0101010181002 101018 1002 1123 1010181002 1010181123
ED011 ED0111010180949 101018 0949 1201 1010180949 1010181201
ED006 ED0061010180949 101018 0949 0956 1010180949 1010180956
AD008 AD0081010180947 101018 0947 1014 1010180947 1010181014
ED007 ED0071010180949 101018 0949 0956 1010180949 1010180956

OMG awk is darn powerful ... i really should get deeper into it

Could someone kindly comment the code pls ?
1. do awk execute M[$1]=$7 before the checking condition $7>M[$1] ?
2. for(i in A)print A[i] i doesn't need to be initialized ? it automatically scan the A array ?
3. how does it know which [$1] index value to take to scan this array ?

Thanks in advance for you lightening

Last edited by ctsgnb; 10-18-2010 at 09:52 AM..
# 4  
Old 10-18-2010
Quote:
Originally Posted by Scrutinizer
Like this?
Code:
$ awk '$7>M[$1]{M[$1]=$7;A[$1]=$0}END{for(i in A)print A[i]}' infile427

should be compare on $2

Code:
awk '$7>M[$2]{M[$2]=$7;A[$2]=$0}END{for(i in A)print A[i]}' infile

These 2 Users Gave Thanks to rdcwayx For This Post:
# 5  
Old 10-18-2010
Quote:
Originally Posted by ctsgnb
OMG awk is darn powerful ... i really should get deeper into it

Could someone kindly comment the code pls ?
1. do awk execute M[$2]=$7 before the checking condition $7>M[$2] ?
2. for(i in A)print A[i] i doesn't need to be initialized ? it automatically scan the A array ?
3. how does it know which [$2] index value taken to scan this array ?

Thanks in advance for you lightening
I adjusted $1 to $2 in your question.
1. M[$2] does not exist when $7>M[$2] gets executed the first time. But in awk this is no problem and it evaluates to $7 being bigger.
2. yes
3. The arrays in awk are associative. The for..in construct iterates over the elements present.

Last edited by Scrutinizer; 10-18-2010 at 10:16 AM..
This User Gave Thanks to Scrutinizer For This Post:
# 6  
Old 10-18-2010
Quote:
Originally Posted by Scrutinizer
I adjusted $1 to $2 in your question.
1. M[$2] does not exist when $7>M[$2] gets executed the first time. But in awk this is no problem and it evaluates to $7 being bigger.
2. yes
3. The arrays in awk are associative. The for..in construct iterates over the elements present.
awk Array are associative ?...

Does it mean that we don't care about the index value [$2] ?
I mean : this $2 value is in fact not the value of the index itself : it is mapped to an internal increment** of the current A[i] element ?
(**) this internal increment would be the real one over which the for construct iterates.

Is that correct or am i missing something ?
# 7  
Old 10-18-2010
Hi, no in the END section, the array element indices are equal to the $2 fields of the input records.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

UNIX sort

Hi I have below pattern A: Apple 2 B:Bolls 4 total_count = 6 A: pens 4 B:Bags 4 total count = 8 A: pens 4 B:Bags 4 A: cells 6 A: jobs 6 Output I need : A: Apple 2 B:Bolls 4 total_count = 6 A: pens 4 B:Bags 4 total count = 8 A: cells 6 A: jobs 6 ... (7 Replies)
Discussion started by: pkkanduk
7 Replies

2. Shell Programming and Scripting

Issue with Sort in Unix

Hi All, I am trying to sort the below data using sort command. temp.dat H|S1-511091486889|27-Jul-2011 00:00:00 H|S1-511091486823|27-Jul-2011 00:00:00 H|S1-511091486757|27-Jul-2011 00:00:00 L|S1-511091486889|1 L|S1-511091486823|1 L|S1-511091486757|1 sort -t "|" -k2 -k1 temp.dat My... (5 Replies)
Discussion started by: deepaknbk
5 Replies

3. Shell Programming and Scripting

need Unix script to sort

Hi i have a file like this oprvdw vrc002093j.ksh oprvdw vrc002092j.ksh oprvrc vrc045016j.ksh oprvrc vrc055141j.ksh svemietl bdw0231185.sh svemietl bdw0231145.sh and i need a script which dispalys in below format: oprvdw : vrc002093j.ksh vrc002092j.ksh oprvrc :... (0 Replies)
Discussion started by: p_satyambabu
0 Replies

4. UNIX for Dummies Questions & Answers

UNIX sort command

Need some help with the sort command. I have a large file which needs sorted on the third field separated by : and within the third field, I need it sorted by second field or everything after the . An example of my file is here and for example, the first line I need :ROUTER2.SFLDMI: sorted on the... (2 Replies)
Discussion started by: numele
2 Replies

5. UNIX for Dummies Questions & Answers

using Unix sort command

Hi I am having some difficulties with the UNIX sort command. I want to sort one a file that looks like this (file A): tiger 5 6 3 5 2 bear 4 5 2 1 8 lions 9 2 5 3 1 dogs 8 5 3 3 1 acccording to a file that looks like this (file B): dogs lions tiger bear So... (2 Replies)
Discussion started by: phil_heath
2 Replies

6. UNIX for Dummies Questions & Answers

UNIX Sort question

I was trying to check for the sort of some columns (say 1-10) of particular file. Now, by default, the Unix sort uses as a separator whitespace (e.g. if you have 'foo bar' then it separates it into 'foo' and 'bar' to use as keys) Now, I know which particular columns I want to use as the sort... (1 Reply)
Discussion started by: rev.meister
1 Replies

7. UNIX for Dummies Questions & Answers

unix SORT

Hey guys. I am trying to sort a file by account number through UNIX. I have a few things but it seems to sort by account number AND sort everything after the account number. Help please. Thanks (5 Replies)
Discussion started by: ndoggy020
5 Replies

8. Shell Programming and Scripting

SORT order in Unix

I am converting mainframes JCL to be used in shell on a one to one basis... when i use the sort command unix does ascii sort as a result which numbers are first followed by charecters in the Ascending sort ... but themainframes uses the EBCDIC as result gives the charecters followed by numbers in... (5 Replies)
Discussion started by: bourne
5 Replies

9. UNIX for Dummies Questions & Answers

Unix Sort - Limitations

Hi All, I want to sort a flat file which will contain millions of records based on a key/field. For this I want to use unix sort command and before that I want to make sure that unix sort command has any file size limitations. And also please let me know whether I have to change any... (2 Replies)
Discussion started by: chprvkmr
2 Replies

10. UNIX for Advanced & Expert Users

Unix Sort - Alternatives

Hi All, I want to sort a flat file which will contain millions of records based on a key/field. For this I want to use unix sort command and before that I want to make sure that unix sort command has any file size limitations. And also please let me know whether I have to change any... (1 Reply)
Discussion started by: chprvkmr
1 Replies
Login or Register to Ask a Question