Sort based on Multiple Columns in UNIX


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Sort based on Multiple Columns in UNIX
# 1  
Old 07-16-2013
Sort based on Multiple Columns in UNIX

Hi,

I would like to sort a list in different ways:

1> Unique based on Field 1 with highest Field 4

For Instance Input:
Code:
1678923450;11112222333344;11-1x;2_File.xml
1678923450;11112222333344;11-1x;5_File.xml
1234567890;11113333222244;11-1x;3_File.xml

Output:
Code:
1234567890;11113333222244;11-1x;3_File.xml
1678923450;11112222333344;11-1x;5_File.xml

2> Unique based on Field 2, then Field 1 with highest Field 4

For Instance Input:
Code:
1234567890;11113333222244;11-1x;3_File.xml
1234567890;11112222333344;11-1x;1_File.xml
1678923450;11112222333344;11-1x;2_File.xml

Output:
Code:
1234567890;11112222333344;11-1x;1_File.xml
1678923450;11112222333344;11-1x;2_File.xml
1234567890;11113333222244;11-1x;3_File.xml


Last edited by Scott; 07-16-2013 at 01:22 AM.. Reason: Code tags, please...
# 2  
Old 07-16-2013
1>
Code:
sort -t\; -k1,1 -k4,4d infile | awk -F\; '!($1 in A) { print $0; A[$1] }'

2>
Code:
sort -t\; -k1,1 -k2,2 -k4,4d infile | awk -F\; '!($1$2 in A) { print $0; A[$1$2] }'

# 3  
Old 07-16-2013
Thanks for the reply.

1> Sorry but got this Code Output:
Code:
1234567890;11113333222244;11-1x;3_File.xml
1678923450;11112222333344;11-1x;2_File.xml

2> Code Output as expected.

Can you please explain the awk logic you've used so I can modify further for my reports?
# 4  
Old 07-16-2013
Sorry, should have checked my sort parameters the 4th field was supposed to be sorted in descending order to do the sort needs the r flag (not d) so your correct answers are:
Code:
sort -t\; -k1,1 -k4,4r infile | awk -F\; '!($1 in A) { print $0; A[$1] }'
sort -t\; -k1,1 -k2,2 -k4,4d infile | awk -F\; '!($1$2 in A) { print $0; A[$1$2] }'

For the awk logic:
-F\; use semicolon as the field separator, this allows awk to split the line up into $1, $2 ... $NF fields

!($1$2 in A) means only process lines where the value of field1 concatenated with field2 (your unique key) is not already in the array A

print $0 print the entire line

So in prints the first line where each unique key value occurs.

A[$1$2] add value field1 concatenated with field2 to the array A
This User Gave Thanks to Chubler_XL For This Post:
# 5  
Old 07-17-2013
Thanks.

Is there a way I can add a Counter Field at the end of each line. This Counter will keep a track of each occurrence of 1st Field.

For instance:
Code:
1234567890;11113333222244;11-1x;3_File.xml;Try-1
1234567890;11112222333344;11-1x;1_File.xml;Try-2
1678923450;11112222333344;11-1x;2_File.xml;Try-1

# 6  
Old 07-17-2013
Use something like this, using and array named C[] to keep track of your counter.

Code:
sort -t\; -k1,1 -k2,2r -k4,4r infile | awk -F\; '!($1$2 in A) { print $0 ";Try-" ++C[$1]; A[$1$2] }'

This User Gave Thanks to Chubler_XL For This Post:
# 7  
Old 07-18-2013
That worked perfectly. Thanks a lot!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Sort and remove duplicates in directory based on first 5 columns:

I have /tmp dir with filename as: 010020001_S-FOR-Sort-SYEXC_20160229_2212101.marker 010020001_S-FOR-Sort-SYEXC_20160229_2212102.marker 010020001-S-XOR-Sort-SYEXC_20160229_2212104.marker 010020001-S-XOR-Sort-SYEXC_20160229_2212105.marker 010020001_S-ZOR-Sort-SYEXC_20160229_2212106.marker... (4 Replies)
Discussion started by: gnnsprapa
4 Replies

2. Programming

Ls -ltr Sort multiple columns

Hi All, I have one requirement, where I need to have output of ls -l command sorted on 1) first on filename 2) last modified time ( descending ) - latest change first. I am not able to figure out how to do it.. Also I dont have a way to change Date display for ls -ltr command.. I am... (1 Reply)
Discussion started by: freakabhi
1 Replies

3. UNIX for Dummies Questions & Answers

UNIX Sort on a field that spans multiple columns

New to unix. I need to sort the records of a file by a control number field. That field is in POS 16 through 28. How do I do that? There are no delimiters, or spaces to separate fields. See example below. Each line is a record. REC1CCYYMMDD0018888888888888ABCDE... (1 Reply)
Discussion started by: jclanc8
1 Replies

4. Shell Programming and Scripting

Sort and join multiple columns using awk

Is it possible to join all the values after sorting them based on 1st column key and replace empty rows with 0 like below ? input a1 0 a1 1 a1 1 a3 1 b2 1 a2 1 a4 1 a2 1 a4 1 c4 1 a3 1 d1 1 a3 1 b1 1 d1 1 a4 1 c4 1 b2 1 b1 1 b2 1 c4 1 d1 1 output... (8 Replies)
Discussion started by: quincyjones
8 Replies

5. Shell Programming and Scripting

sort on multiple columns

hi all, i have a file , having few columns. i wanted to sort it based on 2nd column and then based on 1st column. But i have some problem in first column. first column have characters and numbers, but problem is number of characters are not same in all rows. Few rows have 13 characters and then... (3 Replies)
Discussion started by: deepakiniimt
3 Replies

6. Shell Programming and Scripting

sort on multiple columns

Howdy! Need to sort a large .txt file containing the following, using sort. First based on the 1st column, and then on the 2nd column: Group01.01 1000500 31 0.913 -1.522974494 Group01.01 1001500 16 0.684 -0.967496041 Group01.01 36500 19 0.476 na Group01.02 365500 15 0.400 na... (1 Reply)
Discussion started by: sramirez
1 Replies

7. Shell Programming and Scripting

sort by based on multiple columns

Hi, Is there any way to sort a file in cshell by sort command, sorting it by multiple fields, like to sort it first by the second column and then by the first column. Thanks forhead (1 Reply)
Discussion started by: Takeeshe
1 Replies

8. Shell Programming and Scripting

sort by multiple columns and reformat...

hello.. I have big file and so far I was able to shink it and make smaller with certains values that I need.. vendor1|2000|1 vendor2|1000|1 vendor2|5000|1 vendor2|500|2 vendor3|1000|2 vendor3|500|3 vendor4|500|3 Vendor5|500|1 vendor6|500|3 Vendor7|1000|1 Vendor8|1000|774... (3 Replies)
Discussion started by: abdulaziz
3 Replies

9. Shell Programming and Scripting

Remove lines, Sorted with Time based columns using AWK & SORT

Hi having a file as follows MediaErr.log 84 Server1 Policy1 Schedule1 master1 05/08/2008 02:12:16 84 Server1 Policy1 Schedule1 master1 05/08/2008 02:22:47 84 Server1 Policy1 Schedule1 master1 05/08/2008 03:41:26 84 Server1 Policy1 ... (1 Reply)
Discussion started by: karthikn7974
1 Replies

10. UNIX for Dummies Questions & Answers

Help needed to sort multiple columns in one file

Hi, I would like to know given that I have 3 columns. Let say I have first 3 columns to do operation and these operation output is printed out each line by line using AWK and associative array.Currently in the output file, I do a sort by -r for the operation output. The problem comes to... (1 Reply)
Discussion started by: ahjiefreak
1 Replies
Login or Register to Ask a Question