Sort based on certain value in a column


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Sort based on certain value in a column
# 1  
Old 05-02-2014
Maybe I didn't understand what you're trying to do. I thought the Col_n was supposed to cause the following lines up to the next Col_n line to be sorted in increasing alphanumeric order based on the nth character in the line.

Unfortunately, with the sample data given, there is no way to tell if you're trying to sort on the whole line or on the nth character since the results would be the same. If you are trying to sort on the nth character, rdrtx1's script won't do that. Assuming that there aren't any spaces or tabs in your input file (at least not before the character position that is to be sorted), the following might work:
Code:
awk '
function finish() {
	if(sc != "") close(sc)
}

/^Col_/ {
	finish()
	print
	col = substr($0, 5)
	sc = sprintf("sort -k1.%d,1.%d", col, col)
	next
}
{	print | sc
}
END {	finish()
}' File1.txt

If your input file contains:
Code:
Col_1
SW_MH2_ST
ST_F72_9S
SW_MH3_S6
Col_10
SW_MH3_AS7
ST_S15_9CH
SW_MH3_AS8
SW_MH3_ST
Col_5
ST_M93_SZ
ST_C16_TC
Col_4
Abc4123
Cde3234
Bcd2345
Def1234

rdrtx1's script will produce:
Code:
Col_1
ST_F72_9S
SW_MH2_ST
SW_MH3_S6
Col_10
ST_S15_9CH
SW_MH3_AS7
SW_MH3_AS8
SW_MH3_ST
Col_5
ST_C16_TC
ST_M93_SZ
Col_4
Abc4123
Bcd2345
Cde3234
Def123

while the script above will produce:
Code:
Col_1
ST_F72_9S
SW_MH2_ST
SW_MH3_S6
Col_10
SW_MH3_ST
SW_MH3_AS7
SW_MH3_AS8
ST_S15_9CH
Col_5
ST_C16_TC
ST_M93_SZ
Col_4
Def1234
Bcd2345
Cde3234
Abc4123

If there are spaces in your input file, you need to specify a field separator in the sort command naming a character that can never appear in your input file. If there are tabs in the input file and you want to sort based on output line positions (rather than input character counts), you would have to expand input tabs to a variable number of spaces depending on where in the input line the tab(s) appear. And if you want to sort on output print positions and there are backspace characters in the input, you will need to give a much clearer explanation of what is supposed to happen.

If you want to try the above awk script on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk, /usr/xpg6/bin/awk, or nawk.
This User Gave Thanks to Don Cragun For This Post:
# 2  
Old 05-02-2014
Hi Don Cragun,

Thanks so much for kind explanation. Really appreciate that.
For my files right now, the Col_n is already in alphanumeric order and i just need to sort ascending the members of Col_n only, which makes rdrtx1's script work perfectly. As for your codes, i tried to understand it as it would be a big help for me if I need to sort the nth too in the future. I tried your code, but, i am wondering why "Col_5" comes before "Col_4" and why col_4 members are sorted descending? Hope u can help to explain them. Thanks.
# 3  
Old 05-02-2014
Your description of what to do was vague. You said:
Quote:
the output that i want is to sort based on "Col_X" (X is the number)
I thought that meant you were trying to sort lines following lines of the form Col_n in increasing alphanumeric order based on the nth column (i.e., input character position) in those lines. So, the output produced by my script is sorted on the characters marked in red:
Code:
Col_1   # following lines are sorted on character 1
ST_F72_9S
SW_MH2_ST
SW_MH3_S6
Col_10  # following lines are sorted on character 10
SW_MH3_ST
SW_MH3_AS7
SW_MH3_AS8
ST_S15_9CH
Col_5   # following lines are sorted on character 5
ST_C16_TC
ST_M93_SZ
Col_4   # following lines are sorted on character 4
Def1234
Bcd2345
Cde3234
Abc4123

# 4  
Old 05-02-2014
Hi,

Ok, now i get what u meant. Sorry for the confusion. Actually "Col_X" here just to represent groups. I just want to sort ascending the members of each group. thanks
# 5  
Old 05-02-2014
Quote:
Originally Posted by redse171
Hi,

Ok, now i get what u meant. Sorry for the confusion. Actually "Col_X" here just to represent groups. I just want to sort ascending the members of each group. thanks
OK. You asked how rdrtx1's script works. Basically, it reads the data from your input file into two arrays. One containing the separator lines, and the other containing the data for each group. After it has read all of the data, it prints the group name line and uses the sort utility to sort the data it accumulated for each group. A slightly simpler awk script with comments is:
Code:
awk '
/^Col_/ {
	# Group separator found.
	# Finish sorting the previous group, if there was a previous group.
	if(NR != 1) close("sort")
	# Print separator for next group.
	print
	# Skip to next line of input.
	next
}
{	# Send data for current group to sort command...
	print | "sort"
}
END {	# Finish sorting hte last group.
	close("sort")
}' File1.txt

This will not work if the 1st line in your input file does not start with Col_ and will probably produce an error message if your input file is an empty file, but I assume neither of these is a problem. In practice, the lines shown in orange could be left off, but it is good practice to explicitly close() any pipelines you open.
This User Gave Thanks to Don Cragun For This Post:
# 6  
Old 05-02-2014
Hi Don Cragun,

Many thanks!!! Really appreciate that. Smilie
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Sort based on one column

Hi All , I am having an input file like this Input file 7 sks/jsjssj/ddjd/hjdjd/hdhd/Q 10 0.5 13 dkdkd/djdjd/djdjd/djd/QB 01 0.5 ldld/dkd/jdf/fjfjf/fjf/Q 0.5 10 sjs/jsdd/djdkd/dhd/Q 01 0.5 21 kdkd/djdd/djdd/jdd/djd/QB 01 0.5 dkdld/djdjd/djd/Q 01 0.5 ... (9 Replies)
Discussion started by: kshitij
9 Replies

2. Shell Programming and Scripting

Use sort to sort numerical column

How to sort the following output based on lowest to highest BE? The following sort does not work. $ sort -t. -k1,1n -k2,2n bfd.txt BE31.116 0s 0s DOWN DAMP BE31.116 0s 0s DOWN DAMP BE31.117 0s 0s ... (7 Replies)
Discussion started by: sand1234
7 Replies

3. UNIX for Beginners Questions & Answers

How to align/sort the column pairs of an csv file, based on keyword word specified in another file?

I have a csv file as shown below, xop_thy 80 avr_njk 50 str_nyu 60 avr_irt 70 str_nhj 60 avr_ngt 50 str_tgt 80 xop_nmg 50 xop_nth 40 cyv_gty 40 cop_thl 40 vir_tyk 80 vir_plo 20 vir_thk 40 ijk_yuc 70 cop_thy 70 ijk_yuc 80 irt_hgt 80 I need to align/sort the csv file based... (7 Replies)
Discussion started by: dineshkumarsrk
7 Replies

4. Shell Programming and Scripting

Sum column values based in common identifier in 1st column.

Hi, I have a table to be imported for R as matrix or data.frame but I first need to edit it because I've got several lines with the same identifier (1st column), so I want to sum the each column (2nd -nth) of each identifier (1st column) The input is for example, after sorted: K00001 1 1 4 3... (8 Replies)
Discussion started by: sargotrons
8 Replies

5. Shell Programming and Scripting

Sort based on column 1, not working with awk

Hi Guru, I need some help regarding awking the output so it only show the first line (based on column) of each row. So If column has 1, three row, then it only show the first line of that row, based on similar character in column 1. So i am trying to achieve a sort, based on column one and... (3 Replies)
Discussion started by: Junes
3 Replies

6. UNIX for Dummies Questions & Answers

Sort command in one column and not effect to another column

If my data is numerical : 1 = 101 2 = 102 3 = 104 4 = 104 7 = 103 8 = 103 9 = 105 I need the result like below: 1 = 101 2 = 102 3 = 103 4 = 103 7 = 104 8 = 104 9 = 105 (4 Replies)
Discussion started by: GeodusT
4 Replies

7. UNIX for Dummies Questions & Answers

How to sort a column based on numerical ascending order if it includes e-10?

I have a column of numbers in the following format: 1.722e-05 2.018e-05 2.548e-05 2.747e-05 7.897e-05 4.016e-05 4.613e-05 4.613e-05 5.151e-05 5.151e-05 5.151e-05 6.1e-05 6.254e-05 7.04e-05 7.12e-05 7.12e-05 (6 Replies)
Discussion started by: evelibertine
6 Replies

8. Shell Programming and Scripting

Sort file based on column

Hi, My input file is $cat samp 1 siva 1 raja 2 siva 1 siva 2 raja 4 venkat i want sort this name wise...alos need to remove duplicate lines. i am using cat samp|awk '{print $2,$1}'|sort -u it showing raja 1 (3 Replies)
Discussion started by: rsivasan
3 Replies

9. Shell Programming and Scripting

sort on second column only based on first column

I have an input file like this... AAAlkalines Energizer AAAlkalines Energizer AAAlkalines Energizer AAAlkalines Sunlight AAAlkalines Sunlight AAAlkalines Sunlight AAAlkalines Energizer AAAlkalines Energizer AAAlkalines Energizer AAASalines ... (7 Replies)
Discussion started by: malcomex999
7 Replies

10. Shell Programming and Scripting

Question about sort specific column and print other column at the same time !

Hi, This is my input file: ali 5 usa abc abu 4 uk bca alan 6 brazil bac pinky 10 utah sdc My desired output: pinky 10 utah sdc alan 6 brazil bac ali 5 usa abc abu 4 uk bca Based on the column two, I want to do the descending order and print out other related column at the... (3 Replies)
Discussion started by: patrick87
3 Replies
Login or Register to Ask a Question