UNIX multiple column sort


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers UNIX multiple column sort
# 1  
Old 08-25-2015
UNIX multiple column sort

Hi,

I have a text file that has data like this

Code:
chr1    156106712       156106819       LMNA    8       +       1       147
chr1    156106712       156106819       LMNA    8       +       2       147
chr1    156106712       156106819       LMNA    8       +       3       148
chr1    156106712       156106819       LMNA    8       +       4       151
chr1    156106712       156106819       LMNA    8       +       5       151
chr1    156106712       156106819       LMNA    8       +       6       155
chr1    156106712       156106819       LMNA    8       +       7       155
chr1    10436603        10436645        KIF1B   46      +       1       207
chr1    10436603        10436645        KIF1B   46      +       2       207
chr1    10436603        10436645        KIF1B   46      +       3       211
chr1    10436603        10436645        KIF1B   46      +       4       212
chr1    10436603        10436645        KIF1B   46      +       5       212
chr1    10436603        10436645        KIF1B   46      +       6       212
chr1    10436603        10436645        KIF1B   46      +       7       208
chr1    10436603        10436645        KIF1B   46      +       8       207
chr1    10436603        10436645        KIF1B   46      +       9       207
chr1    10436603        10436645        KIF1B   46      +       10      207
chr1    10436603        10436645        KIF1B   46      +       11      205
chr1    10436603        10436645        KIF1B   46      +       12      202
chr1    10436603        10436645        KIF1B   46      +       13      199
chr1    10436603        10436645        KIF1B   46      +       14      199
chr1    10436603        10436645        KIF1B   46      +       15      198
chr1    10436603        10436645        KIF1B   46      +       16      198
chr1    10436603        10436645        KIF1B   46      +       17      198

I want an output to be sorted by the chromosome position then column 5 and 7

Code:
chr1    10436603        10436645        KIF1B   46      +       2       207
chr1    10436603        10436645        KIF1B   46      +       3       211
chr1    10436603        10436645        KIF1B   46      +       4       212
chr1    10436603        10436645        KIF1B   46      +       5       212
chr1    10436603        10436645        KIF1B   46      +       6       212
chr1    10436603        10436645        KIF1B   46      +       7       208
chr1    10436603        10436645        KIF1B   46      +       8       207
chr1    10436603        10436645        KIF1B   46      +       9       207
chr1    10436603        10436645        KIF1B   46      +       10      207
chr1    10436603        10436645        KIF1B   46      +       11      205
chr1    10436603        10436645        KIF1B   46      +       12      202
chr1    10436603        10436645        KIF1B   46      +       13      199
chr1    10436603        10436645        KIF1B   46      +       14      199
chr1    10436603        10436645        KIF1B   46      +       15      198
chr1    10436603        10436645        KIF1B   46      +       16      198
chr1    10436603        10436645        KIF1B   46      +       17      198
chr1    156106712       156106819       LMNA    8       +       1       147
chr1    156106712       156106819       LMNA    8       +       2       147
chr1    156106712       156106819       LMNA    8       +       3       148
chr1    156106712       156106819       LMNA    8       +       4       151
chr1    156106712       156106819       LMNA    8       +       5       151
chr1    156106712       156106819       LMNA    8       +       6       155
chr1    156106712       156106819       LMNA    8       +       7       155

How can I achieve this?

I tried sort -k1,1 -k2,2n but it did not work.Thanks in advance

Last edited by Don Cragun; 08-25-2015 at 06:09 PM.. Reason: Add CODE and ICODE tags.
# 2  
Old 08-25-2015
Quote:
Originally Posted by mitt
Hi,

I have a text file that has data like this

Code:
chr1    156106712       156106819       LMNA    8       +       1       147
chr1    156106712       156106819       LMNA    8       +       2       147
chr1    156106712       156106819       LMNA    8       +       3       148
chr1    156106712       156106819       LMNA    8       +       4       151
chr1    156106712       156106819       LMNA    8       +       5       151
chr1    156106712       156106819       LMNA    8       +       6       155
chr1    156106712       156106819       LMNA    8       +       7       155
chr1    10436603        10436645        KIF1B   46      +       1       207
chr1    10436603        10436645        KIF1B   46      +       2       207
chr1    10436603        10436645        KIF1B   46      +       3       211
chr1    10436603        10436645        KIF1B   46      +       4       212
chr1    10436603        10436645        KIF1B   46      +       5       212
chr1    10436603        10436645        KIF1B   46      +       6       212
chr1    10436603        10436645        KIF1B   46      +       7       208
chr1    10436603        10436645        KIF1B   46      +       8       207
chr1    10436603        10436645        KIF1B   46      +       9       207
chr1    10436603        10436645        KIF1B   46      +       10      207
chr1    10436603        10436645        KIF1B   46      +       11      205
chr1    10436603        10436645        KIF1B   46      +       12      202
chr1    10436603        10436645        KIF1B   46      +       13      199
chr1    10436603        10436645        KIF1B   46      +       14      199
chr1    10436603        10436645        KIF1B   46      +       15      198
chr1    10436603        10436645        KIF1B   46      +       16      198
chr1    10436603        10436645        KIF1B   46      +       17      198

I want an output to be sorted by the chromosome position then column 5 and 7

Code:
chr1    10436603        10436645        KIF1B   46      +       2       207
chr1    10436603        10436645        KIF1B   46      +       3       211
chr1    10436603        10436645        KIF1B   46      +       4       212
chr1    10436603        10436645        KIF1B   46      +       5       212
chr1    10436603        10436645        KIF1B   46      +       6       212
chr1    10436603        10436645        KIF1B   46      +       7       208
chr1    10436603        10436645        KIF1B   46      +       8       207
chr1    10436603        10436645        KIF1B   46      +       9       207
chr1    10436603        10436645        KIF1B   46      +       10      207
chr1    10436603        10436645        KIF1B   46      +       11      205
chr1    10436603        10436645        KIF1B   46      +       12      202
chr1    10436603        10436645        KIF1B   46      +       13      199
chr1    10436603        10436645        KIF1B   46      +       14      199
chr1    10436603        10436645        KIF1B   46      +       15      198
chr1    10436603        10436645        KIF1B   46      +       16      198
chr1    10436603        10436645        KIF1B   46      +       17      198
chr1    156106712       156106819       LMNA    8       +       1       147
chr1    156106712       156106819       LMNA    8       +       2       147
chr1    156106712       156106819       LMNA    8       +       3       148
chr1    156106712       156106819       LMNA    8       +       4       151
chr1    156106712       156106819       LMNA    8       +       5       151
chr1    156106712       156106819       LMNA    8       +       6       155
chr1    156106712       156106819       LMNA    8       +       7       155

How can I achieve this?

I tried sort -k1,1 -k2,2n but it did not work.Thanks in advance
Assuming that you don't really want to throw away the input line marked in red above, and assuming that the chromosome position that you want to sort on is in the 2nd column; it would seem that the command you want is:
Code:
sort -k2,2n -k5,5n -k7,7n file

where file is the name of the file you want to sort.
This User Gave Thanks to Don Cragun For This Post:
# 3  
Old 08-25-2015
Try with the right columns. What is the chromosome position? Column 4?
Code:
sort -k4,4 -k5,5n -k7,7n

# 4  
Old 08-25-2015
Quote:
Originally Posted by Don Cragun
Assuming that you don't really want to throw away the input line marked in red above, and assuming that the chromosome position that you want to sort on is in the 2nd column; it would seem that the command you want is:
Code:
sort -k2,2n -k5,5n -k7,7n file

where file is the name of the file you want to sort.
Thanks for this. It works, but I need to sort the first column too. If chromosome 12 has a start position lesser than chromosome 1 then that would appear. I would need to sort by chromosome and position.
# 5  
Old 08-25-2015
Assuming that the first column is ALWAYS "chr" followed by the chromosome number and that you want to perform a numeric sort on the chromosome number rather than an alphanumeric sort on the entire 1st field:
Code:
sort -k1.4,1n -k2,2n -k5,5n -k7,7n file

 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Assigning multiple column's value from Oracle query to multiple variables in UNIX

Hi All, I need to read values of 10 columns from oracle query and assign the same to 10 unix variables. The query will return only one record(row). I tried to append all these columns using a delimiter(;) in the select query and assign the same to a single variable(V) in unix. I thought I... (3 Replies)
Discussion started by: hkrishnan91
3 Replies

2. UNIX for Dummies Questions & Answers

Ls output sort on multiple column

Hi All, I have one requirement, where I need to have output of ls -l command sorted on 1) first on filename 2) last modified time ( descending ) - latest change first. I am not able to figure out how to do it.. Also I dont have a way to change Date display for ls -ltr command.. I am... (5 Replies)
Discussion started by: freakabhi
5 Replies

3. UNIX for Dummies Questions & Answers

UNIX Sort on a field that spans multiple columns

New to unix. I need to sort the records of a file by a control number field. That field is in POS 16 through 28. How do I do that? There are no delimiters, or spaces to separate fields. See example below. Each line is a record. REC1CCYYMMDD0018888888888888ABCDE... (1 Reply)
Discussion started by: jclanc8
1 Replies

4. Shell Programming and Scripting

Passing multiple column values to UNIX variable

sqlplus -s $USER_ID@$SID/$PWD<<EOF>sql_1.txt set feedback off set heading off select 114032 as c_1 from dual ; EOF for i in `cat sql_1.txt` do sh script_1.sh $i Currently i am passing one column value to the single unix variable. How can i pass the values from 2... (2 Replies)
Discussion started by: rafa_fed2
2 Replies

5. Shell Programming and Scripting

UNIX append field with comparing fields from multiple column

I have a csv dump from sql server that needs to be converted so it can be feed to another program. I already sorted on field 1 but there are multiple columns with same field 1 where it needs to be compared against and if it is same then append field 5. i.e from ANG SJ,0,B,LC22,LC22(0) BAT... (2 Replies)
Discussion started by: nike27
2 Replies

6. Shell Programming and Scripting

Sort based on Multiple Columns in UNIX

Hi, I would like to sort a list in different ways: 1> Unique based on Field 1 with highest Field 4 For Instance Input: 1678923450;11112222333344;11-1x;2_File.xml 1678923450;11112222333344;11-1x;5_File.xml 1234567890;11113333222244;11-1x;3_File.xml Output: ... (7 Replies)
Discussion started by: DevendraG
7 Replies

7. Shell Programming and Scripting

How to sort a column in UNIX that is colon separated ":" ?

Experts, how to sort this fields with numerical order : -How to use the sort command in this case, I was thinking with -k but it is not working, lan5000 lan5000:1 lan5000:10 lan5000:11 lan5000:12 lan5000:13 lan5000:14 lan5000:15 lan5000:16 lan5000:17 ... (6 Replies)
Discussion started by: rveri
6 Replies

8. UNIX for Dummies Questions & Answers

sort a unix file by 3rd column

Hi, Can anybody tell me how to sort a unix file by 3rd column and not by ltr? Please help Thanks in advance (2 Replies)
Discussion started by: vinnyvirk
2 Replies

9. Shell Programming and Scripting

Question about sort specific column and print other column at the same time !

Hi, This is my input file: ali 5 usa abc abu 4 uk bca alan 6 brazil bac pinky 10 utah sdc My desired output: pinky 10 utah sdc alan 6 brazil bac ali 5 usa abc abu 4 uk bca Based on the column two, I want to do the descending order and print out other related column at the... (3 Replies)
Discussion started by: patrick87
3 Replies

10. UNIX for Dummies Questions & Answers

Unix sort on multiple fields

Hello. I've read a few threads on how to sort on multiple fields, but I still can't get my file to sort correctly. I have a comma delimited .csv file will over a hundred fields. I want to sort it by field 2, field 62 and then field 61 (integer fields). input looks like this well swap field... (2 Replies)
Discussion started by: happy_cow
2 Replies
Login or Register to Ask a Question