Sorting by pairs


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Sorting by pairs
# 1  
Old 07-13-2015
Sorting by pairs

Can sort sort by pair of lines?
infile:
Code:
ID 15
GNJSMKSNS
ID 25
GNJSMKSNS
ID 1
GNJSMKSNS

outfile:
Code:
ID 1
GNJSMKSNS
ID 15
GNJSMKSNS
ID 25
GNJSMKSNS

sort -grk 2 obviously does not work
# 2  
Old 07-13-2015
No. The sort utility sorts lines; not groups of lines. You can however, join pairs of lines, sort the result, and then split the sorted output into pairs of lines again. Assuming that there aren't any tab characters in your input file, a simple way to do it would be:
Code:
awk '{printf("%s%s",$0,(NR%2)?"\t":"\n")}' infile|sort -k2,2n|tr '\t' '\n'

which with the following infile contents (modified so we can be sure that the correct even numbered lines in the output follow the same lines they followed in the input file):
Code:
ID 15
GNJSMKSNS 1
ID 25
GNJSMKSNS 2
ID 1
GNJSMKSNS 3

produces the output:
Code:
ID 1
GNJSMKSNS 3
ID 15
GNJSMKSNS 1
ID 25
GNJSMKSNS 2

Of course, if all of the even numbered lines in your input files are identical (as in your sample); you could delete the even numbered lines, sort the remaining lines, and reinsert the deleted lines.
This User Gave Thanks to Don Cragun For This Post:
# 3  
Old 07-14-2015
This is what I was looking for
Code:
 awk '{printf("%s%s",$0,(NR%2)?"\t":"\n")}' input|sort -rk 3|tr '\t' '\n'

Could you please explain me the following parts of the code:
"%s%s"
?
:
# 4  
Old 07-14-2015
Quote:
Originally Posted by Xterra
This is what I was looking for
Code:
 awk '{printf("%s%s",$0,(NR%2)?"\t":"\n")}' input|sort -rk 3|tr '\t' '\n'

Could you please explain me the following parts of the code:
"%s%s"
?
:
awk: Use awk
': with the script:
printf(: print
"%s%s": two strings
,$0: where the 1st string is the current input line (without the trailing <newline> character)
,(NR%2)?"\t":"\n"): and the 2nd string is a <tab> character if the current input line is an odd numbered line and the 2nd string is a <newline> character if the current input line is an even numbered line
': ending the awk script
input: naming the file to be processed by awk
|sort -rk 3: sorting the output from awk using a sort key that is a decreasing alphanumeric sort starting with the 3rd field on the line, and for lines that match on the 3rd field uses a decreasing alphanumeric sort on any remaining fields after the 3rd field, and for lines that match on the 3rd field to the end of the line perform an decreasing alphanumeric sort on field 1, and for lines that still match, perform an decreasing alphanumeric sort on field 2
|tr '\t' '\n': and, finally, converting the <tab> characters in the joined lines sorted by sort back into <newline> characters (thereby splitting the lines apart again).

Note: I originally said that fields 1 and 2 would be sorted in increasing order (which would have been true with sort -k 3r), but, with sort -rk 3, all fields will be sorted in decreasing order.

Last edited by Don Cragun; 07-14-2015 at 03:25 PM.. Reason: Fix sort behavior description.
These 3 Users Gave Thanks to Don Cragun For This Post:
# 5  
Old 07-14-2015
Alternative to awk

Hi.

Some versions of paste can do the stitching job:
Code:
$ paste - - < input
ID 15	GNJSMKSNS 1
ID 25	GNJSMKSNS 2
ID 1	GNJSMKSNS 3

Then the sort and severing of the super-lines. See man paste for details.

This was on a system:
Code:
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian 5.0.8 (lenny, workstation) 
paste (GNU coreutils) 6.10

Best wishes ... cheers, drl
This User Gave Thanks to drl For This Post:
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Concatenate files by pairs

Hi, I would like to batch concatenate files by pairs. I have quite a few of them so I would not like to do that pair by pair separately. the names of the file is of the type: file1.fastq newfile1_new.fastq file2.fastq newfile2_new.fastq and so on... I would like to concatenate file1... (2 Replies)
Discussion started by: jawad
2 Replies

2. Shell Programming and Scripting

Looping through files in pairs

Hi all, Please guide. It has to do with parsing the input file names. I have a fairly large number of files, I want to do some operations on them in a pairwise fashion (every file has a pair). The names are in the following pattern, with the pairs of files named with _1 and _2 , the... (4 Replies)
Discussion started by: newbie83
4 Replies

3. Shell Programming and Scripting

Extracting key/value pairs in awk

I am extracting a number of key/value pairs in awk using following: awk ' /xyz_session_id/ { n=index($0,"xyz_session_id"); id=substr($0,n+15,25); a=$4; } END{ for (ix in a) { print a } }' I don't like this Index + substr with manually calculated... (5 Replies)
Discussion started by: migurus
5 Replies

4. UNIX for Dummies Questions & Answers

Remove Duplicate Two Line Pairs?

So I have a bunch of files that look like this >gi|33332323 MMKCRGVIMVVEKVMKRDGRIVPFDESRIRWAVQ--- >gi|45235353 MMKCR----VEKMRDVFFDESIRWAVQ They go on...sequences are much longer but all in two line (fasta) format. I want to remove duplicate pairs of ID(GI) number and sequence. I tried... (12 Replies)
Discussion started by: bakere19
12 Replies

5. Shell Programming and Scripting

extracting non-zero pairs of numbers from each row

Hi all, I do have a tab delimited file a1 a2 b1 b2 c1 c2 d1 d2 e1 e2 f1 f2 0 0 123 546 0 0 0 0 0 0 0 0 0 0 345 456 765 890 902 1003 0 0 0 0 534 768 0 0 0 0 0 0 0 0 0 0 0 0 0 0 456 765 0 0 0 0 0 0 0 0 0 0 0 0 12 102 0 0 0 0 456 578 789 1003 678 765 345 400 801 1003 134 765... (5 Replies)
Discussion started by: Lucky Ali
5 Replies

6. Shell Programming and Scripting

keep only pairs

Hi, I am trying to paste together two files which are like this- 1 2 4 and 2 3 4 The paste command would work by default for this case, but when there are cases where the number of entries are different in each file. for eg: 1 3 and 1 3 4 I want to make it such that the odd... (11 Replies)
Discussion started by: jamie_123
11 Replies

7. UNIX Desktop Questions & Answers

trying to cat multiple pairs of files

I have a number of files in a directory named like this: fooP1, fooN1, fooP2, fooN2 ... fooP(i), fooN(i). I'd like to know how to combine each P and N pair into a single file, foo(i) TIA John Balwit (1 Reply)
Discussion started by: balwit
1 Replies

8. Shell Programming and Scripting

How to swap order of pairs of lines?

This seems to be a question whose answer uses sed or awk. For a file like: a b c d e How to swap the order of the line pairs, to end up with: b a d c e All lines from the original file need to wind up in the output file. (8 Replies)
Discussion started by: rd5817
8 Replies

9. Shell Programming and Scripting

concatenate lines in pairs

Hi, I have a text file with the following contents /C=IT/O=INFN/OU=Personal Certificate/L=Napoli/CN=Some guy /C=IT/O=INFN/CN=INFN CA /O=Grid/O=NorduGrid/OU=uninett.no/CN=Another guy /O=Grid/O=NorduGrid/CN=NorduGrid Certification Authority /C=TW/O=AP/OU=GRID/CN=Someone else... (5 Replies)
Discussion started by: kerl
5 Replies

10. Shell Programming and Scripting

PERL name value pairs substituions

I have a main file with variable tokens like this: name: File1 =========== Destination/Company=@deploy.company@ Destination/Environment=@deploy.env@ Destination/Location=@deploy.location@ Destination/Domain=@deploy.location@ MIG_GatewayAddresses=@deploy.gwaddress@ MIG_URL=@deploy.mig_url@... (1 Reply)
Discussion started by: uandme2k2
1 Replies
Login or Register to Ask a Question