Sponsored Content
Top Forums Shell Programming and Scripting Sort and join multiple columns using awk Post 302763927 by Yoda on Wednesday 30th of January 2013 04:52:53 PM
Old 01-30-2013
Here is another dirty approach using join command.

Since join command cannot accept more than 2 files at a time, we have to split the orignal file and perform the join operation:
Code:
awk -F'\t' '{ print $1,$2 > "file1"; print $3,$4 > "file2"; print $5,$6 > "file3"; print $7,$8 > "file4"; print $9,$10 > "file5"; }' filename
join -a1 -1 1 -2 1 -o 1.1 1.2 2.2 -e "0" file1 file2 > j1
join -a1 -1 1 -2 1 -o 1.1 1.2 1.3 2.2 -e "0" j1 file3 > j2
join -a1 -1 1 -2 1 -o 1.1 1.2 1.3 1.4 2.2 -e "0" j2 file4 > j3
join -a1 -1 1 -2 1 -o 1.1 1.2 1.3 1.4 1.5 2.2 -e "0" j3 file5


Last edited by Yoda; 01-30-2013 at 07:05 PM.. Reason: set field separator to tab spaces
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk or python to join alternating columns

Hi, here is my input TEST-SAM-A6ZZ SM-S6KV 0 0 2 1 2 3 3 3 2 2 2 2 1 1 2 2 1 3 1 3 1361 SM-HA4Q 0 0 2 1 3 3 3 3 2 2 2 2 1 1 2 2 1 3 3 3 4 2 4 2 2 276217 SM-H9ZG 0 0 2 2 3 3 3 3 2 2 2 2 1 1 2 2 3 3 1 3 2 2 2 2 GC15458 SM-HAQX 0 0 2 1 2 3 1 3 2 2 2 2 1 1 2 2 1 3 1 3 2 2 2 18331 SM-HA5E 0 0 2 2... (2 Replies)
Discussion started by: genehunter
2 Replies

2. UNIX for Dummies Questions & Answers

Join 2 files with multiple columns: awk/grep/join?

Hello, My apologies if this has been posted elsewhere, I have had a look at several threads but I am still confused how to use these functions. I have two files, each with 5 columns: File A: (tab-delimited) PDB CHAIN Start End Fragment 1avq A 171 176 awyfan 1avq A 172 177 wyfany 1c7k A 2 7... (3 Replies)
Discussion started by: InfoSeeker
3 Replies

3. Shell Programming and Scripting

sort by based on multiple columns

Hi, Is there any way to sort a file in cshell by sort command, sorting it by multiple fields, like to sort it first by the second column and then by the first column. Thanks forhead (1 Reply)
Discussion started by: Takeeshe
1 Replies

4. Shell Programming and Scripting

sort on multiple columns

Howdy! Need to sort a large .txt file containing the following, using sort. First based on the 1st column, and then on the 2nd column: Group01.01 1000500 31 0.913 -1.522974494 Group01.01 1001500 16 0.684 -0.967496041 Group01.01 36500 19 0.476 na Group01.02 365500 15 0.400 na... (1 Reply)
Discussion started by: sramirez
1 Replies

5. Shell Programming and Scripting

sort on multiple columns

hi all, i have a file , having few columns. i wanted to sort it based on 2nd column and then based on 1st column. But i have some problem in first column. first column have characters and numbers, but problem is number of characters are not same in all rows. Few rows have 13 characters and then... (3 Replies)
Discussion started by: deepakiniimt
3 Replies

6. Programming

Ls -ltr Sort multiple columns

Hi All, I have one requirement, where I need to have output of ls -l command sorted on 1) first on filename 2) last modified time ( descending ) - latest change first. I am not able to figure out how to do it.. Also I dont have a way to change Date display for ls -ltr command.. I am... (1 Reply)
Discussion started by: freakabhi
1 Replies

7. Shell Programming and Scripting

Join two files combining multiple columns and produce mix and match output

I would like to join two files when two columns in each file matches with each other and then produce an output when taking multiple columns. Like I have file A 1234,ABCD,23,JOHN,NJ,USA 2345,ABCD,24,SAM,NY,USA 5678,GHIJ,24,TOM,NY,USA 5678,WXYZ,27,MAT,NJ,USA and file B ... (2 Replies)
Discussion started by: mady135
2 Replies

8. Shell Programming and Scripting

Join columns across multiple lines in a Text based on common column using BASH

Hello, I have a file with 2 columns ( tableName , ColumnName) delimited by a Pipe like below . File is sorted by ColumnName. Table1|Column1 Table2|Column1 Table5|Column1 Table3|Column2 Table2|Column2 Table4|Column3 Table2|Column3 Table2|Column4 Table5|Column4 Table2|Column5 From... (6 Replies)
Discussion started by: nv186000
6 Replies

9. Shell Programming and Scripting

Join and merge multiple files with duplicate key and fill void columns

Join and merge multiple files with duplicate key and fill void columns Hi guys, I have many files that I want to merge: file1.csv: 1|abc 1|def 2|ghi 2|jkl 3|mno 3|pqr file2.csv: (5 Replies)
Discussion started by: yjacknewton
5 Replies

10. Shell Programming and Scripting

Join, merge, fill NULL the void columns of multiples files like sql "LEFT JOIN" by using awk

Hello, This post is already here but want to do this with another way Merge multiples files with multiples duplicates keys by filling "NULL" the void columns for anothers joinning files file1.csv: 1|abc 1|def 2|ghi 2|jkl 3|mno 3|pqr file2.csv: 1|123|jojo 1|NULL|bibi... (2 Replies)
Discussion started by: yjacknewton
2 Replies
JOIN(1) 						      General Commands Manual							   JOIN(1)

NAME
join - relational database operator SYNOPSIS
join [-an] [-e s] [-o list] [-tc] file1 file2 DESCRIPTION
Join forms, on the standard output, a join of the two relations specified by the lines of file1 and file2. If file1 is `-', the standard input is used. File1 and file2 must be sorted in increasing ASCII collating sequence on the fields on which they are to be joined, normally the first in each line. There is one line in the output for each pair of lines in file1 and file2 that have identical join fields. The output line normally con- sists of the common field, then the rest of the line from file1, then the rest of the line from file2. Fields are normally separated by blank, tab or newline. In this case, multiple separators count as one, and leading separators are dis- carded. These options are recognized: -an In addition to the normal output, produce a line for each unpairable line in file n, where n is 1 or 2. -e s Replace empty output fields by string s. -o list Each output line comprises the fields specified in list, each element of which has the form n.m, where n is a file number and m is a field number. -tc Use character c as a separator (tab character). Every appearance of c in a line is significant. SEE ALSO
sort(1), comm(1), awk(1). BUGS
With default field separation, the collating sequence is that of sort -b; with -t, the sequence is that of a plain sort. The conventions of join, sort, comm, uniq, look and awk(1) are wildly incongruous. 7th Edition April 29, 1985 JOIN(1)
All times are GMT -4. The time now is 01:59 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy