Sort and extract based on two files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Sort and extract based on two files
# 1  
Old 03-09-2011
Sort and extract based on two files

Hi,

I am having trouble sorting one file based on another file. I tried the grep -f function and failed. Basically what I have is two files that look like this:

File 1 (the list)

Code:
gh
aba
for
hmm

File 2 ( the file that needs to be sorted)

Code:
aba  2  4  6  7
for   2  4  7  4
hmm  1  2  7  4
gh  2  5  7  9

So file 1 is a list that has names in a particular order and I want to sort file 2 according to that order while also extracting the other columns.

So the end output would look like this.

Final file

Code:
gh  2  5  7  9
aba  2  4  6  7
for   2  4  7  4
hmm  1  2  7  4

Thanks

Phil

---------- Post updated at 03:30 PM ---------- Previous update was at 03:29 PM ----------

the file is tab separated.
# 2  
Old 03-09-2011
Code:
awk 'FILENAME=="file2"  {arr[$1]=$0}
       FILENAME=="file1"  {print arr[$1]} '  file2 file1

There has to be a one to one correspondance between file1 and file2 - ie., if file1 is missing one of the keys that is in file2, that line will not print at all.
# 3  
Old 03-09-2011
Hi.

Here is a script that uses a non-standard sort utility that admits alternate collating sequences, msort:
Code:
#!/usr/bin/env bash

# @(#) s1	Demonstrate alternate collating sequence.
# msort-home http://freshmeat.net/projects/msort

# Section 1, setup, pre-solution.
# Infrastructure details, environment, commands for forum posts. 
# Uncomment export command to test script as external user.
# export PATH="/usr/local/bin:/usr/bin:/bin"
set +o nounset
pe() { for i;do printf "%s" "$i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
C=$HOME/bin/context && [ -f $C ] && . $C specimen msort
set -o nounset
pe

FILE=${1-data1}
shift
CS=${1-data2}

# Section 2, display input file and collating sequence file.
# Display sample of data file, with head & tail as a last resort.
pe " || start [ first:middle:last ]"
specimen $FILE $CS \
|| { pe "(head/tail)"; head -n 5 $FILE; pe " ||"; tail -n 5 $FILE; }
pe " || end"

# Section 3, solution.
pl " Results:"
msort -q -n 1,1 -u n -l -c lexicographic -s $CS -1 $FILE

exit 0

producing:
Code:
% ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian GNU/Linux 5.0.7 (lenny) 
GNU bash 3.2.39
specimen (local) 1.17
msort - ( /usr/bin/msort Apr 24 2008 )

 || start [ first:middle:last ]
Whole: 5:0:5 of 4 lines in file "data1"
aba  2  4  6  7
for   2  4  7  4
hmm  1  2  7  4
gh  2  5  7  9

Whole: 5:0:5 of 4 lines in file "data2"
gh
aba
for
hmmm
 || end

-----
 Results:
gh  2  5  7
aba  2  4  6  7
for   2  4  7
hmm  1  2  7  4

If you are using Debian GNU/Linux, msort is in the repository for lenny and squeeze, but not in wheezy yet. The freshmeat site has links to a number of packages for other OSs.

Good luck ... cheers, drl
This User Gave Thanks to drl For This Post:
# 4  
Old 03-09-2011
Quote:
Originally Posted by jim mcnamara
Code:
awk 'FILENAME=="file2"  {arr[$1]=$0}
       FILENAME=="file1"  {print arr[$1]} '  file2 file1

There has to be a one to one correspondance between file1 and file2 - ie., if file1 is missing one of the keys that is in file2, that line will not print at all.
And if file1 contains a key that is missing from file2 it will print a blank line (this can be addressed with a slight change in the awk script).
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

How to sort the files by size and based subdirectory un UNIX?

I have the below input data in a file and need to get the output as mentioned below. Need to sort the data by size(Asc/des)/by subdirectory Below is the input which is there in a file: 120 /root/path2/part-00000-d3700305-428d-4b13-8161-42051f4ac5ed-c000.json 532 ... (3 Replies)
Discussion started by: ajarramuk
3 Replies

2. Shell Programming and Scripting

Extract date from files based on file pattern

I want to extract dates from the files and i have different types of files with pattern. I have list file with the patterns and want to date extract based on it in a sh script Files in the directory : file1_20160101.txt file2_20160101_abc.txt filexyz20160101.txt list file with... (2 Replies)
Discussion started by: lijjumathew
2 Replies

3. UNIX for Dummies Questions & Answers

Sort Files based on the number(s) on the file name

Experts I have a list of files in the directory mysample1 mysample2 mysample3 mysample4 mysample5 mysample6 mysample7 mysample8 mysample9 mysample10 mysample11 mysample12 mysample13 mysample14 mysample15 (4 Replies)
Discussion started by: dsedi
4 Replies

4. Shell Programming and Scripting

Sort log files based on numeric value in the filename

Hi, I have a list of log files as follows: name_date_0001_ID0.log name_date_0001_ID2.log name_date_0001_ID1.log name_date_0002_ID2.log name_date_0004_ID0.log name_date_0005_ID0.log name_date_0021_ID0.log name_date_0025_ID0.log .......................................... (4 Replies)
Discussion started by: alex2005
4 Replies

5. Shell Programming and Scripting

How to sort files based on file name having numbers

Right now there is no unix direct commad that can sort the files base on its name having numbers: We can use the following: In case your file name are like: abc-UP018.zip xyz-UP019.zip ls *|sort -t'-' -k2 (2 Replies)
Discussion started by: asifansari
2 Replies

6. Shell Programming and Scripting

sort the files based on timestamp and execute sorted files in order

Hi I have a requirement like below I need to sort the files based on the timestamp in the file name and run them in sorted order and then archive all the files which are one day old to temp directory My files looks like this PGABOLTXML1D_201108121235.xml... (1 Reply)
Discussion started by: saidutta123
1 Replies

7. Shell Programming and Scripting

Sort and extract based on two files

Hi, I am having trouble sorting one file based on another file. I tried the grep -f function and failed. Basically what I have is two files that look like this: File 1 (the list) gh aba for hmm File 2 ( the file that needs to be sorted) aba 2 4 6 7 for 2 4 7 4... (4 Replies)
Discussion started by: phil_heath
4 Replies

8. UNIX for Dummies Questions & Answers

sort lines in different files based on the starting letter

Hi ,, i have the below file... D 2342135 B 214236 C argjlksd V lskjrghaklsr C slkrgj B sdg4tsd E aslkgjlkasg i want to sort the lines into different files based on the starting letter of the line. so that i have different files for lines starting with a letter. thanks (1 Reply)
Discussion started by: jathin12
1 Replies

9. Shell Programming and Scripting

AWK or KSH : Sort, Group and extract from 3 files

Hi, I've the following two CSV files: File1.csv File2.csv Class,Student# Student#,Marks 1001,6001 6002,50 1001,6002 6001,60 1002,7000 ... (3 Replies)
Discussion started by: Matrix2682
3 Replies

10. Shell Programming and Scripting

How to Sort files based on predefined values.?

How to Sort files based on predefined values.? Normally Sorting happens for the alphabetic or numberic orders.. Is there any way to sort a fields based on the Field values..? Field10 has : one two three five four six ten seven eight nine. in predefined order { one, two, three,... (2 Replies)
Discussion started by: p_prathaban
2 Replies
Login or Register to Ask a Question