Match strings in two files and compare columns of both


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Match strings in two files and compare columns of both
# 1  
Old 08-26-2009
Match strings in two files and compare columns of both

Good Morning,


I was wondering if anybody could tell me how to achieve the following, preferably with a little commenting for understanding.


I have 2 files, each with multiple rows with multiple columns.

I need to find each row where the value in column 1 of file 1 matches column 1 in file 2. I then need to compare the 4th column from the rows and if the value in file 2 is greater than in file 1 then output to a third file the matching column 1 value, the 2nd column value,the 4th column value in file 1 and file 2 and the difference between them.

An example should make the above clearer.

File 1

123 john blah 2
456 tony blah 7
789 michelle blah 9
111 james blah 3


File 2

135 gary blah 6
456 tony blah 13
789 michelle blah 4
111 james blah 19


So the below should be output to file 3:

456 tony 7 13 6
111 james 3 19 16


Many thanks in advance for any answers
# 2  
Old 08-26-2009
Try...

Code:
 
awk 'NR==FNR{arr[$1]=$2","$4}NR!=FNR{for(i in arr){split(arr[i],ss,",");
if(i==$1 && ss[2]<$4) print i,ss[1],ss[2],$4,$4-ss[2]}}' file1 file2

# 3  
Old 08-26-2009
Worked 100% perfectly.

You're an absolute star mate Smilie

Not sure how the bits award thing works, but did it (hopefully correctly).

Last edited by GarciasMuffin; 08-26-2009 at 08:32 AM..
# 4  
Old 08-26-2009
Code:
awk 'NR==FNR{arr[$1]=$2","$4}NR!=FNR{for(i in arr){split(arr[i],ss,",");
if(i==$1 && ss[2]<$4) print i,ss[1],ss[2],$4,$4-ss[2]}}' file1 file2

I was trying to understand how this command works...

I get this error when I try to run this command. Please help debug this..

awk: arr is not an array
record number 1
# 5  
Old 08-26-2009
it is not giving me that kind of error when i run that code, may be you better try with nawk or gawk inplace of awk.
# 6  
Old 08-26-2009
Wow! nawk works for me.
# 7  
Old 02-07-2010
This example works, I have a little more complicated comparison opperation. I have an index of IDs. These IDs are part of several substrings of a file. I want to see which IDs are in the file and which ones are not. Here are a few lines.

23873_ChemDiv_000A-0001
AVtclcactv01291016372D 0 0.00000 0.00000 1

31 34 0 0 0 0 0 0 0 0999 V2000

The ID is the first 5 characters of this string 23873_ChemDiv_000A-0001
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Match patterns between two files and extract certain range of strings

Hi, I need help to match patterns from between two different files and extract region of strings. inputfile1.fa >l-WR24-1:1 GCCGGCGTCGCGGTTGCTCGCGCTCTGGGCGCTGGCGGCTGTGGCTCTACCCGGCTCCGG GGCGGAGGGCGACGGCGGGTGGTGAGCGGCCCGGGAGGGGCCGGGCGGTGGGGTCACGTG... (4 Replies)
Discussion started by: bunny_merah19
4 Replies

2. UNIX for Beginners Questions & Answers

Data match 2 files based on first 2 columns matching only and join if match

Hi, i have 2 files , the data i need to match is in masterfile and i need to pull out column 3 from master if column 1 and 2 match and output entire row to new file I have tried with join and awk and i keep getting blank outputs or same file is there an easier way than what i am... (4 Replies)
Discussion started by: axis88
4 Replies

3. UNIX for Dummies Questions & Answers

Match the columns between two files and output

Hi Help, I have two files namely a.txt and b.txt a.txt looks like a.txt 1 2 2 1 3 3 2 4 4 4 5 6 6 7 7 b.txt looks like, b.txt 1 2 1 1 3 2 2 4 3 3 4 4 4 5 5 (2 Replies)
Discussion started by: Indra2011
2 Replies

4. Shell Programming and Scripting

Compare 2 csv files by columns, then extract certain columns of matcing rows

Hi all, I'm pretty much a newbie to UNIX. I would appreciate any help with UNIX coding on comparing two large csv files (greater than 10 GB in size), and output a file with matching columns. I want to compare file1 and file2 by 'id' and 'chain' columns, then extract exact matching rows'... (5 Replies)
Discussion started by: bkane3
5 Replies

5. Shell Programming and Scripting

Match strings in 2 different files

Hi, i am trying to match strings from 2 different files based on position like below:- file1 (tab delimited) f07270 lololol fff u12730 gggddd dddkkrr mmm file2 (not tab delimited) %f07270 APSLH bl%alalalalallaadsdsfdfdfdgsgfss %g13450 GDIDFLRIP%ILITEAPPRKgsfgsgsf %d08880... (11 Replies)
Discussion started by: redse171
11 Replies

6. Shell Programming and Scripting

Match the columns between 2 files

I have two files I want to match ids in the 5th column of the file 1 with the first column of the file 2 and get the description for the matched ids as shown in the output sno nm no nm2 ID 1 cc 574372 yyyi |6810|51234| 2 bb 119721 nmjk |6810|51234|51179| ... (4 Replies)
Discussion started by: raj_k
4 Replies

7. Shell Programming and Scripting

Match columns several files

Hey fellas! Here come my problem. I appreciate if you have a look at it. I have several files with following structure: file_1:1 21 4 45 file_2:2 31 4 153 6 341 and so on... and I have a 'reference' file look like this: File_ref:A 1 B 2 C 3 (5 Replies)
Discussion started by: @man
5 Replies

8. Shell Programming and Scripting

Match list of strings in File A and compare with File B, C and write to a output file in CSV format

Hi Friends, I'm a great fan of this forum... it has helped me tone my skills in shell scripting. I have a challenge here, which I'm sure you guys would help me in achieving... File A has a list of job ids and I need to compare this with the File B (*.log) and File C (extend *.log) and copy... (6 Replies)
Discussion started by: asnandhakumar
6 Replies

9. Shell Programming and Scripting

Compare one files with strings from another + remove lines

Have two files and want to compare the content of file1 with file2. When matched remove the line. awk 'NR==FNR {b; next} !(b in $0)' file1 file2file1 1. if match 2. removefile2 1. this line has to be removed if match 2. this line has a match, remove 3. this line has no match, no removingThe... (3 Replies)
Discussion started by: sdf
3 Replies

10. Shell Programming and Scripting

How to compare 2 files & get only few columns based on a condition related to both files?

Hiiiii friends I have 2 files which contains huge data & few lines of it are as shown below File1: b.dat(which has 21 columns) SSR 1976 8 12 13 10 44.00 39.0700 70.7800 7.0 0 0.00 0 2.78 0.00 0.00 0 0.00 2.78 0 NULL ISC 1976 8 12 22 32 37.39 36.2942 70.7338... (6 Replies)
Discussion started by: reva
6 Replies
Login or Register to Ask a Question