Compare two files and add new information


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Compare two files and add new information
# 1  
Old 07-09-2011
Compare two files and add new information

Hi,
I want to compare two fields in two different files and add a corresponding field in a third output file. Something similar to vlookup.
Please see the attached input files and the example output file.
I want to compare each entry in column 1 in file1 with column 5 in file2. If both the entries match I want to add the entry found in column 3 of file1 to a new column in file2 and create a new file. The output file should look like the attached outfile.txt

Thank you in advance.
# 2  
Old 07-09-2011
Code:
mute@goflex:~/test$ awk -v OFS='\t' 'NR == FNR { a[$1] = $3 } NR != FNR && a[$5] { print $0 OFS a[$5] }' file1.txt file2.txt
1       7565748 A       C       LRG_7   Nonsyn  2396    cgt     CACNA1A
3       9475755 T       N       LRG_83  Nonsyn  46      act     MLPH
7       3527492 A       T       LRG_88  fs      396     gat     NCF2
14      9858493 C       G       LRG_7   nmd     9396    cgt     CACNA1A

---------- Post updated at 08:30 PM ---------- Previous update was at 08:25 PM ----------

if there is sometimes non-matching lines you need to keep (i noticed typo of LRG_82/LRG_85 in file2 and outfile)

Code:
mute@goflex:~/test$ awk -v OFS='\t' 'NR == FNR { a[$1] = $3; next } NR != FNR && a[$5] { print $0 OFS a[$5]; next } 1' file1.txt file2.txt
1       7565748 A       C       LRG_7   Nonsyn  2396    cgt     CACNA1A
1       8576859 G       -       LRG_82  syn     8576    cgat
3       9475755 T       N       LRG_83  Nonsyn  46      act     MLPH
7       3527492 A       T       LRG_88  fs      396     gat     NCF2
14      9858493 C       G       LRG_7   nmd     9396    cgt     CACNA1A

This User Gave Thanks to neutronscott For This Post:
# 3  
Old 07-10-2011
Thank you so much neutronscott! It works perfect. I really like the second solution.
# 4  
Old 07-11-2011
I need some more help with the same problem. The above solution works perfect to print out one field based on a match. In the same scenario how can I print out data from columns 3, 4 and 5 from file1.txt (instead of only column 3).
Thank you again for all the help.
# 5  
Old 07-11-2011
accidental double-post
# 6  
Old 07-11-2011
Maybe just replace the $3 with 3,4,5. "OFS" means Output Field Separator, which we set as tab. So change

Code:
a[$1] = $3

to

Code:
a[$1] = $3 OFS $4 OFS $5

This User Gave Thanks to neutronscott For This Post:
# 7  
Old 07-11-2011
Thank you so much again!! You are awesome! Smilie
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

Script to parse and compare information in two fields of file

Hello, I am working parsing a large input file1(field CFA) I have to compare the the file1 field(CFA byte 88-96) with the content of the file2(It contains only one field) and and insert rows equal in another file. Here is my code and sample input file: ... (7 Replies)
Discussion started by: GERMANOS
7 Replies

2. Shell Programming and Scripting

search information in multiple files and save in new files

hi everyone, im stuck in here with shell :) can you help me?? i have a directory with alot files (genbank files ... all ended in .gbk ) more than 1000 for sure ... and i want to read each one of them and search for some information and if i found the right one i save in new file with new... (6 Replies)
Discussion started by: andreia
6 Replies

3. Shell Programming and Scripting

Getting information from various files

I have got this piece of csh code that looks into various log files and outputs some parameters For example, I might have 4 files and want to grep for the lines containing "Best Value" npt02-z30-sr65-rgdt0p50-dc0p004-16x12drw.log npt02-z30-sr65-rgdt0p50-dc0p004-16x12drw-run2.log... (6 Replies)
Discussion started by: kristinu
6 Replies

4. Shell Programming and Scripting

Require compare command to compare 4 files

I have four files, I need to compare these files together. As such i know "sdiff and comm" commands but these commands compare 2 files together. If I use sdiff command then i have to compare each file with other which will increase the codes. Please suggest if you know some commands whcih can... (6 Replies)
Discussion started by: nehashine
6 Replies

5. Shell Programming and Scripting

Compare 2 folders to find several missing files among huge amounts of files.

Hi, all: I've got two folders, say, "folder1" and "folder2". Under each, there are thousands of files. It's quite obvious that there are some files missing in each. I just would like to find them. I believe this can be done by "diff" command. However, if I change the above question a... (1 Reply)
Discussion started by: jiapei100
1 Replies

6. Shell Programming and Scripting

How to compare 2 files & get only few columns based on a condition related to both files?

Hiiiii friends I have 2 files which contains huge data & few lines of it are as shown below File1: b.dat(which has 21 columns) SSR 1976 8 12 13 10 44.00 39.0700 70.7800 7.0 0 0.00 0 2.78 0.00 0.00 0 0.00 2.78 0 NULL ISC 1976 8 12 22 32 37.39 36.2942 70.7338... (6 Replies)
Discussion started by: reva
6 Replies

7. Shell Programming and Scripting

Compare two files using awk or sed, add values in a column if their previous fields are same

Hi All, I have two files file1: abc,def,ghi,5,jkl,mno pqr,stu,ghi,10,vwx,xyz cba,ust,ihg,4,cdu,oqw file2: ravi,def,kishore ramu,ust,krishna joseph,stu,mike I need two output files as follows In my above example, each row in file1 has 6 fields and each row in file2 has 3... (3 Replies)
Discussion started by: yerruhari
3 Replies

8. UNIX for Dummies Questions & Answers

Compare two files using awk or sed, add values in a column if their previous fields are same

Hi All, I have two files file1: abc,def,ghi,5,jkl,mno pqr,stu,ghi,10,vwx,xyz cba,ust,ihg,4,cdu,oqw file2: ravi,def,kishore ramu,ust,krishna joseph,stu,mike I need two output files as follows In my above example, each row in file1 has 6 fields and each row in file2 has 3... (1 Reply)
Discussion started by: yerruhari
1 Replies

9. UNIX for Advanced & Expert Users

Compare two files using awk or sed, add values in a column if their previous fields are same

Hi All, I have two files file1: abc,def,ghi,5,jkl,mno pqr,stu,ghi,10,vwx,xyz cba,ust,ihg,4,cdu,oqw file2: ravi,def,kishore ramu,ust,krishna joseph,stu,mike I need two output files as follows In my above example, each row in file1 has 6 fields and each row in file2 has 3... (1 Reply)
Discussion started by: yerruhari
1 Replies

10. Shell Programming and Scripting

compare files in two directories and output changed files to third directory

I have searched about 30 threads, a load of Google pages and cannot find what I am looking for. I have some of the parts but not the whole. I cannot seem to get the puzzle fit together. I have three folders, two of which contain different versions of multiple files, dist/file1.php dist/file2.php... (4 Replies)
Discussion started by: bkeep
4 Replies
Login or Register to Ask a Question