Merge two files based on a 3rd key file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Merge two files based on a 3rd key file
# 1  
Old 01-06-2011
Merge two files based on a 3rd key file

Hi,
I want to merge the two files based on the key file's columns.

The key file:
Code:
 
DATE~DATE 
HOUSE~IN_HOUSE 
CUST~IN_CUST 
PRODUCT~PRODUCT 
ADDRESS~CUST_ADDR 
BASIS_POINTS~BASIS_POINTS 
...

The other 2 files are From_file & To_file -
The From_file:
Code:
 
DATE|date/time|29|9 
HOUSE|integer|15|0 
CUST|integer|15|0 
PRODUCT|integer|15|0 
ADDRESS|string|256|0 
BASIS_POINTS|number|4|5 
....

The To_file:
Code:
DATE|date/time|29|9 
IN_HOUSE|integer|15|0 
IN_CUST|integer|15|0 
PRODUCT|integer|15|0 
CUST_ADDR|string|256|0 
BASIS_POINTS|number|4|5  
...

The o/p file has to be created by matching the left column in the key file to the first column in the from_file and substituting the entire line from the from_file. Similarly for the right column in the key file........to that of the first column in the to_file.

Hence the o/p would look something like this:
OUTPUT:
Code:
 
DATE|date/time|29|9~DATE|date/time|29|9 
HOUSE|integer|15|0~IN_HOUSE|integer|15|0
CUST|integer|15|0~IN_CUST|integer|15|0 
PRODUCT|integer|15|0~PRODUCT|integer|15|0 
ADDRESS|string|256|0~CUST_ADDR|string|256|0 
BASIS_POINTS|number|4|5~BASIS_POINTS|number|4|5 
...

I came accross a link https://www.unix.com/shell-programmin...tch-rules.html in this forum about the matching strings in 2 files......but I got really confused Smilie!! Please can someone suggest?

-dips
# 2  
Old 01-06-2011
Hi,
Try this,


Code:
 awk -F"~" 'NR==FNR{a[$1]++;b[$2]++;next} {split($0,c,"|");if(a[c[1]]){printf $0"~";getline tmp < "to_file"; split(tmp,d,"|");if(b[d[1]]){print tmp}}}' key_file from_file

This User Gave Thanks to pravin27 For This Post:
# 3  
Old 01-06-2011
Code:
awk -F\| 'FILENAME=="from_file"{a[$1]=$0;next;}FILENAME=="to_file"{b[$1]=$0;next;}{FS="~";if(a[$1]) $1=a[$1];if(b[$2]) $2=b[$2];}1' OFS="~" from_file to_file key_file


Last edited by anurag.singh; 01-07-2011 at 07:43 AM..
# 4  
Old 01-06-2011
Small variation:
Code:
awk '{sub(/ *$/,x)} NR==FNR{A[$1]=$0;m++;next} NR-FNR==m{B[$1]=$0;next} {print A[$1],B[$2]}' FS=\| OFS=\~ fromfile tofile FS=\~ keyfile

This User Gave Thanks to Scrutinizer For This Post:
# 5  
Old 01-06-2011
@Scrutinizer, given that every keyfile entry has reference in from and to files. If so, null check if a[$1] and b[$2] can be removed from my post (Also FILENAME=="key_file" check is not needed).
# 6  
Old 01-06-2011
@anurag, there is almost no difference with your solution, that is why I called it a small variation. I removed the trailing spaces which seemed to be present in the input files through the sub statement and used a different way to vary field separators and determining the file.. I think the checks are not needed, since it will produce an empty reference, which is preferable IMO.
# 7  
Old 01-06-2011
Seems a bit fluffy if they have serious amounts of data. I suspect 3-4 sorts, 2 joins and 1 intermediate file would do it in bulk: sort two files, join them, possibly resort the result, sort the third file, join it and the result.

With the right SQL flat file tools, you can just do it as a three way join.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Join and merge multiple files with duplicate key and fill void columns

Join and merge multiple files with duplicate key and fill void columns Hi guys, I have many files that I want to merge: file1.csv: 1|abc 1|def 2|ghi 2|jkl 3|mno 3|pqr file2.csv: (5 Replies)
Discussion started by: yjacknewton
5 Replies

2. Shell Programming and Scripting

awk - Merge two files based on one key

Hi, I am struggling with the an awk command to merge two files based on a common key. I want to append the value from File2 ($2) onto the end of File1 where $1 from each file matches - If no match then nothing is apended File1 COL1|COL2|COL3|COL4|COL5|COL6|COL7... (3 Replies)
Discussion started by: Ads89
3 Replies

3. UNIX for Dummies Questions & Answers

Merge selective columns from files based on common key

Hi, I am trying to selectively merge two files based on keys reported in the 1st column. File1: #file1-header1 file1-header2 111 qwe rtz uio 198 asd fgh jkl 165 yxc 789 poi uzt rew 89 lkj File2: #file2-header2 file2-header2 165 ghz nko2 ... (2 Replies)
Discussion started by: dovah
2 Replies

4. Shell Programming and Scripting

Merge files based on columns

011111123444 1234 1 20000 011111123444 1235 1 30000 011111123446 1234 3 40000 011111123447 1234 4 50000 011111123448 1234 3 50000 File2: 011111123444,Rsttponrfgtrgtrkrfrgtrgrer 011111123446,Rsttponrfgtrgtr 011111123447,Rsttponrfgtrguii 011111123448,Rsttponrfgtrgtjiiu I have 2 files... (4 Replies)
Discussion started by: vinus
4 Replies

5. Shell Programming and Scripting

Merge files based on the column value

Hi Friends, I have a file file1.txt 1|ABC|3|jul|dhj 2|NHU|4|kil|eu 3|hjd|34|hfd|43 file2.txt 1||3|KING|dhj 2|NHU||k| 3|hjd|34|hd|43 i want to merge file1.txt file2.txt based on the column null values in file2.txif there are any nulls in column values , (5 Replies)
Discussion started by: i150371485
5 Replies

6. Shell Programming and Scripting

Merge multiple lines in same file with common key using awk

I've been a Unix admin for nearly 30 years and never learned AWK. I've seen several similar posts here, but haven't been able to adapt the answers to my situation. AWK is so damn cryptic! ;) I have a single file with ~900 lines (CSV list). Each line starts with an ID, but with different stuff... (6 Replies)
Discussion started by: protosd
6 Replies

7. Shell Programming and Scripting

Gawk / Awk Merge Lines based on Key

Hi Guys, After windows died on my netbook I installed Lubuntu and discovered Gawk about a month ago. After using Excel for 10+ years I'm amazed how quick and easily Gawk can process data but I'm stuck with a little problem merging data from multiple lines. I'm an SEO Consultant and provide... (9 Replies)
Discussion started by: Jamesfirst
9 Replies

8. Shell Programming and Scripting

"Join" or "Merge" more than 2 files into single output based on common key (column)

Hi All, I have working (Perl) code to combine 2 input files into a single output file using the join function that works to a point, but has the following limitations: 1. I am restrained to 2 input files only. 2. Only the "matched" fields are written out to the "matched" output file and... (1 Reply)
Discussion started by: Katabatic
1 Replies

9. Shell Programming and Scripting

merge two two txt files into one file based on one column

Hi, I have file1.txt and file2.txt and would like to create file3.txt based on one column in UNIX Eg: file1.txt 17328756,0000786623.pdf,0000786623 20115537,0000793892.pdf,0000793892 file2.txt 12521_74_4.zip,0000786623.pdf 12521_15_5.zip,0000793892.pdf Desired Output ... (5 Replies)
Discussion started by: techmoris
5 Replies

10. Shell Programming and Scripting

Merge files based on key

Hi Friends, Can any one help me with merging these file based on two columns : File1: A|123|99|SAMS B|456|95|GEORGE D|789|85|HOVARD File2: S|123|99|NANcY|6357 S|123|99|GREGRO|83748 A|456|95|HARRY|827|somers S|456|95|ANTONY|546841|RUDOLPH|7263 B|456|95|SMITH|827|BOISE STATE|834... (3 Replies)
Discussion started by: sbasetty
3 Replies
Login or Register to Ask a Question