How to match in 2 files and generate 3rd file?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to match in 2 files and generate 3rd file?
# 1  
Old 07-17-2014
How to match in 2 files and generate 3rd file?

Hello,

I have 2 tables (first file with colon separated, second file comma separated) like below:

Please note that the matching number (kind of primary key) is a number and is NOT unique. It is 2nd column in table1 and 4th column in table2.

Code:
# cat table1
vgbpjdata1:80
vgbpjdata2:50
vgbpjdata3:50
vgbpjdata4:80

Code:
# cat table2
vpath50,c54t4d2,52428800,50
vpath51,c40t4d3,83886080,80
vpath52,c40t4d0,83886080,80
vpath56,c36t4d1,52428800,50

Code:
# cat output_file
MY_CMD vgbpjdata1 /dev/dsk/c40t4d3
MY_CMD vgbpjdata2 /dev/dsk/c54t4d2
MY_CMD vgbpjdata3 /dev/dsk/c36t4d1
MY_CMD vgbpjdata4 /dev/dsk/c40t4d0

Please help, thanks!
# 2  
Old 07-17-2014
You could use awk like this:

Code:
awk '
FS==","{ k[$4]=($4 in k? k[$4]"," : "") "/dev/dsk/" $2; next}
$2 in k {
  dev=k[$2]
  if (sub(",.*",x,dev)) sub(dev",",x,k[$2])
  else delete k[$2]
  print "MY_CMD",$1,dev
} ' FS=, table2 FS=: table1

This User Gave Thanks to Chubler_XL For This Post:
# 3  
Old 07-17-2014
Brilliant, worked great. Thanks a million.
# 4  
Old 07-17-2014
Depending on the awk version, this will not work; on my linux mawk the result is:
Code:
MY_CMD vgbpjdata1 
MY_CMD vgbpjdata2 
MY_CMD vgbpjdata3 /dev/dsk/c54t4d2
MY_CMD vgbpjdata4 /dev/dsk/c40t4d3

And, how can you tell which data set will be selected if you don't have a unique key to refer to it? Should that be in order of appearance (as in Chubler_XL's proposal) or is that sheer coincidence?
This User Gave Thanks to RudiC For This Post:
# 5  
Old 07-17-2014
Thanks RudiC, I was cringing a bit when I wrote that assignment, but it seemed to work OK.

This should be a more portable variation:

Code:
awk '
FS==","{ if($4 in k) k[$4]=k[$4]","; k[$4]=k[$4] "/dev/dsk/" $2; next}
$2 in k {
  dev=k[$2]
  if (sub(",.*",x,dev)) sub(dev",",x,k[$2])
  else delete k[$2]
  print "MY_CMD",$1,dev
} ' FS=, table2 FS=: table1

or even this:

Code:
awk '
  FS==","{ m[$4,++k[$4]]="/dev/dsk/" $2; next}
  k[$2] {print "MY_CMD",$1, m[$2,k[$2]--]} ' FS=, table2 FS=: table1


Last edited by Chubler_XL; 07-17-2014 at 05:32 PM.. Reason: Another shorter solution came to mind
# 6  
Old 07-17-2014
Thanks Chubler_XL and RudiC

It is NOT coincidence but the order of appearance (as shown in my sample output).

Chubler_XL's all 3 solutions worked for me. I'm using standard awk of HP UX. I tested with few samples and all worked fine. Could you please confirm which is the best and most likely to work on "all" samples?

Thanks again!
# 7  
Old 07-17-2014
I'd avoid solution #1 as there is no standards definition on what order assignment statements are processed so some awk implementations and possibly future updates to awk could cause it to fail. Both 2 and 3 are fine and it's you preference, whichever you find easier to understand.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Data match 2 files based on first 2 columns matching only and join if match

Hi, i have 2 files , the data i need to match is in masterfile and i need to pull out column 3 from master if column 1 and 2 match and output entire row to new file I have tried with join and awk and i keep getting blank outputs or same file is there an easier way than what i am... (4 Replies)
Discussion started by: axis88
4 Replies

2. Shell Programming and Scripting

In PErl script: need to read the data one file and generate multiple files based on the data

We have the data looks like below in a log file. I want to generat files based on the string between two hash(#) symbol like below Source: #ext1#test1.tale2 drop #ext1#test11.tale21 drop #ext1#test123.tale21 drop #ext2#test1.tale21 drop #ext2#test12.tale21 drop #ext3#test11.tale21 drop... (5 Replies)
Discussion started by: Sanjeev G
5 Replies

3. Shell Programming and Scripting

From 2 files create 3rd file with uncommon data

Hi All, I have two files. File1 and File2. Want to create another file with all the records of File1 those are not present in File2. Please guide. Thanks in advanced. Anupam (3 Replies)
Discussion started by: Anupam_Halder
3 Replies

4. Shell Programming and Scripting

awk to match field between two files and use conditions on match

I am trying to look for $2 of file1 (skipping the header) in $2 of file2 (skipping the header) and if they match and the value in $10 is > 30 and $11 is > 49, then print the line from file1 to a output file. If no match is foung the line is not printed. Both the input and output are tab-delimited.... (3 Replies)
Discussion started by: cmccabe
3 Replies

5. Shell Programming and Scripting

Generate files from one file based on lines

Hi Friends, I have a file1 file1.txt 1ABC 13478 aqjerh 473 343 2hej 478 5775 24578 23892 3fhd fg 847 brjkb f99345 487 4eh ehjk 84 47589 8947 234 5784 487 738 52895 8975 6 57489 eghe9 4575 859479 7fnbd 4y5 4iuy 458 h irh 8fjdg 74 7845 8475 5789 94yr 48yr 4hr erhj reh... (3 Replies)
Discussion started by: i150371485
3 Replies

6. Shell Programming and Scripting

How to generate a csv files by separating the values from the input file based on position?

Hi All, I need help for doing the following. I have a input file like: aaaaaaaaaabbbbbbbbbbbbbbbbbbbb cccbbbbbaaaaaadddddaaaabbbbbbb now I am trying to generate a output csv file where i will have for e.g. 0-3 chars of each line as the first column in the csv, 4-10 chars of the line as... (3 Replies)
Discussion started by: babom
3 Replies

7. Shell Programming and Scripting

Merge two files based on a 3rd key file

Hi, I want to merge the two files based on the key file's columns. The key file: DATE~DATE HOUSE~IN_HOUSE CUST~IN_CUST PRODUCT~PRODUCT ADDRESS~CUST_ADDR BASIS_POINTS~BASIS_POINTS ... The other 2 files are From_file & To_file - The From_file: DATE|date/time|29|9 ... (9 Replies)
Discussion started by: dips_ag
9 Replies

8. Shell Programming and Scripting

match text from two files and write to a third file

Hi all I have two files X.txt and Y.txt. Both file contains same number of sentences. The content of X.txt is The filter described above may be combined. and the content of Y.txt is The filter describ+ed above may be combin+ed. Some of the words are separated with "+"... (2 Replies)
Discussion started by: my_Perl
2 Replies

9. Ubuntu

Match col 1 of File 1 with col 1 File 2 and create a 3rd file

Hello, I have a 1.6 GB file that I would like to modify by matching some ids in col1 with the ids in col 1 of file2.txt and save the results into a 3rd file. For example: File 1 has 1411 rows, I ignore how many columns it has (thousands) File 2 has 311 rows, 1 column Would like to... (7 Replies)
Discussion started by: sogi
7 Replies

10. Shell Programming and Scripting

combining 2 files with more than one match in second file

Hello, I am attempting to combine two files where the second file can have more than one match with the lookup field (node) in the first file, onto one line of the output file. Also alerting if a lookup was not found in file2 =-=-=-=-=-=-= Example of file1 node,type =-=-=-=-=-=-= bob,232... (5 Replies)
Discussion started by: johnes42
5 Replies
Login or Register to Ask a Question