Match and append - awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Match and append - awk
# 1  
Old 04-26-2015
Match and append - awk

ALL,

Please help with this ...


File1
Code:
 
000433,ds00d11,tdev,ds00d11_view,0CD3
000433,ds00d12_34,tdev,ds00d12_view,132D

File2
Code:
CG01_ds00d11_drs,rs1_ds00d11_0CD3_114D,000433,0CD3                            
CG01_ds00d11_drs,rs1_ds00d11_0CD3_114D,000566,114D                          
 CG02_ds00d12_34_drs,rs2_ds00d12_132D_114F,000433,132D
CG02_ds00d12_34_drs,rs2_ds00d12_132D_114F,000566,114F

$5 in line1(file1) matches to $4 in line1(file2) , get value of $2 from line1(file2)
$2 in line1(file2) matches to $2 line2(file2) , get $1,$3,$4 from that line and append after $5 line1(file1)
Same logic to be used for rest
There will be always 2 occurrences where $2 in line2 matches

Output
Code:
000433,ds00d11,tdev,ds00d11_view,0CD3,CG01_ds00d11_drs,000566,114D
000433,ds00d12_34,tdev,ds00d12_view,132D,CG02_ds00d12_34_drs,000566,114F

Thanks
# 2  
Old 04-26-2015
What have you tried to solved this problem?
# 3  
Old 04-26-2015
I was using a regular "loop" method ... but wanted to use a simpler AWK version for a bigger file

Code:
 
 > /tmp/output
 for i in `cat /tmp/file1`
 do
 VAR1=`echo $i |awk -F, '{print $NF}'`
 OUT=`cat /tmp/file2 | awk -F, -v VAR2=$VAR1 '$NF ~ VAR2{print $2}'`
 APPEND=`cat /tmp/file2 |grep $OUT |grep -v $VAR1$ |awk -F, '{print $3","$4}'`
 echo "$i,$APPEND" >> /tmp/output
 done

# 4  
Old 04-27-2015
There seem to be a few problems in your sample input files.

You imply that both of your input files use comma as your field separator, but the OCD3 in File1 is not the same as OCD3 followed by more than 20 spaces in File2.

Is there really supposed to be a space at the start of line 3 in File2?

The code that you said works (but you want something better for larger files), produces the output:
Code:
000433,ds00d11,tdev,ds00d11_view,0CD3,000433,0CD3                            
000566,114D                          
000433,ds00d12_34,tdev,ds00d12_view,132D,000566,114F

for the sample input you provided; not what you said you wanted:
Code:
000433,ds00d11,tdev,ds00d11_view,0CD3,CG01_ds00d11_drs,000566,114D
000433,ds00d12_34,tdev,ds00d12_view,132D,CG02_ds00d12_34_drs,000566,114F

The following awk script seems to do what you want:
Code:
awk '
BEGIN {	FS = ",| *$"
	OFS = ","
}
NR == FNR {
	data[lf[$4] = $2] = $1 OFS $3 OFS $4
#printf("in: %s\ndata[lf[%s](%s)]: %s\n", $0, $4, lf[$4], data[lf[$4]])
	next
}
{	print $0, data[lf[$5]]
}' File2 File1

If you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk. With your sample input files, the above code produces the output:
Code:
000433,ds00d11,tdev,ds00d11_view,0CD3,CG01_ds00d11_drs,000566,114D
000433,ds00d12_34,tdev,ds00d12_view,132D,CG02_ds00d12_34_drs,000566,114F

which seems to be an exact match for what you said you wanted.
# 5  
Old 04-27-2015
Thanks Don,
it works fine for the above scenario ... but there could be a case in file2 like this

Code:
 
 CG01_ds00d11_drs,rs1_ds00d11_0CD3_114D,000566,114D
 CG01_ds00d11_drs,rs1_ds00d11_0CD3_114D,000433,0CD3
 CG02_ds00d12_34_drs,rs2_ds00d12_132D_114F,000433,132D
 CG02_ds00d12_34_drs,rs2_ds00d12_132D_114F,000566,114F

As we used "next" in the AWK assuming the next matched line would come after the first one .but in the example above it comes before it .the line in RED is not getting matched .There still would be 2 matching lines only ... just may not be in the same order .

thx
# 6  
Old 04-28-2015
No. I thought your directions were quite clear:
Quote:
$5 in line1(file1) matches to $4 in line1(file2) , get value of $2 from line1(file2)
$2 in line1(file2) matches to $2 line2(file2) , get $1,$3,$4 from that line and append after $5 line1(file1)
Same logic to be used for rest
There will be always 2 occurrences where $2 in line2 matches
That said to me that if $5 in File1 matches $4 in File2, get and print the contents of the 2nd line in File2 that matches $2. I didn't see anything there that made me think you were trying to get "the other line" that matched $2 rather than "the 2nd line" that matched $2.

If I correctly understand your new requirements, try:
Code:
awk '
BEGIN {	FS = ",| *$"
	OFS = ","
}
NR == FNR {
	data[$2 SUBSEP (2 - c[$2]++)] = $1 OFS $3 OFS $4
	lf[$4] = $2 SUBSEP c[$2]
	next
}
{	print $0, data[lf[$5]]
}' File2 File1

which, with File1 containing:
Code:
000433,ds00d11,tdev,ds00d11_view,0CD3
000433,ds00d11,tdev,ds00d11_view,114D
000433,ds00d12_34,tdev,ds00d12_view,114F
000433,ds00d12_34,tdev,ds00d12_view,132D

and File2 from post#1 in this thread:
Code:
CG01_ds00d11_drs,rs1_ds00d11_0CD3_114D,000433,0CD3                            
CG01_ds00d11_drs,rs1_ds00d11_0CD3_114D,000566,114D                          
 CG02_ds00d12_34_drs,rs2_ds00d12_132D_114F,000433,132D
CG02_ds00d12_34_drs,rs2_ds00d12_132D_114F,000566,114F

produces the output:
Code:
000433,ds00d11,tdev,ds00d11_view,0CD3,CG01_ds00d11_drs,000566,114D
000433,ds00d11,tdev,ds00d11_view,114D,CG01_ds00d11_drs,000433,0CD3
000433,ds00d12_34,tdev,ds00d12_view,114F, CG02_ds00d12_34_drs,000433,132D
000433,ds00d12_34,tdev,ds00d12_view,132D,CG02_ds00d12_34_drs,000566,114F

Is that what you want?
# 7  
Old 04-28-2015
Hi Don ,

Thanks for all the effort ... looks like its working fine ,,,,


I am getting this output ..with the modified FILE2 , which is what I wanted

Code:
 
 000433,ds00d11,tdev,ds00d11_view,0CD3,CG01_ds00d11_drs,rs1_ds00d11_0CD3_114D,000566,114D
 000433,ds00d12_34,tdev,ds00d12_view,132D,CG02_ds00d12_34_drs,rs2_ds00d12_132D_114F,000566,114F


Last edited by greycells; 04-28-2015 at 08:33 AM.. Reason: Verified again
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Matching two fields in two csv files, create new file and append match

I am trying to parse two csv files and make a match in one column then print the entire file to a new file and append an additional column that gives description from the match to the new file. If a match is not made, I would like to add "NA" to the end of the file Command that Ive been using... (6 Replies)
Discussion started by: dis0wned
6 Replies

2. Shell Programming and Scripting

awk to print match or non-match and select fields/patterns for non-matches

In the awk below I am trying to output those lines that Match between file1 and file2, those Missing in file1, and those missing in file2. Using each $1,$2,$4,$5 value as a key to match on, that is if those 4 fields are found in both files the match, but if those 4 fields are not found then missing... (0 Replies)
Discussion started by: cmccabe
0 Replies

3. Shell Programming and Scripting

awk to update file based on partial match in field1 and exact match in field2

I am trying to create a cronjob that will run on startup that will look at a list.txt file to see if there is a later version of a database using database.txt as the source. The matching lines are written to output. $1 in database.txt will be in list.txt as a partial match. $2 of database.txt... (2 Replies)
Discussion started by: cmccabe
2 Replies

4. Shell Programming and Scripting

Obtain pattern from file; Append 1st Match

Not clear how to do so. Looking to append the 1st match of said pattern with 'OK TO REMOVE' file containing patter File1.txt RMS_QUANTITY_RT SMS_QUANTITY_RT file to search File2.txt <!-- dec=664, SMS_QUANTITY_RT --> <!-- dec=664, RMS_QUANTITY_RT --> Projected Results <!--... (3 Replies)
Discussion started by: TY718
3 Replies

5. Shell Programming and Scripting

Match and Append Based on file contexts

Not Sure how to do this. Some combo of awk and sed perhaps. If String in File1 match String in file2 then append file2 File1.txt BullTerrier Boxer Bulldog File2.txt <Defined info="AllAnimals" group="Adoptions" setting="animals"> <SomeID ="NumbersRepresentingDogName"> <for> <add... (2 Replies)
Discussion started by: TY718
2 Replies

6. Shell Programming and Scripting

Match value in column and append file with new values

Hi, I need help to match two files based on two columns. file_1 ID AA An Ca Ele Pro Su Ot Tra g13950 No No Yes No Yes Yes Yes Yes g05760 Yes No No No No Yes Yes Yes g12640 No No No No No No No No k17720 No Yes No No No No No Yes g05640 Yes Yes Yes No No Yes Yes Yes file_2 ... (8 Replies)
Discussion started by: redse171
8 Replies

7. Shell Programming and Scripting

Match exact and append zero

file 11 2 12 6 13 7 114 6 011 7 if I'm searching for 11, output needed is output: 11 2 011 7 Code: awk '$1 ~ /^11$/' file I used the above to match exact, but it avoiding "011 7" line too, how to resolve this? (6 Replies)
Discussion started by: Roozo
6 Replies

8. UNIX for Dummies Questions & Answers

Match values from 2 files and append certain fields

Hi, I need help on appending certain field in my file1.txt based on matched patterns in file2.txt using awk or sed. The blue color need to match with one of the data in field $2 in file2.txt. If match, BEGIN and FINISHED value in red will have a new value from field $3 and $4 accordingly. ... (1 Reply)
Discussion started by: redse171
1 Replies

9. Shell Programming and Scripting

Match data based on two fields, and append to a line

I need to write a program to do something like a 'vlookup' in excel. I want to match data from file2 based on two fields (where both match) in file1, and for matching lines, add the data from two of the fields from file2 to file1. If anyone knows something in perl or awk that can do this, I'd be... (20 Replies)
Discussion started by: jamessmith01
20 Replies

10. UNIX for Dummies Questions & Answers

Match & append the files

Hi All, I have a problem in appending the files File 1 0.0000001 0.500000039894 0.0000002 0.500000079788 0.0000003 0.500000119683 0.0000004 0.500000159577 0.0000005 0.500000199471 0.0000006 0.500000239365 0.0000007 0.500000279260 0.0000008 0.500000319154 0.0000009 0.500000359048... (2 Replies)
Discussion started by: shashi_kiran_v
2 Replies
Login or Register to Ask a Question