Matching string on two files based on match rules.


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Matching string on two files based on match rules.
# 8  
Old 12-13-2010
Matching string on two files based on match rules.

Thanks again for the detailed explanation!!

one point i missed out during the initial post is that there can be max of 2 trailing spaces in the pattern file (file 2) represented by ##. I tried replacing sub(/#/," ") with gsub(/#/," "), but it is not working

file1
PO BOX,A
PO BOX,A

file2
PO##,y

with the above files, im getting the output as
output

PO BOX,A,matchind
PO BOX,A,Y

I should also write the Active flag (on file2) in the output.
Adr, Cat,Matchind,Activeflg

Please help me out with this. Thanks.
# 9  
Old 12-13-2010
Strange, I changed sub(/#/," ") to gsub(/#/," ") and it seems to work
Code:
$ grep -E 'PO##|PO  ' file[21]
file2:PO##,y
file1:PO  BOX,A

Code:
$ awk -F, 'NR==FNR{gsub(/#/," ");if($2=="y")A[$1];next}{$3="N"}FNR==1{$3="matchind"}$2=="A"{for(i in A)if(" "$1~" "i)$3="Y"}1' OFS=, file2 file1
Adrfld,category,matchind
PO  BOX,A,Y
POST,A,N
avenue,A,Y
business,X,N
bus terminus,A,Y
first cross,A,Y
firstcross,A,N

Ah I noticed you do not have a header line in your test input and also it does not contain 2 spaces and there is no last field with Y

---------- Post updated at 12:40 ---------- Previous update was at 12:06 ----------

The active flag is always "y" , no? Otherwise it does not get printed..
Code:
awk -F, 'NR==FNR{gsub(/#/," ");if($2=="y")A[$1];next}{$3="N,y"}FNR==1{$3="matchind,Activeflg"}$2=="A"{for(i in A)if(" "$1~" "i)$3="Y,y"}1' OFS=, file2 file1

# 10  
Old 12-13-2010
Matching string on two files based on match rules.

Yes, the Active flag will always be 'Y' , but there are other columns on file2 (apart from Adrpatern,Activeflg) that have to be carried out on the output as is for down stream processing. Is there a way to include other columns from file2 on the output?

Many thanks.
# 11  
Old 12-13-2010
Yes can you provide sample input files and desired output?
# 12  
Old 12-13-2010
Matching string on two files based on match rules.

Here are the input files and expected output

file1
Code:
Adrfld,category
PO BOX,A
POST,A
avenue,A
business,X
bus terminus,A
PO  BOX,A

file2
Code:
Adrptrn,active,blankrule,cntry
ave,y,n,usa
PO#,y,y,usa
bus,y,n,usa
cross,y,n,usa
PO##,y,d,usa

output (file3) (for unmatched records, active,blankrule,cntry set to spaces)

Code:
Adrfld,category,matchind,active,blankrule,cntry
PO BOX,A,y,y,y,usa
POST,A,N, , , ,         
avenue,A,y,y,n,usa
business,X,N, , ,
bus terminus,A,y,y,n,usa
PO  BOX,A,y,y,d,usa

Thanks in advance.

Moderator's Comments:
Mod Comment
Please use code tags when posting data and code samples!

Last edited by vgersh99; 12-13-2010 at 09:19 AM.. Reason: code tags, please!
# 13  
Old 12-13-2010
Code:
awk -F, 'NR==FNR{gsub(/#/," ");if($2=="y")A[$1]=$2FS$3FS$4;next}
         FNR==1 {$3="matching,"A["Adrptrn"]}
         $2=="A"{for(i in A)if(" "$1~" "i)$3="y,"A[i]}
         !$3    {$3="n,y, , ,"}1' OFS=, file2 file1

This User Gave Thanks to Scrutinizer For This Post:
# 14  
Old 12-14-2010
MySQL Matching string on two files based on match rules.

Awesome !! Working perfectly. Thank you very much for your help.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Data match 2 files based on first 2 columns matching only and join if match

Hi, i have 2 files , the data i need to match is in masterfile and i need to pull out column 3 from master if column 1 and 2 match and output entire row to new file I have tried with join and awk and i keep getting blank outputs or same file is there an easier way than what i am... (4 Replies)
Discussion started by: axis88
4 Replies

2. Shell Programming and Scripting

Matching two fields in two csv files, create new file and append match

I am trying to parse two csv files and make a match in one column then print the entire file to a new file and append an additional column that gives description from the match to the new file. If a match is not made, I would like to add "NA" to the end of the file Command that Ive been using... (6 Replies)
Discussion started by: dis0wned
6 Replies

3. Shell Programming and Scripting

awk to print fields that match using conditions and a default value for non-matching in two files

Trying to use awk to match the contents of each line in file1 with $5 in file2. Both files are tab-delimited and there may be a space or special character in the name being matched in file2, for example in file1 the name is BRCA1 but in file2 the name is BRCA 1 or in file1 name is BCR but in file2... (6 Replies)
Discussion started by: cmccabe
6 Replies

4. Shell Programming and Scripting

New files based off match or no match

Trying to match $2 in original_targets with $2 of new_targets . If the two numbers match exactly then a match.txt file is outputted using the information in the new_targets in the beginning 4 fields $1, $2, $3, $4 and value of $4 in the original_targets . If there is "No Match" then a no... (2 Replies)
Discussion started by: cmccabe
2 Replies

5. Shell Programming and Scripting

Need to print the next word from the same line based on grep string condtion match.

I need to fetch particular string from log file based on grep condition match. Actual requirement is need to print the next word from the same line based on grep string condtion match. File :Java.lanag.xyz......File copied completed : abc.txt Ouput :abc.txt I have used below... (5 Replies)
Discussion started by: siva83
5 Replies

6. Shell Programming and Scripting

Match part of string in file2 based on column in file1

I have a file containing texts and indexes. I need the text between (and including ) INDEX and number "1" alone in line. I have managed this: awk '/INDEX/,/1$/{if (!/1$/)print}' file1.txt It works for all indexes. And then I have second file with years and indexes per year, one per line... (3 Replies)
Discussion started by: phoebus
3 Replies

7. Shell Programming and Scripting

Based on column in file1, find match in file2 and print matching lines

file1: file2: I need to find matches for any lines in file1 that appear in file2. Desired output is '>' plus the file1 term, followed by the line after the match in file2 (so the title is a little misleading): This is honestly beyond what I can do without spending the whole night on it, so I'm... (2 Replies)
Discussion started by: pathunkathunk
2 Replies

8. Shell Programming and Scripting

Matching 2 files based on one column

Hi, On a similar subject, the following. I have two files: file1.txt dbSNP_rsID,Chromosome,Position,Gene rs10399749,chr. 01,45162,? rs4030303,chr. 01,72434,? rs4030300,chr. 01,72515,? rs940550,chr. 01,78032,? rs13328714,chr. 01,81468,? rs11490937,chr. 01,222077,? rs6683466,chr.... (5 Replies)
Discussion started by: swvanderlaan
5 Replies

9. Shell Programming and Scripting

awk to print lines based on string match on another line and condition

Hi folks, I have a text file that I need to parse, and I cant figure it out. The source is a report breaking down softwares from various companies with some basic info about them (see source snippet below). Ultimately what I want is an excel sheet with only Adobe and Microsoft software name and... (5 Replies)
Discussion started by: rowie718
5 Replies

10. Shell Programming and Scripting

Concatenating and appending string based on specific pattern match

Input #GEO-1-type-1-fwd-Initial 890 1519 OPKHIJEFVTEFVHIJEFVOPKHIJTOPKEFVHIJTEFVOPKOPKHIJHIJHIJTTOPKHIJHIJEFVEFVOPKHIJOPKHIJOPKEFVEFVOPKHIJHIJEFVHIJHIJEFVTHIJOPKOPKTEFVEFVEFVOPKHIJOPKOPKHIJTTEFVEFVTEFV #GEO-1-type-2-fwd-Terminal 1572 2030... (7 Replies)
Discussion started by: patrick87
7 Replies
Login or Register to Ask a Question