awk base lookup of best match strings


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk base lookup of best match strings
# 1  
Old 01-31-2017
awk base lookup of best match strings

Hi,

I'm new to scripting and unable to find out a way to perform the below task. Request help in finding out a way to accomplish this.

File one consists of some numbers/string which i need to lookup against file 2 and fetch the best match results in output. If best match is not present in file 2 then NA should be populated.

Thanks in advance.

Input File1

Code:
555
555555
5558
5558888
55558
5555877
445
445555
6665
6665555
6665ttt

Input File2

Code:
555,1
5558,2
55558,3
445,a
6665,b

Output
Code:
555,555,1
555555,555,1
5558,5558,2
5558888,5558,2
55558,55558,3
5555877,55558,3
445,445,a
445555,445,a
6665,6665,b
6665555,6665,b
6665ttt,6665,b
66667,NA,NA

Moderator's Comments:
Mod Comment Please use CODE tags when displaying sample input, sample output, and code segments (as required by the forum rules you agreed to when you joined).

Last edited by Don Cragun; 01-31-2017 at 03:29 AM.. Reason: Add CODE tags.
# 2  
Old 01-31-2017
Quote:
Originally Posted by suraj016
Hi,

I'm new to scripting and unable to find out a way to perform the below task. Request help in finding out a way to accomplish this.

File one consists of some numbers/string which i need to lookup against file 2 and fetch the best match results in output. If best match is not present in file 2 then NA should be populated.

Thanks in advance.

Input File1

Code:
555
555555
5558
5558888
55558
5555877
445
445555
6665
6665555
6665ttt

Input File2

Code:
555,1
5558,2
55558,3
445,a
6665,b

Output
Code:
555,555,1
555555,555,1
5558,5558,2
5558888,5558,2
55558,55558,3
5555877,55558,3
445,445,a
445555,445,a
6665,6665,b
6665555,6665,b
6665ttt,6665,b
66667,NA,NA

Moderator's Comments:
Mod Comment Please use CODE tags when displaying sample input, sample output, and code segments (as required by the forum rules you agreed to when you joined).
Is this a homework assignment? Homework and coursework questions can only be posted in the Homework & Coursework Questions forum under special homework rules.

If you did not post homework, please explain the company you work for and the nature of the problem you are working on. And, define what "fetch the best match" means. Clearly there is no match in File1 for any line in File2 (since there aren't any commas in File1 and there is a comma in every line in File2).
If you did post homework in the main forums, please review the guidelines for posting homework and repost.
# 3  
Old 01-31-2017
Hi Don Cragun,

I 'm a telecom professional and the same is not required for any Homework and coursework. I need to reconcile 2 sets of data which are delimited using comma. Only sample format was given earlier as the actual data is having more than 100 fields.

I'm appending remarks also for reference in the output file sample.

Input Files
Code:
File A

Digits,Charged,Duration
555,60,60
555555,10,10
5558,6,3
5558888,12,6
55558,3,1
5555877,6,2
445,40,10
445555,44,11
6665,50,10
6665555,10,2
66665,20,10


File B

Digits,Rate,Pulse
555,1,1
5558,2,1
55558,3,1
445,4,1
6665,5,1


Ouput File

Digits,Charged,Duration,Digits_B,Rate_B,Pulse_B,Remarks
555,60,60,555,1,1,Digits 555 in FileA exactly matches with555 in FileB
555555,10,10,555,1,1,Digits 555555 in FileA best matches with 555 in FileB
5558,6,3,5558,2,1,Digits 5558 in FileA exactly matches with 5558 in FileB
5558888,12,6,5558,2,1,Digits 5558888 in FileA best matches with 5558 in FileB
55558,3,1,55558,3,1,Digits 55558 in FileA exactly matches with 55558 in FileB
5555877,6,2,55558,3,1,Digits 5555877 in FileA best matches with 55558 in FileB
445,40,10,445,4,1,--exact match--
445555,44,11,445,4,1,----best match---
6665,50,10,6665,5,1,--exact match--
6665555,10,2,6665,5,1,----best match---
66665,20,10,NA,NA,NA,Not present in File B

# 4  
Old 02-01-2017
What have you tried to solve this on your own?

I repeat:
Quote:
And, define what "fetch the best match" means.
Why is there not supposed to be a match for the last line in File B in post #3? The 6665 in the 1st field on that line is a subset of the 1st field of the last line in File A and the 3rd field on both of those lines are both 1. Why isn't that an exact match on the 3rd field and a "best match" on the 1st field??? Why isn't the 5 in the 2nd field of the last line in File B a "best match" for every line except the header line in File A (each of which contains at least one 5 in the 1st field)?

Until we have a clear English description of what you are trying to do, trying to make wild guesses at what you want from two inconsistent examples allows us to waste a lot of time trying to guess what you're trying to do instead of helping us teach you what you need to know to do it yourself.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Use strings from nth field from one file to match strings in entire line in another file, awk

I cannot seem to get what should be a simple awk one-liner to work correctly and cannot figure out why. I would like to use patterns from a specific field in one file as regex to search for matching strings in the entire line ($0) of another file. I would like to output the lines of File2 which... (1 Reply)
Discussion started by: jvoot
1 Replies

2. UNIX for Beginners Questions & Answers

Match Strings between two files, print portions of each file together when matched ([g]awk)

I have two files and desire to use the strings from $1 of file 1 (file1.txt) as search criteria to find matches in $2 of file 2 (file2.txt). If matches are found I want to output the entire line of file 2 (file2.txt) followed by fields $2-$11 of file 1 (file1.txt). I can find the matches, I cannot... (7 Replies)
Discussion started by: jvoot
7 Replies

3. Shell Programming and Scripting

awk to remove field and match strings to add text

In file1 field $18 is removed.... column header is "Otherinfo", then each line in file1 is used to search file2 for a match. When a match is found the last four strings in file2 are copied to file1. Maybe: cut -f1-17 file1 and then match each line to file2 file1 Chr Start End ... (6 Replies)
Discussion started by: cmccabe
6 Replies

4. Shell Programming and Scripting

Using awk to match strings and outputing their corresponding values

Hi I will appreciate it if you can help me out. I have a file that contains this data System Load: 3244 card: 1903 CPU: 6% card: 1904 CPU: 6% card: 1905 CPU: 28% card: 1906 CPU: 28% card: 1907 CPU: 36% card: 1908 CPU: 37% I need to manipulate and output this as system_load:3244... (2 Replies)
Discussion started by: kaf3773
2 Replies

5. Shell Programming and Scripting

Match strings in 2 different files

Hi, i am trying to match strings from 2 different files based on position like below:- file1 (tab delimited) f07270 lololol fff u12730 gggddd dddkkrr mmm file2 (not tab delimited) %f07270 APSLH bl%alalalalallaadsdsfdfdfdgsgfss %g13450 GDIDFLRIP%ILITEAPPRKgsfgsgsf %d08880... (11 Replies)
Discussion started by: redse171
11 Replies

6. Shell Programming and Scripting

Print strings that match pattern with awk

I have a file with many lines which contain strings like .. etc. But with no rule regarding field separators or anything else. I want to print ONLY THE STRING from each line , not the entire line !!! For example from the lines : Flow on service executed with success in . Performances... (5 Replies)
Discussion started by: black_fender
5 Replies

7. UNIX for Dummies Questions & Answers

Help with AWK - Compare a field in a file to lookup file and substitute if only a match

I have the below 2 files: 1) Third field from file1.txt should be compared to the first field of lookup.txt. 2) If match found then third field, file1.txt should be substituted with the second field from lookup.txt. 3)Else just print the line from file1.txt. File1.txt:... (4 Replies)
Discussion started by: venalla_shine
4 Replies

8. Shell Programming and Scripting

awk strings search + print next column after match

Hi, I have a file filled with search strings which have a blank in between and look like this: S. g. Ehr. o. Jg. v. d. Chijs g. Ehr. Now i would like to search for the strings and it also shall return the next column after the match. awk -v FILE="search_strings.txt" 'BEGIN {... (10 Replies)
Discussion started by: sdf
10 Replies

9. Shell Programming and Scripting

[Solved] Lookup a file and match the contents

Hi, I appreciate all who have been very helpful to me in providing valuable suggestions and replies. I want to write a script to look up a file and match the contents. Let me go through the scenario. Lets say i have two files Content file: abc, bcd, adh|bcdf|adh|wed bcf, cdf,... (2 Replies)
Discussion started by: forums123456
2 Replies

10. UNIX for Dummies Questions & Answers

AWK lookup not finding match

Hello everyone, I have been struggling with the following situation, I think I am doing something wrong, can anyone help? I have 2 comma separated files, the first is a look-up table that will supply the phone number based on the customer id, the second is a file containing customers and their... (4 Replies)
Discussion started by: gio001
4 Replies
Login or Register to Ask a Question