Better way to match a list in awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Better way to match a list in awk
# 1  
Old 07-18-2014
Better way to match a list in awk

Suppose I have a list of strings in a file called stringlist...

string1
string2
...
stringn

Suppose also that I have another file, or stdin, or whatever, and I want to use awk to see if some field in each record matches any string in stringlist. What I've been doing is using each string on the list as an index of an array isOnList, and the value of the array element is a unique string, like YUPPERS. This is in my BEGIN block:

Code:
while ("cat file_containing_strings" | getline)
     {
     isOnList[$1] = "YUPPERS"
     }

Suppose I'm interested in finding out whether, say, field 7 matches anything on the list. I have the list read into the array, so to do the testing in the body of my script, I simply do something like this:

Code:
isOnList[$7] == "YUPPERS"   {
     do some stuff...
     }

It works, and it seems reasonably efficient, but I wonder if there's a faster, less expensive way to do this. Looping through the array on each line can be pretty costly, I imagine. How would you all go about solving this sort of problem? Thanks.
# 2  
Old 07-18-2014
They're supposed to be efficient when you use them like this. They're not straight arrays, they're some sort of hash or tree structure (implementation dependent).

You can get rid of the cat in BEGIN though -- in fact you can get rid of the BEGIN, just check for NR==FNR (line number in file equals total line number), which will only be true inside the first file awk reads...

And instead of checking for YUPPERS, check "if X in A" -- this is more efficient because it won't create a new array element whenever it finds an empty one.

Code:
awk 'NR==FNR { A[$1]=1 ; next }
$7 in A { some stuff }' stringlist datafile

This User Gave Thanks to Corona688 For This Post:
# 3  
Old 07-18-2014
and for the same reason, A[$1] alone defines an (empty) element; no reason to assign a value to it =1.
This User Gave Thanks to MadeInGermany For This Post:
# 4  
Old 07-18-2014
Hmm... say, that is pretty nice. And it's straightforward and easy. I like the use of FNR == NR. Thank you very much.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

How to write a script to match a searched name to a given list?

Full title: How to write a script to match a searched name to a given list, and then returns other names with the same properties Anyway, first time here, hi! So I'm taking an introductory course at uni and there's a question in our lab that asks us to write a script where a user can search a... (4 Replies)
Discussion started by: aheyhey
4 Replies

2. Shell Programming and Scripting

awk to print match or non-match and select fields/patterns for non-matches

In the awk below I am trying to output those lines that Match between file1 and file2, those Missing in file1, and those missing in file2. Using each $1,$2,$4,$5 value as a key to match on, that is if those 4 fields are found in both files the match, but if those 4 fields are not found then missing... (0 Replies)
Discussion started by: cmccabe
0 Replies

3. Shell Programming and Scripting

awk to update file based on partial match in field1 and exact match in field2

I am trying to create a cronjob that will run on startup that will look at a list.txt file to see if there is a later version of a database using database.txt as the source. The matching lines are written to output. $1 in database.txt will be in list.txt as a partial match. $2 of database.txt... (2 Replies)
Discussion started by: cmccabe
2 Replies

4. Shell Programming and Scripting

awk to match field between two files and use conditions on match

I am trying to look for $2 of file1 (skipping the header) in $2 of file2 (skipping the header) and if they match and the value in $10 is > 30 and $11 is > 49, then print the line from file1 to a output file. If no match is foung the line is not printed. Both the input and output are tab-delimited.... (3 Replies)
Discussion started by: cmccabe
3 Replies

5. Shell Programming and Scripting

Match and insert in a sorted list

I have a sorted list (python) and I want to insert a string if it matches the pattern in list. Example : Sorted List Above list is in sorted order. I need to insert a name in sorted order and also if the name already exist then it should be inserted before the existing... (1 Reply)
Discussion started by: pratapsingh
1 Replies

6. UNIX for Dummies Questions & Answers

awk display the match and 2 lines after the match is found.

Hello, can someone help me how to find a word and 2 lines after it and then send the output to another file. For example, here is myfile1.txt. I want to search for "Error" and 2 lines below it and send it to myfile2.txt I tried with grep -A but it's not supported on my system. I tried with awk,... (4 Replies)
Discussion started by: eurouno
4 Replies

7. Shell Programming and Scripting

Match username to a name in list!(help:P)

Hello, I'm taking a username from keyboard as input and i want to check if it exists in a list under a directory for example "/home/students/" .... if there is a match i want to type the list... can anyone give me an example or a tutorial to read plz? thnx for your help. (1 Reply)
Discussion started by: lupospiros
1 Replies

8. Shell Programming and Scripting

sed/awk help to match list of patterns and remove from org file

Hi, From the pattern mentioned below remove lines based on pattern range. Conditions 1 Look For all lines starting with ALTER TABLE and Ending with ; and contains the word MOVE.I wanto to remove these lines from the file sample below. Note : The above pattern list could be found in... (1 Reply)
Discussion started by: rajan_san
1 Replies

9. Shell Programming and Scripting

List file in Dir and then match

Hi ALL, I am making a script that search all then worldwriteable dir in documentroot.I have manage to find all the dir now i want to match each file extension in worldwriteable against a list of array which contain file like php html etc,if i find any file then print dirname. Dont see to get any... (8 Replies)
Discussion started by: aliahsan81
8 Replies

10. UNIX for Dummies Questions & Answers

List files that do not match the search pattern

I need to list the files that do not match the search pattern: Example: cat file1 This is how it should work cat file2 This is why I like Unix grep -option? Unix * (or some other command) returns file1 (7 Replies)
Discussion started by: olapxpert
7 Replies
Login or Register to Ask a Question