Using awk to read one file and search in another file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Using awk to read one file and search in another file
# 1  
Old 11-26-2012
Using awk to read one file and search in another file

Hi Forum.

I did some google search on what I'm trying to do but I cannot get my code to work correctly. I have 2 files which are very large and I want to read text from file1 and search in file2 - if present, keep the records.

I've tried fgrep -f file1 file2 but it is too slow.
Code:
File1:
00000014|

File2:
00000014|XSELL_ASP||Y|
00000014|XSELL_RSP||Y|
00000014|XSELL_ISA||Y|
00000014|XSELL_THRIVE||Y|
00000132|XSELL_ISA||Y|
00000132|XSELL_RSP|0.0014960810|N|
00000132|XSELL_THRIVE|0.0078523404|N|

Output File3 should be:
Code:
00000014|XSELL_ASP||Y|
00000014|XSELL_RSP||Y|
00000014|XSELL_ISA||Y|
00000014|XSELL_THRIVE||Y|

Here's my code so far (based on examples found on forum):

Code:
awk -F"|" '
  FNR==NR {f1[$1];next}
  (($1 SUBSEP) in f1)
' file1 file2

Can someone help out?

Thanks.

Last edited by vbe; 11-26-2012 at 12:52 PM.. Reason: completed code tags
# 2  
Old 11-26-2012
Try this (and don“t forget the code tags in your posts):
Code:
awk -F\| 'NR==FNR{a[$1]++;next}a[$1]'  file1 file2

These 2 Users Gave Thanks to Klashxx For This Post:
# 3  
Old 11-26-2012
Can you try this?
Code:
awk -F\| 'NR == FNR {a1[$1]; next} $1 in file1' file1 file2

# 4  
Old 11-26-2012
PLEASE use code tags.

A small modifation to your own code will make it do what you expect:
Code:
awk -F"|" '
           FNR==NR {f1[$1];next}
           ($1 in f1)
          ' file1 file2

, but I don't think it will be much faster than grep. Pls report back the time differences!
# 5  
Old 11-26-2012
Quote:
Originally Posted by Klashxx
Try this (and don“t forget the code tags in your posts):
Code:
awk -F\| 'NR==FNR{a[$1]++;next}a[$1]'  file1 file2

Wow - it's blazing FAST!!!!

Done within seconds.

Can you explain the code a little bit?

I can understand that file1 is being stored in an array but I cannot understand the rest.

Thanks.

---------- Post updated at 11:44 AM ---------- Previous update was at 11:43 AM ----------

Quote:
Originally Posted by RudiC
PLEASE use code tags.

A small modifation to your own code will make it do what you expect:
Code:
awk -F"|" '
           FNR==NR {f1[$1];next}
           ($1 in f1)
          ' file1 file2

, but I don't think it will be much faster than grep. Pls report back the time differences!
Not even comparable - fgrep was very slow but awk came back within seconds.
# 6  
Old 11-26-2012
OK , first file , with a[$1]++ , we just set a flag for each element of first field using an array), in fact we can just use a[$1]=1 cause we only need to know that that that element is used (the amount of memory/resources needed is very low ) , then we use next to stop processing the current record and go on to the next.


The rest of the flow is quite simple , if FR!=FNR ( second file ) awk will go directly into the a[$1] statement, if the flag is set ( >= 1 ) , the statement becomes true and awk prints the whole line.

Hope this helps.

Regards.
# 7  
Old 11-26-2012
Quote:
Originally Posted by pchang
. . .
Not even comparable - fgrep was very slow but awk came back within seconds.
Yes - fgrep is slower, but don't forget the influence of I/O buffering when comparing the two. Then pls consider using grep in lieu of fgrep. I did a little test on a somewhat bigger file, eliminating stdout influence, and appreciating the influence of I/O buffering etc:
Code:
$ time grep -f file1 file2 >/dev/null
real    0m0.022s
user    0m0.008s
sys     0m0.012s
$ time fgrep -f file1 file2 >/dev/null
real    0m0.092s
user    0m0.088s
sys     0m0.004s
$ time awk -F"|" '
FNR==NR {f1[$1];next}
($1 in f1)
' file1 file2 >/dev/null
real    0m0.090s
user    0m0.084s
sys     0m0.004s

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Use while loop to read file and use ${file} for both filename input into awk and as string to print

I have files named with different prefixes. From each I want to extract the first line containing a specific string, and then print that line along with the prefix. I've tried to do this with a while loop, but instead of printing the prefix I print the first line of the file twice. Files:... (3 Replies)
Discussion started by: pathunkathunk
3 Replies

2. Shell Programming and Scripting

Read in search strings from text file, search for string in second text file and output to CSV

Hi guys, I have a text file named file1.txt that is formatted like this: 001 , ID , 20000 002 , Name , Brandon 003 , Phone_Number , 616-234-1999 004 , SSNumber , 234-23-234 005 , Model , Toyota 007 , Engine ,V8 008 , GPS , OFF and I have file2.txt formatted like this: ... (2 Replies)
Discussion started by: An0mander
2 Replies

3. Programming

C program to read a binary file and search for a string?

Hi, I am not a C programmer. The only C exposure I have is reading and completing the exercises from the C (ANSI C ) Programming Language book:o At the moment, I am using the UNIX strings command to extract information for a binary file and grepping for a particular string and the value... (3 Replies)
Discussion started by: newbie_01
3 Replies

4. Shell Programming and Scripting

Search and replace from file in awk using a 16 bit text file

Hello, Some time ago a helpful awk file was provided on the forum which I give below: NR==FNR{A=$0;next}{for(j in A){split(A,P,"=");for(i=1;i<=NF;i++){if($i==P){$i=P}}}}1 While it works beautifully on English and Latin characters i.e. within the ASCII range of 127, the moment a character beyond... (6 Replies)
Discussion started by: gimley
6 Replies

5. Shell Programming and Scripting

awk read one delimited file, search another delimited file

Hello folks, I have another doozy. I have two files. The first file has four fields in it. These four fields map to different locations in my second file. What I want to do is read the master file (file 2 - 23 fields) and compare each line against each record in file 1. If I get a match in all four... (4 Replies)
Discussion started by: dagamier
4 Replies

6. Shell Programming and Scripting

Want to read data from a file name.txt and search it in another file and then matching...

Hi Frnds... I have an input file name.txt and another file named as source.. name.txt is having only one column and source is having around 25 columns...i need to read from name.txt line by line and search it in source file and then save the result in results file.. I have a rough idea about the... (15 Replies)
Discussion started by: ektubbe
15 Replies

7. Shell Programming and Scripting

Using awk to when reading a file to search and output to file

Hi, I am not sure if this will work or not. I am getting a syntax error. I am reading fileA, using an acct number field trying to see if it exists in fileB and output to new file. Can anyone tell me if what I am doing will work or should I attempt it another way? Thanks. exec < "${fileA}... (4 Replies)
Discussion started by: ski
4 Replies

8. Shell Programming and Scripting

Read a file and search a value in another file create third file using AWK

Hi, I have two files with the format shown below. I need to read first field(value before comma) from file 1 and search for a record in file 2 that has the same value in the field "KEY=" and write the complete record of file 2 with corresponding field 2 of the first file in to result file. ... (11 Replies)
Discussion started by: King Kalyan
11 Replies

9. Shell Programming and Scripting

Need help with awk - how to read a content of a file from every file from file list

Hi Experts. I need to list the file and the filename comes from the file ListOfFile.txt. Basicly I have a filename "ListOfFile.txt" and it contain Example of ListOfFile.txt /home/Dave/Program/Tran1.P /home/Dave/Program/Tran2.P /home/Dave/Program/Tran3.P /home/Dave/Program/Tran4.P... (7 Replies)
Discussion started by: tanit
7 Replies

10. Shell Programming and Scripting

sendmail.cf: How can I read a .db file and search for a token?

Hello, I need to write code in '/etc/mail/sendmail.cf' to verify that a string exists within a hash file ( Such as /etc/mail/key-value.db ). I've searched the web and did find many great articles regarding 'sendmail.cf' however I'm not clear how I can do this specific thing as the online... (0 Replies)
Discussion started by: Devyn
0 Replies
Login or Register to Ask a Question