Using awk to read one file and search in another file | Unix Linux Forums | Shell Programming and Scripting

  Go Back    


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

Using awk to read one file and search in another file

Shell Programming and Scripting


Tags
awk

Closed Thread    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 11-26-2012
pchang pchang is offline
Registered User
 
Join Date: Sep 2007
Last Activity: 13 August 2014, 4:43 PM EDT
Posts: 74
Thanks: 13
Thanked 1 Time in 1 Post
Using awk to read one file and search in another file

Hi Forum.

I did some google search on what I'm trying to do but I cannot get my code to work correctly. I have 2 files which are very large and I want to read text from file1 and search in file2 - if present, keep the records.

I've tried fgrep -f file1 file2 but it is too slow.

Code:
File1:
00000014|

File2:
00000014|XSELL_ASP||Y|
00000014|XSELL_RSP||Y|
00000014|XSELL_ISA||Y|
00000014|XSELL_THRIVE||Y|
00000132|XSELL_ISA||Y|
00000132|XSELL_RSP|0.0014960810|N|
00000132|XSELL_THRIVE|0.0078523404|N|

Output File3 should be:
Code:
00000014|XSELL_ASP||Y|
00000014|XSELL_RSP||Y|
00000014|XSELL_ISA||Y|
00000014|XSELL_THRIVE||Y|

Here's my code so far (based on examples found on forum):


Code:
awk -F"|" '
  FNR==NR {f1[$1];next}
  (($1 SUBSEP) in f1)
' file1 file2

Can someone help out?

Thanks.

Last edited by vbe; 11-26-2012 at 11:52 AM.. Reason: completed code tags
Sponsored Links
    #2  
Old 11-26-2012
Klashxx's Avatar
Klashxx Klashxx is offline Forum Advisor  
HP-UX/Linux/Oracle
 
Join Date: Feb 2006
Last Activity: 24 October 2014, 6:58 PM EDT
Location: Almerķa, Spain
Posts: 777
Thanks: 24
Thanked 110 Times in 105 Posts
Try this (and don“t forget the code tags in your posts):

Code:
awk -F\| 'NR==FNR{a[$1]++;next}a[$1]'  file1 file2

The Following 2 Users Say Thank You to Klashxx For This Useful Post:
clx (11-26-2012), pchang (11-26-2012)
Sponsored Links
    #3  
Old 11-26-2012
clx clx is offline Forum Advisor  
Registered User
 
Join Date: Jun 2007
Last Activity: 17 October 2014, 9:03 AM EDT
Location: Mumbai, India
Posts: 1,614
Thanks: 125
Thanked 184 Times in 178 Posts
Can you try this?

Code:
awk -F\| 'NR == FNR {a1[$1]; next} $1 in file1' file1 file2

    #4  
Old 11-26-2012
RudiC RudiC is online now Forum Advisor  
Registered User
 
Join Date: Jul 2012
Last Activity: 30 October 2014, 8:08 AM EDT
Location: Aachen, Germany
Posts: 4,469
Thanks: 73
Thanked 1,094 Times in 1,030 Posts
PLEASE use code tags.

A small modifation to your own code will make it do what you expect:
Code:
awk -F"|" '
           FNR==NR {f1[$1];next}
           ($1 in f1)
          ' file1 file2

, but I don't think it will be much faster than grep. Pls report back the time differences!
Sponsored Links
    #5  
Old 11-26-2012
pchang pchang is offline
Registered User
 
Join Date: Sep 2007
Last Activity: 13 August 2014, 4:43 PM EDT
Posts: 74
Thanks: 13
Thanked 1 Time in 1 Post
Quote:
Originally Posted by Klashxx View Post
Try this (and don“t forget the code tags in your posts):

Code:
awk -F\| 'NR==FNR{a[$1]++;next}a[$1]'  file1 file2

Wow - it's blazing FAST!!!!

Done within seconds.

Can you explain the code a little bit?

I can understand that file1 is being stored in an array but I cannot understand the rest.

Thanks.

---------- Post updated at 11:44 AM ---------- Previous update was at 11:43 AM ----------

Quote:
Originally Posted by RudiC View Post
PLEASE use code tags.

A small modifation to your own code will make it do what you expect:
Code:
awk -F"|" '
           FNR==NR {f1[$1];next}
           ($1 in f1)
          ' file1 file2

, but I don't think it will be much faster than grep. Pls report back the time differences!
Not even comparable - fgrep was very slow but awk came back within seconds.
Sponsored Links
    #6  
Old 11-26-2012
Klashxx's Avatar
Klashxx Klashxx is offline Forum Advisor  
HP-UX/Linux/Oracle
 
Join Date: Feb 2006
Last Activity: 24 October 2014, 6:58 PM EDT
Location: Almerķa, Spain
Posts: 777
Thanks: 24
Thanked 110 Times in 105 Posts
OK , first file , with a[$1]++ , we just set a flag for each element of first field using an array), in fact we can just use a[$1]=1 cause we only need to know that that that element is used (the amount of memory/resources needed is very low ) , then we use next to stop processing the current record and go on to the next.


The rest of the flow is quite simple , if FR!=FNR ( second file ) awk will go directly into the a[$1] statement, if the flag is set ( >= 1 ) , the statement becomes true and awk prints the whole line.

Hope this helps.

Regards.
Sponsored Links
    #7  
Old 11-26-2012
RudiC RudiC is online now Forum Advisor  
Registered User
 
Join Date: Jul 2012
Last Activity: 30 October 2014, 8:08 AM EDT
Location: Aachen, Germany
Posts: 4,469
Thanks: 73
Thanked 1,094 Times in 1,030 Posts
Quote:
Originally Posted by pchang View Post
. . .
Not even comparable - fgrep was very slow but awk came back within seconds.
Yes - fgrep is slower, but don't forget the influence of I/O buffering when comparing the two. Then pls consider using grep in lieu of fgrep . I did a little test on a somewhat bigger file, eliminating stdout influence, and appreciating the influence of I/O buffering etc:
Code:
$ time grep -f file1 file2 >/dev/null
real    0m0.022s
user    0m0.008s
sys     0m0.012s
$ time fgrep -f file1 file2 >/dev/null
real    0m0.092s
user    0m0.088s
sys     0m0.004s
$ time awk -F"|" '
FNR==NR {f1[$1];next}
($1 in f1)
' file1 file2 >/dev/null
real    0m0.090s
user    0m0.084s
sys     0m0.004s

Sponsored Links
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
awk read one delimited file, search another delimited file dagamier Shell Programming and Scripting 4 11-19-2012 12:20 PM
Want to read data from a file name.txt and search it in another file and then matching... ektubbe Shell Programming and Scripting 15 02-15-2012 12:20 PM
awk: read file 1, search file 2, sum on match, print Bubnoff Shell Programming and Scripting 6 01-30-2010 07:16 PM
Read a file and search a value in another file create third file using AWK King Kalyan Shell Programming and Scripting 11 06-19-2009 12:05 AM
Need help with awk - how to read a content of a file from every file from file list tanit Shell Programming and Scripting 7 03-10-2009 05:19 AM



All times are GMT -4. The time now is 08:14 AM.