Grep Line with Matching Fields


 
Thread Tools Search this Thread
Top Forums UNIX for Advanced & Expert Users Grep Line with Matching Fields
# 1  
Old 08-09-2007
Bug Grep Line with Matching Fields

Below is the scenario. Help is appreciated.

File1: ( 500,000 lines ) : Three fields comma delimited : Not sorted

1234FAA,435612,88975
1224FAB,12345,212356


File2: ( 4,000,000 lines ) : Six fields comma delimited (Last 3 field should match the 3 fields of File1) : Not Sorted :

0123456abcd,12345,abcdef,1234FAA,435612,88975
0123456wxyz,11234,lmnopq,1224FAB,12345,212356

I need to grab all the six fields for file2 when there is a match of first 3 fields of file1 and last 3 fields of file2.

I wrote a small script but seems like it might take days to complete :-)

1 #!/bin/ksh
2
3 while read record
4 do
5 cat file2 | grep "$record" >> final.list
6 done < file1

Can someone help me with a faster solution?

Thanks in advance.
# 2  
Old 08-09-2007
Code:
egrep -f file1 file2

# 3  
Old 08-09-2007
nawk -F',' -f hem.awk file1 file2

hem.awk:
Code:
FNR==NR { f1[$0]; next}
( $(NF-2) FS $(NF-1) FS $(NF) ) in f1

# 4  
Old 08-09-2007
Bug

vgersh99,

Linux box does not have nawk.
Has awk and gawk.


Smilie
# 5  
Old 08-09-2007
Quote:
Originally Posted by hemangjani
vgersh99,

Linux box does not have nawk.
Has awk and gawk.


Smilie
'gawk' is your new friend.
# 6  
Old 08-09-2007
vgersh99,

The following ran for about 45-50 minutes and completed, but the output file was empty.
Could you please give me more insight on what is happening below?

Thanks

gawk -F',' -f hem.awk file1 file2 > final.list

hem.awk:
FNR==NR { f1[$0]; next}
( $(NF-2) FS $(NF-1) FS $(NF) ) in file1

ShellLife,

The egrep command ran over an hour and killed it.
I kicked it off again to see how long it runs.

Thanks
# 7  
Old 08-09-2007
hemangjani,
based on your sample input files and the proposed awk script, the output came out as expected. I'v also modified the 'file2' file to add non=matching records found in file1, and the result was as expected.

I believe your actual files are not the same as the ones you'v quoted above: there might be inconsistent spaces/tabs between the fields and/or some other anomalities you're not paying attention to that result in the 'empty' output.

I'd suggest copy/pasting the part of the content of files file1 and file2 here using the Vb Codes so that the formating does not 'get lost in translation' [pun intended].
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to control grep output intact for each matching line?

I have multiple (~80) files (some can be as big as 30GB of >1 billion of lines!) to grep on a pattern, and piped the match to a single file. I have a 96-core machine so that each grep job was sent to the background to speed up the search: file1.tab chr1A_part1 123241847 123241848... (6 Replies)
Discussion started by: yifangt
6 Replies

2. UNIX for Beginners Questions & Answers

Grep file starting from pattern matching line

I have a file with a list of references towards the end and want to apply a grep for some string. text .... @unnumbered References @sp 1 @paragraphindent 0 2017. @strong{Chalenski, D.A.}; Wang, K.; Tatanova, Maria; Lopez, Jorge L.; Hatchell, P.; Dutta, P.; @strong{Small airgun... (1 Reply)
Discussion started by: kristinu
1 Replies

3. Shell Programming and Scripting

awk to combine all matching fields in input but only print line with largest value in specific field

In the below I am trying to use awk to match all the $13 values in input, which is tab-delimited, that are in $1 of gene which is just a single column of text. However only the line with the greatest $9 value in input needs to be printed. So in the example below all the MECP2 and LTBP1... (0 Replies)
Discussion started by: cmccabe
0 Replies

4. Shell Programming and Scripting

Grep log file to get line above matching pattern

Hi, I have a log file that looks like this "delete" : { "_type" : "cl", "_id" : "1000600000000562636", "_version" : 1, "status" : 200, "found" : false } }, { "delete" : { "_type" : "cl", "_id" : "1000600000000562643", ... (4 Replies)
Discussion started by: wahi80
4 Replies

5. Shell Programming and Scripting

Compare file1 for matching line in file2 and print the difference in matching lines

Hello, I have two files file 1 and file 2 each having result of a query on certain database tables and need to compare for Col1 in file1 with Col3 in file2, compare Col2 with Col4 and output the value of Col1 from File1 which is a) not present in Col3 of File2 b) value of Col2 is different from... (2 Replies)
Discussion started by: RasB15
2 Replies

6. Linux

matching two fields

Hi I am having 2 fields and if f1=f2 i wanna print that line eg 1 2 1 3 1 9 2 2 3 5 9 9 In the abov eg. the highlighted lines shud be printed 2 2 9 9 Thanking u (3 Replies)
Discussion started by: binnybio
3 Replies

7. Shell Programming and Scripting

find out line number of matching string using grep

Hi all, I want to display line number for matching string in a file. can anyone please help me. I used grep -n "ABC" file so it displays 6 ABC. But i only want to have line number,i don't want that it should prefix matching context with line number. Actually my original... (10 Replies)
Discussion started by: sarbjit
10 Replies

8. Shell Programming and Scripting

Matching by key fields

I have a file (key.dat) that contains two columns: AA|1234| BB|567| CC|8910| I have another file (extract.dat) that contains some data: SD|458|John|Smith| AA|3345|Frank|Williams| AA|1234|Bill|Garner| BD|0098|Yu|Lin| BB|567|Gail|Hansen| CC|8910|Ken|Nielsen| I want to compare the... (5 Replies)
Discussion started by: ChicagoBlues
5 Replies

9. Shell Programming and Scripting

matching 2 exact fields

Dear experts, I have a file1 that looks like 60127930928 2091 60129382039 2092 60126382937 2091 60128937928 2061 60127329389 2062 60123748730 2061 60128730293 2061 and file 2 that looks like 60127930928 2091 60129382039 2092 60126382937 2093 60128937928 2061 60127329389... (2 Replies)
Discussion started by: aismann
2 Replies

10. UNIX for Dummies Questions & Answers

How to grep / zgrep to output ONLY the matching filename and line number?

Hi all, I am trying to zgrep / grep list of files so that it displays only the matching filename:line number and does not display the whole line, like: (echo "1.txt";echo "2.txt") | xargs zgrep -no STRING If I use -o option, it displays the matching STRING and if not used, displays the... (3 Replies)
Discussion started by: vvaidyan
3 Replies
Login or Register to Ask a Question