Look up between 2 files and print matching lines


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Look up between 2 files and print matching lines
# 1  
Old 02-02-2012
Look up between 2 files and print matching lines

Hi,

I have 2 large log files in .gz format

file 1 contains

abcde
12345
23456
.
.
.
.
.
.
.
.
09123


file 2
abcde,1,2,3,4,5,6,7
09123,3,4,5,6,7,7,8
23456,9,6,5,4,3,2,1
....
...
...
...

I am basically looking for a script to open the file1 , read line by line and verify if the string matches in file 2 then redirect matching lines to one more file 3 , the string match should be irrespective of position they occur in file2.

Since i have a space limitation can the script be executed in the compressed format only

It would be better if a PERL search script is provided as it is pretty faster than awk.
# 2  
Old 02-02-2012
Not perl , but give it a try:
Code:
gunzip -c file1.dat.gz|while read pat
do
   gunzip -c file2.dat.gz |grep "${pat}" 
done >file3

If you want to compress de destination file:
Code:
gunzip -c file1.dat.gz|while read pat
do
   gunzip -c file2.dat.gz |grep "${pat}" 
done |gzip -c -9  >file3.dat.gz


Last edited by Klashxx; 02-02-2012 at 05:53 AM..
# 3  
Old 02-02-2012
Thanks but File 1 has 10 million lines and File 2 has 60 million lines , can it run faster with the above script
# 4  
Old 02-02-2012
Yep , its going to be always slow (because of requirements) , 10 M x 60 M makes 600 M of file2 "full scans" ...
# 5  
Old 02-02-2012
Actually, it can get considerably faster. The above solution uses a shell loop which is very slow. Look here at a very similar thread:
https://www.unix.com/shell-programmin...hing-perl.html
# 6  
Old 02-02-2012
Yes , if you're plenty of memory
# 7  
Old 02-02-2012
True, memory can become an issue. But I still think it can be done faster, using a double loop in a scripting/programming language, rather than shell. My perl skills are not very developed, but you can try this out:

Code:
#!/usr/bin/perl -w

use strict;

while (<>) {  
    chomp;
    my $pattern = $_;

    open FILE2, "-|", "gunzip -c file2.gz" or die $!;
    while (<FILE2>) {
	if (/$pattern/) {
	    print;
	}
    }
    close FILE2;
}

Save it as grep2.pl, make it executable
Code:
chmod u+x grep2.pl

and run it as:
Code:
gunzip -c file1.gz | ./grep2.pl

or
Code:
gunzip -c file1.gz | ./grep2.pl | gzip -c > output.gz

once you tested it out and want the output gzipped.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to print matching lines in files that meet critera

In the tab delimited files below I am trying to match $2 in file1 to $2 of file2. If a match is found the awk checks $3 of file2 and if it is greater than 40% and $4 of file2 is greater than 49, the line in file1 is printed. In the desired output line3 of file1 is not printed because $3 off file2... (9 Replies)
Discussion started by: cmccabe
9 Replies

2. Shell Programming and Scripting

Print lines after matching two pattern

would like to print everything after matching two patterns AAA and BBB. output : CCC ZZZ sample data : AAA BBB CCC ZZZ (4 Replies)
Discussion started by: jhonnyrip
4 Replies

3. Shell Programming and Scripting

Compare file1 for matching line in file2 and print the difference in matching lines

Hello, I have two files file 1 and file 2 each having result of a query on certain database tables and need to compare for Col1 in file1 with Col3 in file2, compare Col2 with Col4 and output the value of Col1 from File1 which is a) not present in Col3 of File2 b) value of Col2 is different from... (2 Replies)
Discussion started by: RasB15
2 Replies

4. Shell Programming and Scripting

How to print all the lines after pattern matching?

I have a file that contains... Number -------------------- 1 2 3 4 i want to print all the numbers after the hyphen ... (6 Replies)
Discussion started by: ankitknit
6 Replies

5. Shell Programming and Scripting

print lines between 2 matching patterns

Hi Guys, I have file like below, I want to print all lines between test1231233 to its 10 occurrence(till line 41) test1231233 qwe qwe qweq123 test1231233 qwe qwe qweq23 test1231233 qwe qwe qweq123 test1231233 qwe qwe qweq123131 (3 Replies)
Discussion started by: jagnikam
3 Replies

6. Shell Programming and Scripting

Print matching lines in a file

Hello everyone, I have a little script below: die "Usage infile outfile reGex" if @ARGV != 3; ($regex) = @ARGV; open(F,$ARGV) or die "Can't open"; open(FOUT,"+>$ARGV") or die "Can't open"; while (<F>) { print FOUT if /$regex/.../$regex/; } No matter what I give $regex on the... (2 Replies)
Discussion started by: new bie
2 Replies

7. Shell Programming and Scripting

Print lines matching value(s) in other file using awk

Hi, I have two comma separated files. I would like to see field 1 value of File1 exact match in field 2 of File2. If the value matches, then it should print matched lines from File2. I have achieved the results using cut, paste and egrep -f but I would like to use awk as it is efficient way and... (7 Replies)
Discussion started by: SBC
7 Replies

8. Shell Programming and Scripting

AIX equivalent to GNU grep's -B and -A [print lines after or before matching lines]

Hi folks I am not allowed to install GNU grep on AIX. Here my code excerpt: grep_fatal () { /usr/sfw/bin/gegrep -B4 -A2 "FATAL|QUEUE|SIGHUP" } Howto the same on AIX based machine? from manual GNU grep ‘--after-context=num’ Print num lines of trailing context after... (4 Replies)
Discussion started by: slashdotweenie
4 Replies

9. Shell Programming and Scripting

How to print file without few exactly matching lines?

Hi I have a very long file with 4 columns of numbers for example 1875 1876 12725 12723 13785 13786 4232 4230 13184 13185 ... (2 Replies)
Discussion started by: ananyob
2 Replies

10. Shell Programming and Scripting

I want to print next 3 lines after pattern matching.

Dear Experts, I have file called file1 in which i am greping a pattern after that i want to next 3 lines when that pattern is matched. Ex:- file1 USA UK India Africa Hello Asia Europe Australia Hello Peter Robert Jo i want to next 3 lines after matching Hello... (12 Replies)
Discussion started by: naree
12 Replies
Login or Register to Ask a Question