Compare multiple fields in file1 to file2 and print line and next line


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Compare multiple fields in file1 to file2 and print line and next line
# 1  
Old 03-13-2009
Compare multiple fields in file1 to file2 and print line and next line

Hello,

I have two files that I need to compare and print out the line from file2 that has the first 6 fields matching the first 6 fields in file1. Complicating this are the following restrictions

1. file1 is only a few thousand lines at most and file2 is greater than 2 million
2. I need to match the first 6 fields (in order) of each line in file1 to the first 6 fields (in order) in a line in file2 and print the matched line from file2 along with the next line in file2.

Example files

file1:

...
0.54 3.2 0.45 32.9 4 0.02 9.0 4.0 (line 364)
0.6 4.0 3.99 2.0 0.85 7.0 3.84 0.05 (line 365)
...

file2:

93 28 04 73 95 11 0.4 7.9 2.30 4.05 (100(f18.3)) (line 30046)
70.1 99.4 0.35 9.943 6.1 0.27 0.654 (line 30047)
0.54 3.2 0.45 32.9 4 0.02 9.0 4.0 (54(f18.3) (line 628450)
44.8 33.2 90.3 45.2 66.3 (line 628451)

Needed result matches line 364 from file1 to line 628450 from file2 and prints lines 628450 and 628451, then goes to line 365 of file1 and searches file2 for a match to print matching first line and necessary second line from file2

Example partial output matching file1 with file2

0.54 3.2 0.45 32.9 4 0.02 9.0 4.0 (54(f18.3)
44.8 33.2 90.3 45.2 66.3

I don't really care what I use, awk, sed, perl, etc. I just need it to work.

Hopefully this make sense.

Thanks

Chris
# 2  
Old 03-13-2009
Quote:
Originally Posted by gillesc_mac
I have two files that I need to compare and print out the line from file2 that has the first 6 fields matching the first 6 fields in file1. Complicating this are the following restrictions

1. file1 is only a few thousand lines at most and file2 is greater than 2 million
2. I need to match the first 6 fields (in order) of each line in file1 to the first 6 fields (in order) in a line in file2 and print the matched line from file2 along with the next line in file2.

You really need GNU grep for this.

Put the fields you want to search for from file1 in another file, and use the -f and -A options to grep:

Code:
cut -d ' ' -c1-6 > file3
grep -f file3 -A1 file2

# 3  
Old 03-13-2009
something along these lines.

nawk -f gil.awk file1 file2

gil.awk:
Code:
function buildIDX(   i, idx) {
    for(i=1; i<=6;i++) idx=(i==1) ? $i : idx SUBSEP $i
    return idx
}
FNR==NR {
    f1[buildIDX()]
    next
}
found && found--
{
   if (buildIDX() in f1) {
      print
      found=1
   }
}


Last edited by vgersh99; 03-13-2009 at 05:13 PM..
# 4  
Old 03-13-2009
Thank you, that was helpful...

Now I have another somewhat similar scenario

I have file1 with a field 8 that I need to match to field 1 in file2 and print the file2 line along with the next line in file2, so I was thinking of generating a file that contained the matched file2 line then doing the grep recommendation above to get both lines from file2.

I am unsure how to compare different fields in different files (note these are floating point numbers not necessarily the same string values but same numerical values, i.e. 8.54 for file1 and 8.54000 for file2)

Thanks again
# 5  
Old 03-13-2009
Quote:
Originally Posted by gillesc_mac
Thank you, that was helpful...

Now I have another somewhat similar scenario

I have file1 with a field 8 that I need to match to field 1 in file2 and print the file2 line along with the next line in file2, so I was thinking of generating a file that contained the matched file2 line then doing the grep recommendation above to get both lines from file2.

I am unsure how to compare different fields in different files (note these are floating point numbers not necessarily the same string values but same numerical values, i.e. 8.54 for file1 and 8.54000 for file2)

Thanks again
Assuming the floating point precision is 2 - not tested:
Code:
nawk 'FNR==NR { f1[$8]; next } sprintf("%.2f", $1) in f1' file1 file2

# 6  
Old 03-13-2009
Thank you again, but I neglected to remember another restriction. I need to match multiple fields for example

File1 File2
$9 = $1
$1 = $3
$2 = $4
$3 = $5
$4 = $6
$5 = $7
$6 = $8
$7 = $9

But again each field is not necessarily the same precision. I tried adding additions to your script but I am just beginning to learn.

Thank you
# 7  
Old 03-13-2009
how many requirements DO you have?
Not tested.
Code:
BEGIN {
   fld1="9 1 2 3 4 5 6 7"
   fld1="1 3 4 5 6 7 8 9"

   split(fld1, fld1A)
   split(fld2, fld2A)
}
function buildIDX(fldA,   i, idx) {
    for(i=1; i in fldA ;i++) idx=(i==1) ? sprintf("%.2f",$i) : idx SUBSEP sprintf("%.2f",$i)
    return idx
}
FNR==NR {
    f1[buildIDX(fld1A)]
    next
}
found && found--

{
   if (buildIDX(fld2A) in f1) {
      print
      found=1
   }
}


Last edited by vgersh99; 03-13-2009 at 07:22 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to search field2 in file2 using range of fields file1 and using match to another field in file1

I am trying to use awk to find all the $2 values in file2 which is ~30MB and tab-delimited, that are between $2 and $3 in file1 which is ~2GB and tab-delimited. I have just found out that I need to use $1 and $2 and $3 from file1 and $1 and $2of file2 must match $1 of file1 and be in the range... (6 Replies)
Discussion started by: cmccabe
6 Replies

2. UNIX for Dummies Questions & Answers

Compare file1 and file2, print matching lines in same order as file1

I want to print only the lines in file2 that match file1, in the same order as they appear in file 1 file1 file2 desired output: I'm getting the lines to match awk 'FNR==NR {a++}; FNR!=NR && a' file1 file2 but they are in sorted order, which is not what I want: Can anyone... (4 Replies)
Discussion started by: pathunkathunk
4 Replies

3. Shell Programming and Scripting

Match single line in file1 to groups of lines in file2

I have two files. File 1 is a two-column index file, e.g. comp11084_c0_seq6:130-468(-) comp12746_c0_seq3:140-478(+) comp11084_c0_seq3:201-539(-) comp12746_c0_seq2:191-529(+) File 2 is a sequence file with headers named with the same terms that populate file 1. ... (1 Reply)
Discussion started by: pathunkathunk
1 Replies

4. Shell Programming and Scripting

Compare file1 header count with file2 line count

What I'm trying to accomplish. I receive a Header and Detail file for daily processing. The detail file comes first which holds data, the header is a receipt of the detail file and has the detail files record count. Before processing the detail file I would like to put a wrapper around another... (4 Replies)
Discussion started by: pone2332
4 Replies

5. Shell Programming and Scripting

Compare file1 for matching line in file2 and print the difference in matching lines

Hello, I have two files file 1 and file 2 each having result of a query on certain database tables and need to compare for Col1 in file1 with Col3 in file2, compare Col2 with Col4 and output the value of Col1 from File1 which is a) not present in Col3 of File2 b) value of Col2 is different from... (2 Replies)
Discussion started by: RasB15
2 Replies

6. Shell Programming and Scripting

Using regex's from file1, print line and line after matches in file2

Good day, I have a list of regular expressions in file1. For each match in file2, print the containing line and the line after. file1: file2: Output: I can match a regex and print the line and line after awk '{lines = $0} /Macrosiphum_rosae/ {print lines ; print lines } ' ... (1 Reply)
Discussion started by: pathunkathunk
1 Replies

7. Shell Programming and Scripting

look for line from FILE1 at FILE2

Hi guys! I'm trying to write something to find each line of file1 into file2, if line is found return YES, if not found return NO. The result can be written to a new file. Can you please help me out? FILE1 INPUT: WATER CAR SNAKE (in reality this file has about 600 lines each with a... (2 Replies)
Discussion started by: demmel
2 Replies

8. Shell Programming and Scripting

[Solved] delete line from file1 by reading from file2

Hi All, I have to arrange one of the text file by deleting specific lines. cat file1.txt 3595 3595 -0.00842773 -0.0085077 0.00368851 12815 12815 -0.00929239 0.00439785 0.0291697 3747 3747 -0.00974353 0.00228922 0.0225058 3574 3574 -0.00711399 -0.00315748 0.0141206 .... 12734... (7 Replies)
Discussion started by: senayasma
7 Replies

9. Shell Programming and Scripting

append text from file1 to the end of each line in file2

hi; my file2.txt:portname=1;list=10.11;l- portname=2;list=10.12;l- portname=3;list=10.13;l- ... my file1.txt:;"{'sector=%27'}"\&> so; i want to see:portname=1;list=10.11;l-;"{'sector=%27'}"\&> portname=2;list=10.12;l-;"{'sector=%27'}"\&> portname=3;list=10.13;l-;"{'sector=%27'}"\&>... (4 Replies)
Discussion started by: gc_sw
4 Replies

10. Shell Programming and Scripting

insert file2 after line containing patternX in file1

file1:- aaaa bbbb cccc dddd eeee file2:- 1111 2222 3333 4444 I want to insert file2 after pattern bbbb to come up with a finished file of :- aaaa bbbb 1111 2222 3333 4444 (5 Replies)
Discussion started by: repudi8or
5 Replies
Login or Register to Ask a Question