Awk+Grep Input file needs to match a column and print the entire line


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Awk+Grep Input file needs to match a column and print the entire line
# 1  
Old 03-13-2009
Awk+Grep Input file needs to match a column and print the entire line

I'm having problems since few days ago, and i'm not able to make it works with a simple awk+grep script (or other way to do this).

For example, i have a input file1.txt:

cat inputfile1.txt

218299910417
1172051195
1172070231
1172073514
1183135117
1183135118
1183135119
1281440202


And i need to match these numbers, on another file on two specific columns, for example the $3 and $4 column, using the pipe delimiter

cat inputfile2.txt

AAAAA|DISTHOR1_U2|6981258207|218299910417|END
BBBBB|DISTHOR1_U2|6981118022|6981259131|END
FARFAR|DISTHOR1_U2|6981119404|1172070231|END
CCCCC|DISTHOR1_U2|1172073514|6981258793|END
BBBBB|DISTHOR1_U2|698515487|489498131|END

The expected result, is a output file that matches the elements from the first file, with the third and forth column from the second file, in this case, the output file will be:

cat outputfile1.txt

AAAAA|DISTHOR1_U2|6981258207|218299910417|END
FARFAR|DISTHOR1_U2|6981119404|1172070231|END
CCCCC|DISTHOR1_U2|1172073514|6981258793|END

I was able to do this, with this command, but he is looking for the whole file, not a specific column:

grep -f inputfile1.txt inputfile2.txt > outputfile1.txt

But this command is taking over an hour, because my input file1.txt has over 1600 records and the inputfile2.txt has over one million of records with 190 characters on each line, divided in 43 columns

Can someone help me with this?

Thanks
# 2  
Old 03-13-2009
Code:
nawk -F'|' -v OFS='|' 'FNR==NR {f1[$0]; next} $3 in f1 || $4 in f1' inputfile1.txt inputfile2.txt

# 3  
Old 03-14-2009
If performance is key, you may want to use Python
Code:
#! /usr/bin/python

import sys

if len(sys.argv) != 3:
        print "Usage: %s <input1> <input2>" % (sys.argv[0])
        exit(1)

inputfile1=sys.argv[1]
inputfile2=sys.argv[2]


# store keys in list
keys=list()
for i in open(inputfile1):
        keys.append(i.strip())



for i in open(inputfile2):
        line=i.strip()
        list=line.split("|")
        if list[2] in keys or list[3] in keys:
                print line


Please post us the run time for your data set. Also, I would like to compare this with AWK version (or nawk - new AWK in Solaris)
# 4  
Old 03-16-2009
thank you vgersh99, your tip is working fine, before i was taking more than one hour to process the script, and now i'm taking less than 5 minutes Smilie
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Compare 1st column from 2 file and if match print line from 1st file and append column 7 from 2nd

hi I have 2 file with more than 10 columns for both 1st file apple,0,0,0...... orange,1,2,3..... mango,2,4,5..... 2nd file apple,2,3,4,5,6,7... orange,2,3,4,5,6,8... watermerlon,2,3,4,5,6,abc... mango,5,6,7,4,6,def.... (1 Reply)
Discussion started by: tententen
1 Replies

2. UNIX for Beginners Questions & Answers

Use strings from nth field from one file to match strings in entire line in another file, awk

I cannot seem to get what should be a simple awk one-liner to work correctly and cannot figure out why. I would like to use patterns from a specific field in one file as regex to search for matching strings in the entire line ($0) of another file. I would like to output the lines of File2 which... (1 Reply)
Discussion started by: jvoot
1 Replies

3. Shell Programming and Scripting

Print next line beside preceding line on column match

Hi, I have some data like below: John 254 Chris 254 Matt 123 Abe 123 Raj 487 Moh 487 How can i print it using awk to have: 254 John,Chris 123 Matt,Abe 487 Raj,Moh Thanks. (4 Replies)
Discussion started by: james2009
4 Replies

4. Shell Programming and Scripting

Input file needs to match a column and print the entire line

I have a file with class c IP addresses that I need to match to a column and print the matching lines of another file. I started playing with grep -if file01.out file02.out but I am stuck as to how to match it to a column and print the matching lines; cat file01.out 10.150.140... (5 Replies)
Discussion started by: lewk
5 Replies

5. Shell Programming and Scripting

How to print the entire line if the mentioned match is found?

Hello Everyone, I have a file with 5 fields in each line just like mentioned below. Also the 4th field is time elapsed(hh:mm:ss) since the process is running xyz abc status 23:00:00 idle abc def status 24:00:00 idle def gji status 27:00:02 idle fgh gty status 00:00:00 idle Here I... (8 Replies)
Discussion started by: rahul2662
8 Replies

6. Shell Programming and Scripting

awk Print New Column For Every Two Lines and Match On Multiple Column Values to print another column

Hi, My input files is like this axis1 0 1 10 axis2 0 1 5 axis1 1 2 -4 axis2 2 3 -3 axis1 3 4 5 axis2 3 4 -1 axis1 4 5 -6 axis2 4 5 1 Now, these are my following tasks 1. Print a first column for every two rows that has the same value followed by a string. 2. Match on the... (3 Replies)
Discussion started by: jacobs.smith
3 Replies

7. Shell Programming and Scripting

Match a line in File 1 with Column in File 2 and print whole line in file 2 when matched

Hi Experts, I am very new to scripting and have a prb since few days and it is urgent to solve so much appreciated if u help me. i have 2 files file1.txt 9647810043118 9647810043126 9647810043155 9647810043161 9647810043166 9647810043185 9647810043200 9647810043203 9647810043250... (22 Replies)
Discussion started by: mustafa.abdulsa
22 Replies

8. UNIX for Dummies Questions & Answers

grep N lines after match and then print them on 1 line each

Hello I have a silly question. I need to grep a match in text file and then print 5 lines after it. grep -A 5 .... do it. OK The next thing I can not handle is I need each output to be on 1 line match line2 line3 line4 line5 match line2 line3 line4 line5 etc.. I will really... (10 Replies)
Discussion started by: alekkz
10 Replies

9. Shell Programming and Scripting

grep N lines after match and then print them on 1 line each

Hello I need some help with this job. file.txt ----- cut ---- TARGET 13/11/08 20:43:21 POINT 1 MOVE 8 772102y64312417771 TARGET 13/11/08 21:10:01 POINT 2 MOVE 5 731623jjd12njhd ----- cut ---- this is the example. i need to grep for the word TARGET and print next 4 lines like... (1 Reply)
Discussion started by: alekkz
1 Replies

10. Shell Programming and Scripting

Print entire line based on value in a column

Friends, File1.txt abc|0|xyz 123|129|opq def|0|678 890|pqw|sdf How do I print the entire line where second column has value is 0? Expected Result: abc|0|xyz def|0|678 Thanks, Prashant ---------- Post updated at 02:14 PM ---------- Previous update was at 02:06 PM ---------- ... (1 Reply)
Discussion started by: ppat7046
1 Replies
Login or Register to Ask a Question