Big pattern file matching within another pattern file in awk or shell


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Big pattern file matching within another pattern file in awk or shell
# 1  
Old 11-19-2015
Big pattern file matching within another pattern file in awk or shell

Hi

I need to do a patten match between files .
I am new to shell scripting and have come up with this so far. It take 50 seconds to process files of 2mb size . I need to tune this code as file size will be around 50mb and need to save time.
Main issue is that I need to search the pattern from Keys in one file(File1) and then that file becomes the pattern for another two files (File2 and File3).

Is there any other way to do it?

File content looks like below.

**File1**

Code:
    
    20150816,ab311914,ab,abc040,2
    20150817,ab311914,ab,abc040,3
    20150818,ab311914,ab,abc040,4
    20150819,ab311914,ab,abc040,5
    20150820,ab311914,ab,abc040,6
    20150821,ab311914,ab,abc040,7
    20150822,ab311914,ab,abc040,8
    20150823,ab311914,ab,abc040,9
    20150824,ab311914,ab,abc040,10
    20150825,ab311914,ab,abc040,11

**File2**

Code:
    
    20150816,ab311914,ab,abc040,1
    20150817,ab311914,ab,abc040,2
    20150818,ab311914,ab,abc040,3
    20150819,ab311914,ab,abc040,5
    20150820,ab311914,ab,abc040,6
    20150821,ab311914,ab,abc040,7
    20150822,ab311914,ab,abc040,8
    20150823,ab311914,ab,abc040,9
    20150824,ab311914,ab,abc040,10
    20150825,ab311914,ab,abc040,1

**File3**
Code:
  
    20150816,ab,0
    20150817,ab,1
    20150818,ab,2
    20150819,ab,3
    20150820,ab,4
    20150821,ab,5
    20150822,ab,6
    20150823,ab,7
    20150824,ab,8

**Keys**
Code:
 ab311914,1

**Sample output**
Code:
  
     20150816,ab311914,ab,abv040,61
     20150817,ab311914,ab,abv040,62
     20150818,ab311914,ab,abv040,63
     20150819,ab311914,ab,abv040,64
     20150820,ab311914,ab,abv040,65
     20150821,ab311914,ab,abv040,66
     20150822,ab311914,ab,abv040,67
     20150823,ab311914,ab,abv040,68
     20150824,ab311914,ab,abv040,69
     20150825,ab311914,ab,abv040,70

** shell script code so far**
Code:
                awk -F "," keys.txt '{print $1}'|while read key_
		do
		echo "$key_" 
		grep $key_ file1.txt |grep -v ",-1$"|while read line; 
		do 
		patterna1=`echo $line|awk -F "," '{print $2 "," $3 "," $4 "," $5 "$"}' `
		patterna2=`echo $line| awk -F "," '{print $1 "," $2 "," $3 "," $4}'`
		patternb1=`grep $patterna1 file2.txt|head -1|awk -F "," '{print $1}'`
		patternb2=`grep $patternb1 file3.txt|awk -F "," '{print $3}'`
		echo $patterna2,$patternb2 
		done  >> final.txt
		done


Last edited by nitin_daharwal; 11-19-2015 at 10:01 PM..
# 2  
Old 11-19-2015
I am totally confused.

There is nothing in your code (which you imply is working but is running too slow), that explains why the output has abv040,61 through abv040,70 when abv040 does not appear anywhere in any of the input files and the values 61 through 70 do not appear anywhere in any of the input files.

Furthermore, your code seems to only output two fields; not five.

And, there is no line in File3 containing the string (or date) 20150825; so why is there a line in the output containing that string?

Please explain more clearly what you are trying to do.
# 3  
Old 11-20-2015
Some comments on top of what Don Cragun said:
awk -F "," keys.txt '{print $1}' can't possibly work (reverse order of arguments) and is superfluous - you could simply read IFS="," key_ REST; ... < keys.txt
grep -v ",-1$" is pointless as (at least in the samples given) there's no line ending in "-1"
And, for each line in keys.txt times each matching line in file1.txt, you run 10 processes to extract a few fields - no surprise that is slow.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Awk: Matching Pattern From other file with length

Hi, I have input file whose first column needs(match.txt) to be matched with the first column of the input file with min & max length as defined in match.txt. But conditions are not matching. Please help on the changes in the code below as for multiple enteries in match.txt complete match.txt will... (3 Replies)
Discussion started by: siramitsharma
3 Replies

2. UNIX for Dummies Questions & Answers

How to append portion of a file content to another file when a certain pattern is matching?

Hi ladies and gentleman.. I have two text file with me. I need to replace one of the file content to another file if one both files have a matching pattern. Example: text1.txt: ABCD 1234567,HELLO_WORLDA,HELLO_WORLDB DCBA 3456789,HELLO_WORLDE,HELLO_WORLDF text2.txt: XXXX,ABCD... (25 Replies)
Discussion started by: bananamen
25 Replies

3. Shell Programming and Scripting

awk - writing matching pattern to a new file and deleting it from the current file

Hello , I have comma delimited file with over 20 fileds that i need to do some validations on. I have to check if certain fields are null and then write the line containing the null field into a new file and then delete the line from the current file. Can someone tell me how i could go... (2 Replies)
Discussion started by: goddevil
2 Replies

4. Shell Programming and Scripting

awk pattern matching and shell issue.

Please help me in this issue. I am unable to get the job,seems the awk not browsing the files. Please find my tries below. I have attached two files : 1.tobesearched.txt - a glimpse of a huge log file. 2.searchstring.txt - searching keys. these are the two scripts i tried writing: ... (7 Replies)
Discussion started by: deboprio
7 Replies

5. Shell Programming and Scripting

AWK match $1 $2 pattern in file 1 to $1 $2 pattern in file2

Hi, I have 2 files that I have modified to basically match each other, however I want to determine what (if any) line in file 1 does not exist in file 2. I need to match column $1 and $2 as a single string in file1 to $1 and $2 in file2 as these two columns create a match. I'm stuck in an AWK... (9 Replies)
Discussion started by: right_coaster
9 Replies

6. Shell Programming and Scripting

Get matching string pattern from a file

Hi, file -> temp.txt cat temp.txt /home/pradeep/123/a_asp.html /home/pradeep/123/a_asp1.html /home/pradeep/435/a_asp2.html /home/pradeep/arun/abc/a_dfr.html /home/pradeep/arun/123/a_kir.html /home/pradeep/123/arun/a_dir.html .... .... .. i need to get a_*.html(bolded strings... (4 Replies)
Discussion started by: pradebban
4 Replies

7. Shell Programming and Scripting

Help with matching pattern inside a file

I have a huge file that has roughly 30304 lines. I need to extract specific info from that file. For example, Box 1 > *aaaaaaaajjjj* > hbbvjvj > jdnnfddllll > *dgdfhfekwjh* Box 2 > *aaaaaaa'aj'jjj* > dse hkjuejef bfdw > dyeee > dsewq > *dgdfhfekwjh* >feweiuei Box 3 > *aaaa"aaaaj"jjj* >... (25 Replies)
Discussion started by: Ernst
25 Replies

8. UNIX for Dummies Questions & Answers

PERL pattern matching in a file

Hi Gurus, I have a file like below.. I have to match each with predefined pattern. If matches then have to write the entire record to a separate file. If not make the value as NULL and write the entire record into another file. | is the delimiter ravi123|2344|M R123Vi|2345|F... (8 Replies)
Discussion started by: pvksandeep
8 Replies

9. Programming

File Pattern Matching C++

Hi, I have large files with fixed length fields or fields seperated by delimeter. I would like to do validation on some or all fields to check for numeric or date or characters etc.. I would like to write this in C++. Please let me know if any one have any ideas on this. Thanks for all... (2 Replies)
Discussion started by: rameshmelam
2 Replies

10. Shell Programming and Scripting

Pattern matching for file

Hi All, I'm new to perl, My requirement is to check if particular file exists. e.g. filename.txt, filename1.txt, filename2.txt etc I tried the below code:- my $var1 = "filename.txt" if ( -e ($var1 = ~ /file\w/)) { print "File found \n"; } else { print "File not found \n"; } ... (0 Replies)
Discussion started by: doitnow
0 Replies
Login or Register to Ask a Question