Read a file and search a value in another file create third file using AWK


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Read a file and search a value in another file create third file using AWK
# 8  
Old 06-18-2009
hope below perl script can help you some.

Code:
while(<DATA>){
	my @tmp=split(",",$_);
	push @{$tmp[0]}, $tmp[1];
}
open $fh,"<", "a.txt";
while(<$fh>){
	chomp;
	if(/KEY=([0-9]+)/){
		my $tmp=shift @{$1};
		print $_,"<RESULT>",$tmp;
	}
}
__DATA__
000000000160191837,00140000637006925269
000000000160191837,00140000637006925270
000000000160191838,00140000637006925271
000000000160191840,00140000637006925272

# 9  
Old 06-18-2009
or better yet - to take care of the mismatching NUMBER of keys in either one of the files:
Code:
FNR==NR{f1[$1]=($1 in f1)? f1[$1] SUBSEP $2 : $2;default_num=$2;next}
$3 in f1 {
   n=split(f1[$3], a, SUBSEP)
   printf("%s<RESULT>%s\n", $0, a[1])
   if (n==1) next;
   delete f1[$3]
   for(i=2;i<=n;i++)
      f1[$3]=(i==2)?a[i]:f1[$3] SUBSEP a[i]
   next
}
{
  print $0 "<RESULT>" substr(default_num, 1, 11) "000000000"
}
}

# 10  
Old 06-18-2009
Thanks Cherry! I do not know perl so this is kind of out of scope for me but I heard perl is very fast. I have to learn that in future.

The below awk code runs fine for small number of records but now I'm running it on 200K records and it's taking lot of time, it's been 25 minutes and it wrote just 50 records in to the output file. Not sure how much more time it will take to complete the process.

Is there any thing wrong with the code that is making it to run long time?
Generally awk is very fast, right?

Actually this code is already availble in C++ and I'm trying to re-write in awk because of performance issues as awk is faster. Smilie

awk -f king.awk FS=, file1 FS='#KEY=' file2

Code:
 
FNR==NR{f1[$1]=($1 in f1)? f1[$1] SUBSEP $2 : $2; default_num=$2;next}
{x=index($2,">");
key=substr($2,1,x-1);
}
key in f1 {
n=split(f1[key], a, SUBSEP)
delete f1[key]
printf("%s<RESULT>%s\n", $0, a[1])
for(i=2;i<=n;i++)
f1[key]=(i==2)?a[i]:f1[key] SUBSEP a[i]
next
}
{
print $0 "<RESULT>" substr(default_num, 1, 11) "000000000"
}



---------- Post updated at 05:00 PM ---------- Previous update was at 11:37 AM ----------

My bad.. I used wrong data that has only 7 unique records and rest all of it is duplicate which will not happen in real world. So, I'm good.

For all 200K unique records and 1 duplicate for each record, it ran in ~3 mins.

Thanks for all your support!

Last edited by King Kalyan; 06-18-2009 at 12:43 PM..
# 11  
Old 06-18-2009
I'm not sure why you changed the invocation:
Code:
nawk -f king.awk FS=, file1 FS='(#KEY=|>)' file2

TO
Code:
awk -f king.awk FS=, file1 FS='#KEY=' file2

and do the 'index/substr' for each record/line in file2. It's definitely adding time to the execution.
Also if you take my last version - it should be a little faster as I don't rebuild the array if I just have 1 entry in it (probably the majority of your records in file2).
You could probably think of a different implementation that doesn't require rebuilding the array all together. This is left as an exercise for the OP Smilie
# 12  
Old 06-19-2009
Thanks for giving me a new direction!

I changed the invocation because field <#KEY> can be anywhere in the file it's not always at second position. Sorry I did not mention that in my first post.

Yes I saw your last code and forgot to include that in my code, now I included it (skipping rebuilding array if there is only one entry).

After you told that, I thought of different implementation and here it is.. this is more faster...

Code:
 
FNR==NR{f1[$1]=($1 in f1)? f1[$1] SUBSEP $2 : $2; default_num=$2;next}
{x=index($2,">");
key=substr($2,1,x-1);
}
key in f1 {
n=split(f1[key], a, SUBSEP)
printf("%s<RESULT>%s\n", $0, a[1])
if (n==1) {next}
y=index(f1[key],SUBSEP);
f1[key]=substr(f1[key],y+1)
next
}
{
print $0 "<RESULT>" substr(default_num, 1, 11) "000000000"
}

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Splitting a text file into smaller files with awk, how to create a different name for each new file

Hello, I have some large text files that look like, putrescine Mrv1583 01041713302D 6 5 0 0 0 0 999 V2000 2.0928 -0.2063 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 5.6650 0.2063 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 3.5217 ... (3 Replies)
Discussion started by: LMHmedchem
3 Replies

2. Shell Programming and Scripting

Use while loop to read file and use ${file} for both filename input into awk and as string to print

I have files named with different prefixes. From each I want to extract the first line containing a specific string, and then print that line along with the prefix. I've tried to do this with a while loop, but instead of printing the prefix I print the first line of the file twice. Files:... (3 Replies)
Discussion started by: pathunkathunk
3 Replies

3. Shell Programming and Scripting

Read in search strings from text file, search for string in second text file and output to CSV

Hi guys, I have a text file named file1.txt that is formatted like this: 001 , ID , 20000 002 , Name , Brandon 003 , Phone_Number , 616-234-1999 004 , SSNumber , 234-23-234 005 , Model , Toyota 007 , Engine ,V8 008 , GPS , OFF and I have file2.txt formatted like this: ... (2 Replies)
Discussion started by: An0mander
2 Replies

4. Shell Programming and Scripting

Bash to search file based off user input then create new file

In the below bash a file is downloaded when the program is opened and then that file is searched based on user input and the result is written to a new file. For example, the bash is opened and the download.txt is downloaded, the user then enters the id (NA04520). The id is used to search... (5 Replies)
Discussion started by: cmccabe
5 Replies

5. Shell Programming and Scripting

Using awk to read one file and search in another file

Hi Forum. I did some google search on what I'm trying to do but I cannot get my code to work correctly. I have 2 files which are very large and I want to read text from file1 and search in file2 - if present, keep the records. I've tried fgrep -f file1 file2 but it is too slow. File1:... (10 Replies)
Discussion started by: pchang
10 Replies

6. Shell Programming and Scripting

awk read one delimited file, search another delimited file

Hello folks, I have another doozy. I have two files. The first file has four fields in it. These four fields map to different locations in my second file. What I want to do is read the master file (file 2 - 23 fields) and compare each line against each record in file 1. If I get a match in all four... (4 Replies)
Discussion started by: dagamier
4 Replies

7. Shell Programming and Scripting

Want to read data from a file name.txt and search it in another file and then matching...

Hi Frnds... I have an input file name.txt and another file named as source.. name.txt is having only one column and source is having around 25 columns...i need to read from name.txt line by line and search it in source file and then save the result in results file.. I have a rough idea about the... (15 Replies)
Discussion started by: ektubbe
15 Replies

8. Shell Programming and Scripting

Select some lines from a txt file and create a new file with awk

Hi there, I have a text file with several colums separated by "|;#" I need to search the file extracting all columns starting with the value of "1" or "2" saving in a separate file just the first 7 columns of each row maching the criteria, with replacement of the saparators in the nearly created... (4 Replies)
Discussion started by: capnino
4 Replies

9. Shell Programming and Scripting

Need help with awk - how to read a content of a file from every file from file list

Hi Experts. I need to list the file and the filename comes from the file ListOfFile.txt. Basicly I have a filename "ListOfFile.txt" and it contain Example of ListOfFile.txt /home/Dave/Program/Tran1.P /home/Dave/Program/Tran2.P /home/Dave/Program/Tran3.P /home/Dave/Program/Tran4.P... (7 Replies)
Discussion started by: tanit
7 Replies

10. Shell Programming and Scripting

Read words from file and create new file using K-shell.

Hi All, Please help me in creating files through K-shell scripts. I am having one file in this format. OWNER.TABLE_NAME OWNER.TABLE_NAME1 OWNER1.TABLE_NAME OWNER1.TABLE_NAME1 I want to read the above file and create new file through k shell script. The new file should looks like this.... (4 Replies)
Discussion started by: bsrajirs
4 Replies
Login or Register to Ask a Question