Read a file and search a value in another file create third file using AWK

06-17-2009

Registered User

12, 0

Join Date: Sep 2008

Last Activity: 15 July 2012, 11:15 AM EDT

Posts: 12

Thanks Given: 0

Thanked 0 Times in 0 Posts

Read a file and search a value in another file create third file using AWK

Hi,

I have two files with the format shown below. I need to read first field(value before comma) from file 1 and search for a record in file 2 that has the same value in the field "KEY=" and write the complete record of file 2 with corresponding field 2 of the first file in to result file.

File 1:

000000000160191837,00140000637006925269
000000000160191837,00140000637006925270
000000000160191838,00140000637006925271
000000000160191840,00140000637006925272

File 2:

<DATA1><#KEY=000000000160191837><DATA2>
<DATA3><#KEY=000000000160191837><DATA4>
<DATA5><#KEY=000000000160191838><DATA6>
<DATA6><#KEY=000000000160191840><DATA8>

Result File:

<DATA1><#KEY=000000000160191837><DATA2><RESULT>00140000637006925269
<DATA3><#KEY=000000000160191837><DATA4><RESULT>00140000637006925270
<DATA5><#KEY=000000000160191838><DATA6><RESULT>00140000637006925271
<DATA6><#KEY=000000000160191840><DATA8><RESULT>00140000637006925272

I wrote awk command for it but my code doesn't take care of duplicate records. please look at first two records in File 1 in the above example, field 1 is same but field 2 is different. In the same way I will have two exact same entries (same KEY value) in File 2 and I need to assign different values.

My code:

Code:

awk '{ 
  if (FNR==NR) {
    FS=","  
    sample_array[$1]=$2; 
    next 
   }
  FS="KEY=" 
  x=index($2,">")
  sample_num=substr($2,1,x-1);
  if (sample_num in sample_array)
      print $0 "<RESULT>" Sample_array[Sample_num] 
    
 } ' file1 file2 > result_file

Thanks in advance!

King Kalyan

View Public Profile for King Kalyan

Find all posts by King Kalyan

06-17-2009

Moderator

8,825, 1,112

Join Date: Feb 2005

Last Activity: 23 August 2021, 11:26 AM EDT

Location: Foxborough, MA

Posts: 8,825

Thanks Given: 579

Thanked 1,112 Times in 1,003 Posts

nawk -f king.awk FS=, file1 FS='(#KEY=|>)' file2

king.awk:

Code:

FNR==NR{f1[$1];next}
$3 in f1 {out[$3]=($3 in out)?$0:out[$3] $0}
END {
  for (i in out)
    print out[i]
}

vgersh99

View Public Profile for vgersh99

Find all posts by vgersh99

06-17-2009

Registered User

12, 0

Join Date: Sep 2008

Last Activity: 15 July 2012, 11:15 AM EDT

Posts: 12

Thanks Given: 0

Thanked 0 Times in 0 Posts

Thanks for quick reponse!

The code is kind of suppressing duplicates and it's not giving corresponding field 2 of file 1 in the output. I need all records in the output with different field 2 values for duplicates as I shown in the example.

I'm just asking, does it require muti-dimensional array to store different values for duplicates. Not sure as I'm not good at using multi-dimensional arrays.

King Kalyan

View Public Profile for King Kalyan

Find all posts by King Kalyan

06-17-2009

Moderator

8,825, 1,112

Join Date: Feb 2005

Last Activity: 23 August 2021, 11:26 AM EDT

Location: Foxborough, MA

Posts: 8,825

Thanks Given: 579

Thanked 1,112 Times in 1,003 Posts

sorry - misread the data samples.

Assuming there're equal number of same 'keys' in file1 and file2.

king.awk:

Code:

FNR==NR{f1[$1]=($1 in f1)? f1[$1] SUBSEP $2 : $2;next}
$3 in f1 {
   n=split(f1[$3], a, SUBSEP)
   delete f1[$3]
   printf("%s<RESULT>%s\n", $0, a[1])
   for(i=2;i<=n;i++)
    f1[$3]=(i==2)?a[i]:f1[$3] SUBSEP a[i]
}

vgersh99

View Public Profile for vgersh99

Find all posts by vgersh99

06-17-2009

Registered User

12, 0

Join Date: Sep 2008

Last Activity: 15 July 2012, 11:15 AM EDT

Posts: 12

Thanks Given: 0

Thanked 0 Times in 0 Posts

Perfect!! Thanks a lot!!!
It works great!! I never thought it in that angle.

I added one more part, please check and let me know if I did it right.
If there is no match for a value in file 2 then I need to take first 11 digits from any value and append zeros to it and output the record.

It was working fine before but now it's not working not sure where I went wrong.

Addition:

FNR==NR{f1[$1]=($1 in f1)? f1[$1] SUBSEP $2 : $2;next}
$3 in f1 {
n=split(f1[$3], a, SUBSEP)
delete f1[$3]
printf("%s<RESULT>%s\n", $0, a[1])
for(i=2;i<=n;i++)
f1[$3]=(i==2)?a[i]:f1[$3] SUBSEP a[i] ; next}
for ( temp in f1) {
tmp_value=substr(f1[temp],1,11)
print $0 "<RESULT>" tmp_value "000000000"
}

King Kalyan

View Public Profile for King Kalyan

Find all posts by King Kalyan

06-17-2009

Moderator

8,825, 1,112

Join Date: Feb 2005

Last Activity: 23 August 2021, 11:26 AM EDT

Location: Foxborough, MA

Posts: 8,825

Thanks Given: 579

Thanked 1,112 Times in 1,003 Posts

Code:

FNR==NR{f1[$1]=($1 in f1)? f1[$1] SUBSEP $2 : $2;next}
$3 in f1 {
   n=split(f1[$3], a, SUBSEP)
   delete f1[$3]
   printf("%s<RESULT>%s\n", $0, a[1])
   for(i=2;i<=n;i++)
      f1[$3]=(i==2)?a[i]:f1[$3] SUBSEP a[i]
   next
}
{
   for( i in f1) {
      print $0 "<RESULT>" substr(f1(i), 1, 11) "000000000"
      break
  }
}

vgersh99

View Public Profile for vgersh99

Find all posts by vgersh99

06-18-2009

Registered User

12, 0

Join Date: Sep 2008

Last Activity: 15 July 2012, 11:15 AM EDT

Posts: 12

Thanks Given: 0

Thanked 0 Times in 0 Posts

Thanks!! You are the best!!
BTW Thanks for calling the awk code as king.awk

This is not giving the desired results if the missing record is last one in the file 2. I figured it out, as we are deleting the array element everytime and when we reach last record we would have deleted all array elements and so it's not printing the last record.

I changed the code a liitle bit and it's working fine now.

FNR==NR{f1[$1]=($1 in f1)? f1[$1] SUBSEP $2 : $2; default_num=$2;next}
$3 in f1 {
n=split(f1[$3], a, SUBSEP)
delete f1[$3]
printf("%s<RESULT>%s\n", $0, a[1])
for(i=2;i<=n;i++)
f1[$3]=(i==2)?a[i]:f1[$3] SUBSEP a[i]
next
}
{
print $0 "<RESULT>" substr(default_num, 1, 11) "000000000"
}

This is my first post to this forum and I'm really astonished with the quality/quick response.

King Kalyan

View Public Profile for King Kalyan

Find all posts by King Kalyan

Shell Programming and Scripting

Read a file and search a value in another file create third file using AWK

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Splitting a text file into smaller files with awk, how to create a different name for each new file

Discussion started by: LMHmedchem

2. Shell Programming and Scripting

Use while loop to read file and use ${file} for both filename input into awk and as string to print

Discussion started by: pathunkathunk

3. Shell Programming and Scripting

Read in search strings from text file, search for string in second text file and output to CSV

Discussion started by: An0mander

4. Shell Programming and Scripting

Bash to search file based off user input then create new file

Discussion started by: cmccabe

5. Shell Programming and Scripting

Using awk to read one file and search in another file

Discussion started by: pchang

6. Shell Programming and Scripting

awk read one delimited file, search another delimited file

Discussion started by: dagamier

7. Shell Programming and Scripting

Want to read data from a file name.txt and search it in another file and then matching...

Discussion started by: ektubbe

8. Shell Programming and Scripting

Select some lines from a txt file and create a new file with awk

Discussion started by: capnino

9. Shell Programming and Scripting

Need help with awk - how to read a content of a file from every file from file list

Discussion started by: tanit

10. Shell Programming and Scripting

Read words from file and create new file using K-shell.

Discussion started by: bsrajirs