![]() |
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.
|
|
google unix.com
|
|||||||
| Forums | Register | Forum Rules | Links | Albums | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here. |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Read each word from File1 and search each file in file2 | clem2610 | Shell Programming and Scripting | 8 | 04-23-2009 08:13 AM |
| read mp3 filename and create one XML for each file | jason7 | Shell Programming and Scripting | 4 | 03-21-2009 02:57 PM |
| Read words from file and create new file using K-shell. | bsrajirs | Shell Programming and Scripting | 4 | 06-01-2007 12:15 PM |
| Korn Shell Script - Read File & Search On Values | run_unx_novice | Shell Programming and Scripting | 2 | 06-15-2005 07:20 AM |
| sendmail.cf: How can I read a .db file and search for a token? | Devyn | Shell Programming and Scripting | 0 | 02-18-2005 03:43 PM |
![]() |
|
|
LinkBack | Thread Tools | Search this Thread |
Rating:
|
Display Modes |
|
|
|
||||
|
Hi,
I have two files with the format shown below. I need to read first field(value before comma) from file 1 and search for a record in file 2 that has the same value in the field "KEY=" and write the complete record of file 2 with corresponding field 2 of the first file in to result file. File 1: 000000000160191837,00140000637006925269 000000000160191837,00140000637006925270 000000000160191838,00140000637006925271 000000000160191840,00140000637006925272 File 2: <DATA1><#KEY=000000000160191837><DATA2> <DATA3><#KEY=000000000160191837><DATA4> <DATA5><#KEY=000000000160191838><DATA6> <DATA6><#KEY=000000000160191840><DATA8> Result File: <DATA1><#KEY=000000000160191837><DATA2><RESULT>00140000637006925269 <DATA3><#KEY=000000000160191837><DATA4><RESULT>00140000637006925270 <DATA5><#KEY=000000000160191838><DATA6><RESULT>00140000637006925271 <DATA6><#KEY=000000000160191840><DATA8><RESULT>00140000637006925272 I wrote awk command for it but my code doesn't take care of duplicate records. please look at first two records in File 1 in the above example, field 1 is same but field 2 is different. In the same way I will have two exact same entries (same KEY value) in File 2 and I need to assign different values. My code: Code:
awk '{
if (FNR==NR) {
FS=","
sample_array[$1]=$2;
next
}
FS="KEY="
x=index($2,">")
sample_num=substr($2,1,x-1);
if (sample_num in sample_array)
print $0 "<RESULT>" Sample_array[Sample_num]
} ' file1 file2 > result_file
|
|
||||
|
Thanks for quick reponse!
The code is kind of suppressing duplicates and it's not giving corresponding field 2 of file 1 in the output. I need all records in the output with different field 2 values for duplicates as I shown in the example. I'm just asking, does it require muti-dimensional array to store different values for duplicates. Not sure as I'm not good at using multi-dimensional arrays. |
|
||||
|
Perfect!! Thanks a lot!!!
It works great!! I never thought it in that angle. I added one more part, please check and let me know if I did it right. If there is no match for a value in file 2 then I need to take first 11 digits from any value and append zeros to it and output the record. It was working fine before but now it's not working not sure where I went wrong. Addition: FNR==NR{f1[$1]=($1 in f1)? f1[$1] SUBSEP $2 : $2;next} $3 in f1 { n=split(f1[$3], a, SUBSEP) delete f1[$3] printf("%s<RESULT>%s\n", $0, a[1]) for(i=2;i<=n;i++) f1[$3]=(i==2)?a[i]:f1[$3] SUBSEP a[i] ; next} for ( temp in f1) { tmp_value=substr(f1[temp],1,11) print $0 "<RESULT>" tmp_value "000000000" } |
|
|||||
|
Code:
FNR==NR{f1[$1]=($1 in f1)? f1[$1] SUBSEP $2 : $2;next}
$3 in f1 {
n=split(f1[$3], a, SUBSEP)
delete f1[$3]
printf("%s<RESULT>%s\n", $0, a[1])
for(i=2;i<=n;i++)
f1[$3]=(i==2)?a[i]:f1[$3] SUBSEP a[i]
next
}
{
for( i in f1) {
print $0 "<RESULT>" substr(f1(i), 1, 11) "000000000"
break
}
}
|
|
||||
|
Thanks!! You are the best!!
BTW Thanks for calling the awk code as king.awk ![]() This is not giving the desired results if the missing record is last one in the file 2. I figured it out, as we are deleting the array element everytime and when we reach last record we would have deleted all array elements and so it's not printing the last record. I changed the code a liitle bit and it's working fine now. FNR==NR{f1[$1]=($1 in f1)? f1[$1] SUBSEP $2 : $2; default_num=$2;next} $3 in f1 { n=split(f1[$3], a, SUBSEP) delete f1[$3] printf("%s<RESULT>%s\n", $0, a[1]) for(i=2;i<=n;i++) f1[$3]=(i==2)?a[i]:f1[$3] SUBSEP a[i] next } { print $0 "<RESULT>" substr(default_num, 1, 11) "000000000" } This is my first post to this forum and I'm really astonished with the quality/quick response. |
![]() |
| Bookmarks |
| Tags |
| array, awk, dulplicate, search, two files |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|