The UNIX and Linux Forums  


Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
.
google unix.com




View Single Post in the UNIX and Linux Forums - Click on the Thread or Permalink to View Entire Thread -->
  #4 (permalink)  
Old 04-21-2009
vgersh99's Avatar
vgersh99 vgersh99 is online now Forum Staff  
Moderator
  
 

Join Date: Feb 2005
Location: Boston, MA
Posts: 5,131
Quote:
Originally Posted by zenith View Post
vgersh99

I have two files
$ head file1
zip,FirstName,Lastname
07777,abc,def
22584,dec,dlo
25487,xyz,jkl
25488,dim,kio

$ head file2
aim server database
SSN,Firstname,LastName
123456789,abc,def
123456789,dec,dlo
123456789,xyz,jkl
123456789,dim,kio
wanted Output:
SSN,zip,FirstName,LastName


Code:
nawk -F"," 'NR==FNR {a[$2,$3]=$1;next} ($2 SUBSEP $3) in a {print a[$2,$3],$1,$2,$3}'  OFS=,  " file2 file1
40 Matches


Code:
nawk -F"," 'NR==FNR {a[$2,$3]=$1;next} ($2 SUBSEP $3) in a {print a[$2,$3],$1,$2,$3}'  OFS=,  " file1 file2
140 matches
The above 2 invocations are exactly the same. I don't understand why you're getting different results.
Also I don't understand why you have a trailing double quote (in red) in both case?
Quote:
Originally Posted by zenith
140 matches is correct i know but both should give 140 i dont know why its giving difference.

Can you please explain this part ($2 SUBSEP $3)
a[$2,$3] we are using , here because its is comma seperated inputfile or is it general rule
If i dont use , then also i am getting same result
No, it's not because your file is comma-separated. You can build your array index just by concatenating the strings ($2$3) or (which is better for further processing) by doing this:

Code:
a[$2,$3]

In the context of the array index building, the "," is substituted by the awk's internal variable SUBSEP. If later on you decide to "split" the index (to find it parts) you can split by SUBSEP. If you simply concatenate the string, you cannot reconstruct the index to its original parts.

The originally posted solution should give you the desired result.
Given file1:

Code:
zip,FirstName,Lastname
07777,abc,def
22584,dec,dlo
25487,xyz,jkl
25488,dim,kio

and file2:

Code:
SSN,Firstname,LastName
123456789,abc,def
123456789,dec,dlo
123456789,xyz,jkl
123456789,dim,kio

running:

Code:
nawk -F, 'NR==FNR {a[$2,$3]=$1;next} ($2 SUBSEP $3) in a {print a[$2,$3],$1,$2,$3}'  OFS=, file2 file1

Results in:

Code:
123456789,07777,abc,def
123456789,22584,dec,dlo
123456789,25487,xyz,jkl
123456789,25488,dim,kio

Check your file1 and file2 - see if there're any discrepancies and/or embedded spaces.

Also, this is NOT one of your first forum posts and you've been asked in the past: please use BB Code tags when posting data or code samples.