The UNIX and Linux Forums  
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
.
google unix.com



Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Script extracting the incorrect data from text file jermaine4ever Shell Programming and Scripting 6 03-16-2009 12:18 PM
Incorrect login NIS? Juterassee SUN Solaris 5 10-30-2008 11:08 AM
login incorrect espace1000 UNIX for Dummies Questions & Answers 2 08-22-2008 06:48 AM
Login Incorrect sydney2008 Red Hat 6 08-22-2008 04:57 AM
Incorrect Directory Name jand102821 UNIX for Dummies Questions & Answers 1 06-19-2002 04:35 PM

Closed Thread
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 04-21-2009
pinnacle pinnacle is offline
Registered User
  
 

Join Date: Apr 2009
Posts: 182
Awk incorrect data.

I am using the following command:

Code:
nawk -F"," 'NR==FNR {a[$2$3]=$1;next} a[$2$3] {print a[$2$3],$1,$2,$3}'  file1 file2
I am getting 40 records output.
But when i import file1 and file2 in MS Access i get 140 records.
And i know 140 is correct count.

Appreciate your help on correcting the above script
  #2 (permalink)  
Old 04-21-2009
vgersh99's Avatar
vgersh99 vgersh99 is online now Forum Staff  
Moderator
  
 

Join Date: Feb 2005
Location: Boston, MA
Posts: 5,119
It would definitely help if you provided a more detailed description of what you're trying to achieve with the sample data files and the expected output.

My crystal ball is a bit fuzzy, but:
Code:
nawk -F"," 'NR==FNR {a[$2,$3]=$1;next} ($2 SUBSEP $3) in a {print a[$2,$3],$1,$2,$3}'  OFS=, file1 file2
  #3 (permalink)  
Old 04-21-2009
pinnacle pinnacle is offline
Registered User
  
 

Join Date: Apr 2009
Posts: 182
Quote:
Originally Posted by vgersh99 View Post
It would definitely help if you provided a more detailed description of what you're trying to achieve with the sample data files and the expected output.

My crystal ball is a bit fuzzy, but:
Code:
nawk -F"," 'NR==FNR {a[$2,$3]=$1;next} ($2 SUBSEP $3) in a {print a[$2,$3],$1,$2,$3}'  OFS=, file1 file2
vgersh99

I have two files
$ head file1
zip,FirstName,Lastname
07777,abc,def
22584,dec,dlo
25487,xyz,jkl
25488,dim,kio

$ head file2
aim server database
SSN,Firstname,LastName
123456789,abc,def
123456789,dec,dlo
123456789,xyz,jkl
123456789,dim,kio
wanted Output:
SSN,zip,FirstName,LastName

Code:
nawk -F"," 'NR==FNR {a[$2,$3]=$1;next} ($2 SUBSEP $3) in a {print a[$2,$3],$1,$2,$3}'  OFS=,  " file2 file1
40 Matches
Code:
nawk -F"," 'NR==FNR {a[$2,$3]=$1;next} ($2 SUBSEP $3) in a {print a[$2,$3],$1,$2,$3}'  OFS=,  " file1 file2
140 matches
140 matches is correct i know but both should give 140 i dont know why its giving difference.

Can you please explain this part ($2 SUBSEP $3)
a[$2,$3] we are using , here because its is comma seperated inputfile or is it general rule
If i dont use , then also i am getting same result
  #4 (permalink)  
Old 04-21-2009
vgersh99's Avatar
vgersh99 vgersh99 is online now Forum Staff  
Moderator
  
 

Join Date: Feb 2005
Location: Boston, MA
Posts: 5,119
Quote:
Originally Posted by zenith View Post
vgersh99

I have two files
$ head file1
zip,FirstName,Lastname
07777,abc,def
22584,dec,dlo
25487,xyz,jkl
25488,dim,kio

$ head file2
aim server database
SSN,Firstname,LastName
123456789,abc,def
123456789,dec,dlo
123456789,xyz,jkl
123456789,dim,kio
wanted Output:
SSN,zip,FirstName,LastName

Code:
nawk -F"," 'NR==FNR {a[$2,$3]=$1;next} ($2 SUBSEP $3) in a {print a[$2,$3],$1,$2,$3}'  OFS=,  " file2 file1
40 Matches
Code:
nawk -F"," 'NR==FNR {a[$2,$3]=$1;next} ($2 SUBSEP $3) in a {print a[$2,$3],$1,$2,$3}'  OFS=,  " file1 file2
140 matches
The above 2 invocations are exactly the same. I don't understand why you're getting different results.
Also I don't understand why you have a trailing double quote (in red) in both case?
Quote:
Originally Posted by zenith
140 matches is correct i know but both should give 140 i dont know why its giving difference.

Can you please explain this part ($2 SUBSEP $3)
a[$2,$3] we are using , here because its is comma seperated inputfile or is it general rule
If i dont use , then also i am getting same result
No, it's not because your file is comma-separated. You can build your array index just by concatenating the strings ($2$3) or (which is better for further processing) by doing this:
Code:
a[$2,$3]
In the context of the array index building, the "," is substituted by the awk's internal variable SUBSEP. If later on you decide to "split" the index (to find it parts) you can split by SUBSEP. If you simply concatenate the string, you cannot reconstruct the index to its original parts.

The originally posted solution should give you the desired result.
Given file1:
Code:
zip,FirstName,Lastname
07777,abc,def
22584,dec,dlo
25487,xyz,jkl
25488,dim,kio
and file2:
Code:
SSN,Firstname,LastName
123456789,abc,def
123456789,dec,dlo
123456789,xyz,jkl
123456789,dim,kio
running:
Code:
nawk -F, 'NR==FNR {a[$2,$3]=$1;next} ($2 SUBSEP $3) in a {print a[$2,$3],$1,$2,$3}'  OFS=, file2 file1
Results in:
Code:
123456789,07777,abc,def
123456789,22584,dec,dlo
123456789,25487,xyz,jkl
123456789,25488,dim,kio
Check your file1 and file2 - see if there're any discrepancies and/or embedded spaces.

Also, this is NOT one of your first forum posts and you've been asked in the past: please use BB Code tags when posting data or code samples.
  #5 (permalink)  
Old 04-21-2009
pinnacle pinnacle is offline
Registered User
  
 

Join Date: Apr 2009
Posts: 182
Quote:
Originally Posted by vgersh99 View Post
The above 2 invocations are exactly the same. I don't understand why you're getting different results.
Also I don't understand why you have a trailing double quote (in red) in both case?

No, it's not because your file is comma-separated. You can build your array index just by concatenating the strings ($2$3) or (which is better for further processing) by doing this:
Code:
a[$2,$3]
In the context of the array index building, the "," is substituted by the awk's internal variable SUBSEP. If later on you decide to "split" the index (to find it parts) you can split by SUBSEP. If you simply concatenate the string, you cannot reconstruct the index to its original parts.

The originally posted solution should give you the desired result.
Given file1:
Code:
zip,FirstName,Lastname
07777,abc,def
22584,dec,dlo
25487,xyz,jkl
25488,dim,kio
and file2:
Code:
SSN,Firstname,LastName
123456789,abc,def
123456789,dec,dlo
123456789,xyz,jkl
123456789,dim,kio
running:
Code:
nawk -F, 'NR==FNR {a[$2,$3]=$1;next} ($2 SUBSEP $3) in a {print a[$2,$3],$1,$2,$3}'  OFS=, file2 file1
Results in:
Code:
123456789,07777,abc,def
123456789,22584,dec,dlo
123456789,25487,xyz,jkl
123456789,25488,dim,kio
Check your file1 and file2 - see if there're any discrepancies and/or embedded spaces.

Also, this is NOT one of your first forum posts and you've been asked in the past: please use BB Code tags when posting data or code samples.
Code:
nawk -F"," 'NR==FNR {a[$2,$3]=$1;next} ($2 SUBSEP $3) in a {print a[$2,$3],$1,$2,$3}'  OFS=, file1 file2
In the above code if i switch the file1 and fie2 position then i get different results.
I cannot post the files due to data sensitivity.
I visually checked the files and i see no special characters or anything.
Is there a special command to verify this.

Appreciate your response.
  #6 (permalink)  
Old 04-22-2009
vgersh99's Avatar
vgersh99 vgersh99 is online now Forum Staff  
Moderator
  
 

Join Date: Feb 2005
Location: Boston, MA
Posts: 5,119
Quote:
Originally Posted by zenith View Post
Code:
nawk -F"," 'NR==FNR {a[$2,$3]=$1;next} ($2 SUBSEP $3) in a {print a[$2,$3],$1,$2,$3}'  OFS=, file1 file2
In the above code if i switch the file1 and fie2 position then i get different results.
I cannot post the files due to data sensitivity.
I visually checked the files and i see no special characters or anything.
Is there a special command to verify this.

Appreciate your response.
Patient: Doc, it really hurts when I do that!
Doctor: Then don't do that!

The positions of the files on the command line is important for mapping the fields from one file to the other. Look at your data files' fields - try to see the difference and look at your original posting for the mapping logic.
Good luck.
Closed Thread

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 05:06 AM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0