The UNIX and Linux Forums  
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
.
google unix.com



Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Concatenating records from 2 files Powcmptr UNIX for Dummies Questions & Answers 4 04-02-2008 03:57 PM
Multiline Grep tolmark UNIX for Dummies Questions & Answers 4 03-14-2008 12:31 AM
Count No of Records in File without counting Header and Trailer Records guiguy Shell Programming and Scripting 2 06-07-2007 12:15 PM
Help comparing 2 files to find deleted records eja UNIX for Dummies Questions & Answers 2 04-03-2007 08:53 AM
Best approach for a 10 min extract out of several log files with timestamped records Browser_ice UNIX for Dummies Questions & Answers 3 11-15-2005 05:49 PM

Closed Thread
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Bulgarian Greek Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 12-20-2007
RacerX's Avatar
RacerX RacerX is offline
Registered User
  
 

Join Date: Oct 2007
Posts: 38
Awk Compare Files w/Multiline Records

I'm trying to compare the first column values in two different files that use a numerical value as the key and output the more meaningful value found in the second column of file1 in front of the matching line(s) in file2. My problem is that file2 has multiple records. For example given:
FILE1
1:A A CABBS:B:G:1988:9:3:3:1:7,060
2:A A CARSON:B:M:1990:21:1:0:3:9,500
3:A A DUPONT:B:M:1978:21:3:2:4:13,500
4:AAHGH TINIAN:B:G:2001:31:5:5:4:10,100
5:AAHON YVES:B:G:1994:41:5:4:3:18,795

FILE2
1:98-02-20:LAX:40:SL
1:98-02-27:LAX:40:GD
1:98-03-8:LAX:36:SL
1:98-03-13:LAX:31:GD
1:98-03-27:LAX:60:FT
1:98-04-3:LAX:45:FT
2:98-05-29:LLG:71:FT
2:98-06-6:LLG:57:FT
2:98-06-12:LLG:71:FT
3:98-05-23:LLG:62:FT
3:98-06-6:LLG:55:FT
4:98-01-6:BOS:58:GD
5:98-01-5:CHI:58:FT
5:98-01-12:CHI:39:FT
5:98-01-19:CHI:30:GD
5:98-01-28:CHI:39:FT

Desired OUTPUT
A A CABBS:1:98-02-20:LAX:40:SL
A A CABBS:1:98-02-27:LAX:40:GD
A A CABBS:1:98-03-8:LAX:36:SL
A A CABBS:1:98-03-13:LAX:31:GD
A A CABBS:1:98-03-27:LAX:60:FT
A A CABBS:1:98-04-3:LAX:45:FT
A A CARSON:2:98-05-29:LLG:71:FT
A A CARSON:2:98-06-6:LLG:57:FT
A A CARSON:2:98-06-12:LLG:71:FT
A A DUPONT:3:98-05-23:LLG:62:FT
A A DUPONT:3:98-06-6:LLG:55:FT
AAHGH TINIAN:4:98-01-6:BOS:58:GD
AAHON YVES:5:98-01-5:CHI:58:FT
AAHON YVES:5:98-01-12:CHI:39:FT
AAHON YVES:5:98-01-19:CHI:30:GD
AAHON YVES:5:98-01-28:CHI:39:FT

I have come up with the following awk program:
Code:
BEGIN {
FS = OFS = ":";
while (getline < ARGV[1]) {
   field1 = $1;
   field2 = $2;
   while (getline < ARGV[2]) {
      if ($1==field1) {
         print field2, $0;
      }
	}
}
 
}
#awk -f ~/Desktop/alt.awk ~/Desktop/file1.txt ~/Desktop/file2.txt > ~/Desktop/Output.txt
However, it only returns what i want for the first record and is done. I know i'm missing something but don't know what: array or loop or both? Any suggestions or help would be appreciated as my real files have 39,000 records and i've been going nowhere with this database project for over a week.
  #2 (permalink)  
Old 12-20-2007
shamrock shamrock is offline Forum Advisor  
Registered User
  
 

Join Date: Oct 2007
Location: USA
Posts: 750
Quote:
Originally Posted by RacerX View Post
I'm trying to compare the first column values in two different files that use a numerical value as the key and output the more meaningful value found in the second column of file1 in front of the matching line(s) in file2. My problem is that file2 has multiple records. For example given:
FILE1
1:A A CABBS:B:G:1988:9:3:3:1:7,060
2:A A CARSON:B:M:1990:21:1:0:3:9,500
3:A A DUPONT:B:M:1978:21:3:2:4:13,500
4:AAHGH TINIAN:B:G:2001:31:5:5:4:10,100
5:AAHON YVES:B:G:1994:41:5:4:3:18,795

FILE2
1:98-02-20:LAX:40:SL
1:98-02-27:LAX:40:GD
1:98-03-8:LAX:36:SL
1:98-03-13:LAX:31:GD
1:98-03-27:LAX:60:FT
1:98-04-3:LAX:45:FT
2:98-05-29:LLG:71:FT
2:98-06-6:LLG:57:FT
2:98-06-12:LLG:71:FT
3:98-05-23:LLG:62:FT
3:98-06-6:LLG:55:FT
4:98-01-6:BOS:58:GD
5:98-01-5:CHI:58:FT
5:98-01-12:CHI:39:FT
5:98-01-19:CHI:30:GD
5:98-01-28:CHI:39:FT

Desired OUTPUT
A A CABBS:1:98-02-20:LAX:40:SL
A A CABBS:1:98-02-27:LAX:40:GD
A A CABBS:1:98-03-8:LAX:36:SL
A A CABBS:1:98-03-13:LAX:31:GD
A A CABBS:1:98-03-27:LAX:60:FT
A A CABBS:1:98-04-3:LAX:45:FT
A A CARSON:2:98-05-29:LLG:71:FT
A A CARSON:2:98-06-6:LLG:57:FT
A A CARSON:2:98-06-12:LLG:71:FT
A A DUPONT:3:98-05-23:LLG:62:FT
A A DUPONT:3:98-06-6:LLG:55:FT
AAHGH TINIAN:4:98-01-6:BOS:58:GD
AAHON YVES:5:98-01-5:CHI:58:FT
AAHON YVES:5:98-01-12:CHI:39:FT
AAHON YVES:5:98-01-19:CHI:30:GD
AAHON YVES:5:98-01-28:CHI:39:FT

I have come up with the following awk program:
Code:
BEGIN {
FS = OFS = ":";
while (getline < ARGV[1]) {
   field1 = $1;
   field2 = $2;
   while (getline < ARGV[2]) {
      if ($1==field1) {
         print field2, $0;
      }
	}
}
 
}
#awk -f ~/Desktop/alt.awk ~/Desktop/file1.txt ~/Desktop/file2.txt > ~/Desktop/Output.txt
However, it only returns what i want for the first record and is done. I know i'm missing something but don't know what: array or loop or both? Any suggestions or help would be appreciated as my real files have 39,000 records and i've been going nowhere with this database project for over a week.
This looks like a job for join provided both FILE1 and FILE2 are sorted...

Code:
join -t":" -1 1 -2 1 -o 1.2 2.1 2.2 2.3 2.4 2.5 FILE1 FILE2
  #3 (permalink)  
Old 12-20-2007
vgersh99's Avatar
vgersh99 vgersh99 is online now Forum Staff  
Moderator
  
 

Join Date: Feb 2005
Location: Boston, MA
Posts: 5,119
nawk -f racer.awk FILE1 FILE2
racer.awk:
Code:
BEGIN {
   FS=OFS=":"
}
FNR==NR { arr[$1]=$2; next}
$1 in arr { print arr[$1], $0 }
  #4 (permalink)  
Old 12-20-2007
RacerX's Avatar
RacerX RacerX is offline
Registered User
  
 

Join Date: Oct 2007
Posts: 38
Thanks for the replies. I decided to give vgersh99's version a try, because i am more comfortable with the awk code and it worked to perfection on my files.

You guru's are great but always make me feel like such a buffoon . As it probably took you less than five minutes to solve it while i was banging my head on the wall for over a week.

Oh well, i guess we all have to learn at our own pace....Thanks again for the help!
  #5 (permalink)  
Old 12-21-2007
summer_cherry summer_cherry is offline Forum Advisor  
Registered User
  
 

Join Date: Jun 2007
Location: Beijing China
Posts: 1,079
awk

HI,

Just for your reference, this one should be ok for you.

code:
Code:
nawk 'BEGIN{
FS=":"
OFS=":"
}
{
if (NR==FNR)
	a[$1]=$2
else
{
	$1=a[$1]
	print $0
}
}' file1 file2
Closed Thread

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 09:27 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0