Awk Compare Files w/Multiline Records


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Awk Compare Files w/Multiline Records
# 1  
Old 12-20-2007
Awk Compare Files w/Multiline Records

I'm trying to compare the first column values in two different files that use a numerical value as the key and output the more meaningful value found in the second column of file1 in front of the matching line(s) in file2. My problem is that file2 has multiple records. For example given:
FILE1
1:A A CABBS:B:G:1988:9:3:3:1:7,060
2:A A CARSON:B:M:1990:21:1:0:3:9,500
3:A A DUPONT:B:M:1978:21:3:2:4:13,500
4:AAHGH TINIAN:B:G:2001:31:5:5:4:10,100
5:AAHON YVES:B:G:1994:41:5:4:3:18,795

FILE2
1:98-02-20:LAX:40:SL
1:98-02-27:LAX:40:GD
1:98-03-8:LAX:36:SL
1:98-03-13:LAX:31:GD
1:98-03-27:LAX:60:FT
1:98-04-3:LAX:45:FT
2:98-05-29:LLG:71:FT
2:98-06-6:LLG:57:FT
2:98-06-12:LLG:71:FT
3:98-05-23:LLG:62:FT
3:98-06-6:LLG:55:FT
4:98-01-6:BOS:58:GD
5:98-01-5:CHI:58:FT
5:98-01-12:CHI:39:FT
5:98-01-19:CHI:30:GD
5:98-01-28:CHI:39:FT

Desired OUTPUT
A A CABBS:1:98-02-20:LAX:40:SL
A A CABBS:1:98-02-27:LAX:40:GD
A A CABBS:1:98-03-8:LAX:36:SL
A A CABBS:1:98-03-13:LAX:31:GD
A A CABBS:1:98-03-27:LAX:60:FT
A A CABBS:1:98-04-3:LAX:45:FT
A A CARSON:2:98-05-29:LLG:71:FT
A A CARSON:2:98-06-6:LLG:57:FT
A A CARSON:2:98-06-12:LLG:71:FT
A A DUPONT:3:98-05-23:LLG:62:FT
A A DUPONT:3:98-06-6:LLG:55:FT
AAHGH TINIAN:4:98-01-6:BOS:58:GD
AAHON YVES:5:98-01-5:CHI:58:FT
AAHON YVES:5:98-01-12:CHI:39:FT
AAHON YVES:5:98-01-19:CHI:30:GD
AAHON YVES:5:98-01-28:CHI:39:FT

I have come up with the following awk program:
Code:
BEGIN {
FS = OFS = ":";
while (getline < ARGV[1]) {
   field1 = $1;
   field2 = $2;
   while (getline < ARGV[2]) {
      if ($1==field1) {
         print field2, $0;
      }
	}
}
 
}
#awk -f ~/Desktop/alt.awk ~/Desktop/file1.txt ~/Desktop/file2.txt > ~/Desktop/Output.txt

However, it only returns what i want for the first record and is done. I know i'm missing something but don't know what: array or loop or both? Any suggestions or help would be appreciated as my real files have 39,000 records and i've been going nowhere with this database project for over a week.
# 2  
Old 12-20-2007
Quote:
Originally Posted by RacerX
I'm trying to compare the first column values in two different files that use a numerical value as the key and output the more meaningful value found in the second column of file1 in front of the matching line(s) in file2. My problem is that file2 has multiple records. For example given:
FILE1
1:A A CABBS:B:G:1988:9:3:3:1:7,060
2:A A CARSON:B:M:1990:21:1:0:3:9,500
3:A A DUPONT:B:M:1978:21:3:2:4:13,500
4:AAHGH TINIAN:B:G:2001:31:5:5:4:10,100
5:AAHON YVES:B:G:1994:41:5:4:3:18,795

FILE2
1:98-02-20:LAX:40:SL
1:98-02-27:LAX:40:GD
1:98-03-8:LAX:36:SL
1:98-03-13:LAX:31:GD
1:98-03-27:LAX:60:FT
1:98-04-3:LAX:45:FT
2:98-05-29:LLG:71:FT
2:98-06-6:LLG:57:FT
2:98-06-12:LLG:71:FT
3:98-05-23:LLG:62:FT
3:98-06-6:LLG:55:FT
4:98-01-6:BOS:58:GD
5:98-01-5:CHI:58:FT
5:98-01-12:CHI:39:FT
5:98-01-19:CHI:30:GD
5:98-01-28:CHI:39:FT

Desired OUTPUT
A A CABBS:1:98-02-20:LAX:40:SL
A A CABBS:1:98-02-27:LAX:40:GD
A A CABBS:1:98-03-8:LAX:36:SL
A A CABBS:1:98-03-13:LAX:31:GD
A A CABBS:1:98-03-27:LAX:60:FT
A A CABBS:1:98-04-3:LAX:45:FT
A A CARSON:2:98-05-29:LLG:71:FT
A A CARSON:2:98-06-6:LLG:57:FT
A A CARSON:2:98-06-12:LLG:71:FT
A A DUPONT:3:98-05-23:LLG:62:FT
A A DUPONT:3:98-06-6:LLG:55:FT
AAHGH TINIAN:4:98-01-6:BOS:58:GD
AAHON YVES:5:98-01-5:CHI:58:FT
AAHON YVES:5:98-01-12:CHI:39:FT
AAHON YVES:5:98-01-19:CHI:30:GD
AAHON YVES:5:98-01-28:CHI:39:FT

I have come up with the following awk program:
Code:
BEGIN {
FS = OFS = ":";
while (getline < ARGV[1]) {
   field1 = $1;
   field2 = $2;
   while (getline < ARGV[2]) {
      if ($1==field1) {
         print field2, $0;
      }
	}
}
 
}
#awk -f ~/Desktop/alt.awk ~/Desktop/file1.txt ~/Desktop/file2.txt > ~/Desktop/Output.txt

However, it only returns what i want for the first record and is done. I know i'm missing something but don't know what: array or loop or both? Any suggestions or help would be appreciated as my real files have 39,000 records and i've been going nowhere with this database project for over a week.
This looks like a job for join provided both FILE1 and FILE2 are sorted...

Code:
join -t":" -1 1 -2 1 -o 1.2 2.1 2.2 2.3 2.4 2.5 FILE1 FILE2

# 3  
Old 12-20-2007
nawk -f racer.awk FILE1 FILE2
racer.awk:
Code:
BEGIN {
   FS=OFS=":"
}
FNR==NR { arr[$1]=$2; next}
$1 in arr { print arr[$1], $0 }

# 4  
Old 12-20-2007
Thanks for the replies. I decided to give vgersh99's version a try, because i am more comfortable with the awk code and it worked to perfection on my files.

You guru's are great but always make me feel like such a buffoon Smilie. As it probably took you less than five minutes to solve it while i was banging my head on the wall for over a week.

Oh well, i guess we all have to learn at our own pace....Thanks again for the help!
# 5  
Old 12-21-2007
awk

HI,

Just for your reference, this one should be ok for you.

code:
Code:
nawk 'BEGIN{
FS=":"
OFS=":"
}
{
if (NR==FNR)
	a[$1]=$2
else
{
	$1=a[$1]
	print $0
}
}' file1 file2

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

awk for matching fields between files with repeated records

Hello all, I am having trouble with what should be an easy task, but seem to be missing something fundamental. I have two files, with File 1 consisting of a single field of many thousands of records. I also have File 2 with two fields and many thousands of records. My goal is that when $1 of... (2 Replies)
Discussion started by: jvoot
2 Replies

2. Shell Programming and Scripting

Compare files to pull changed records only

Hi, I am using Sun Solaris - SunOS. I have two fixed width files shown below. I am trying to find the changes in the records in the Newfile.txt for the records where the key column matches. The first column is a key column (example: A123). If there are any new or deletion of records in the... (4 Replies)
Discussion started by: Saanvi1
4 Replies

3. Shell Programming and Scripting

awk - compare records of 1 file with 3 files

hi.. I want to compare records present in 1 file with those in 3 other files and print those records of file 1 which are not present in any of the files. for eg - file1 file2 file3 file4 1 1 5 7 2 2 6 9 3 4 5 6 7 8 9 ... (3 Replies)
Discussion started by: Abhiraj Singh
3 Replies

4. Shell Programming and Scripting

Compare multiple files, identify common records and combine unique values into one file

Good morning all, I have a problem that is one step beyond a standard awk compare. I would like to compare three files which have several thousand records against a fourth file. All of them have a value in each row that is identical, and one value in each of those rows which may be duplicated... (1 Reply)
Discussion started by: nashton
1 Replies

5. Shell Programming and Scripting

Compare two files with different number of records and output only the Extra records from file1

Hi Freinds , I have 2 files . File 1 |nag|HYd|1|Che |esw|Gun|2|hyd |pra|bhe|3|hyd |omu|hei|4|bnsj |uer|oeri|5|uery File 2 |nag|HYd|1|Che |esw|Gun|2|hyd |uer|oi|3|uery output : (9 Replies)
Discussion started by: i150371485
9 Replies

6. Shell Programming and Scripting

awk command to compare a file with set of files in a directory using 'awk'

Hi, I have a situation to compare one file, say file1.txt with a set of files in directory.The directory contains more than 100 files. To be more precise, the requirement is to compare the first field of file1.txt with the first field in all the files in the directory.The files in the... (10 Replies)
Discussion started by: anandek
10 Replies

7. Shell Programming and Scripting

Compare 2 files having different number of columns and records

Hi , My requirement is to Compare 2 files having different number of columns and records and get the ouptut containing all the non-matching records from File A(with all column values ) .Example data below : File A contains following : Aishvarya |1234... (4 Replies)
Discussion started by: aishvarya.singh
4 Replies

8. Shell Programming and Scripting

How to compare data from 2 zip files and capture the new records from file2 to a new file

I have 2 zip files which have about 20 million records in each file. file 2 will have additional records than file 1. I want to compare the records in both the files and capture the new records from file 2 into another file file3. Please help me with a command/script which provides me the desired... (8 Replies)
Discussion started by: koneru
8 Replies

9. Shell Programming and Scripting

Compare Records between to files and extract it

I am not an expert in awk, SED, etc... but I really hope there is a way to do this, because I don't want to have to right a program. I am using C shell. FILE 1 FILE 2 H0000000 H0000000 MA1 MA1 CA1DDDDDD CA1AAAAAA MA2 ... (2 Replies)
Discussion started by: jclanc8
2 Replies

10. UNIX for Dummies Questions & Answers

Cut specific fields from a file containing multiline records

Hi, I am looking for a method to get column13 to column 50 data from the 1st line of a multiline reord. The records are stored in a large file and are separated by newline. sample format is (data in red is to be extracted) <header> A001dfhskhfkdsh hajfhksdhfjh... (3 Replies)
Discussion started by: sunayana3112
3 Replies
Login or Register to Ask a Question