Sponsored Content
Top Forums Shell Programming and Scripting Awk Compare Files w/Multiline Records Post 302152638 by RacerX on Thursday 20th of December 2007 03:51:34 PM
Old 12-20-2007
Awk Compare Files w/Multiline Records

I'm trying to compare the first column values in two different files that use a numerical value as the key and output the more meaningful value found in the second column of file1 in front of the matching line(s) in file2. My problem is that file2 has multiple records. For example given:
FILE1
1:A A CABBS:B:G:1988:9:3:3:1:7,060
2:A A CARSON:B:M:1990:21:1:0:3:9,500
3:A A DUPONT:B:M:1978:21:3:2:4:13,500
4:AAHGH TINIAN:B:G:2001:31:5:5:4:10,100
5:AAHON YVES:B:G:1994:41:5:4:3:18,795

FILE2
1:98-02-20:LAX:40:SL
1:98-02-27:LAX:40:GD
1:98-03-8:LAX:36:SL
1:98-03-13:LAX:31:GD
1:98-03-27:LAX:60:FT
1:98-04-3:LAX:45:FT
2:98-05-29:LLG:71:FT
2:98-06-6:LLG:57:FT
2:98-06-12:LLG:71:FT
3:98-05-23:LLG:62:FT
3:98-06-6:LLG:55:FT
4:98-01-6:BOS:58:GD
5:98-01-5:CHI:58:FT
5:98-01-12:CHI:39:FT
5:98-01-19:CHI:30:GD
5:98-01-28:CHI:39:FT

Desired OUTPUT
A A CABBS:1:98-02-20:LAX:40:SL
A A CABBS:1:98-02-27:LAX:40:GD
A A CABBS:1:98-03-8:LAX:36:SL
A A CABBS:1:98-03-13:LAX:31:GD
A A CABBS:1:98-03-27:LAX:60:FT
A A CABBS:1:98-04-3:LAX:45:FT
A A CARSON:2:98-05-29:LLG:71:FT
A A CARSON:2:98-06-6:LLG:57:FT
A A CARSON:2:98-06-12:LLG:71:FT
A A DUPONT:3:98-05-23:LLG:62:FT
A A DUPONT:3:98-06-6:LLG:55:FT
AAHGH TINIAN:4:98-01-6:BOS:58:GD
AAHON YVES:5:98-01-5:CHI:58:FT
AAHON YVES:5:98-01-12:CHI:39:FT
AAHON YVES:5:98-01-19:CHI:30:GD
AAHON YVES:5:98-01-28:CHI:39:FT

I have come up with the following awk program:
Code:
BEGIN {
FS = OFS = ":";
while (getline < ARGV[1]) {
   field1 = $1;
   field2 = $2;
   while (getline < ARGV[2]) {
      if ($1==field1) {
         print field2, $0;
      }
	}
}
 
}
#awk -f ~/Desktop/alt.awk ~/Desktop/file1.txt ~/Desktop/file2.txt > ~/Desktop/Output.txt

However, it only returns what i want for the first record and is done. I know i'm missing something but don't know what: array or loop or both? Any suggestions or help would be appreciated as my real files have 39,000 records and i've been going nowhere with this database project for over a week.
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Cut specific fields from a file containing multiline records

Hi, I am looking for a method to get column13 to column 50 data from the 1st line of a multiline reord. The records are stored in a large file and are separated by newline. sample format is (data in red is to be extracted) <header> A001dfhskhfkdsh hajfhksdhfjh... (3 Replies)
Discussion started by: sunayana3112
3 Replies

2. Shell Programming and Scripting

Compare Records between to files and extract it

I am not an expert in awk, SED, etc... but I really hope there is a way to do this, because I don't want to have to right a program. I am using C shell. FILE 1 FILE 2 H0000000 H0000000 MA1 MA1 CA1DDDDDD CA1AAAAAA MA2 ... (2 Replies)
Discussion started by: jclanc8
2 Replies

3. Shell Programming and Scripting

How to compare data from 2 zip files and capture the new records from file2 to a new file

I have 2 zip files which have about 20 million records in each file. file 2 will have additional records than file 1. I want to compare the records in both the files and capture the new records from file 2 into another file file3. Please help me with a command/script which provides me the desired... (8 Replies)
Discussion started by: koneru
8 Replies

4. Shell Programming and Scripting

Compare 2 files having different number of columns and records

Hi , My requirement is to Compare 2 files having different number of columns and records and get the ouptut containing all the non-matching records from File A(with all column values ) .Example data below : File A contains following : Aishvarya |1234... (4 Replies)
Discussion started by: aishvarya.singh
4 Replies

5. Shell Programming and Scripting

awk command to compare a file with set of files in a directory using 'awk'

Hi, I have a situation to compare one file, say file1.txt with a set of files in directory.The directory contains more than 100 files. To be more precise, the requirement is to compare the first field of file1.txt with the first field in all the files in the directory.The files in the... (10 Replies)
Discussion started by: anandek
10 Replies

6. Shell Programming and Scripting

Compare two files with different number of records and output only the Extra records from file1

Hi Freinds , I have 2 files . File 1 |nag|HYd|1|Che |esw|Gun|2|hyd |pra|bhe|3|hyd |omu|hei|4|bnsj |uer|oeri|5|uery File 2 |nag|HYd|1|Che |esw|Gun|2|hyd |uer|oi|3|uery output : (9 Replies)
Discussion started by: i150371485
9 Replies

7. Shell Programming and Scripting

Compare multiple files, identify common records and combine unique values into one file

Good morning all, I have a problem that is one step beyond a standard awk compare. I would like to compare three files which have several thousand records against a fourth file. All of them have a value in each row that is identical, and one value in each of those rows which may be duplicated... (1 Reply)
Discussion started by: nashton
1 Replies

8. Shell Programming and Scripting

awk - compare records of 1 file with 3 files

hi.. I want to compare records present in 1 file with those in 3 other files and print those records of file 1 which are not present in any of the files. for eg - file1 file2 file3 file4 1 1 5 7 2 2 6 9 3 4 5 6 7 8 9 ... (3 Replies)
Discussion started by: Abhiraj Singh
3 Replies

9. Shell Programming and Scripting

Compare files to pull changed records only

Hi, I am using Sun Solaris - SunOS. I have two fixed width files shown below. I am trying to find the changes in the records in the Newfile.txt for the records where the key column matches. The first column is a key column (example: A123). If there are any new or deletion of records in the... (4 Replies)
Discussion started by: Saanvi1
4 Replies

10. UNIX for Beginners Questions & Answers

awk for matching fields between files with repeated records

Hello all, I am having trouble with what should be an easy task, but seem to be missing something fundamental. I have two files, with File 1 consisting of a single field of many thousands of records. I also have File 2 with two fields and many thousands of records. My goal is that when $1 of... (2 Replies)
Discussion started by: jvoot
2 Replies
COMM(1) 						    BSD General Commands Manual 						   COMM(1)

NAME
comm -- select or reject lines common to two files SYNOPSIS
comm [-123i] file1 file2 DESCRIPTION
The comm utility reads file1 and file2, which should be sorted lexically, and produces three text columns as output: lines only in file1; lines only in file2; and lines in both files. The filename ``-'' means the standard input. The following options are available: -1 Suppress printing of column 1, lines only in file1. -2 Suppress printing of column 2, lines only in file2. -3 Suppress printing of column 3, lines common to both. -i Case insensitive comparison of lines. Each column will have a number of tab characters prepended to it equal to the number of lower numbered columns that are being printed. For example, if column number two is being suppressed, lines printed in column number one will not have any tabs preceding them, and lines printed in column number three will have one. The comm utility assumes that the files are lexically sorted; all characters participate in line comparisons. ENVIRONMENT
The LANG, LC_ALL, LC_COLLATE, and LC_CTYPE environment variables affect the execution of comm as described in environ(7). EXIT STATUS
The comm utility exits 0 on success, and >0 if an error occurs. SEE ALSO
cmp(1), diff(1), sort(1), uniq(1) STANDARDS
The comm utility conforms to IEEE Std 1003.2-1992 (``POSIX.2''). The -i option is an extension to the POSIX standard. HISTORY
A comm command appeared in Version 4 AT&T UNIX. BSD
December 12, 2009 BSD
All times are GMT -4. The time now is 05:46 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy