Sponsored Content
Top Forums Shell Programming and Scripting Awk Compare Files w/Multiline Records Post 302152642 by shamrock on Thursday 20th of December 2007 04:25:49 PM
Old 12-20-2007
Quote:
Originally Posted by RacerX
I'm trying to compare the first column values in two different files that use a numerical value as the key and output the more meaningful value found in the second column of file1 in front of the matching line(s) in file2. My problem is that file2 has multiple records. For example given:
FILE1
1:A A CABBS:B:G:1988:9:3:3:1:7,060
2:A A CARSON:B:M:1990:21:1:0:3:9,500
3:A A DUPONT:B:M:1978:21:3:2:4:13,500
4:AAHGH TINIAN:B:G:2001:31:5:5:4:10,100
5:AAHON YVES:B:G:1994:41:5:4:3:18,795

FILE2
1:98-02-20:LAX:40:SL
1:98-02-27:LAX:40:GD
1:98-03-8:LAX:36:SL
1:98-03-13:LAX:31:GD
1:98-03-27:LAX:60:FT
1:98-04-3:LAX:45:FT
2:98-05-29:LLG:71:FT
2:98-06-6:LLG:57:FT
2:98-06-12:LLG:71:FT
3:98-05-23:LLG:62:FT
3:98-06-6:LLG:55:FT
4:98-01-6:BOS:58:GD
5:98-01-5:CHI:58:FT
5:98-01-12:CHI:39:FT
5:98-01-19:CHI:30:GD
5:98-01-28:CHI:39:FT

Desired OUTPUT
A A CABBS:1:98-02-20:LAX:40:SL
A A CABBS:1:98-02-27:LAX:40:GD
A A CABBS:1:98-03-8:LAX:36:SL
A A CABBS:1:98-03-13:LAX:31:GD
A A CABBS:1:98-03-27:LAX:60:FT
A A CABBS:1:98-04-3:LAX:45:FT
A A CARSON:2:98-05-29:LLG:71:FT
A A CARSON:2:98-06-6:LLG:57:FT
A A CARSON:2:98-06-12:LLG:71:FT
A A DUPONT:3:98-05-23:LLG:62:FT
A A DUPONT:3:98-06-6:LLG:55:FT
AAHGH TINIAN:4:98-01-6:BOS:58:GD
AAHON YVES:5:98-01-5:CHI:58:FT
AAHON YVES:5:98-01-12:CHI:39:FT
AAHON YVES:5:98-01-19:CHI:30:GD
AAHON YVES:5:98-01-28:CHI:39:FT

I have come up with the following awk program:
Code:
BEGIN {
FS = OFS = ":";
while (getline < ARGV[1]) {
   field1 = $1;
   field2 = $2;
   while (getline < ARGV[2]) {
      if ($1==field1) {
         print field2, $0;
      }
	}
}
 
}
#awk -f ~/Desktop/alt.awk ~/Desktop/file1.txt ~/Desktop/file2.txt > ~/Desktop/Output.txt

However, it only returns what i want for the first record and is done. I know i'm missing something but don't know what: array or loop or both? Any suggestions or help would be appreciated as my real files have 39,000 records and i've been going nowhere with this database project for over a week.
This looks like a job for join provided both FILE1 and FILE2 are sorted...

Code:
join -t":" -1 1 -2 1 -o 1.2 2.1 2.2 2.3 2.4 2.5 FILE1 FILE2

 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Cut specific fields from a file containing multiline records

Hi, I am looking for a method to get column13 to column 50 data from the 1st line of a multiline reord. The records are stored in a large file and are separated by newline. sample format is (data in red is to be extracted) <header> A001dfhskhfkdsh hajfhksdhfjh... (3 Replies)
Discussion started by: sunayana3112
3 Replies

2. Shell Programming and Scripting

Compare Records between to files and extract it

I am not an expert in awk, SED, etc... but I really hope there is a way to do this, because I don't want to have to right a program. I am using C shell. FILE 1 FILE 2 H0000000 H0000000 MA1 MA1 CA1DDDDDD CA1AAAAAA MA2 ... (2 Replies)
Discussion started by: jclanc8
2 Replies

3. Shell Programming and Scripting

How to compare data from 2 zip files and capture the new records from file2 to a new file

I have 2 zip files which have about 20 million records in each file. file 2 will have additional records than file 1. I want to compare the records in both the files and capture the new records from file 2 into another file file3. Please help me with a command/script which provides me the desired... (8 Replies)
Discussion started by: koneru
8 Replies

4. Shell Programming and Scripting

Compare 2 files having different number of columns and records

Hi , My requirement is to Compare 2 files having different number of columns and records and get the ouptut containing all the non-matching records from File A(with all column values ) .Example data below : File A contains following : Aishvarya |1234... (4 Replies)
Discussion started by: aishvarya.singh
4 Replies

5. Shell Programming and Scripting

awk command to compare a file with set of files in a directory using 'awk'

Hi, I have a situation to compare one file, say file1.txt with a set of files in directory.The directory contains more than 100 files. To be more precise, the requirement is to compare the first field of file1.txt with the first field in all the files in the directory.The files in the... (10 Replies)
Discussion started by: anandek
10 Replies

6. Shell Programming and Scripting

Compare two files with different number of records and output only the Extra records from file1

Hi Freinds , I have 2 files . File 1 |nag|HYd|1|Che |esw|Gun|2|hyd |pra|bhe|3|hyd |omu|hei|4|bnsj |uer|oeri|5|uery File 2 |nag|HYd|1|Che |esw|Gun|2|hyd |uer|oi|3|uery output : (9 Replies)
Discussion started by: i150371485
9 Replies

7. Shell Programming and Scripting

Compare multiple files, identify common records and combine unique values into one file

Good morning all, I have a problem that is one step beyond a standard awk compare. I would like to compare three files which have several thousand records against a fourth file. All of them have a value in each row that is identical, and one value in each of those rows which may be duplicated... (1 Reply)
Discussion started by: nashton
1 Replies

8. Shell Programming and Scripting

awk - compare records of 1 file with 3 files

hi.. I want to compare records present in 1 file with those in 3 other files and print those records of file 1 which are not present in any of the files. for eg - file1 file2 file3 file4 1 1 5 7 2 2 6 9 3 4 5 6 7 8 9 ... (3 Replies)
Discussion started by: Abhiraj Singh
3 Replies

9. Shell Programming and Scripting

Compare files to pull changed records only

Hi, I am using Sun Solaris - SunOS. I have two fixed width files shown below. I am trying to find the changes in the records in the Newfile.txt for the records where the key column matches. The first column is a key column (example: A123). If there are any new or deletion of records in the... (4 Replies)
Discussion started by: Saanvi1
4 Replies

10. UNIX for Beginners Questions & Answers

awk for matching fields between files with repeated records

Hello all, I am having trouble with what should be an easy task, but seem to be missing something fundamental. I have two files, with File 1 consisting of a single field of many thousands of records. I also have File 2 with two fields and many thousands of records. My goal is that when $1 of... (2 Replies)
Discussion started by: jvoot
2 Replies
COMM(1) 							   User Commands							   COMM(1)

NAME
comm - compare two sorted files line by line SYNOPSIS
comm [OPTION]... FILE1 FILE2 DESCRIPTION
Compare sorted files FILE1 and FILE2 line by line. With no options, produce three-column output. Column one contains lines unique to FILE1, column two contains lines unique to FILE2, and column three contains lines common to both files. -1 suppress column 1 (lines unique to FILE1) -2 suppress column 2 (lines unique to FILE2) -3 suppress column 3 (lines that appear in both files) --check-order check that the input is correctly sorted, even if all input lines are pairable --nocheck-order do not check that the input is correctly sorted --output-delimiter=STR separate columns with STR --help display this help and exit --version output version information and exit Note, comparisons honor the rules specified by `LC_COLLATE'. EXAMPLES
comm -12 file1 file2 Print only lines present in both file1 and file2. comm -3 file1 file2 Print lines in file1 not in file2, and vice versa. AUTHOR
Written by Richard M. Stallman and David MacKenzie. REPORTING BUGS
Report comm bugs to bug-coreutils@gnu.org GNU coreutils home page: <http://www.gnu.org/software/coreutils/> General help using GNU software: <http://www.gnu.org/gethelp/> Report comm translation bugs to <http://translationproject.org/team/> COPYRIGHT
Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>. This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. SEE ALSO
join(1), uniq(1) The full documentation for comm is maintained as a Texinfo manual. If the info and comm programs are properly installed at your site, the command info coreutils 'comm invocation' should give you access to the complete manual. GNU coreutils 8.5 February 2011 COMM(1)
All times are GMT -4. The time now is 02:31 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy