awk script required for finding records in 1 file with corresponding another file.


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk script required for finding records in 1 file with corresponding another file.
# 1  
Old 07-04-2008
awk script required for finding records in 1 file with corresponding another file.

Hi,

I have a .txt file (uniqfields.txt) with 3 fields separated by " | " (pipe symbol). This file contains unique values with respect to all these 3 fields taken together. There are about 40,000 SORTED records (rows) in this file. Sample records are given below.

Code:
1TVAO|OVEPT|VO
1TVAO|OVPDM|VO
6NFXE|17CLP|DH
6NFXE|NRZO4|EQ
6NFXE|SMOSA|EQ
ACA15|11X1W|DX
ACA15|1LN88|DX
ACA15|1LNSK|DX
ACA15|1LNVX|DX
ACA15|1LNVX|FD

Now, there is another file (mainfile.txt) which contains 23 fields(columns), which contains the above fields as 7th, 13th and 14th field respectively. This file ie, mainfile.txt is also sorted and is seperated by pipe symbol. This file contains about 77,000 records. The 7th, 13th & 14th columns are from with in the above values only, but some records(rows) are repeated (with respect to these 3 fields(columns), other fields(columns) may or maynot be same.

What i need to do now is to compare first record (1st row) of uniqfields.txt with that of mainfile.txt and fetch first record (row) which contains all the above 3 fields same. That is, 1st field from uniqfields.txt should match with 7th column of mainfile.txt AND 2nd of uniqfields.txt with 13th of mainfile.txt AND 3rd of uniqfields.txt with 14th of mainfile.txt.

Why this is required with awk script..??

1) As i'm new to unix, i'm just catchin up wit awk and i'm not able to find a solution myself.
2) I tried with sort -t\| -u +6 -7 +12 -14 mainfile.txt > uniqmainfile.txt, but its working fine in SunOS and not working on NCR MP-RAS.
Actually, we are migrating these from server with SunOS to server with NCR MP-RAS. So, in SunOS, if i run the sort script just mentioned above, its fetching the first unique record, whereas in NCR MP-RAS its fetching the last unique record.

How does it impact if it takes last or first among from similar records..??

Actually, as we are checking for uniqueness only in the 7th, 13th and 14th fields(columns), the other fields(columns) are not matched with the reports in MP-RAS with that of SunOS.

I've tried with -r and tried using uniq command as well,but in vain. And found that the only solution is using awk.

Please help me in this regard.

Thanks,
RRVARMA
# 2  
Old 07-06-2008
How about this:

Code:
awk -F '|' '
        # load uniqfile.txt into array
        NR==FNR { uniq[$1,$2,$3]=1; next }
        # print record if index is present in uniq array, change value to prevent 
        # printing duplicate records
        uniq[$7,$13,$14] == 1 { print; uniq[$7,$13,$14]=0 }
' uniqfile.txt mainfile.txt

I'm not sure how well it will perform since it loads the entire uniqfile.txt into an array, so it might use a significant amount of memory.
# 3  
Old 07-18-2008
Hi Annihilannic,

Its showing this error..

Code:
awk: can't open |

As i said.. i'm not that good in awk.. Smilie Could you please suggest some other way.

Thanks Annihilannic, Smilie
RRVARMA
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Finding records NOT on another file

I have three files named ALL, MATCH, and DIFF. Match and diff have completely different records included in the "all" file, but the "all" file also has records not in either the Match or Diff files. I know I can sort all three files together, one unique and one without that option to show which... (5 Replies)
Discussion started by: wbport
5 Replies

2. Shell Programming and Scripting

How to get the Invalid records from a file using awk?

My Input file is fixed length record ends with . as end of the line and the character length is 4156 Example: 12234XYZ TY^4253$+00000-00000........... I need to check is there any control characters(like ^M,^Z) The line will be splitted awk '{id=substr($0,1,5) nm=substr($0,6,3)... (2 Replies)
Discussion started by: dineshaila
2 Replies

3. Shell Programming and Scripting

Shell script to filter records in a zip file that contains matching columns from another file

Not sure if this is the correct forum for this question. I have two files. file1.zip, file2 Input: file1.zip col1, col2 , col3 a , b , 0:0:0:0:0:c436:9346:d40b x, y, 0:0:0:0:0:880:39f9:c9a7 m, n , 0:0:0:0:0:80c7:9161:fe00 file2.txt col1 c4:36:93:46:d4:0b... (1 Reply)
Discussion started by: anil.v
1 Replies

4. Shell Programming and Scripting

UNIX Script required for count the records in table

Hi Friends, I looking for the script for the count of the records in table. and then it's containg the zero records then should get abort. and should notify us through mail. Can you please help me out in this area i am lacking. (5 Replies)
Discussion started by: victory
5 Replies

5. Shell Programming and Scripting

awk - compare records of 1 file with 3 files

hi.. I want to compare records present in 1 file with those in 3 other files and print those records of file 1 which are not present in any of the files. for eg - file1 file2 file3 file4 1 1 5 7 2 2 6 9 3 4 5 6 7 8 9 ... (3 Replies)
Discussion started by: Abhiraj Singh
3 Replies

6. Shell Programming and Scripting

Deleting duplicate records from file 1 if records from file 2 match

I have 2 files "File 1" is delimited by ";" and "File 2" is delimited by "|". File 1 below (3 record shown): Doc1;03/01/2012;New York;6 Main Street;Mr. Smith 1;Mr. Jones Doc2;03/01/2012;Syracuse;876 Broadway;John Davis;Barbara Lull Doc3;03/01/2012;Buffalo;779 Old Windy Road;Charles... (2 Replies)
Discussion started by: vestport
2 Replies

7. Shell Programming and Scripting

awk script for getting the selected records from a file.

Hello, I have attached one file named file.txt . I have to create a file using the awk script with the records in which 38th position is P and not V . ex it should have 00501 HOLTSVILLE NYP00501 and it should not include 00501 I R S SERVICE CENTER ... (3 Replies)
Discussion started by: sonam273
3 Replies

8. Shell Programming and Scripting

Urgent Help Required for File Comparison using Awk

Hello All, I am having a below requirement. File1 contains KEY|VIN|SEQUENCE|COST 101 | XXX111 | 1 | 234.22 234 | XXX111 | 2 | 134.32 444 | ABC234 | 1 | 100.22 555 | DFF611 | 1 | 734.82 FILE 2 Contains only VINs XXX111 DFF611 Now if the VIN from file 1 is present in... (8 Replies)
Discussion started by: dinesh1985
8 Replies

9. Shell Programming and Scripting

Filter records in a file using AWK

I want to filter records in one of my file using AWK command (or anyother command). I am using the below code awk -F@ '$1=="0003"&&"$2==20100402" print {$0}' $INPUT > $OUTPUT I want to pass the 0003 and 20100402 values through a variable. How can I do this? Any help is much... (1 Reply)
Discussion started by: gpaulose
1 Replies

10. Shell Programming and Scripting

finding null records in data file

I am having a "|" delimited flat file and I have to pick up all the records with the 2nd field having null value. Please suggest. (3 Replies)
Discussion started by: dsravan
3 Replies
Login or Register to Ask a Question