Sponsored Content
Top Forums Shell Programming and Scripting Compare 2 huge files wrt to a key using awk Post 302214450 by Ranjani on Monday 14th of July 2008 06:25:28 AM
Old 07-14-2008
Compare 2 huge files wrt to a key using awk

Hi Folks,

I need to compare two very huge file ( i.e the files would contain a minimum of 70k records each) using awk or sed. The comparison needs to be done with respect to a 'key'. For example :

File1
**********
1234|TONY|Y75634|20/07/2008
1235|TINA|XCVB56|30/07/2009
43456|PATS|U74454|12/04/2009
23456|DAPS|R4576|15/03/2008

File2
******
1235|TINA|XCVB56|30/07/2009
1234|TONY|Y75634|20/07/2008
23456|DAPS|R4576|15/03/2008

In this case, if I consider '|' as the delimiter , the value at column 3 as 'key' for the files, I need to look out for this key in the second file and once that is got, I need to compare the values at column 2 of the corresponding records in both the files.

Also, I need to report a message in case the key is not present in file2.

PS: I have a perl script running for this.. but it takes way too long to perform this comparison, your help in suggesting some awk script which would perform this action much faster would be really appreciated.

Thanks in advance
Ranjani
 

10 More Discussions You Might Find Interesting

1. Solaris

compare huge file

Hi, I have files with records of 40,00,000& 39,00,000 and i want to find out the content 1.which is existing in file1 and not in file2. 2.Which is exisitng in file2 and not in file1. The format of the file will be like 404ABCDEFGHIJK|CDEFGHIJK|1234567890|1 If its a smaller one i... (1 Reply)
Discussion started by: salaathi
1 Replies

2. Shell Programming and Scripting

Compare 2 folders to find several missing files among huge amounts of files.

Hi, all: I've got two folders, say, "folder1" and "folder2". Under each, there are thousands of files. It's quite obvious that there are some files missing in each. I just would like to find them. I believe this can be done by "diff" command. However, if I change the above question a... (1 Reply)
Discussion started by: jiapei100
1 Replies

3. Shell Programming and Scripting

Compare Fields from two text files using key columns

Hi All, I have two files to compare. Each has 10 columns with first 4 columns being key index together. The rest of the columns have monetary values. Using Perl, I want to read one file into hash; check for the key value availability in file 2; then compare the values in the rest of 6... (2 Replies)
Discussion started by: Sangtha
2 Replies

4. Shell Programming and Scripting

Format & Compare two huge CSV files

I have two csv files having 90K records each & each row has around 50 columns.Lets say the file names are FILE1 and FILE2. I have to compare both the files and generate a new file that has rows from FILE2 if it differs. FILE1 ----- 2001,"John",25,19901130,21211.41,Unix Forum... (3 Replies)
Discussion started by: Sheel
3 Replies

5. Shell Programming and Scripting

awk command to compare a file with set of files in a directory using 'awk'

Hi, I have a situation to compare one file, say file1.txt with a set of files in directory.The directory contains more than 100 files. To be more precise, the requirement is to compare the first field of file1.txt with the first field in all the files in the directory.The files in the... (10 Replies)
Discussion started by: anandek
10 Replies

6. Shell Programming and Scripting

match two key columns in two files and print output (awk)

I have two files... file1 and file2. Where columns 1 and 2 of file1 match columns 1 and 2 of file2 I want to create a new file that is all file1 + columns 3 and 4 of file2 :b: Many thanks if you know how to do this.... :b: file1 31-101 106 0 92 31-101 106 29 ... (2 Replies)
Discussion started by: pelhabuan
2 Replies

7. Shell Programming and Scripting

Fetching record based on Uniq Key from huge file.

Hi i want to fetch 100k record from a file which is looking like as below. XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX ... (17 Replies)
Discussion started by: lathigara
17 Replies

8. Shell Programming and Scripting

awk to parse huge files

Hello All, I have a situation as below: (1) Read a source file (a single file of 1.2 million rows in it ) (2) Read Destination files one by one and replace the content ( few fields in it ) with the corresponding matching field from source file. I tried as below: ( please note I am not... (4 Replies)
Discussion started by: panyam
4 Replies

9. Shell Programming and Scripting

awk - Merge two files based on one key

Hi, I am struggling with the an awk command to merge two files based on a common key. I want to append the value from File2 ($2) onto the end of File1 where $1 from each file matches - If no match then nothing is apended File1 COL1|COL2|COL3|COL4|COL5|COL6|COL7... (3 Replies)
Discussion started by: Ads89
3 Replies

10. Shell Programming and Scripting

Files summary using awk based on index key

Hello , I have several files which are looking similar to : file01.txt keyA001 350 X string001 value001 keyA001 450 X string002 value007 keyA001 454 X string002 value004 keyA001 500 X string003 value005 keyA001 255 X string004 value006 keyA001 388 X string005 value008 keyA001 1278 X... (4 Replies)
Discussion started by: alex2005
4 Replies
BB-CSVINFO.CGI(1)					      General Commands Manual						 BB-CSVINFO.CGI(1)

NAME
bb-csvinfo.cgi - CGI program to show host information from a CSV file SYNOPSIS
bb-csvinfo.cgi DESCRIPTION
bb-csvinfo.cgi is invoked as a CGI script via the bb-csvinfo.sh CGI wrapper. Based on the parameters it receives, it searches a comma- separated file for the matching host, and presents the information found as a table. bb-csvinfo.cgi is passed a QUERY_STRING environment variable with the following parameters: key (string to search for, typically hostname) column (columnnumber to search - default 0) db (name of the CSV database file in $BBHOME/etc/, default hostinfo.csv) delimiter (delimiter character for columns, default semi-colon) CSV files are easily created from e.g. spreadsheets, by exporting them in CSV format. You should have one host per line, with the first line containing the column headings. Despite their name, the default delimiter for CSV files is the semi-colon - if you need a different delimiter, invoke bb-csvinfo.cgi with the "delimiter=<character>" in the query string. Example usage This example shows how you can use the bb-csvinfo CGI. It assumes you have a CSV-formatted file with information about the hosts stored as $BBHOME/etc/hostinfo.csv, and the hostname is in the first column of the file. Use with the bbgen --docurl The --docurl option to bbgen(1) sets up all of the hostnames on your Xymon webpages to act as links to a CGI script. To invoke the bb-csvinfo CGI script, run bbgen with the option --docurl=/cgi-bin/bb-csvinfo.sh?db=hostinfo.csv&key=%s SEE ALSO
bb-hosts(5), hobbitserver.cfg(5), bbgen(1) Xymon Version 4.2.3: 4 Feb 2009 BB-CSVINFO.CGI(1)
All times are GMT -4. The time now is 04:26 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy