Compare specific columns between two files having different layouts


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Compare specific columns between two files having different layouts
# 1  
Old 09-08-2010
Java Compare specific columns between two files having different layouts

Hi,

My requirement is that I need to compare two files.

For example :

The first file will be having 15 columns and the second file will be having just 10 columns.

Example :

File1 :
Code:
abcd,abrd,fun,D000,$15,$236,$217,$200,$200,$200
dear,dare,tun,D000,$12.00405,$234.08976,$212.09876,$200,$200,$200

File2 :
Code:
dear,dar2e,tun,D00210,12.00405,2134.08976
abcd,awred,fuwn,qD0qw00,15,236

The first column can be treated as the key.

Based on the key(s), the columns present in the second file will be compared to those in the first file and the remaning columns will not be compared.

Also,
The rows might not be in order in both the files.
How to achieve this using perl?

Moderator's Comments:
Mod Comment Use code tags please, thanks.

Last edited by zaxxon; 09-08-2010 at 08:49 AM..
# 2  
Old 09-08-2010
Here is a simple way:

Code:
open FIRFILE "file1.txt" or die "Unable to open file: [$!]";
open SECFILE "file2.txt" or die "Unable to open file: [$!]";

# Start with empty hashes
my %firHash = ();
my %secHash = ();

# Fill the first hash
while (<FIRFILE>)
{
	@fileColumns = split(/,/);
	my $ident = $fileColumns[0];
	
	$firHash{$ident} = $_;
}

# Fill the second hash
while (<SECFILE>)
{
	@fileColumns = split(/,/);
	my $ident = $fileColumns[0];
	
	$secHash{$ident} = $_;
}

for (my($secHashKey, $secHashValue) = each(%secHash))
{
	$firHashValue = $firHash{$secHashKey};
	
	# $firHashValue
	# $secHashValue
	# Here you can compare the values you need!
}

Regards.
This User Gave Thanks to felipe.vinturin For This Post:
# 3  
Old 09-08-2010
MySQL @felipe.vinturin

Hi,

Thanks for the reply.

This looks good for me to start things.

Actually my requirement is little more complex.

The two files will be having varying layout. The columns in both the delimited files might not be in the same location.

For example, column1 in file1 might be present as column3 in file2.

We need to have a parameterized file, which gives the locations of the columns(primary key columns as well as columns which are to be compared) in both the files.

Any suggestions on how to proceed on this would greatly help.

Since, I am new to this forum, I do not know how to add proper tags for the thread. Any help on that too will be greatly appreciated.

Thanks.
# 4  
Old 09-08-2010
About reading a configuration file, you can check this link: How to read a configuration file with Perl | devdaily.com

For a user purpose it is ok, but for production it is not a good idea because it uses: "eval".
CPAN has a good module: AppConfig - AppConfig::File - search.cpan.org

In your case you can create:
1. A function where you pass: the separator, the key column in the file, and a reference to the hash.

With the code above, the result hash will be like:
Code:
Key: abcd
Value: abcd,abrd,fun,D000,$15,$236,$217,$200,$200,$200

Key: dear
Value: dear,dare,tun,D000,$12.00405,$234.08976,$212.09876,$200,$200,$200

Check this link: Perl Hash Howto

2. A function where you pass: the separator, the position in the first hash, the position in the second hash that you want to compare.
Inside this function you will split the values and compare based on the position given as argument. Don't forget that the split return an array with start position equals to zero.

I hope it helps. =o)

Regards.
This User Gave Thanks to felipe.vinturin For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

How to compare two columns in two files?

Hi All, I have a.dat file with content 1,338,30253395122015103,2015103,UB0085000,STMT151117055527002,,, 1,338,30253395122015103,2015103,UB0085000,STMT151117055527001,,, and b.dat having content 1,STMT151117055527001,a1.txt,b1.txt,c1.txt 1,STMT151117055527002,a2.txt,b2.txt,c2.txt ... (13 Replies)
Discussion started by: PRAMOD 96
13 Replies

2. UNIX for Dummies Questions & Answers

Help need to compare columns in files

Hi, Below is my requirement file1 id|cnt 1|1 2|2 3|3 file2 id_1|cnt_1 1|1 2|1 3|1 I want to compare cnt and cnt_1 columns, if they are differ then give the details Am using below awk command, but the output is not as expected. (2 Replies)
Discussion started by: grandhirahuletl
2 Replies

3. Shell Programming and Scripting

Compare 2 csv files by columns, then extract certain columns of matcing rows

Hi all, I'm pretty much a newbie to UNIX. I would appreciate any help with UNIX coding on comparing two large csv files (greater than 10 GB in size), and output a file with matching columns. I want to compare file1 and file2 by 'id' and 'chain' columns, then extract exact matching rows'... (5 Replies)
Discussion started by: bkane3
5 Replies

4. Shell Programming and Scripting

Compare columns in different files

Hi, I have two files like this: 8 1.3 10 1.3 12 1.3 15 1.3 21 1.3 and 1 2 3 4 10 11 15 16 21 22 (3 Replies)
Discussion started by: jamie_123
3 Replies

5. Shell Programming and Scripting

Compare Columns of two files

Hi I have file 1 like this and file 2 like this I need to compare column 3 of both files and delete lines in file1 with same column 3 values in two files. So the output is I tried with perl but didnt work. A perl code will be good as I am learning the language, but any other code would... (1 Reply)
Discussion started by: polsum
1 Replies

6. Shell Programming and Scripting

awk compare specific columns from 2 files, print new file

Hello. I have two files. FILE1 was extracted from FILE2 and modified thanks to help from this post. Now I need to replace the extracted, modified lines into the original file (FILE2) to produce the FILE3. FILE1 1466 55.27433 14.72050 -2.52E+03 3.00E-01 1.05E+04 2.57E+04 1467 55.27433... (1 Reply)
Discussion started by: jm4smtddd
1 Replies

7. UNIX for Dummies Questions & Answers

Compare Columns in two files

Hi all, I would like to compare a column in one file to a column in another file and when there is a match it prints the first column and the corresponding second column. Example File1 ABA ABC ABE ABF File 2 ABA 123 ABB 124 ABD 125 ABC 126 So what I would like printed to a file... (0 Replies)
Discussion started by: pcg
0 Replies

8. Shell Programming and Scripting

How to compare 2 files & get only few columns based on a condition related to both files?

Hiiiii friends I have 2 files which contains huge data & few lines of it are as shown below File1: b.dat(which has 21 columns) SSR 1976 8 12 13 10 44.00 39.0700 70.7800 7.0 0 0.00 0 2.78 0.00 0.00 0 0.00 2.78 0 NULL ISC 1976 8 12 22 32 37.39 36.2942 70.7338... (6 Replies)
Discussion started by: reva
6 Replies

9. Shell Programming and Scripting

How to compare two columns in two files?

Hello all, Could someone please let me know shell script or awk solution to compare two columns in two files? Here is the sample - file1.txt abc/xyz,M1234 ddd/lyg,M2345 cnn/tnt,G0123 file2.txt A,abc/xyz,kk,dd,zz,DCT,G0123,1 A,ddd/lyg,kk,dd,zz,DCT,M1234,1... (17 Replies)
Discussion started by: sncoupons
17 Replies

10. Shell Programming and Scripting

Compare few columns from two files

My Friends, Need your help to find the difference between few columns from two comma delimited files. For example, File1 and File2 has 22 columns, and I want to find the difference in first 12 columns. I have list of file names in MyListOfFiles2Compare.txt. Data is separated with commas.... (5 Replies)
Discussion started by: manish44
5 Replies
Login or Register to Ask a Question