Plz Help. Compare 2 files field by field and get the output in another file.


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Plz Help. Compare 2 files field by field and get the output in another file.
# 1  
Old 07-25-2012
Data Plz Help. Compare 2 files field by field and get the output in another file.

Hi Freinds,
I have 2 files . one is source.txt and second one is target.txt. I want to keep source.txt as baseline and compare target.txt. please find the data in 2 files and Expected output.

Source.txt
Code:
1|HYD|NAG|TRA|34.5|1234
2|CHE|ESW|DES|36.5|134
3|BAN|MEH|TRA|33.5|234
4|PUN|ABHI|TA|38.5|123
5|KIN|NAV|PRA|31.5|135

target.txt
Code:
1|HYD|NAG|TRA|34.5|1234
2|CHE|EW|DES|33.5|134
5|KIN|NV|PRA|31.5|136

Expected output should be :

Number of Extra records in Source file : 2
Code:
3|BAN|MEH|TRA|33.5|234
4|PUN|ABHI|TA|38.5|123

Number of mismatches:
Code:
KEYFIELD|COLUMN_NUMBER|SOURCE_VALUE|TARGET_VALUE
2 |3 |ESW |EW
2 |5 |36.5 |33.5
5 |3 |NAV |NV

Please help. I am Very new to Unix shell scripting . Smilie Smilie

Last edited by Franklin52; 07-28-2012 at 03:25 PM.. Reason: Please use code tags for data and code samples, thank you
# 2  
Old 07-25-2012
I guess this is already discussed!! I saw almost the same examples too!! Smilie
# 3  
Old 07-25-2012
Hi Pikka45 , yes we have done this with the follwing code

paste -d '|' File1.txt File2.txt | awk -F '|' '{c=NF/2;for(i=1;i<=c;i++)if($i!=$(i+c))printf "line %-5s field %s\n",NR,i}'

The above code is used only for the below condition :

1)when we have equal number of records where all the keyfields (let assume first field) present in both the source.txt and target.txt

2) When we have source.txt.count < target.txt.count and all the keyfields (let assume 1st filed) are present in both the source.txt and target.txt.

3) it is not working when source.txt.count > target.txt.count and if there are mismatches .

I hope you understand the above cases. Kindly help if there is any chance to cover the 3rd senario. Smilie

---------- Post updated at 12:56 PM ---------- Previous update was at 12:10 PM ----------

Hi Friends, Can anyone look into the Thread .. Plz help..
# 4  
Old 07-25-2012
Hi, check this

Code:
#!/bin/bash

>extra.txt
>mismatch.txt
while read sLine; do
    OFS="$IFS"
    IFS="|"
    sTab=( $sLine );
    tLine="$(egrep "^"${sTab[0]} target.txt)"
    if [ -z "$tLine" ]; then echo "$sLine" >>extra.txt; IFS="$OFS"; continue; fi
    tTab=( $tLine );
    for (( i = 1 ; i < ${#sTab[@]} ; i++ )); do
        [ "${sTab[$i]}" = "${tTab[$i]}" ] || echo "${sTab[0]}|$i|${sTab[$i]}|${tTab[$i]}" >>mismatch.txt
    done
    IFS="$OFS"
done <source.txt

echo "Number of Extra records in Source file : $(cat extra.txt|wc -l)"
cat extra.txt

echo "Number of mismatches : $(cat mismatch.txt|wc -l)"
cat mismatch.txt

This User Gave Thanks to Chirel For This Post:
# 5  
Old 07-25-2012
@Chirel : Thank you so much Smilie Smilie it worked as expected. I have one doubt. if i use for some 50,000 records with 50 columns it is taking more time. is there any way we can reduce the timing and increase the performance ? Plz help ..
# 6  
Old 07-26-2012
perl

Code:
open my $fh,"<a.txt";
while(<$fh>){
	my @tmp = split("[|]",$_);
	my @t = @tmp[1..$#tmp];
	$hash{$tmp[0]} = \@t;
}
close $fh;
while(<DATA>){
	my @tmp = split("[|]",$_);
	if(not exists $hash{$tmp[0]}){
		print;
	}
	else{
		my @t = @{$hash{$tmp[0]}};
		my @diff;
		for(my $i=0;$i<=$#t;$i++){
			if($t[$i] ne $tmp[$i+1]){
				push @diff, ($i+2,$t[$i],$tmp[$i+1]);
			}
		}
		print join "|", ($tmp[0],@diff) if $#diff>=0;
		print "\n";
	}
}
__DATA__
1|HYD|NAG|TRA|34.5|1234
2|CHE|ESW|DES|36.5|134
3|BAN|MEH|TRA|33.5|234
4|PUN|ABHI|TA|38.5|123
5|KIN|NAV|PRA|31.5|135

a.txt
Code:
1|HYD|NAG|TRA|34.5|1234
2|CHE|EW|DES|33.5|134
5|KIN|NV|PRA|31.5|136

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Combine Similar Output from the 2nd field w.r.t 1st Field

Hi, For example: I have: HostA,XYZ HostB,XYZ HostC,ABC I would like the output to be: HostA,HostB: XYZ HostC:ABC How can I achieve this? So far what I though of is: (1 Reply)
Discussion started by: alvinoo
1 Replies

2. Shell Programming and Scripting

Compare two CSV files and put the difference in third file with line no,field no and diff value.

I am having two csv files i need to compare these files and the output file should have the information of the differences at the field level. For Example, File 1: A,B,C,D,E,F 1,2,3,4,5,6 File 2: A,C,B,D,E,F 1,2,4,5,5,6 out put file: (12 Replies)
Discussion started by: karingulanagara
12 Replies

3. Shell Programming and Scripting

Compare two files Field by field and output the result in another file

Hi Friends, Need Help. I have file1.txt as File1.txt |123|A|7267|Hyder|Cross|Sell|7801 |995|A|7051|2008|Lunar|New|Year|Promotion|7801 |996|A|7022|Q108|Targ|Prospect|&|SSCC|Savings|Promo|7801 |997|A|7182|Q1|Feb-Apr|08|Credit|ITA|PA|SBA|Campaign|7801 File2.txt... (7 Replies)
Discussion started by: i150371485
7 Replies

4. Shell Programming and Scripting

Compare a common field in two files and append a column from File 1 in File2

Hi Friends, I am new to Shell Scripting and need your help in the below situation. - I have two files (File 1 and File 2) and the contents of the files are mentioned below. - "Application handle" is the common field in both the files. (NOTE :- PLEASE REFER TO THE ATTACHMENT "Compare files... (2 Replies)
Discussion started by: Santoshbn
2 Replies

5. Shell Programming and Scripting

Compare Field in Current Line with Field in Previous

Hi Guys I have the following file Essentially, I am trying to find the right awk/sed syntax in order to produce the following 3 distinct files from the file above: Basically, I want to print the lines of the file as long as the second field of the current line is equal to the... (9 Replies)
Discussion started by: moutaye
9 Replies

6. UNIX for Dummies Questions & Answers

compare two files based on common field in unix

I have two files in UNIX. 1st file is Entity and Second File is References. 1st File has only one column named Entity ID and 2nd file has two columns Entity ID | Person ID. I want to produce a output file where entity id's are matching in both the files. Entity File 624197 624252 624264... (4 Replies)
Discussion started by: PRS
4 Replies

7. Shell Programming and Scripting

Compare two files and output difference, by first field using awk.

It seems like a common task, but I haven't been able to find the solution. vitallog.txt 1310,John,Hancock 13211,Steven,Mills 122,Jane,Doe 138,Thoms,Doe 1500,Micheal,May vitalinfo.txt 12122,Jane,Thomas 122,Janes,Does 123,Paul,Kite **OUTPUT** vitalfiltered.txt 12122,Jane,Thomas... (2 Replies)
Discussion started by: charles33
2 Replies

8. Shell Programming and Scripting

AWK: Pattern match between 2 files, then compare a field in file1 as > or < field in file2

First, thanks for the help in previous posts... couldn't have gotten where I am now without it! So here is what I have, I use AWK to match $1 and $2 as 1 string in file1 to $1 and $2 as 1 string in file2. Now I'm wondering if I can extend this AWK command to incorporate the following: If $1... (4 Replies)
Discussion started by: right_coaster
4 Replies

9. UNIX and Linux Applications

How to compare 2 field from 2 separated file

I have a problem here. I'm trying to compare multiple fields. Files are like this: File 1: Email,Account Number,Contact,Status,Date File 2: Name|Address|Contact|Email|Account Number|0000000 #!/bin/bash myFolder=`pwd` TEMPFILE=$myFolder/tempfile APFILE=$myFolder/file 1.csv... (0 Replies)
Discussion started by: micxshinoda
0 Replies

10. Shell Programming and Scripting

AWK to compare two files for each field value

I have "n" files in directory A and "n" files in directory B. The files are expected to be the same with same data. Each file has 14 columns and "x" rows. Of the 14 column, 2 columns are to be considered as key identifiers. Based on this unique combination, I need to compare each field value... (2 Replies)
Discussion started by: Sangtha
2 Replies
Login or Register to Ask a Question