Comparing the files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Comparing the files
# 1  
Old 08-07-2012
Question Comparing the files

Hi Friends,

I have file1.txt

Quote:
29123973Ç2012-05-29Ç35310124Ç00000000000469744762Ç00010Ç20Ç390ÇÇÇÇF
29123974Ç2012-05-29Ç35310125Ç00000000000469744770Ç00010Ç20Ç390ÇÇÇÇF
29123975Ç2012-05-29Ç35310126Ç00000000000469744804Ç00010Ç20Ç390ÇÇÇÇF
29123976Ç2012-05-29Ç35310127Ç00000000000469744820Ç00010Ç20Ç390ÇÇÇÇF
29123977Ç2012-05-29Ç35310128Ç00000000000469744846Ç00010Ç20Ç390ÇÇÇÇF
29123978Ç2012-05-29Ç35310129Ç00000000000469744895Ç00010Ç20Ç390ÇÇÇÇF
29123979Ç2012-05-29Ç35310130Ç00000000000469744903Ç00010Ç20Ç401ÇÇÇÇF
29123980Ç2012-05-29Ç35310131Ç00000000000469744911Ç00010Ç20Ç390ÇÇÇÇF
29123981Ç2012-05-29Ç35310132Ç00000000000469744929Ç00010Ç20Ç390ÇÇÇÇF
29123982Ç2012-05-29Ç35310133Ç00000000000469744952Ç00010Ç20Ç390ÇÇÇÇF
file2.txt

Quote:
29123973Ç2012-05-29Ç35310124Ç00000000000469744762Ç00010Ç20Ç390ÇÇÇÇF
29123974Ç2012-05-29Ç35310125Ç00000000000469744770Ç00010Ç20Ç390ÇÇÇÇF
2923975Ç2012-05-29Ç35310126Ç00000000000469744804Ç00010Ç20Ç390ÇÇÇÇF
29123976Ç2012-05-29Ç35310127Ç00000000000469744820Ç00010Ç20Ç390ÇÇÇÇF
29123977Ç2012-05-29Ç35310128Ç00000000000469744846Ç00010Ç20Ç390ÇÇÇÇF
29123978Ç2012-05-29Ç35310129Ç00000000000469744895Ç00010Ç20Ç390ÇÇÇÇF
29123979Ç2012-05-29Ç35310130Ç00000000000469744903Ç00010Ç20Ç401ÇÇÇÇF
29123980Ç2012-05-29Ç35310131Ç00000000000469744911Ç00010Ç20Ç390ÇÇÇÇF
29123981Ç2012-05-29Ç35310132Ç00000000000469744929Ç00010Ç20Ç390ÇÇÇÇF
29123982Ç2012-05-29Ç35310133Ç00000000000469744952Ç00010Ç20Ç390ÇÇÇÇF

I tried using the diff and comm but not getting the expected output..

I want where exactly the miss match occurs. probably the field.

Sourcevalue|Targetvalue|Linenumber|field
29123975|2923975|3|1

Please help.

i tried with diff but output which i got is
Quote:
3c3
< 29123975Ç2012-05-29Ç35310126Ç00000000000469744804Ç00010Ç20Ç390ÇÇÇÇF
---
> 2923975Ç2012-05-29Ç35310126Ç00000000000469744804Ç00010Ç20Ç390ÇÇÇÇF



Code:
#!/bin/bash
 
>extra.txt
>mismatch.txt
while read sLine; do
    OFS="$IFS"
    IFS="Ç"
    sTab= ${sLine} ;
    tLine="${egrep "^"${sTab[0]} file2.txt}"
    if [ -z "$tLine" ]; then echo "$sLine" >>extra.txt; IFS="$OFS"; continue; fi
    tTab= ${tLine} ;
    for (( i = 1 ; i < ${#sTab[@]} ; i++ )); do
        [ "${sTab[$i]}" = "${tTab[$i]}" ] || echo "${sTab[0]}|$i|${sTab[$i]}|${tTab[$i]}" >>mismatch.txt
    done
    IFS="$OFS"
done <file1.txt
echo "Number of Extra records in Source file : $(cat extra.txt|wc -l)"
cat extra.txt
echo "Number of mismatches : $(cat mismatch.txt|wc -l)"
cat mismatch.txt

I got an error saying :

Quote:
line 14: 29123973: command not found
line 15: ${egrep "^"${sTab[0]} file2.txt}: bad substitution
Number of Extra records in Source file : 0
Number of mismatches : 0

Last edited by i150371485; 08-07-2012 at 04:51 AM.. Reason: Included code
# 2  
Old 08-07-2012
Do you need a shell script solution? If not, you might want to try an awk command:
Code:
awk 'BEGIN {FS="Ç";OFS="|"}
    {getline lf1 < "file1"
      if (lf1!=$0) {b=split(lf1,a)
         for (i=1;i<=b;i++) if (a[i]!=$i) print a[i],$i,NR,i}
    }' file2

yielding
Code:
29123975|2923975|3|1

What it does is for each input line of file 2 (in $0) it "getlines" an input line of file1 into variable lf1. If this does not compare to $0, every single field of the two are compared, and a mismatch is printed. This still needs some error checking etc. added.
This User Gave Thanks to RudiC For This Post:
# 3  
Old 08-08-2012
@Rudic : Thanks for the reply and expalnation of the AWk command. I will execute today and i will check the results and let you know ..

---------- Post updated at 01:07 PM ---------- Previous update was at 12:16 PM ----------

@Rudic, I have couple of doubts here. Would you mind explaning me . i have executed couple of scenarios. If i have same number of records in file1 and file2 it is working fine. Please find below points where it is not working .

Quote:
1) if i have record in file1 which is not present in file2 , then i should be able to catch that line and show it as a extra record.then it should work for the other lines which are present in both file1 and file2
i have excuted the above awk command by removing first line from file2, but i got whole records as mismatches Smilie

Please help . I am using KSH .
# 4  
Old 08-08-2012
As mentioned in my post, error checking was omitted, and it was based on the assumption of files of equal length.
diff is great at finding missing lines:
Code:
diff file1 file2
3c3
< 29123975Ç2012-05-29Ç35310126Ç00000000000469744804Ç00010Ç20Ç390ÇÇÇÇF
---
> 2923975Ç2012-05-29Ç35310126Ç00000000000469744804Ç00010Ç20Ç390ÇÇÇÇF
7d6
< 29123979Ç2012-05-29Ç35310130Ç00000000000469744903Ç00010Ç20Ç401ÇÇÇÇF
8a8
> 29123981Ç2012-05-29Ç35310132Ç00000000000469744929Ç00010Ç20Ç390ÇÇÇÇF

which you read like line 7 of file1 doesn't exist in file2, and line 8 vice versa.
So you could do a diff first to find out line differences in the files and then execute the awk script to find field differences. To me it seems inadequate to duplicate existing diff functionality using awk.

---------- Post updated at 10:30 AM ---------- Previous update was at 10:12 AM ----------

Actually you could do sth. like
Code:
(diff -e file1 file2; echo w)|ed file1

to add missing lines to file1, but you would need to remove the change (e.g. 3c3) commands, e.g. by piping it through sed: |sed '/.c/,+1d'

Last edited by RudiC; 08-08-2012 at 05:38 AM..
These 2 Users Gave Thanks to RudiC For This Post:
# 5  
Old 08-08-2012
Hi Rudic , Could you please exaplain the significance of "echo w" and "ed" in (diff -e file1 file2; echo w)|ed file1.
# 6  
Old 08-08-2012
ed is a (text-) file editor (-> man ed), and diff -e is designed to output ed compatible command lines but excluding the final w(rite) command, not to overwrite the file it's working upon. So what we do is create (diff -e + echo w) and execute (ed) the commands in a pipe | to make file1 into file2. If you need to see the deleted records, you might want to execute diff twice - once for the records, once to execute the changes with ed.
This User Gave Thanks to RudiC For This Post:
# 7  
Old 08-10-2012
@Rudic, Thanks very much Smilie
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Comparing two files and list the difference with common first line content of both files

I have two file as given below which shows the ACL permissions of each file. I need to compare the source file with target file and list down the difference as specified below in required output. Can someone help me on this ? Source File ************* # file: /local/test_1 # owner: own #... (4 Replies)
Discussion started by: sarathy_a35
4 Replies

2. Shell Programming and Scripting

Comparing files in a directory against an array of files

I hope I can explain this correctly. I am using Bash-4.2 for my shell. I have a group of file names held in an array. I want to compare the names in this array against the names of files currently present in a directory. If the file does not exist in the directory, that is not a problem.... (5 Replies)
Discussion started by: BudMan
5 Replies

3. Shell Programming and Scripting

Help with comparing two files

Hi all I have to compare two file this time one is P11223 x1124 x1145 t5678 e3456 z2345 another file P11223 x s (2 Replies)
Discussion started by: manigrover
2 Replies

4. UNIX for Advanced & Expert Users

How to find duplicates contents in a files by comparing other files?

Hi Guys , we have one directory ...in that directory all files will be set on each day.. files must have header ,contents ,footer.. i wants to compare the header,contents,footer ..if its same means display an error message as 'files contents same' (7 Replies)
Discussion started by: Venkatesh1
7 Replies

5. Shell Programming and Scripting

Comparing the matches in two files using awk when both files have their own field separators

I've two files with data like below: file1.txt: AAA,Apples,123 BBB,Bananas,124 CCC,Carrot,125 file2.txt: Store1|AAA|123|11 Store2|BBB|124|23 Store3|CCC|125|57 Store4|DDD|126|38 So,the field separator in file1.txt is a comma and in file2.txt,it is | Now,the output should be... (2 Replies)
Discussion started by: asyed
2 Replies

6. UNIX for Dummies Questions & Answers

Comparing two files

HI ALL, i have two files each with three columns. i need to compare the row in file 1 is present in file 2 using FOR loop. Please Help. Thanks in Advance Regards, Arun Manas (1 Reply)
Discussion started by: arunmanas
1 Replies

7. Shell Programming and Scripting

Need help comparing two files and deleting some things in those files!

So I have two files: File1 pictures.txt 1.1 1.3 dance.txt 1.2 1.4 treehouse.txt 1.3 1.5 File2 pictures.txt 1.5 ref2313 1.4 ref2345 1.3 ref5432 1.2 ref4244 dance.txt 1.6 ref2342 1.5 ref2352 1.4 ref0695 1.3 ref5738 1.2 ref4948 1.1 treehouse.txt 1.6 ref8573 1.5 ref3284 1.4 ref5838... (24 Replies)
Discussion started by: linuxkid
24 Replies

8. Shell Programming and Scripting

Need Help Comparing two Files

I really need help on creating a script that does the following: I have one file (File 1) with lines in the following format: Name.maf score1 score2 I have a second file (File 2) with lines in the following format: label start end Name What I need to do is compare File 1 and... (1 Reply)
Discussion started by: awknerd
1 Replies

9. Shell Programming and Scripting

Comparing files

I have a file called X, which contains the following: 10 100 200 300 I then have file Y, which containts the following: 10 200 500 800 I want to write a script that will compare the contents of Y with the contents of X and ONLY return values in Y that does not exist in X (output... (5 Replies)
Discussion started by: soliberus
5 Replies

10. UNIX for Advanced & Expert Users

comparing shadow files with real files

Hi I need to compare shadow file sizes with their real file counterparts. If the shadow file size differs form the realfile size then it must send a mail. My problem is that our system has over 1600 shadowfiles in different directories, with different names. the only consistancy is the .sh file... (4 Replies)
Discussion started by: terrym
4 Replies
Login or Register to Ask a Question