Compare two files line by line


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Compare two files line by line
# 1  
Old 08-06-2010
Compare two files line by line

Query:
There are two files as below -
/home/rgupta/input/file.txt.arch (source file)
/home/rgupta/output/file.txt (destination file)

Files details are attached for the reference.

Scenario:
File /home/rgupta/input/file.txt.arch picked up by an application xyz. Application xyz does some validation/checks on this file then finally kept at /home/rgupta/output/file.txt.
While copying the file at /home/rgupta/output, Following activities can happen by application xyz -
1) some records can be eliminated
2) There can be some records modified

There are many files with huge number of records with comma/semicolon separated fields.
I need to write a shell script which will compare source file with destination file line by line. And this script should create another file with following data -
1. If record is missing in destination file -
Source file name, destination file name, missing record, missing record number in source
file
2. If record is modified while transferring the file from source to destination -
- Source file name, Data record from source file [first line in new file]
- Destination file name, Data record from Destination file [second line in new file]
- modified record number, exact field which is modified [second line in new file]

I tried following command -
sdiff -s /home/rgupta/input/file.txt.arch /home/rgupta/input/file.txt
Output is -
6015a6016
> 29429293,1,,,01387262543,N,01387262543,N,0,01387262543,N
10000c10001
COUNT=9999 | COUNT=10000;


Output of this command is similar what I expect but the missing record is coming incomplete.

Could someone please provide me the solution?

file1.txt:
Code:
file1.txt
029429284,1,,,02077232651,N,02077232651,N,0,02077232651,N,0,0,1877,02072496953,U,N,,02072496953,U,20100726111641,20100726111641,0,20100726111714,00000330,16,16,10,N;
029429285,1,,,02087488096,N,02087488096,N,0,02087488096,N,0,0,1877,02083325000,U,N,,02083325000,U,20100726111611,20100726111611,0,20100726111715,00001040,16,16,10,N;
029429286,1,,,02072432251,N,02072432251,N,0,02072432251,N,0,0,1363,02076241470,U,N,,02076241470,U,20100726111257,20100726111257,0,20100726111701,00004040,16,16,10,N;
029429287,1,,,02087447700,N,02087447700,N,0,02087447700,N,0,0,1363,07826378298,U,N,,07826378298,U,20100726111650,20100726111650,0,20100726111701,00000110,16,16,10,N;
029429288,1,,,02087400376,N,02087400376,N,0,02087400376,N,0,0,8010,08000151736,U,N,,08000151736,U,20100726111237,20100726111237,0,20100726111651,00004140,16,16,10,N;
029429289,1,,,02088736705,N,02088736705,N,0,02088736705,N,0,0,6160,003531437509,U,N,,003531437509,U,20100726111644,20100726111644,0,20100726111651,00000070,16,16,10,N;
029429290,1,,,02083995696,N,02083995696,N,0,02083995696,N,0,0,1363,07803049166,U,N,,07803049166,U,20100726111700,20100726111700,0,20100726111704,00000040,16,16,10,N;
029429291,1,,,01228631982,N,01228631982,N,0,01228631982,N,0,0,1363,07866387607,U,N,,07866387607,U,20100726121123,20100726121123,0,20100726122312,00011490,16,16,10,N;
029429292,1,,,01620844940,N,01620844940,N,0,01620844940,N,0,0,1877,01312402295,U,N,,01312402295,U,20100726121700,20100726121700,0,20100726122251,00005510,16,16,10,N;
029429294,1,,,01463783624,N,01463783624,N,0,01463783624,N,0,0,1363,01463811679,U,N,,01463811679,U,20100726122255,20100726122255,0,20100726122334,00000390,16,16,10,N;
029429295,1,,,01315537520,N,01315537520,N,0,01315537520,N,0,0,1363,08449999999,U,N,,08449999999,U,20100726121407,20100726121407,0,20100726122315,00009080,16,16,10,N;
029429296,1,,,01624844675,N,01624844675,N,0,01624844675,N,0,0,6160,02890614024,U,N,,02890614024,U,20100726121837,20100726121837,0,20100726122235,00003580,16,16,10,N;
029429297,1,,,01539723145,N,01539723145,N,0,01539723145,N,0,0,1363,09041619999,U,N,,09041619999,U,20100726122221,20100726122221,0,20100726122300,00000390,16,16,10,N;
029429298,1,,,01361884551,N,01361884551,N,0,01361884551,N,0,0,1363,01890818065,U,N,,01890818065,U,20100726121903,20100726121903,0,20100726122244,00003410,16,16,10,N;
029429299,1,,,02089944041,N,02089944041,N,0,COUNT=9999

file1.txt.arch:
Code:
029429284,1,,,02077232651,N,02077232651,N,0,02077232651,N,0,0,1877,02072496953,U,N,,02072496953,U,20100726111641,20100726111641,0,20100726111714,00000330,16,16,10,N;
029429285,1,,,02087488096,N,02087488096,N,0,02087488096,N,0,0,1877,02083325000,U,N,,02083325000,U,20100726111611,20100726111611,0,20100726111715,00001040,16,16,10,N;
029429286,1,,,02072432251,N,02072432251,N,0,02072432251,N,0,0,1363,02076241470,U,N,,02076241470,U,20100726111257,20100726111257,0,20100726111701,00004040,16,16,10,N;
029429287,1,,,02087447700,N,02087447700,N,0,02087447700,N,0,0,1363,07826378298,U,N,,07826378298,U,20100726111650,20100726111650,0,20100726111701,00000110,16,16,10,N;
029429288,1,,,02087400376,N,02087400376,N,0,02087400376,N,0,0,8010,08000151736,U,N,,08000151736,U,20100726111237,20100726111237,0,20100726111651,00004140,16,16,10,N;
029429289,1,,,02088736705,N,02088736705,N,0,02088736705,N,0,0,6160,003531437509,U,N,,003531437509,U,20100726111644,20100726111644,0,20100726111651,00000070,16,16,10,N;
029429290,1,,,02083995696,N,02083995696,N,0,02083995696,N,0,0,1363,07803049166,U,N,,07803049166,U,20100726111700,20100726111700,0,20100726111704,00000040,16,16,10,N;
029429291,1,,,01228631982,N,01228631982,N,0,01228631982,N,0,0,1363,07866387607,U,N,,07866387607,U,20100726121123,20100726121123,0,20100726122312,00011490,16,16,10,N;
029429292,1,,,01620844940,N,01620844940,N,0,01620844940,N,0,0,1877,01312402295,U,N,,01312402295,U,20100726121700,20100726121700,0,20100726122251,00005510,16,16,10,N;
029429293,1,,,01387262543,N,01387262543,N,0,01387262543,N,0,0,8091,01877242558,U,N,,01877242558,U,20100726122241,20100726122241,0,20100726122257,00000160,16,16,10,N;
029429294,1,,,01463783624,N,01463783624,N,0,01463783624,N,0,0,1363,01463811679,U,N,,01463811679,U,20100726122255,20100726122255,0,20100726122334,00000390,16,16,10,N;
029429295,1,,,01315537520,N,01315537520,N,0,01315537520,N,0,0,1363,08449999999,U,N,,08449999999,U,20100726121407,20100726121407,0,20100726122315,00009080,16,16,10,N;
029429296,1,,,01624844675,N,01624844675,N,0,01624844675,N,0,0,6160,02890614024,U,N,,02890614024,U,20100726121837,20100726121837,0,20100726122235,00003580,16,16,10,N;
029429297,1,,,01539723145,N,01539723145,N,0,01539723145,N,0,0,1363,09041619999,U,N,,09041619999,U,20100726122221,20100726122221,0,20100726122300,00000390,16,16,10,N;
029429298,1,,,01361884551,N,01361884551,N,0,01361884551,N,0,0,1363,01890818065,U,N,,01890818065,U,20100726121903,20100726121903,0,20100726122244,00003410,16,16,10,N;
029429299,1,,,02089944041,N,02089944041,N,0,02089944041,N,0,0,1877,02085601193,U,N,,02085601193,U,20100726111138,20100726111138,0,20100726112927,00017490,16,16,10,N;
COUNT=1000;

Highlighted record is extra record in source file.

Last edited by Scott; 08-07-2010 at 05:00 AM.. Reason: Added the file data to the post
# 2  
Old 08-07-2010
Hi.
Code:
       -w NUM  --width=NUM
              Output at most NUM (default 130) columns per line.  

-- excerpt from man sdiff

for the sdiff on my system:
Code:
sdiff (GNU diffutils) 2.8.1

so that by default only the first 60 or so characters from each line of a differing set will be displayed.

Expanding the line width with sdiff might make the output even more unreadable, so if you need the entire line(s) displayed, perhaps the regular diff would be better ... cheers, drl

Last edited by drl; 08-07-2010 at 07:57 AM..
# 3  
Old 08-07-2010
Unable to do.


Please help me to write an shell script to compare above two files in a way so that three files can be generated with following data -

1) First file should have the records which are missing in file1.txt but available in file1.txt.arch (Red colored record).

2) Second file should have records which have been modified (Blue colored).

3) Third file should have the common records.

Please help me to provide the solution on this.

Thanks very much in advance.

Ravi

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Compare two files line by line in bash

Hello All! Thanks for taking time out and helping. My issue is, I have two files that have file names in it. Now, i need to go through each line of both the files and when the file names are different, i need to rename the file. Below is the example: File1</ fil1ename1.txt filename2,txt... (2 Replies)
Discussion started by: svks1985
2 Replies

2. Shell Programming and Scripting

How to read file line by line and compare subset of 1st line with 2nd?

Hi all, I have a log file say Test.log that gets updated continuously and it has data in pipe separated format. A sample log file would look like: <date1>|<data1>|<url1>|<result1> <date2>|<data2>|<url2>|<result2> <date3>|<data3>|<url3>|<result3> <date4>|<data4>|<url4>|<result4> What I... (3 Replies)
Discussion started by: pat_pramod
3 Replies

3. Shell Programming and Scripting

Compare fields in two files line by line

I am new to awk scripting. I want to do a field by word (field) comparison of two files File1.txt and File2.txt. The files contain a list of | (pipe) separated field. **File 1: ------------------- aaa|bbb|ccc|eee|fff lll|mmm|nnn|ooo|ppp rrr|sss|ttt|uuu|vvv** File 2: ... (7 Replies)
Discussion started by: dhruvmohan
7 Replies

4. Shell Programming and Scripting

Perl how to compare two pdf files line by line

Hi Experts, Would really appreciate if anyone can guide me how to compare two pdf files line by line and report the difference to another file. (3 Replies)
Discussion started by: prasanth_babu
3 Replies

5. Shell Programming and Scripting

how to read the contents of two files line by line and compare the line by line?

Hi All, I'm trying to figure out which are the trusted-ips and which are not using a script file.. I have a file named 'ip-list.txt' which contains some ip addresses and another file named 'trusted-ip-list.txt' which also contains some ip addresses. I want to read a line from... (4 Replies)
Discussion started by: mjavalkar
4 Replies

6. Shell Programming and Scripting

How to compare two files line by line

Using diff to compare 2 files FILE1: AAA AAA AAA AAA AAA AAA FILE2: BBB BBB AAA AAA BBB BBB diff FILE1 FILE2 (3 Replies)
Discussion started by: yannsun
3 Replies

7. Shell Programming and Scripting

how compare first line of two files

Hi, Unix Gurus: I have a requirement which need to compare the first line of two files. e.g; file1 123 abc def file2 123 abcdef defe I need compare first line: in two file: in this case, two file contain same value in first line (123) anybody can help me. Thanks in advance (2 Replies)
Discussion started by: ken002
2 Replies

8. Shell Programming and Scripting

Compare two text files line by line

Hi all I need help on comparing two texts files line by line and print the total number of lines matched. Any help is appreciated. Thanks in advance. (4 Replies)
Discussion started by: my_Perl
4 Replies

9. Shell Programming and Scripting

Compare multiple fields in file1 to file2 and print line and next line

Hello, I have two files that I need to compare and print out the line from file2 that has the first 6 fields matching the first 6 fields in file1. Complicating this are the following restrictions 1. file1 is only a few thousand lines at most and file2 is greater than 2 million 2. I need to... (7 Replies)
Discussion started by: gillesc_mac
7 Replies

10. Shell Programming and Scripting

compare line in two files

Dear All i want to compate two diff file line by line. Kindly help me. file 1: 1;givi;01012000;wer 2;sss;02012000;rrr 3;ccc;03012000;ttt file 2: 0;uuu;01012000;lll 1;givi;01012000;wer 2;sss;02012000;rrr 3;ccc;03012000;ttt 5;givi;01012000;hhh Output1: comman line to both file... (3 Replies)
Discussion started by: jaydeep_sadaria
3 Replies
Login or Register to Ask a Question