Finding first difference between two files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Finding first difference between two files
# 1  
Old 02-09-2013
Finding first difference between two files

Hi!
I'd like to know if it is possible for a command to find the first difference between two large files, output that line from both file and stop, so no need to continue after that to save some computation time.

I don't think looping through it will be efficient enough but that's the only thing i can think of...

Better: Aside from outputting the different line, it would be better if it could output the preceding line too.

Thanks!!!!!
# 2  
Old 02-09-2013
Did you try diff
Code:
diff file1 file2 | head -n 2

List where the first difference are and print the difference.
Lots of option to test.
# 3  
Old 02-09-2013
hmm I see, but it still will go through the whole file, which is like a 1GB text file...
thanks though

---------- Post updated at 04:15 AM ---------- Previous update was at 04:14 AM ----------

I think cmp will do, I'll just get the line info and grep some text out
# 4  
Old 02-09-2013
I am 100% sure you can do this easy with awk, but I have not worked much with array.
There you can add an exit if diff found.
This User Gave Thanks to Jotne For This Post:
# 5  
Old 02-09-2013
A quick start would be:
Code:
awk 'getline p<f && p!=$0 {print "Line " NR ":" RS $0 RS p; exit}' f=file2 file1

But you would need to add provisions for if file1 has more lines than file2 or vice versa
This User Gave Thanks to Scrutinizer For This Post:
# 6  
Old 02-09-2013
I have a new issue actually,
So what I want to do is compare two files such as:

Code:
file1.txt
A B C D E F G
X 34234 324234
A B C D E F Z
A B C D E F Z
X 34234 0
...

Code:
file2.txt
A B C D E F Z
X 34234 324234
A B C D E F Z
A B C D E F E
X 34234 1
...

I want it to ignore difference with lines starting with A and only comparing lines starting with X for example.
I know that I can filter out all the A lines, but I need to keep them in the files as I have to look back at that line A that was preceding the line X with the difference.
So the output should be like, the two files differs at line 5. not at line 1.

I was thinking of something like

Code:
cmp file1 file2 and ignore line starting with pattern e

Thanks!!
# 7  
Old 02-09-2013
With a quick minor adaptation:

Code:
awk 'getline p<f && /^X/ && p!=$0 {print "Line " NR ":" RS $0 RS p; exit}' f=file2 file1

But now there are more exceptions to consider, for example, are the number of A lines allowed to differ and there can be more X lines in file1 than there are in file2 and vice versa. So the script would need to be improved..
This User Gave Thanks to Scrutinizer For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Finding difference between two columns of unequal length

Hi, I have two files which look like this cat waitstate.txt 18.2 82.1 cat gostate.txt 5.6 5.8 6.1 6.3 6.6 6.9 7.2 7.5 (4 Replies)
Discussion started by: jamie_123
4 Replies

2. Shell Programming and Scripting

Finding difference in between two array's of strings

Hi, Can anybody help me in finding the difference between two array elements with the help of code pls. purge=("Purge Concurrent Request and/or Manager Data" "Purge Signon Audit data" "Purge Obsolete Workflow Runtime Data" "Purge Logs and Closed System Alerts") purge_1=("Purge Obsolete... (3 Replies)
Discussion started by: Y.balakrishna
3 Replies

3. Shell Programming and Scripting

Finding difference in 2 different timestamps

Legends, I have a requirement to run the script exactly after one hour of completion of dependent script. Eg: Script B should run after one hour on the completion of Script A. I got the time stamps using following variables. these scripts runs in autosys > DATE=`date +%H:%M` >... (4 Replies)
Discussion started by: sdosanjh
4 Replies

4. UNIX for Dummies Questions & Answers

Finding difference in 1st field for rows of data

I have a file that has multiple lines, of grouped data, that typically all have the same values in the 1st field, however, I would like to search the 1st field for any differences and set a flag to use in an "if" statement to run some other routine. An example of the typical file is below,... (2 Replies)
Discussion started by: co21ss
2 Replies

5. Shell Programming and Scripting

Need help in finding in date difference

Hi, My date is coming as STARTDATE=Sun Jul 15 00:34:23 2012 ENDDATE=Sun Jul 15 00:50:04 2012I want difference between these two dates,anyone's helps will be appriciated. Thanks Prasoon (3 Replies)
Discussion started by: prasson_ibm
3 Replies

6. Shell Programming and Scripting

finding difference between two files

Hi, I have two files one with 12486 lines second one with 13116 As per the comparsion between two files the count have 630 difference I used diff command to find the difference between two files but it's not understandable could any one suggest any command to get 630 records in a new... (4 Replies)
Discussion started by: thelakbe
4 Replies

7. Shell Programming and Scripting

Finding difference in two comma separated files in UINX

Dear All, I have requirement like this: I have 2 camma seperated files file1: 1,aaa,bbb,ccc, 2,bbb,ccc,ddd, 3,ccc,ddd,eee, file2: 1,aaa,bbb,ccc, 2,bbb,ddd,ddd, 3,ccc,ddd,eee, my requirement is I should get message in the out put like: There is a difference in 3 rd filed in... (2 Replies)
Discussion started by: mymoto
2 Replies

8. Shell Programming and Scripting

finding difference between 2 directory recursively

Hi, i'm trying to compare two directories in Unix. I need a recursive search ie my shell script should also compare common files in those two directory and so on... any clues.. ?? (2 Replies)
Discussion started by: yayati
2 Replies

9. Shell Programming and Scripting

Finding the time difference

Hi, I have two files A.txt and B.txt. And i have the following attributes in both the files. <date and time> <a unique id> For eg: <2007 May 30 20:29:36:034 GMT> <ID1> in A.txt <2007 May 30 20:42:36:038 GMT> <ID1> in B.txt Now, i need to find the time difference... (0 Replies)
Discussion started by: padma.raajesh
0 Replies

10. Shell Programming and Scripting

finding duplicate files by size and finding pattern matching and its count

Hi, I have a challenging task,in which i have to find the duplicate files by its name and size,then i need to take anyone of the file.Then i need to open the file and find for more than one pattern and count of that pattern. Note:These are the samples of two files,but i can have more... (2 Replies)
Discussion started by: jerome Sukumar
2 Replies
Login or Register to Ask a Question