Finding first difference between two files | Unix Linux Forums | Shell Programming and Scripting

  Go Back    


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

Finding first difference between two files

Shell Programming and Scripting


Closed Thread    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 02-09-2013
Mojing Mojing is offline
Registered User
 
Join Date: Feb 2013
Last Activity: 20 February 2013, 1:10 AM EST
Posts: 5
Thanks: 7
Thanked 0 Times in 0 Posts
Finding first difference between two files

Hi!
I'd like to know if it is possible for a command to find the first difference between two large files, output that line from both file and stop, so no need to continue after that to save some computation time.

I don't think looping through it will be efficient enough but that's the only thing i can think of...

Better: Aside from outputting the different line, it would be better if it could output the preceding line too.

Thanks!!!!!
Sponsored Links
    #2  
Old 02-09-2013
Jotne's Avatar
Jotne Jotne is offline
Registered User
 
Join Date: Dec 2010
Last Activity: 20 September 2014, 2:08 AM EDT
Posts: 1,040
Thanks: 62
Thanked 216 Times in 204 Posts
Did you try diff

Code:
diff file1 file2 | head -n 2

List where the first difference are and print the difference.
Lots of option to test.
Sponsored Links
    #3  
Old 02-09-2013
Mojing Mojing is offline
Registered User
 
Join Date: Feb 2013
Last Activity: 20 February 2013, 1:10 AM EST
Posts: 5
Thanks: 7
Thanked 0 Times in 0 Posts
hmm I see, but it still will go through the whole file, which is like a 1GB text file...
thanks though

---------- Post updated at 04:15 AM ---------- Previous update was at 04:14 AM ----------

I think cmp will do, I'll just get the line info and grep some text out
    #4  
Old 02-09-2013
Jotne's Avatar
Jotne Jotne is offline
Registered User
 
Join Date: Dec 2010
Last Activity: 20 September 2014, 2:08 AM EDT
Posts: 1,040
Thanks: 62
Thanked 216 Times in 204 Posts
I am 100% sure you can do this easy with awk , but I have not worked much with array .
There you can add an exit if diff found.
The Following User Says Thank You to Jotne For This Useful Post:
Mojing (02-09-2013)
Sponsored Links
    #5  
Old 02-09-2013
Scrutinizer's Avatar
Scrutinizer Scrutinizer is offline Forum Staff  
Moderator
 
Join Date: Nov 2008
Last Activity: 23 November 2014, 11:57 PM EST
Location: Amsterdam
Posts: 9,611
Thanks: 293
Thanked 2,448 Times in 2,196 Posts
A quick start would be:

Code:
awk 'getline p<f && p!=$0 {print "Line " NR ":" RS $0 RS p; exit}' f=file2 file1

But you would need to add provisions for if file1 has more lines than file2 or vice versa
The Following User Says Thank You to Scrutinizer For This Useful Post:
Mojing (02-09-2013)
Sponsored Links
    #6  
Old 02-09-2013
Mojing Mojing is offline
Registered User
 
Join Date: Feb 2013
Last Activity: 20 February 2013, 1:10 AM EST
Posts: 5
Thanks: 7
Thanked 0 Times in 0 Posts
I have a new issue actually,
So what I want to do is compare two files such as:


Code:
file1.txt
A B C D E F G
X 34234 324234
A B C D E F Z
A B C D E F Z
X 34234 0
...


Code:
file2.txt
A B C D E F Z
X 34234 324234
A B C D E F Z
A B C D E F E
X 34234 1
...

I want it to ignore difference with lines starting with A and only comparing lines starting with X for example.
I know that I can filter out all the A lines, but I need to keep them in the files as I have to look back at that line A that was preceding the line X with the difference.
So the output should be like, the two files differs at line 5. not at line 1.

I was thinking of something like


Code:
cmp file1 file2 and ignore line starting with pattern e

Thanks!!
Sponsored Links
    #7  
Old 02-09-2013
Scrutinizer's Avatar
Scrutinizer Scrutinizer is offline Forum Staff  
Moderator
 
Join Date: Nov 2008
Last Activity: 23 November 2014, 11:57 PM EST
Location: Amsterdam
Posts: 9,611
Thanks: 293
Thanked 2,448 Times in 2,196 Posts
With a quick minor adaptation:


Code:
awk 'getline p<f && /^X/ && p!=$0 {print "Line " NR ":" RS $0 RS p; exit}' f=file2 file1

But now there are more exceptions to consider, for example, are the number of A lines allowed to differ and there can be more X lines in file1 than there are in file2 and vice versa. So the script would need to be improved..
The Following User Says Thank You to Scrutinizer For This Useful Post:
Mojing (02-09-2013)
Sponsored Links
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Need help in finding in date difference prasson_ibm Shell Programming and Scripting 3 07-16-2012 01:17 AM
finding difference between two files thelakbe Shell Programming and Scripting 4 10-21-2011 08:44 AM
Finding difference in two comma separated files in UINX mymoto Shell Programming and Scripting 2 07-07-2011 06:02 AM
finding difference between 2 directory recursively yayati Shell Programming and Scripting 2 04-09-2008 11:37 AM
Finding the time difference padma.raajesh Shell Programming and Scripting 0 02-28-2008 12:12 AM



All times are GMT -4. The time now is 12:14 PM.