Simple awk command to compare two files and print first difference

04-26-2017

Registered User

362, 16

Join Date: Mar 2010

Last Activity: 3 March 2020, 10:38 PM EST

Location: Boston

Posts: 362

Thanks Given: 193

Thanked 16 Times in 15 Posts

Simple awk command to compare two files and print first difference

Hello,

I have two text files, each with a single column,
file 1:

Code:

file 2:

Code:

I am trying to identify the value in red above which is the first value that doesn't match the second file. I need to print that value and exit.

At first I tried diff,

diff file1 file2 | head -n 2

This gives what I want, but there are multiple lines of output and so it was more steps to get the value into a bash variable, which is what I need.

I then tried awk,

awk ' NR==FNR { a[NR]=$0; next } !($0 in a){ print $1; exit } ' file2 file1

Note that the order of input files is reversed because I want the first line of file1 that does not match file2. This just prints the first line of file1. Even if it did work, I think that this just tells me that the value is, or is not, in the file, not if the lines match.

Code:

awk ' NR==FNR { a[NR]=$0; next } $0 != a[FNR] { print a[FNR]; exit } file1 file2

I am sure I could do a loop with read, but that would be slow.

This seems like a very simple task. Are there any suggestions?

LMHmedchem

Last edited by Scrutinizer; 04-26-2017 at 07:17 PM..

LMHmedchem

View Public Profile for LMHmedchem

Find all posts by LMHmedchem

04-26-2017

Registered User

15,129, 5,008

Join Date: Jul 2012

Last Activity: 4 May 2020, 4:31 PM EDT

Location: Aachen, Germany

Posts: 15,129

Thanks Given: 735

Thanked 5,008 Times in 4,483 Posts

How about

Code:

diff -y -b --suppress-common-lines file1 file2 | cut -f1 | head -1
123476854

Or, slightly adapting your own awk proposal:

Code:

awk ' NR==FNR { a[$0]; next } !($0 in a){ print $1; exit } ' file2 file1
123476854

This User Gave Thanks to RudiC For This Post:

RudiC

View Public Profile for RudiC

Find all posts by RudiC

04-26-2017

Registered User

362, 16

Join Date: Mar 2010

Last Activity: 3 March 2020, 10:38 PM EST

Location: Boston

Posts: 362

Thanks Given: 193

Thanked 16 Times in 15 Posts

Quote:

Originally Posted by RudiC

How about

Code:

diff -y -b --suppress-common-lines file1 file2 | cut -f1 | head -1
123476854

Or, slightly adapting your own awk proposal:

Code:

awk ' NR==FNR { a[$0]; next } !($0 in a){ print $1; exit } ' file2 file1
123476854

It seems something like this would be correct,

awk ' NR==FNR { a[$0]; next } $0 != a[FNR] { print a[FNR]; exit } file1 file2'

but that doesn't do anything at all. Am I right that evaluating !($0 in a) looks for $0 anywhere in a[]? I am checking that the files match, so it matters that the value appears on the same line in both files, not that it appears anywhere.

LMHmedchem

LMHmedchem

View Public Profile for LMHmedchem

Find all posts by LMHmedchem

04-26-2017

Read Only

1,278, 486

Join Date: Sep 2012

Last Activity: 27 February 2020, 8:59 PM EST

Location: Houston, Texas, USA

Posts: 1,278

Thanks Given: 0

Thanked 486 Times in 451 Posts

Code:

paste -d" " file1 file2 | awk '$1 != $2 {print $1; exit;}'

This User Gave Thanks to rdrtx1 For This Post:

rdrtx1

View Public Profile for rdrtx1

Find all posts by rdrtx1

04-26-2017

Moderator

12,296, 3,792

Join Date: Nov 2008

Last Activity: 1 January 2021, 1:47 AM EST

Location: Amsterdam

Posts: 12,296

Thanks Given: 679

Thanked 3,792 Times in 3,282 Posts

@OP, you second suggestion seems to work alright but you forgot the second quote:

Code:

awk ' NR==FNR { a[NR]=$0; next } $0 != a[FNR] { print a[FNR]; exit }' file1 file2

However, it would read the whole of file1 first and put it in memory..

Another approach you could try:

Code:

awk '{getline s<f} $0!=s{print; exit}' f=file2 file1

This User Gave Thanks to Scrutinizer For This Post:

Scrutinizer

View Public Profile for Scrutinizer

Find all posts by Scrutinizer

04-26-2017

Registered User

362, 16

Join Date: Mar 2010

Last Activity: 3 March 2020, 10:38 PM EST

Location: Boston

Posts: 362

Thanks Given: 193

Thanked 16 Times in 15 Posts

In the end, I did this based on the code posted by Scrutinizer,

error_record=$(awk '{getline s<f} $0!=s{print; exit}' f=file2 file1)

It seems like it will work well enough and was the fastest of the methods that worked.

This suggestion of RudiC also worked but was marginally slower.

error_record=$(diff -y -b --suppress-common-lines file1 file2 | cut -f1 | head -1)

By slower I mean 0m0.391s as opposed to 0m0.156s with the first method. Not enough difference to bother with but I guess you need some reason to pick a method.

The method suggested by rdrtx1 also worked but again was a bit slower,

error_record=$(paste -d" " file1 file2 | awk '$1 != $2 {print $1; exit;}')

My guess is that the two slower methods both made calls to more than one program and this is the origin of the difference.

I was not able to get any output from this, even though it looks correct,

awk ' NR==FNR { a[NR]=$0; next } $0 != a[FNR] { print a[FNR]; exit }' file1 file2

Don't know what the issue is there.

LMHmedchem

Last edited by LMHmedchem; 04-26-2017 at 10:15 PM..

This User Gave Thanks to LMHmedchem For This Post:

LMHmedchem

View Public Profile for LMHmedchem

Find all posts by LMHmedchem

Shell Programming and Scripting

Simple awk command to compare two files and print first difference

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Compare two variables and print the difference

Discussion started by: amar1208

2. Shell Programming and Scripting

[Solved] awk compare two different columns of two files and print all from both file

Discussion started by: justinjj

3. Shell Programming and Scripting

Compare two files and print using awk

Discussion started by: sol_nov

4. Shell Programming and Scripting

awk compare specific columns from 2 files, print new file

Discussion started by: jm4smtddd

5. Shell Programming and Scripting

Compare two files and output difference, by first field using awk.

Discussion started by: charles33

6. Shell Programming and Scripting

Compare two columns in two files and print the difference

Discussion started by: jhonnyrip

7. Shell Programming and Scripting

awk to compare flat files and print output to another file

Discussion started by: suhaeb

8. Shell Programming and Scripting

Compare two files and print the two lines with difference

Discussion started by: kingpeejay

9. Shell Programming and Scripting

awk to compare lines of two files and print output on screen

Discussion started by: chlfc

10. Shell Programming and Scripting

to compare two files and to print the difference

Discussion started by: cdfd123