The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
Google UNIX.COM


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
comparing files - adding/subtracting/formating columns oabdalla Shell Programming and Scripting 7 06-13-2008 12:20 AM
Comparing similar columns in two different files ragavhere Shell Programming and Scripting 13 04-16-2008 05:53 AM
Comparing the common columns of a table in two files ragavhere SUN Solaris 1 04-11-2008 05:41 AM
Comparing Columns of two FIles ggopal UNIX for Advanced & Expert Users 1 02-17-2007 12:11 AM
comparing shadow files with real files terrym UNIX for Advanced & Expert Users 4 02-08-2007 10:38 PM

Reply
 
Submit Tools LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 02-16-2007
Registered User
 

Join Date: Feb 2007
Posts: 4
Stumble this Post!
Comparing Columns of two FIles

Dear all,

I have two files in UNIX File1 and File2 as below:

File1:
1,1234,.,67.897,,0
1,4134,.,87.97,,4
0,1564,.,97.8,,1

File2:
2,8798,.,67.897,,0
2,8879,.,77.97,,4
0,1564,.,97.8,,1

I want to do the following:
(1) Make sure that both the files have equal number of columns and if not error out
(2) If there are equal number of columns then compare each corresponding field from both the files line by line and if there is any difference then print the two different values with line number and column number.

I know that 'diff -w' is useful here because it compares ignoring leading, trailing or in-between spaces which is useful because all my columns are numeric.

Can I use say 'awk' and its NF and NR variables along with diff? Or is there any other way? Please advise. I thank you for your help.

GG
Reply With Quote
Forum Sponsor
  #2 (permalink)  
Old 02-17-2007
Registered User
 

Join Date: Dec 2006
Posts: 29
Stumble this Post!
Wil all lines in a file have same number of columns?
Reply With Quote
  #3 (permalink)  
Old 02-18-2007
Registered User
 

Join Date: Feb 2007
Posts: 4
Stumble this Post!
Yes. All lines have the same number of columsn, it is like a matrix. Thanks.
Reply With Quote
  #4 (permalink)  
Old 02-18-2007
vgersh99's Avatar
Moderator
 

Join Date: Feb 2005
Location: Boston, MA
Posts: 3,002
Stumble this Post!
here's something to start with:

nawk -F',' -f gg.awk File1 File2

gg.awk:
Code:
FNR==NR{
  arr[FNR]=$0
  next;
}
{
  f1num=split(arr[FNR], f1A, FS)
  if ( f1num != NF ) {
     printf("error: [%d] - unequal number of fields\n", FNR)
     next
  }
  for(i=1; i<=NF; i++)
    if (f1A[i] != $i)
     printf("error: [%d]::[%d] - non-equal fields: [%d != %d]\n", FNR, i, f1A[i]
, $i)
  split("", f1A)
}
Reply With Quote
  #5 (permalink)  
Old 02-21-2007
Registered User
 

Join Date: Feb 2007
Posts: 4
Stumble this Post!
The next step

I thank you so much for your reply. It seems to work but I wish to expand the functionality a little further,.by printing labels in the output along with all the currently displaying fields, i.e:

File1:
1,1234,.,67.897,,0
1,4134,.,87.97,,4
0,1564,.,97.8,,1

File2:
2,8798,.,67.897,,0
2,8879,.,77.97,,4
0,1564,.,97.8,,1

File3:
Label1, Label2, ...LabelN

Where File1 and File2 are same as the old files and file3 now contains just one row with labels corresponding to the fields (which are the same labels for both file1 and file2). Can I now display the same output as is generated by your code (gg.awk) except I also want to print these corresponding labels on the far right side when the fields are different so that the output now looks like:

error: [1]::[1] - non-equal fields: [1 != 2] "label1"

error: [1]::[2] - non-equal fields: [1234 != 8798] "label2"

etc...

with the labels coming from file3.

Also, can you also offer some explanation as to what this code is doing. I got a rough idea while exploring it but still some features seem a little advanced to me. Any information would be very useful.

I thank you immensely for all this help,

GG
Reply With Quote
Google The UNIX and Linux Forums
Reply

Thread Tools
Display Modes




All times are GMT -7. The time now is 05:33 AM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited.
The UNIX and Linux Forums Content Copyright ©1993-2008 The CEP Blog All Rights Reserved -Ad Management by RedTyger Visit The Global Fact Book

Content Relevant URLs by vBSEO 3.2.0