Comparing two files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Comparing two files
# 1  
Old 04-21-2008
Tools Comparing two files

Hi,

I have two files in this format. The files contain the statistics of tables as seen below. The other file is also in this format. I need to compare both the files and if there is a mismatch i need to display the contents within the break lines from both the files for that corresponding table.

Table Name: AAA
Row Count:96 SUM(F1): 3739 MAX(F1):77 MIN(F1): 0 AVG(F1): 38.9479167 LENGTH(LINE): 2260

------------------------------------------------------------------------------------------------------------------------------

Table Name: AQ$_FT_Q_BECMD_G
Row Count: 0 SUM(SUBSCRIBER#):MAX(SUBSCRIBER#): MIN(SUBSCRIBER#): AVG(SUBSCRIBER#):LENGTH(NAME):SUM(ADDRESS#): MAX(ADDRESS#):MIN(ADDRESS#):AVG(ADDRESS#):

------------------------------------------------------------------------------------------------------------------------------

Table Name: AQ$_FT_Q_BECMD_H
Row Count: 0 SUM(SUBSCRIBER#):MAX(SUBSCRIBER#): MIN(SUBSCRIBER#): AVG(SUBSCRIBER#):LENGTH(NAME):SUM(ADDRESS#): MAX(ADDRESS#):MIN(ADDRESS#):AVG(ADDRESS#): LENGTH(TRANSACTION_ID): SUM(DEQUEUE_USER): MAX(DEQUEUE_USER): LENGTH(DEQUEUE_USER): 50
------------------------------------------------------------------------------------------------------------------------------

Can anyone help me on this?

Last edited by ragavhere; 05-12-2008 at 01:27 AM..
# 2  
Old 04-21-2008
A possible way using awk :
Code:
awk '
/^Table Name/  { table = $3 ; tables[table]++                }
! NF || /^-+/  {                                        next }
NR == FNR      { stats1[table] = stats1[table] $0 ORS ; next }
               { stats2[table] = stats2[table] $0 ORS ; next }
END {
   for (t in tables) {
      if (stats1[t] != stats2[t]) {
         out = "==========================================" ORS
         out = out stats1[t] ORS stats2[t]
         print out
      }
   }
}
' stats1.dat stats2.dat

Input file 1:
Code:
Table Name: AAA
Row Count:96 SUM(F1): 3739 MAX(F1):77 MIN(F1): 0 AVG(F1): 38.9479167 LENGTH(LINE): 2260

------------------------------------------------------------------------------------------------------------------------------
Table Name: AQ$_FT_Q_BECMD_G
Row Count: 0 SUM(SUBSCRIBER#):MAX(SUBSCRIBER#): MIN(SUBSCRIBER#): AVG(SUBSCRIBER#):LENGTH(NAME):SUM(ADDRESS#): MAX(ADDRESS#):MIN(ADDRESS#):AVG(ADDRESS#):

------------------------------------------------------------------------------------------------------------------------------

Table Name: AQ$_FT_Q_BECMD_H
Row Count: 0 SUM(SUBSCRIBER#):MAX(SUBSCRIBER#): MIN(SUBSCRIBER#): AVG(SUBSCRIBER#):LENGTH(NAME):SUM(ADDRESS#): MAX(ADDRESS#):MIN(ADDRESS#):AVG(ADDRESS#): LENGTH(TRANSACTION_ID): SUM(DEQUEUE_USER): MAX(DEQUEUE_USER):
------------------------------------------------------------------------------------------------------------------------------

Input file 2:
Code:
Table Name: AAA
Row Count:96 SUM(F1): 3739 MAX(F1):77 MIN(F1): 0 AVG(F1): 38.9479167 LENGTH(LINE): 2260

------------------------------------------------------------------------------------------------------------------------------
Table Name: AQ$_FT_Q_BECMD_G
Row Count: 0 SUM(SUBSCRIBER#):MAX(SUBSCRIBER#): MIN(SUBSCRIBER#): AVG(SUBSCRIBER#):LENGTH(NAME):SUM(ADDRESS#): MAX(ADDRESS#):MIN(ADDRESS#):AVG(ADDRESS#):

------------------------------------------------------------------------------------------------------------------------------
Table Name: AQ$_FT_Q_BECMD_K
Row Count: 0 SUM(SUBSCRIBER#):MAX(SUBSCRIBER#): MIN(SUBSCRIBER#): AVG(SUBSCRIBER#):LENGTH(NAME):SUM(ADDRESS#): MAX(ADDRESS#):MIN(ADDRESS#):AVG(ADDRESS#):

------------------------------------------------------------------------------------------------------------------------------

Table Name: AQ$_FT_Q_BECMD_H
Row Count: 1 SUM(SUBSCRIBER#): 11 MAX(SUBSCRIBER#): 11 MIN(SUBSCRIBER#): 11 AVG(SUBSCRIBER#): 11 LENGTH(NAME) 24 :SUM(ADDRESS#): MAX(ADDRESS#): 0 MIN(ADDRESS#): 0 AVG(ADDRESS#):  0 LENGTH(TRANSACTION_ID): 4 SUM(DEQUEUE_USER): 0 MAX(DEQUEUE_USER): 0
------------------------------------------------------------------------------------------------------------------------------

Ouput:
Code:
==========================================
Table Name: AQ$_FT_Q_BECMD_H
Row Count: 0 SUM(SUBSCRIBER#):MAX(SUBSCRIBER#): MIN(SUBSCRIBER#): AVG(SUBSCRIBER#):LENGTH(NAME):SUM(ADDRESS#): MAX(ADDRESS#):MIN(ADDRESS#):AVG(ADDRESS#): LENGTH(TRANSACTION_ID): SUM(DEQUEUE_USER): MAX(DEQUEUE_USER):

Table Name: AQ$_FT_Q_BECMD_H
Row Count: 1 SUM(SUBSCRIBER#): 11 MAX(SUBSCRIBER#): 11 MIN(SUBSCRIBER#): 11 AVG(SUBSCRIBER#): 11 LENGTH(NAME) 24 :SUM(ADDRESS#): MAX(ADDRESS#): 0 MIN(ADDRESS#): 0 AVG(ADDRESS#):  0 LENGTH(TRANSACTION_ID): 4 SUM(DEQUEUE_USER): 0 MAX(DEQUEUE_USER): 0

==========================================

Table Name: AQ$_FT_Q_BECMD_K
Row Count: 0 SUM(SUBSCRIBER#):MAX(SUBSCRIBER#): MIN(SUBSCRIBER#): AVG(SUBSCRIBER#):LENGTH(NAME):SUM(ADDRESS#): MAX(ADDRESS#):MIN(ADDRESS#):AVG(ADDRESS#):

Jean-Pierre.
# 3  
Old 04-22-2008
MySQL Comparing two files

Hi,

The code is working very fine. Can you please explain each line of the code?

Smilie
# 4  
Old 04-22-2008
Code:
awk '

/^Table Name/ {                             # Select lines starting with 'Table Name'
   table = $3;                              # Memorize table name into variable
   tables[table]++                          #   and array
}                                           #

! NF || /^-+/ {                             # Select empty and delimiter lines 
   next                                     # Proceed next line (skip selected lines)
}                                           #

NR == FNR {                                 # Select lines from first input file 
   stats1[table] = stats1[table] $0 ORS;    # Memorize stats's table
   next                                     # Proceed next line
}

{                                           # Lines comes from second input file 
   stats2[table] = stats2[table] $0 ORS;    # Memorize stats's table
   next                                     # Proceed next line
}                                           #

END {                                       # All files have been read
   for (t in tables) {                      # For all memorized tables
      if (stats1[t] != stats2[t]) {         #    If stats mismatch
         out = "==========================================" ORS;
         out = out stats1[t] ORS stats2[t]; #
         print out                          #       Output sep line and stats
      }                                     #
   }                                        #
}                                           #

' stats1.dat stats2.dat

Jean-Pierre.
# 5  
Old 04-22-2008
MySQL Comparing two files

Hi aigles,

Thanks a lot.Smilie
# 6  
Old 04-26-2008
Tools Comparing two files

Hi Jean-Pierre,

Does stats1 and stats2 refer to the two input files? Will the code work if i give the path name like /home/frk/ragav/stats1 and /home/frk/ragav/stats2 instead of the file names? Then how should the code be modified?

If i assign the path like /home/frk/ragav/stats1 to a variable how can i call the path in the code?

When i assigned the file name to a variable like
a=stats1.txt
b=stats2.txt
and changed the code to

nawk '
/^Table Name/ { table = $3 ; tables[table]++ }
! NF || /^-+/ { next }
NR == FNR { $e[table] = $e[table] $0 ORS ; next }
{ $f[table] = $f[table] $0 ORS ; next }
END {
for (t in tables) {
if ($e[t] != $f[t]) {
out = "-----------------------------------------------------------------" ORS
out = out $e[t] ORS $f[t]
print out
}
}
}
' $e $f >> result.out


i am getting this error.

nawk: illegal field $()
input record number 1, file startendcut1.txt
source line number 4

Can you please help on the above two ways of modifying the code?

Last edited by ragavhere; 04-26-2008 at 06:31 AM.. Reason: Error reason included
# 7  
Old 04-26-2008
In my script, stats1 and stats2 inside awk code are arrays.
stats1.dat and stats2.dat are the input files.

The inputfiles can be specified with the full path.
Don't use $e and $f as arrays in your awk code, use fixed names (stats1 and stats2 or what you want like first_array and second_array...)
Code:
a=stats1.txt
b=stats2.txt

nawk '
/^Table Name/ { table = $3 ; tables[table]++ }
! NF || /^-+/ { next }
NR == FNR     { stats1[table] = stats1[table] $0 ORS ; next }
              {stats2[table] = stats2[table] $0 ORS ; next }
END {
   for (t in tables) {
      if (stats1[t] != stats2[t]) {
         out = "-----------------------------------------------------------------" ORS
         out = out stats1[t] ORS stats2[t]
         print out
      }
   }
}
' $e $f >> result.out

Jean-Pierre.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Comparing two files and list the difference with common first line content of both files

I have two file as given below which shows the ACL permissions of each file. I need to compare the source file with target file and list down the difference as specified below in required output. Can someone help me on this ? Source File ************* # file: /local/test_1 # owner: own #... (4 Replies)
Discussion started by: sarathy_a35
4 Replies

2. Shell Programming and Scripting

Comparing files in a directory against an array of files

I hope I can explain this correctly. I am using Bash-4.2 for my shell. I have a group of file names held in an array. I want to compare the names in this array against the names of files currently present in a directory. If the file does not exist in the directory, that is not a problem.... (5 Replies)
Discussion started by: BudMan
5 Replies

3. Shell Programming and Scripting

Comparing the files

Hi Friends, I have file1.txt file2.txt I tried using the diff and comm but not getting the expected output.. I want where exactly the miss match occurs. probably the field. Sourcevalue|Targetvalue|Linenumber|field 29123975|2923975|3|1 Please help. (6 Replies)
Discussion started by: i150371485
6 Replies

4. Shell Programming and Scripting

Help with comparing two files

Hi all I have to compare two file this time one is P11223 x1124 x1145 t5678 e3456 z2345 another file P11223 x s (2 Replies)
Discussion started by: manigrover
2 Replies

5. UNIX for Advanced & Expert Users

How to find duplicates contents in a files by comparing other files?

Hi Guys , we have one directory ...in that directory all files will be set on each day.. files must have header ,contents ,footer.. i wants to compare the header,contents,footer ..if its same means display an error message as 'files contents same' (7 Replies)
Discussion started by: Venkatesh1
7 Replies

6. Shell Programming and Scripting

Comparing the matches in two files using awk when both files have their own field separators

I've two files with data like below: file1.txt: AAA,Apples,123 BBB,Bananas,124 CCC,Carrot,125 file2.txt: Store1|AAA|123|11 Store2|BBB|124|23 Store3|CCC|125|57 Store4|DDD|126|38 So,the field separator in file1.txt is a comma and in file2.txt,it is | Now,the output should be... (2 Replies)
Discussion started by: asyed
2 Replies

7. Shell Programming and Scripting

Need help comparing two files and deleting some things in those files!

So I have two files: File1 pictures.txt 1.1 1.3 dance.txt 1.2 1.4 treehouse.txt 1.3 1.5 File2 pictures.txt 1.5 ref2313 1.4 ref2345 1.3 ref5432 1.2 ref4244 dance.txt 1.6 ref2342 1.5 ref2352 1.4 ref0695 1.3 ref5738 1.2 ref4948 1.1 treehouse.txt 1.6 ref8573 1.5 ref3284 1.4 ref5838... (24 Replies)
Discussion started by: linuxkid
24 Replies

8. Shell Programming and Scripting

Need Help Comparing two Files

I really need help on creating a script that does the following: I have one file (File 1) with lines in the following format: Name.maf score1 score2 I have a second file (File 2) with lines in the following format: label start end Name What I need to do is compare File 1 and... (1 Reply)
Discussion started by: awknerd
1 Replies

9. Shell Programming and Scripting

Comparing files

I have a file called X, which contains the following: 10 100 200 300 I then have file Y, which containts the following: 10 200 500 800 I want to write a script that will compare the contents of Y with the contents of X and ONLY return values in Y that does not exist in X (output... (5 Replies)
Discussion started by: soliberus
5 Replies

10. UNIX for Advanced & Expert Users

comparing shadow files with real files

Hi I need to compare shadow file sizes with their real file counterparts. If the shadow file size differs form the realfile size then it must send a mail. My problem is that our system has over 1600 shadowfiles in different directories, with different names. the only consistancy is the .sh file... (4 Replies)
Discussion started by: terrym
4 Replies
Login or Register to Ask a Question