Differences between 2 Flat Files and process the differences


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Differences between 2 Flat Files and process the differences
# 1  
Old 07-17-2010
Differences between 2 Flat Files and process the differences

Hi
Hope you are having a great weeknd !! I had a question and need your expertise for this :

I have 2 files File1 & File2(of same structure) which I need to compare on some columns. I need to find the values which are there in File2 but not in File 1 and put the Differences in another file which I need to process

File 1
Code:
Field1 Field2 Field3 Field4 Field5 Field6
1      2      3      4      5      6
1      23     3      43     12     15

File 2
Code:
Field1 Field2 Field3 Field4  Field5 Field6
1      2      3      4       5      6
1      23     3      43      5      6 (not in diff field 5& 6 have changed which are not in compare condition)
1      23     3      12      5      6
31     43     54     5       7      8

Diff File
Code:
Field1 Field2 Field3  Field4  Field5 Field6
31     43     54      5       7      8 (in diff as its in File2 and not in File 1)
1      23     3       12      5      6 (in diff as its in File2(field4 has changed) and not in File 1)

So I need to compare only for the first 4 fields (i.e if any of those 4 fields are different then we need to pu them in the diff file and process it)


Any Help will be greatly appreciated!!

Thanks
J

Last edited by Scott; 07-17-2010 at 01:51 PM.. Reason: Code tags, please...
# 2  
Old 07-17-2010
Hi

Code:
awk 'NR==FNR{a[i++]=$1" "$2" "$3" "$4;next;}{x=$1" "$2" "$3" "$4; for (j in a){if (a[j] == x)next;}}1' i=1 j=1 file1 file2 > diff_file

Guru.
# 3  
Old 07-17-2010
Differences between 2 Flat Files and process the differences

Hi Guru
Thanks for the quick answer...Not well versed with awk so wanted to confirm my understanding...

basically when you say:
Code:
'NR==FNR{a[i++]=$1" "$2" "$3" "$4;next;}

you are taking the first file and putting all its data into array a[i]
so a[1] will have first row of the first file so you insert the whole data of the first file in a[i]

now when you say
Code:
{x=$1" "$2" "$3" "$4; for (j in a){if (a[j] == x)next;}}1'

this is the part where NR not equal to FNR ie you start reading the second file you assign X with the first row of the second file then you use the for loop to compare each row of first file with the row in the second file if its equal then you continue to the next row otherwise you put the entry into diff file .This you do for each row in the second file
Code:
i=1 j=1 file1 file2 > diff_file

this part you have initialized i,j


It will be great if you could confirm...It would be a great help


Thanks

Last edited by Scott; 07-17-2010 at 01:52 PM.. Reason: Code tags and formatting
# 4  
Old 07-17-2010
Hi
You got it right except for the first part. In the array, we are not storing the entire row, only the first 4 numbers of every row since you are interested in comparing the first 4. Same holds good for x as well.

Guru.
# 5  
Old 07-17-2010
Differences between 2 Flat Files and process the differences

Hey Guru

Sorry for the confusion but
what I meant to ask was that
Code:
'NR==FNR{a[i++]=$1" "$2" "$3" "$4;next;}

by this you store first 4 columns of the whole file(all rows)

and then you move to the next
Code:
{x=$1" "$2" "$3" "$4; for (j in a){if (a[j] == x)next;}}1'

where you store the first 4 columns of each row of second file in x and compare them against the all rows(but first 4 columns only) one by one

what does this 1' means and after executing this awk script the final
file that we will have only 4 columns or the whole structure of the file
(it should compare only the first 4 columns but in the final file should have all 6 columns)


Thanks
J

Moderator's Comments:
Mod Comment Please use code tags.

Last edited by Scott; 07-17-2010 at 01:53 PM.. Reason: Code tags
# 6  
Old 07-17-2010
Something like this?
Code:
awk 'NR==FNR{a[$1 $2 $3 $4]=$0; next}
!($1 $2 $3 $4 in a)' file1 file2 > diff_file

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to do find differences between 2 XML Files?

Hello All, Requirement is to compare 2 XML files and see if there are any differences but from some of the providers We are receiving UTF-16 formatted XML file with no end of line as shown below. Excerpt of data file: ÿþ<^@?^@x^@m^@l^@ ^@v^@e^@r^@s^@i^@o^@n^@=^@"^@1^@.^@0^@"^@... (11 Replies)
Discussion started by: Ariean
11 Replies

2. Shell Programming and Scripting

Need to compare the two files and list out differences between the two

Hi, I need to compare the two files and list out difference between the two. Please assist. Best regards, Vishal (2 Replies)
Discussion started by: Vishal_dba
2 Replies

3. Shell Programming and Scripting

Help comparing 2 files and sending differences

I have 2 files that need to be compared. Email the differences if something is different and don't email if nothing is different. One or both of the files could be empty. One or both could have data in them. example files backup.doc.$(date +%y%m%d) file size is 0 backup.doc.$(TZ=CST+24... (4 Replies)
Discussion started by: jabbott3
4 Replies

4. Shell Programming and Scripting

Comparing two files and list the differences

Hi * I have two text files which has the file size, timestamp and the file name. I need to compare these two files and get the differences in the output format. Can anyone help me out with this. * cat file1.txt *474742 Apr 18* 2010 sample.log *135098 Apr 18* 2010 Testfile 134282 Apr 18* 2010... (7 Replies)
Discussion started by: Sendhil.Kumaran
7 Replies

5. AIX

Aix process CPU differences.

Hi, I'm trying to create a script to catch a process which is consuming high CPU which I have pretty much done but it's just finding the correct place to pull the current CPU for that process. When viewed in Topas it's consuming 99.*% cpu But if I try using ps avg or ps -eo pcpu ... (5 Replies)
Discussion started by: elmesy
5 Replies

6. UNIX for Dummies Questions & Answers

Finding differences between 2 text files

Hi everyone, I know that's a deep treated issue but I'm actually not able to find the solution. I have 2 plain text files with ~ 2000 rows and ~5 columns. The first column of the shortest file (f1) is fully contained by the first column of the biggest one (f2), but only that column. I want to... (6 Replies)
Discussion started by: OBAFGKM
6 Replies

7. Shell Programming and Scripting

Eliminating differences in two files

Hello, I'm having trouble to read two txt files, they have employee records line by line, I need to do the reading of a file that is old and compare it with the new base in the new file, deleting the lines in old file, then add the new file data from the old file and write to the database manager.... (5 Replies)
Discussion started by: selmar
5 Replies

8. Shell Programming and Scripting

Detect differences in two files

All, I have two csv files, the format of which are exactly the same. I would like to find differences between the two files but would like to identify the difference as opposed to just printing a different line. For exmaple File 1 xxx,yyy,zzz,1,2,3 111,222,333,xxx,yyy ... (4 Replies)
Discussion started by: pxy2d1
4 Replies

9. Solaris

Differences between jar files

I want to find the difference between two jar files sitting on a sun box. How do I do this? (3 Replies)
Discussion started by: runnerpaul
3 Replies

10. UNIX for Dummies Questions & Answers

Number of differences between 2 files

Hi, "diff" command takes two file names as arguements and gives the difference between the two. How do I get the number of differences between two files ??? (Excluding whitespaces). Don't ask me to count number of lines produced by "diff". Thanks in advance, Sharath (4 Replies)
Discussion started by: sharuvman
4 Replies
Login or Register to Ask a Question