Using AWK to compare 2 files


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Using AWK to compare 2 files
# 1  
Old 04-06-2010
Using AWK to compare 2 files

Hi

How can I use awk to compare specific columns in 2 files and print the difference.

I currently have this:


Code:
BEGIN {
OFS = FS = ","
}
NR == FNR {
b[$2] = $3
next
}
{
e = "" 
for (x in b) {
if (match ($1, x)) {
if (RSTART == 1 && RLENGTH > length(e)) {
e=x
}
}
}
print $0, e, b[e]
}



It compares all fields against each other. I basically need to compare column1 if file1 against column1 of file2 and so forth. But I need to exclude column4 comparison against each other.

Thank you for your help.

Last edited by pludi; 04-06-2010 at 07:50 AM.. Reason: code tags, please...
# 2  
Old 04-06-2010
Paraphrase:
you have two files, each with the same number of columns, for example columns=n.
you want to compare file1 and file two row by row column by column, ignoring column 4.
Then I think you want to print the differences, but how?

Could you give us a dummy "file1" and "file2" with desired output?
# 3  
Old 04-07-2010
File1:

Code:
User Parameter;Entity Name;Cell ID;Type SubCell / TX / FHSY / DRI / TDMA;Instance CHGR / TX / FHSY / DRI / TDMA;CellR ID;Vendor Parameter;Planned Value;Translated Value;Network Value;Override Level;Override Node;Override Value;Planned Date;Network Date;Vendor;Technology;Version
Access Grant Blocks Reserved;CELL;1A;;;;AGBLK;5;5;1;CELL;1A;;4/6/2010 9:01:17 AM;1/12/2010 5:15:15 AM;ERICSSON;GSM;07B
Access Grant Blocks Reserved;CELL;1B;;;;AGBLK;2;2;1;CELL;1B;;4/6/2010 9:01:17 AM;1/12/2010 5:15:15 AM;ERICSSON;GSM;07B
Access Grant Blocks Reserved;CELL;1C;;;;AGBLK;3;3;1;CELL;1C;;4/6/2010 9:01:17 AM;1/12/2010 5:15:15 AM;ERICSSON;GSM;07B
Access Grant Blocks Reserved;CELL;2A;;;;AGBLK;7;7;1;CELL;2A;;4/6/2010 9:01:17 AM;1/12/2010 5:15:15 AM;ERICSSON;GSM;08B
Access Grant Blocks Reserved;CELL;2B;;;;AGBLK;4;4;1;CELL;2B;;4/6/2010 9:01:17 AM;1/12/2010 5:15:15 AM;ERICSSON;GSM;08B
Access Grant Blocks Reserved;CELL;2C;;;;AGBLK;0;0;1;CELL;2C;;4/6/2010 9:01:17 AM;1/12/2010 5:15:15 AM;ERICSSON;GSM;08B
Access Grant Blocks Reserved;CELL;3A;;;;AGBLK;1;1;1;CELL;3A;;4/6/2010 9:01:17 AM;1/12/2010 5:15:15 AM;ERICSSON;GSM;08B
Access Grant Blocks Reserved;CELL;3B;;;;AGBLK;6;6;1;CELL;3B;;4/6/2010 9:01:17 AM;1/12/2010 5:15:15 AM;ERICSSON;GSM;08B
Access Grant Blocks Reserved;CELL;3C;;;;AGBLK;4;4;1;CELL;3C;;4/6/2010 9:01:17 AM;1/12/2010 5:15:15 AM;ERICSSON;GSM;08B


File2:

Code:
User Parameter;Entity Name;Cell ID;Type SubCell / TX / FHSY / DRI / TDMA;Instance CHGR / TX / FHSY / DRI / TDMA;CellR ID;Vendor Parameter;Planned Value;Translated Value;Network Value;Override Level;Override Node;Override Value;Planned Date;Network Date;Vendor;Technology;Version
Access Grant Blocks Reserved;CELL;1A;;;;AGBLK;0;0;1;CELL;1A;;4/6/2010 11:15:44 AM;1/12/2010 5:15:15 AM;ERICSSON;GSM;07B
Access Grant Blocks Reserved;CELL;1B;;;;AGBLK;2;2;1;CELL;1B;;4/6/2010 11:15:44 AM;1/12/2010 5:15:15 AM;ERICSSON;GSM;07B
Access Grant Blocks Reserved;CELL;1C;;;;AGBLK;3;3;1;CELL;1C;;4/6/2010 11:15:44 AM;1/12/2010 5:15:15 AM;ERICSSON;GSM;07B
Access Grant Blocks Reserved;CELL;2A;;;;AGBLK;7;7;1;CELL;2A;;4/6/2010 11:15:44 AM;1/12/2010 5:15:15 AM;ERICSSON;GSM;08B
Access Grant Blocks Reserved;CELL;2B;;;;AGBLK;4;4;1;CELL;2B;;4/6/2010 11:15:44 AM;1/12/2010 5:15:15 AM;ERICSSON;GSM;08B
Access Grant Blocks Reserved;CELL;2C;;;;AGBLK;0;0;1;CELL;2C;;4/6/2010 11:15:44 AM;1/12/2010 5:15:15 AM;ERICSSON;GSM;08B
Access Grant Blocks Reserved;CELL;3A;;;;AGBLK;1;1;1;CELL;3A;;4/6/2010 11:15:44 AM;1/12/2010 5:15:15 AM;ERICSSON;GSM;08B
Access Grant Blocks Reserved;CELL;3B;;;;AGBLK;6;6;1;CELL;3B;;4/6/2010 11:15:44 AM;1/12/2010 5:15:15 AM;ERICSSON;GSM;08B
Access Grant Blocks Reserved;CELL;3C;;;;AGBLK;4;4;1;CELL;3C;;4/6/2010 11:15:44 AM;1/12/2010 5:15:15 AM;ERICSSON;GSM;08B


They're basically the same file, same columns etc, except that the datetime has changed and I've changed one other value to pick up if the diff works.

Here's what I have done so far:

I removed the datetime columns in this way:

Code:
#!/bin/bash
#---------------------------------------------------------------------
#This script removes the datetime stamp and pipes the old file to a new #file
#---------------------------------------------------------------------
awk -v oldfile="$1" -v newfile="$2"'BEGIN{FS=";";OFS=","}{$13=$14=$15=""}1' oldfile > newfile

I then intend to call the script from the command line in this way:

Code:
./script.sh oldfile1.csv newfile1.csv
./script.sh oldfile2.csv newfile2.csv

I do this for the 2 files

Then I use this command to compare the two files (this must also be in a script):

Code:
awk diff newfile1 newfile2 > resultfile

I want to compare the two files excluding the columns that generate the date time stamp.

I only need to see the differences made from file1 to file2 without picking up the timestamp column or then all the data gets selected since this is always changing.

I need all these different commands in scripts.

Can you please help me?

Last edited by Franklin52; 04-07-2010 at 03:10 AM.. Reason: Please use code tags!
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

[awk] Compare two files

HI!! I am trying to compare two files using AWK but I have some problems. I need to count how many times letters are used in two texts. This is my script { long=length($0) for (i=1;i<=long;i++) { aux=substr($0,i,1) if ( aux != " " && aux != "" ) ... (7 Replies)
Discussion started by: ettore8888
7 Replies

2. Shell Programming and Scripting

awk compare files

I have a below requirement and trying to compare the files using awk File 1 - Already stored on a prev day id | text | email id --------------------------------- 89564|this is line 1 | xyz@sample.txt 985384|this is line 2 | abc@sample.txt 657342|this is line 3 |... (3 Replies)
Discussion started by: rakesh_411
3 Replies

3. Shell Programming and Scripting

Compare 2 files, awk maybe?

I have 2 files, file1: alfa numbers numbers vita numbers numbers gama numbers numbers delta numbers numbers epsilon numbers numbers zita numbers numbers ... file2: 'zita' keepnumbers keepnumbers keepnumbers 'gama' keepnumbers keepnumbers keepnumbers 'misc' ... (11 Replies)
Discussion started by: phaethon
11 Replies

4. HP-UX

Awk compare two files

Hi guys, I have 2 files: File1 ABC|2203|115.50 ABC|2288|328.12 ABC|2289|611.09 ABC|2290|698 DEF|1513|721.3 DEF|1514|40 DEF|1515|5 File2 ABC|2288|328.12 ABC|2289|666.08 ABC|2290|698.00 DEF|1513|721.30 (3 Replies)
Discussion started by: Eduardo Aceves
3 Replies

5. Shell Programming and Scripting

Compare files using awk

Please help me to compare two files and remove the items in file2 from file1 file 1:delimited using pipe(|) file1 00012|Description - 1|||||AA12345|1|AB12345|2|2012/06/03 AB123|Description - 2|||||AA12345|3|ZA11111|4|2012/06/04 11111|Description - 3|||||AP00012|1|AB12345|2|2012/06/03... (8 Replies)
Discussion started by: Mary James
8 Replies

6. Shell Programming and Scripting

awk command to compare a file with set of files in a directory using 'awk'

Hi, I have a situation to compare one file, say file1.txt with a set of files in directory.The directory contains more than 100 files. To be more precise, the requirement is to compare the first field of file1.txt with the first field in all the files in the directory.The files in the... (10 Replies)
Discussion started by: anandek
10 Replies

7. Shell Programming and Scripting

Compare two files with awk

Hello, I have a script which extracts the values from a csv file when a specific date is entered : #!/bin/sh awk 'BEGIN{printf("Entrez la date : "); getline date < "-"} $0 ~ date {f=1;print;next} /^{2}\//{f=0} f' file1.csv This script gives me a number of lines with different values. ... (6 Replies)
Discussion started by: freyr
6 Replies

8. Shell Programming and Scripting

compare two files using awk

Hi, I want to compare two files using awk and write an output based on if the records matched. Both the files are space delimitted. File A: 8351 00000000000636 2009044 -00001.000 8351 00000000000637 2009044 -00002.000 8351 00000000000638 2009044 -00001.000 8351 00000000000640... (7 Replies)
Discussion started by: gpaulose
7 Replies

9. Shell Programming and Scripting

Compare two files using awk

Hi. I'm new to awk and have searched for a solution to my problem, but haven't found the right answer yet. I have two files that look like this: file1 Delete,3105551234 Delete,3105551236 Delete,5625559876 Delete,5625556789 Delete,5625553456 Delete,5625551234 Delete,5625556956... (8 Replies)
Discussion started by: paul.o
8 Replies

10. Shell Programming and Scripting

awk compare 2 files

Hi i hope some awk gurus here can help me.. here is what i need i have 2 files: File1 152445 516532 405088.pdf 152445 516533 405089.pdf 152491 516668 405153.jpg 152491 520977 408779.jpg 152491 0 409265.pdf File2 516532 /tmp/MainStreet_Sum09_Front_FNL.pdf 516533... (9 Replies)
Discussion started by: kenray
9 Replies
Login or Register to Ask a Question