File Difference Problem


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting File Difference Problem
# 1  
Old 07-11-2011
File Difference Problem

Hey all,

I have a scenario where i have two files

File-3rdjuly --- This has 3 records -(3 rows)

Code:
~||~172.16.44.102~||~-~||~Unauthenticated~||~[17/Jun/2011:09:02:03 +0100]~||~GET /NestWeb/faces/public/MUA/pages/loginPage.xhtml HTTP/1.1~||~200~||~48583~||~-~||~Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C)~||~-~||~0~||~258828~||~
~||~172.16.44.102~||~-~||~Unauthenticated~||~[17/Jun/2011:09:02:03 +0100]~||~GET /NestWeb/includes/common/css/print.css HTTP/1.1~||~200~||~-~||~https://tcsppdsw/schemeweb/NestWeb/faces/public/MUA/pages/loginPage.xhtml~||~Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C)~||~org.apache.myfaces.shared_impl.context.flash.FlashImpl.POSTBACKMAP.KEY=-mmdxm8rb8; JSESSIONID=Qw5nN7JSLBgnW2LbGHJd72nMPBT1TNzP8bH13WJwpQvS4QggNm64!-63489622~||~0~||~553~||~
~||~172.16.44.102~||~-~||~Unauthenticated~||~[17/Jun/2011:09:02:03 +0100]~||~GET /NestWeb/includes/common/css/main.css HTTP/1.1~||~200~||~180945~||~https://tcsppdsw/schemeweb/NestWeb/faces/public/MUA/pages/loginPage.xhtml~||~Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C)~||~org.apache.myfaces.shared_impl.context.flash.FlashImpl.POSTBACKMAP.KEY=-mmdxm8rb8; JSESSIONID=Qw5nN7JSLBgnW2LbGHJd72nMPBT1TNzP8bH13WJwpQvS4QggNm64!-63489622~||~0~||~782~||~


File 4th July --- This has 5 records -(5 rows)

Code:
~||~172.16.44.102~||~-~||~Unauthenticated~||~[17/Jun/2011:09:02:03 +0100]~||~GET /NestWeb/faces/public/MUA/pages/loginPage.xhtml HTTP/1.1~||~200~||~48583~||~-~||~Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C)~||~-~||~0~||~258828~||~
~||~172.16.44.102~||~-~||~Unauthenticated~||~[17/Jun/2011:09:02:03 +0100]~||~GET /NestWeb/includes/common/css/print.css HTTP/1.1~||~200~||~-~||~https://tcsppdsw/schemeweb/NestWeb/faces/public/MUA/pages/loginPage.xhtml~||~Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C)~||~org.apache.myfaces.shared_impl.context.flash.FlashImpl.POSTBACKMAP.KEY=-mmdxm8rb8; JSESSIONID=Qw5nN7JSLBgnW2LbGHJd72nMPBT1TNzP8bH13WJwpQvS4QggNm64!-63489622~||~0~||~553~||~
~||~172.16.44.102~||~-~||~Unauthenticated~||~[17/Jun/2011:09:02:03 +0100]~||~GET /NestWeb/includes/common/css/main.css HTTP/1.1~||~200~||~180945~||~https://tcsppdsw/schemeweb/NestWeb/faces/public/MUA/pages/loginPage.xhtml~||~Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C)~||~org.apache.myfaces.shared_impl.context.flash.FlashImpl.POSTBACKMAP.KEY=-mmdxm8rb8; JSESSIONID=Qw5nN7JSLBgnW2LbGHJd72nMPBT1TNzP8bH13WJwpQvS4QggNm64!-63489622~||~0~||~782~||~
~||~172.16.44.102~||~-~||~Unauthenticated~||~[17/Jun/2011:09:02:04 +0100]~||~GET /NestWeb/includes/MUA/css/IE.css HTTP/1.1~||~200~||~44~||~https://tcsppdsw/schemeweb/NestWeb/faces/public/MUA/pages/loginPage.xhtml~||~Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C)~||~org.apache.myfaces.shared_impl.context.flash.FlashImpl.POSTBACKMAP.KEY=-mmdxm8rb8; JSESSIONID=Qw5nN7JSLBgnW2LbGHJd72nMPBT1TNzP8bH13WJwpQvS4QggNm64!-63489622~||~0~||~603~||~
~||~172.16.44.102~||~-~||~Unauthenticated~||~[17/Jun/2011:09:02:04 +0100]~||~GET /NestWeb/includes/common/js/jquery-1.4.min.js HTTP/1.1~||~200~||~69838~||~https://tcsppdsw/schemeweb/NestWeb/faces/public/MUA/pages/loginPage.xhtml~||~Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C)~||~org.apache.myfaces.shared_impl.context.flash.FlashImpl.POSTBACKMAP.KEY=-mmdxm8rb8; JSESSIONID=Qw5nN7JSLBgnW2LbGHJd72nMPBT1TNzP8bH13WJwpQvS4QggNm64!-63489622~||~0~||~525~||~

-----------------------------

Well, 4th july file has 3rdjuly records and some new records.. Wel these are the flat files i am using to load in my warehouse now.

I want a program which will compare both 4thJuly and 3rdJuly files and extract me only new records from 4thjuly (which are not there in 3rd july) in a new file say 'NEWFILE'.

Please help me out with the same... pleasee

Last edited by pludi; 07-11-2011 at 08:25 AM..
# 2  
Old 07-11-2011
try with below code


Code:
rm -f newfile.txt
while read line
do
  line1=`grep -ie "${line}" 3rdjuly.txt`
  if [ $? -ne 0 ] ; then
    echo "$line" >> newfile.txt
  fi
done < 4thjuly.txt


you can also try with below
In this the difference between 3rdjuly.txt and 4thjuly.txt will be put into newfile.txt
since in this case 4thjuly.txt will have 3rdjuly.txt records as well as few
new records the new records will be put into newfile.txt

Code:
diff 3rdjuly.txt 4thjuly.txt > newfile.txt


Cheers
Harish

Last edited by harish612; 07-11-2011 at 08:45 AM..
# 3  
Old 07-11-2011
Question

hi harish..

After executing the code block you gave me, i got first 4 rows from 4thjuly file.. This wasnt something i was expecting.

Dear sir, i need the records from 4thjuly file, which arn't part of 3rdjuly file.

Output of the code harish you gave was

Code:
 
~||~172.16.44.102~||~-~||~Unauthenticated~||~[17/Jun/2011:09:02:03 +0100]~||~GET /NestWeb/faces/public/MUA/pages/loginPage.xhtml HTTP/1.1~||~200~||~48583~||~-~||~Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C)~||~-~||~0~||~258828~||~
~||~172.16.44.102~||~-~||~Unauthenticated~||~[17/Jun/2011:09:02:03 +0100]~||~GET /NestWeb/includes/common/css/print.css HTTP/1.1~||~200~||~-~||~https://tcsppdsw/schemeweb/NestWeb/faces/public/MUA/pages/loginPage.xhtml~||~Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C)~||~org.apache.myfaces.shared_impl.context.flash.FlashImpl.POSTBACKMAP.KEY=-mmdxm8rb8; JSESSIONID=Qw5nN7JSLBgnW2LbGHJd72nMPBT1TNzP8bH13WJwpQvS4QggNm64!-63489622~||~0~||~553~||~
~||~172.16.44.102~||~-~||~Unauthenticated~||~[17/Jun/2011:09:02:03 +0100]~||~GET /NestWeb/includes/common/css/main.css HTTP/1.1~||~200~||~180945~||~https://tcsppdsw/schemeweb/NestWeb/faces/public/MUA/pages/loginPage.xhtml~||~Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C)~||~org.apache.myfaces.shared_impl.context.flash.FlashImpl.POSTBACKMAP.KEY=-mmdxm8rb8; JSESSIONID=Qw5nN7JSLBgnW2LbGHJd72nMPBT1TNzP8bH13WJwpQvS4QggNm64!-63489622~||~0~||~782~||~
~||~172.16.44.102~||~-~||~Unauthenticated~||~[17/Jun/2011:09:02:04 +0100]~||~GET /NestWeb/includes/MUA/css/IE.css HTTP/1.1~||~200~||~44~||~https://tcsppdsw/schemeweb/NestWeb/faces/public/MUA/pages/loginPage.xhtml~||~Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C)~||~org.apache.myfaces.shared_impl.context.flash.FlashImpl.POSTBACKMAP.KEY=-mmdxm8rb8; JSESSIONID=Qw5nN7JSLBgnW2LbGHJd72nMPBT1TNzP8bH13WJwpQvS4QggNm64!-63489622~||~0~||~603~||~

# 4  
Old 07-11-2011
try with below
In this the difference between 3rdjuly.txt and 4thjuly.txt will be put into newfile.txt
since in this case 4thjuly.txt will have 3rdjuly.txt records as well as few
new records the new records will be put into newfile.txt


Code:
diff 3rdjuly.txt 4thjuly.txt > newfile.txt


Cheers
Harish
# 5  
Old 07-11-2011
Sir, i got the ouput as

Code:
 
3c3,5
< ~||~172.16.44.102~||~-~||~Unauthenticated~||~[17/Jun/2011:09:02:03 +0100]~||~GET /NestWeb/includes/common/css/main.css HTTP/1.1~||~200~||~180945~||~https://tcsppdsw/schemeweb/NestWeb/faces/public/MUA/pages/loginPage.xhtml~||~Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C)~||~org.apache.myfaces.shared_impl.context.flash.FlashImpl.POSTBACKMAP.KEY=-mmdxm8rb8; JSESSIONID=Qw5nN7JSLBgnW2LbGHJd72nMPBT1TNzP8bH13WJwpQvS4QggNm64!-63489622~||~0~||~782~||~
---
> ~||~172.16.44.102~||~-~||~Unauthenticated~||~[17/Jun/2011:09:02:03 +0100]~||~GET /NestWeb/includes/common/css/main.css HTTP/1.1~||~200~||~180945~||~https://tcsppdsw/schemeweb/NestWeb/faces/public/MUA/pages/loginPage.xhtml~||~Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C)~||~org.apache.myfaces.shared_impl.context.flash.FlashImpl.POSTBACKMAP.KEY=-mmdxm8rb8; JSESSIONID=Qw5nN7JSLBgnW2LbGHJd72nMPBT1TNzP8bH13WJwpQvS4QggNm64!-63489622~||~0~||~782~||~
> ~||~172.16.44.102~||~-~||~Unauthenticated~||~[17/Jun/2011:09:02:04 +0100]~||~GET /NestWeb/includes/MUA/css/IE.css HTTP/1.1~||~200~||~44~||~https://tcsppdsw/schemeweb/NestWeb/faces/public/MUA/pages/loginPage.xhtml~||~Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C)~||~org.apache.myfaces.shared_impl.context.flash.FlashImpl.POSTBACKMAP.KEY=-mmdxm8rb8; JSESSIONID=Qw5nN7JSLBgnW2LbGHJd72nMPBT1TNzP8bH13WJwpQvS4QggNm64!-63489622~||~0~||~603~||~
> ~||~172.16.44.102~||~-~||~Unauthenticated~||~[17/Jun/2011:09:02:04 +0100]~||~GET /NestWeb/includes/common/js/jquery-1.4.min.js HTTP/1.1~||~200~||~69838~||~https://tcsppdsw/schemeweb/NestWeb/faces/public/MUA/pages/loginPage.xhtml~||~Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C)~||~org.apache.myfaces.shared_impl.context.flash.FlashImpl.POSTBACKMAP.KEY=-mmdxm8rb8; JSESSIONID=Qw5nN7JSLBgnW2LbGHJd72nMPBT1TNzP8bH13WJwpQvS4QggNm64!-63489622~||~0~||~525~||~

I am expecting an output of

Code:
 
~||~172.16.44.102~||~-~||~Unauthenticated~||~[17/Jun/2011:09:02:04 +0100]~||~GET /NestWeb/includes/MUA/css/IE.css HTTP/1.1~||~200~||~44~||~https://tcsppdsw/schemeweb/NestWeb/faces/public/MUA/pages/loginPage.xhtml~||~Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C)~||~org.apache.myfaces.shared_impl.context.flash.FlashImpl.POSTBACKMAP.KEY=-mmdxm8rb8; JSESSIONID=Qw5nN7JSLBgnW2LbGHJd72nMPBT1TNzP8bH13WJwpQvS4QggNm64!-63489622~||~0~||~603~||~
~||~172.16.44.102~||~-~||~Unauthenticated~||~[17/Jun/2011:09:02:04 +0100]~||~GET /NestWeb/includes/common/js/jquery-1.4.min.js HTTP/1.1~||~200~||~69838~||~https://tcsppdsw/schemeweb/NestWeb/faces/public/MUA/pages/loginPage.xhtml~||~Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C)~||~org.apache.myfaces.shared_impl.context.flash.FlashImpl.POSTBACKMAP.KEY=-mmdxm8rb8; JSESSIONID=Qw5nN7JSLBgnW2LbGHJd72nMPBT1TNzP8bH13WJwpQvS4QggNm64!-63489622~||~0~||~525~||~

# 6  
Old 07-11-2011
Quote:
Originally Posted by Amit Gupta
I want a program which will compare both 4thJuly and 3rdJuly files and extract me only new records from 4thjuly (which are not there in 3rd july) in a new file say 'NEWFILE'.

Please help me out with the same... pleasee
If and only if the new records always appear at the end of the file, after all of the old records, you can use:
Code:
tail -n +$(($(wc -l < 3rdjuly)+1)) 4thjuly > NEWFILE

Regards,
Alister
# 7  
Old 07-11-2011
If records are ordered the same way in the two file, you can do :
Code:
comm -3 3rdjuly.txt  4thjuly.txt

Jean-Pierre.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

awk code to find difference in second file which is not present in first file .

Hi All, I want to find difference between two files and output only lines which are not present in second file .I am using awk and I am getting only the first difference but I want to get all the lines which are not present in file2 .Below is the code I am using . Please help to get the desired... (7 Replies)
Discussion started by: srinivasrao
7 Replies

2. Shell Programming and Scripting

awk to calculate difference of split and sum the difference

In the awk I am trying to subtract the difference $3-$2 of each matching $4 before the first _ (underscore) and print that value in $13. I think the awk will do that, but added comments. What I am not sure off is how to add a line or lines that will add sum each matching $13 value and put it in... (2 Replies)
Discussion started by: cmccabe
2 Replies

3. UNIX for Beginners Questions & Answers

UNIX utility to find difference in folder, file and contents of file against a base version

Hi, I am trying to find out whether there are any Unix utilities that compares folders, files and contents within the file and provides a comprehensive report. The comparison can be against base version of a folder and file with content. Can you please let me know of such a utility? Thanks,... (6 Replies)
Discussion started by: Sripathi_ks
6 Replies

4. Shell Programming and Scripting

Compare large file and identify difference in separate file

I have a very large system generated file containing around 500K rows size 100MB like following HOME|ALICE STREET|3||NEW LISTING HOME|NEWPORT STREET|1||NEW LISTING HOME|KING STREET|5||NEW LISTING HOME|WINSOME AVENUE|4||MODIFICATION CAR|TOYOTA|4||NEW LISTING CAR|FORD|4||NEW... (9 Replies)
Discussion started by: jubaier
9 Replies

5. Shell Programming and Scripting

Three Difference File Huge Data Comparison Problem.

I got three different file: Part of File 1 ARTPHDFGAA . . Part of File 2 ARTGHHYESA . . Part of File 3 ARTPOLYWEA . . (4 Replies)
Discussion started by: patrick87
4 Replies

6. Shell Programming and Scripting

Calculate the time difference between a local file and a remote file.

I m stuck with a issue. I need to calculate the time difference between two files.. one on the local machine and one on the remote machine using a script. Can any one suggest the way this can be achevied Thanks, manohar (1 Reply)
Discussion started by: meetmano143
1 Replies

7. Shell Programming and Scripting

Weird date difference problem

I am trying to find the difference in days between 2 dates. I have to extract the 1st date from a filename, which i did using the awk command. I have to compare this date to today's date and if the difference is greater than 30 days, do something, else do something else. This is what i wrote... (22 Replies)
Discussion started by: meeraKh
22 Replies

8. Shell Programming and Scripting

compare 2 file and print difference in the third file URG PLS

Hi I have two files in unix. I need to compare two files and print the differed lines in other file Eg file1 1111 2222 3333 file2 1111 2222 3333 4444 5555 newfile 4444 5555 Thanks In advance (3 Replies)
Discussion started by: evvander
3 Replies

9. Filesystems, Disks and Memory

Strange difference in file size when copying LARGE file..

Hi, Im trying to take a database backup. one of the files is 26 GB. I am using cp -pr to create a backup copy of the database. after the copying is complete, if i do du -hrs on the folders i saw a difference of 2GB. The weird fact is that the BACKUP folder was 2 GB more than the original one! ... (1 Reply)
Discussion started by: 0ktalmagik
1 Replies

10. UNIX for Dummies Questions & Answers

Newbie question about difference between executable file and ordinary file

Hi, I am newbie in unix and just started learning it. I want to know what is the difference between an executable file and a file (say text file). How to create executable file? What is the extension for that? How to differentiate ? How does it get executed? Thanks (1 Reply)
Discussion started by: Balaji
1 Replies
Login or Register to Ask a Question