Shell script to compare two files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Shell script to compare two files
# 1  
Old 04-27-2011
Shell script to compare two files

I have two files; file A and file B. I need all the entries of file A to be compared with file B line by line. If the entry exists on file B, then save those on file C; if no then save it on file D

Note :- all the columns of the lines of file A need to be compared, except the last two columns (date & time)

file A
Code:
dbclstr-b IXT_Web Memphis_Prod_SQL_Full Memphis-Prod-SQL-Full-Application-Backup 04/23/11 11:24:41
ebs-sql1-b EBSCClaimStore Memphis_Prod_SQL_Diff Memphis-Prod-SQL-Inc-Application-Backup 04/24/11 19:58:22
pmemcfdb001-b ERTL Memphis_Prod_SQL_Full Memphis-Prod-SQL-Full-Application-Backup 04/23/11 11:03:18

file B
Code:
dbclstr-b IXTProd02 Memphis_Prod_SQL_Diff Memphis-Prod-SQL-Inc-Application-Backup 04/24/11 21:49:14
pmemcfdb001-b ERTL Memphis_Prod_SQL_Full Memphis-Prod-SQL-Full-Application-Backup 04/23/11 17:51:12
pmemcfdb001-b ERTL Memphis_Prod_SQL_Full Memphis-Prod-SQL-Full-Application-Backup 04/23/11 17:47:53
qbnawldb021-b AetnaLTC Memphis_Corp_SQL_Full Memphis-Corp-SQL-Full-Application-Backup 04/23/11 17:45:20
ebs-sql1-b EBSCClaimStore Memphis_Prod_SQL_Diff Memphis-Prod-SQL-Inc-Application-Backup 04/23/11 19:58:22

desired output

file C (if entries of file A exists in file B)
Code:
ebs-sql1-b EBSCClaimStore Memphis_Prod_SQL_Diff Memphis-Prod-SQL-Inc-Application-Backup 04/24/11 19:58:22
pmemcfdb001-b ERTL Memphis_Prod_SQL_Full Memphis-Prod-SQL-Full-Application-Backup 04/23/11 11:03:18

file D (if entries of file A does not exists in file B)
Code:
dbclstr-b IXT_Web Memphis_Prod_SQL_Full Memphis-Prod-SQL-Full-Application-Backup 04/23/11 11:24:41

---------- Post updated at 07:06 AM ---------- Previous update was at 07:03 AM ----------

I wrote one script which will compare the entries and save to fileC if enties existed on both the file. But i am not able to put a condition for those which does not exists and save on fileD

Code:
cat fileB | while read STATUS CLIENT DB POLICY SCHEDULE DATE TIME
do
grep -w "$DB" fileA | grep -w "$CLIENT" | grep -w "$POLICY" | grep -w "$SCHEDULE" | grep -w "$DATE" >> fileC
done


Last edited by pludi; 04-27-2011 at 09:12 AM..
# 2  
Old 04-27-2011
Lazy way but still ...
Code:
awk 'NF>2{NF=NF-2;$1=$1}1' fileA | sort >fileA.s
awk 'NF>2{NF=NF-2;$1=$1}1' fileB | sort >fileB.s
comm -12 fileA.s fileB.s >fileC
comm -23 fileA.s fileB.s >fileD

# 3  
Old 04-27-2011
Hi,
I'm not sure about this one because I get three lines in fileC instead of 2 as You state, but anyway, it may work as a hint:

Code:
>fileC
>fileD
while read a b c d date time; do  
grep "$a $b $c $d" fileB >> fileC || grep "$a $b $c $d" fileA >> fileD
done < fileA

meaning, if there's no hit in fileB do it again on A and put it in fileD. I'm sure it could be a lot cleaner, without calling grep twice for example.

Best regards,
Lakris
This User Gave Thanks to Lakris For This Post:
# 4  
Old 04-27-2011
You could also try this logic which builds a temp file for fileB without the last two fields and then just uses grep -f, however if fileB is large, then the script may a little lacking in performance. I've used the internal code rather than some convoluted echo $line through some sort of field counter, subtract two then echo $line | cut -f -$wanted that spawns several processes for each record trim and is a lot slower, but I've seen it quite a lot elsewhere Smilie and probably used it myself too before I found a better way Smilie:-
Code:
#!/bin/ksh


{ cat fileB | while read line
do
outline="${line% * *}"
echo $outline
done } > temp-fileB

grep -f temp-fileB fileA > fileC
grep -vf temp-fileB fileA > fileD

Does this help?

Let us know how you get on

Robin
Liverpool/Blackburn
UK

Last edited by rbatte1; 04-27-2011 at 09:03 PM.. Reason: Spelling correction
# 5  
Old 04-28-2011
I was able to do it in another way
Code:
cat fileA | while read CLIENT DB POLICY SCHEDULE DATE TIME
do
  if ( grep -w "$DB" fileB | grep -w "$CLIENT" | grep -w "$POLICY" | grep -w "$SCHEDULE" )
  then echo $DB $CLIENT $POLICY $SCHEDULE $DATE $TIME >> fileC
  else echo $DB $CLIENT $POLICY $SCHEDULE $DATE $TIME >> fileD
  fi
done

---------- Post updated at 06:37 AM ---------- Previous update was at 06:30 AM ----------

now situation becomes more complicated; we need to apply 2 more conditions Smilie

INITIAL SETUP
base condition
values from column 1 to 5 of fileA should match with fileB Smilie
;
if matching, put it on fileC and if not fileD

NOW
values from column 1 to 5 of fileA should match with fileB
and
values of column 6 & 7 of fileA are greater than fileB
;
if matching, put it on fileC and if not fileD Smilie


i wrote one script

Code:
cat fileA | while read STATUS CLIENT DB POLICY SCHEDULE DATE TIME ; do
  if  ( grep -w "$DB" fileB | grep -w "$CLIENT" | grep -w "$POLICY" | grep -w "$SCHEDULE" ) ; if ( $6 < "'$DATE'"  )  ;  if ( $7 < "'$TIME'" )
  then echo $STATUS $DB $CLIENT $POLICY $SCHEDULE $DATE $TIME >> fileC
  else echo $STATUS $DB $CLIENT $POLICY $SCHEDULE $DATE $TIME >> fileD
  fi
done


but its erroring out as below Smilie

line 14: syntax error near unexpected token `done'

Last edited by Scott; 04-28-2011 at 09:44 AM.. Reason: Please use code tags
# 6  
Old 04-28-2011
For a large fileB, you will be spawning lots of grep processes, 4 for each record, and that will take time.

You are also assuming that the date can be compared so easily. You will need to reformat them so they come out as yyyy/mm/dd else your comparison would find something with a date of 15/01/2011 as "newer" than 10/02/2011

You could call a conversion for each record, but that could get rather complex. I will have a think. I would still recommend against grep | grep | grep stuff though. It could cripple your system for serisous size files.




Robin
# 7  
Old 04-28-2011
files are not so large. date wise, yes you are correct. i need to split and then compare. But once i have a base script, then can modify that date part later. Any idea why the syntax error is coming ?
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Shell Script to Compare Files and Email the differences

Hi, I have 2 files abc.txt and bdc.txt. I am using $diff -y abc.txt bcd.txt -- compared the files side by side I would like to write a Shell Script to cmpare the files side by side and print the results( which are not matched) in a side by side format and save the results in another... (10 Replies)
Discussion started by: vasuvv
10 Replies

2. Shell Programming and Scripting

Shell script to compare two files for duplicate..??

Hi , I had a requirement to compare two files whether the two files are same or different .... like(files contaisn of two columns each) file1.txt 121343432213 1234 64564564646 2345 343423424234 2456 file2.txt 121343432213 1234 64564564646 2345 31231313123 3455 how to... (2 Replies)
Discussion started by: hemanthsaikumar
2 Replies

3. Shell Programming and Scripting

Using shell script to compare files and retrieve connections

Hello, I want to use shell script to generate network files (I tried with python but its taking too long). I have a list of nodes: node.txt LOC_Os11g37970 LOC_Os01g07760 LOC_Os03g19480 LOC_Os11g45740 LOC_Os06g08290 LOC_Os07g02800 I have an edge-list as well: edge.txt Source_node ... (2 Replies)
Discussion started by: Sanchari
2 Replies

4. Shell Programming and Scripting

How to use awk shell script to compare and match two files?

Basically, I have two files dupestest.txt 152,153 192,193,194 215,216 290,291 2279,2280 2282,2283haftest.txt 152,ABBOTS ROAD 153,ABBOTS ROAD 154,ABBOTS ROAD 155,ABBOTS ROAD 156,ABBOTS ROAD 157,ABBOTS ROADI want to find the numbers in dupestest.txt in haftest.txt... (4 Replies)
Discussion started by: amyc92
4 Replies

5. Shell Programming and Scripting

Shell script to compare ,diff and remove betwen 2 files

Hi Friends Need your expertise. Command to check the difference and compare 2 files and remove lines . example File1 is master copy and File2 is a slave copy . whenever i change, add or delete a record in File1 it should update the same in slave copy . Can you guide me how can i accomplish... (3 Replies)
Discussion started by: ajayram_arya
3 Replies

6. Shell Programming and Scripting

Compare two files using shell script

Hi i want to compare two files and i need the o/p of only difference here the files file1 achilles aedxbepo aedxbwdm01 aedxbwdm02 albedo amarice ambrister anakin anton argon artephius asgard avatar aymara (10 Replies)
Discussion started by: venikathir
10 Replies

7. Shell Programming and Scripting

Shell script compare all parameters in two files and display results

Hi , I am not familiar with shell programming. I have a requirement like i have two files .I need to compare the two files by comparing each parameter and i should produce 2 outputs. 1)i have around 35 parameters say i have one parameter name called db_name=dcap in one file and... (7 Replies)
Discussion started by: muraliinfy04
7 Replies

8. Shell Programming and Scripting

Shell Script to Compare Two Files

I have a directory with about 6 files that we receive regularly. these 6 files contain information for 3 different units, 2 for each unit. files related to a specific unit are named similarly with a change in number at the end of the file. the numbers should be sequential. for each grouping of... (3 Replies)
Discussion started by: scriptman237
3 Replies

9. Shell Programming and Scripting

Compare semicolon seperated data in 2 files using shell script

hello members, I have some data ( seperated by semicolon ) with close to 240 rows in a text file temp1. temp2.txt stores 204 rows of data ( seperated by semicolon ). I want to : Sort the data in both files by field1.i.e first data field in every row. compare the data in both files and print... (6 Replies)
Discussion started by: novice82
6 Replies

10. UNIX and Linux Applications

How to compare two files using shell script

hi experts please help me to compare two files which are in different directory file1<file will be master file> (/home/rev/mas.txt} ex x1 x2 file2 <will be in different folder> (/home/rev/per/.....) ex x3 x4 the filesinside per folder i need to compare with master file... (1 Reply)
Discussion started by: revenna
1 Replies
Login or Register to Ask a Question