Sponsored Content
Top Forums Shell Programming and Scripting Best way to diff two huge directory trees Post 302224285 by same1290 on Tuesday 12th of August 2008 06:42:23 PM
Old 08-12-2008
Best way to diff two huge directory trees

Hi

I have a job that will be running nightly incremental backsup of a large directory tree.

I did the initial backup, now I want to write a script to verify that all the files were transferred correctly. I did something like this which works in principle on small trees:

diff -r -q $src_dir $dst_dir >& diffreport.txt

The problem with this is that it is very slow. The directory I am backing up is about 2 TB.

I also tried using the tools find and sum to dump the checksums to two file s, one for source directory and one for destination and comparing them. This is the command I used:

find $src_dir -type f -print0 | xargs -0 sum > src_dir_checksums.txt
find $dst_dir -type f -print0 | xargs -0 sum > dst_dir_checksums.txt
diff src_dir_checksums.txt dst_dir_checksums.txt

But for some reason this produces a different search order for the two directories which are on different machines.

Any help would greatly appreciated.

Thanks in advance,
Sam
 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

how to look in directory for files with diff date

What kind of command can i use to search a directory and subdirectories for all files that do not have the same date? i want to find any files that do not match a date of Sep 13, 2002? Or that have a different owner or group? Any help appreciated! (2 Replies)
Discussion started by: kymberm
2 Replies

2. Programming

what is diff b/w near ,far and huge pointers

helo, can u tell me what is exact difference among near,far and huge pointer Amit (1 Reply)
Discussion started by: amitpansuria
1 Replies

3. UNIX for Advanced & Expert Users

How to rsync or tar directory trees, with hidden directory, but without files?

I want to backup all the directory tress, including hidden directories, without copying any files. find . -type d gives the perfect list. When I tried tar, it won't work for me because it tars all the files. find . -type d | xargs tar -cvf a.tar So i tried rsync. On my own test box, the... (4 Replies)
Discussion started by: fld2007
4 Replies

4. Shell Programming and Scripting

Diff - filename and directory name are same

Hi, I have in the one folder file and directory that have same name. I need make diff from first directory where exists file in folder FOLDER/filename and second file where not exist folder, but FOLDER is filename. I use -N switch for create new file. Scripts report: Not a directory Sample:... (2 Replies)
Discussion started by: tomix
2 Replies

5. Shell Programming and Scripting

Fine Tune - Huge files/directory - Purging

Hi Expert's, I need your assitance in tunning one script. I have a mount point where almost 4848008 files and 864739 directories are present. The script search for specific pattern files and specfic period then delete them to free up space. The script is designed to run daily and its taking around... (19 Replies)
Discussion started by: senthil.ak
19 Replies

6. Shell Programming and Scripting

Checking whether the file exists under a directory and doing a diff

Hi Everyone, I am writing a shell script for the below needs and would like your suggestions and advices. I have a lot of scripting files(Shell Scripts) under the directory: /home/risk_dev/dev I have another directory which has a lot of shell scripts under the directory: ... (2 Replies)
Discussion started by: filter
2 Replies

7. Shell Programming and Scripting

How to copy very large directory trees

I have constant trouble with XCOPY/s for multi-gigabyte transfers. I need a utility like XCOPY/S that remembers where it left off if I reboot. Is there such a utility? How about a free utility (free as in free beer)? How about an md5sum sanity check too? I posted the above query in another... (3 Replies)
Discussion started by: siegfried
3 Replies

8. Shell Programming and Scripting

ksh - Checking directory trees containing wild cards

Hi Can somebody please show me how to check from within a KSH script if a directory exists on that same host when parts of the directory tree are unknown? If these wildcard dirs were the only dirs at that level then ... RETCODE=$(ls -l /u01/app/oracle/local/*/* | grep target_dir) ... will... (4 Replies)
Discussion started by: user052009
4 Replies

9. UNIX for Beginners Questions & Answers

Need help with listing file name and modified date on a huge directory

hi, We have a huge directory that ha 5.1 Million files in it. We are trying to get the file name and modified timestamp of the most recent 3 years from this huge directory for a migration project. However, the ls command (background process) to list the file names and timestamp is running for... (2 Replies)
Discussion started by: subbu
2 Replies
GENDIFF(1)						      General Commands Manual							GENDIFF(1)

NAME
gendiff - utility to aid in error-free diff file generation SYNOPSIS
gendiff <directory> <diff-extension> DESCRIPTION
gendiff is a rather simple script which aids in generating a diff file from a single directory. It takes a directory name and a "diff- extension" as its only arguments. The diff extension should be a unique sequence of characters added to the end of all original, unmodi- fied files. The output of the program is a diff file which may be applied with the patch program to recreate the changes. The usual sequence of events for creating a diff is to create two identical directories, make changes in one directory, and then use the diff utility to create a list of differences between the two. Using gendiff eliminates the need for the extra, original and unmodified directory copy. Instead, only the individual files that are modified need to be saved. Before editing a file, copy the file, appending the extension you have chosen to the filename. I.e. if you were going to edit somefile.cpp and have chosen the extension "fix", copy it to somefile.cpp.fix before editing it. Then edit the first copy (somefile.cpp). After editing all the files you need to edit in this fashion, enter the directory one level above where your source code resides, and then type $ gendiff somedirectory .fix > mydiff-fix.patch You should redirect the output to a file (as illustrated) unless you want to see the results on stdout. SEE ALSO
diff(1), patch(1) AUTHOR
Marc Ewing <marc@redhat.com> 4th Berkeley Distribution Mon Jan 10 2000 GENDIFF(1)
All times are GMT -4. The time now is 02:19 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy