Visit Our UNIX and Linux User Community

data integrity check needed

Thread Tools Search this Thread
Top Forums Shell Programming and Scripting data integrity check needed
# 1  
Old 08-23-2012
data integrity check needed

Hi friends
I need copied 100gd of data to other Solaris server. Could anyone help me guiding appropriate way of checking data integrity at source and destination so can I delete the data at source location . How can print/check cksum of individual file in each folder and match it with destination.
Thanks in advance
# 2  
Old 08-23-2012
You can check hash e.g. sha1sum. I would tar the whole directory and calculate sha1sum on the fly:

tar czf - /path/to/dir | sha1sum

Then do the same on the other side and verify that the checksums are the same.

On a second thought, this may give different checksums if there are different version of tar or gzip on the different machines...

Perhaps better to store checksums in a file and check them from there.

find /path/to/dir -type f -exec sha1sum {} \;  > sums.tmp

Copy the file over to the other machine:
scp sums.tmp <user>@<remoteServer>:~

And verify against this file on the remote machine:
sha1sum -c ~/sums.tmp

Last edited by mirni; 08-23-2012 at 07:30 AM..
This User Gave Thanks to mirni For This Post:
# 3  
Old 08-23-2012
Thanks for the help
but in my situation I can't use archiving and also it is multiple directories of each @100 gb
# 4  
Old 08-23-2012
but in my situation I can't use archiving
Why not? It doesn't really write the archive into a file, it writes it to stdout, just to pipe it to sha1sum. The idea is to make a checksum of one tarball instead of every file. But as I say, I would check the tar and gzip versions if you get different checksums before starting to panic.

also it is multiple directories of each @100 gb
You could write a for loop to loop through all the root directories and run the above commands.
# 5  
Old 08-23-2012
The idea here is to ensure that we get lists of files in exactly the same order, find out the checksum for each file, then compare the two lists. We do the compare on the source computer in case we need to generate a list of files to re-copy.
It is important that we sort the output from find because they will never be the same order after this sort of copy.

This assumes that you have copied the files and need to check that the files are identical.

# Source computer
cd /source_dir
find . -type f -print | sort | while read filename
        cksum "${filename}" >> /tmp/cksum_source

# Destination computer
cd /destination_dir
find . -type f -print | sort | while read filename
        cksum "${filename}" >> /tmp/cksum_destination

Then copy the destination checksum to the source computer.

cksum /tmp/cksum_source /tmp/cksum_destination
If the two checksums are identical we don't even need to run a diff.

Footnote: I would strongly advise that any file copy method you use preserves the file permissions and timestamps and preserves directory permissions. This can be more difficult than it sounds unless you ensure that the account UID's and GIDs match on both computers. It is impossible to preserve the directory timestamps.

Last edited by methyl; 08-23-2012 at 05:59 PM..
This User Gave Thanks to methyl For This Post:
# 6  
Old 08-23-2012
Originally Posted by methyl
It is impossible to preserve the directory timestamps.
A directory timestamp (and its mode) can be set accordingly after its files are copied/extracted.

In case you were thinking of a different scenario, I am referring to an archiver unpacking.

This User Gave Thanks to alister For This Post:
# 7  
Old 08-23-2012
I somehow knew that someone would know a way round this issue.

I'd be very interested in bulk file copy products for unix which can preserve the timestamps on both the files AND the directories.
Current best technique for unix uses disc mirroring software.

If anybody knows of a similar product for Windows Server 2003/2008 I'm very interested. Current technique also uses disc mirroring software.

My only interest is the best technique for blunt technology upgrades. i.e. Increasing the size of physical discs and swapping the mother board for a better one.
What we do at the moment is mirror the data discs to larger discs, smash the mirror, fit the new discs into a better cold-build system, and import the data discs. This technique has the advantage that the original computer is preserved intact but it can be a bit time-consuming.

Previous Thread | Next Thread

8 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

What's the best way to check file permissions before moving files if needed?

Hello, I would like to know if it's a good practice to check the file permissions of the contents of a directory before moving them. For example: mv -- "$directory"/* "$directory"/.* "$directory"/..?* "$destination"The variables $directory and $destination contain the path to an existing... (6 Replies)
Discussion started by: Cacializ
6 Replies

2. UNIX for Dummies Questions & Answers

Integrity check for the backup

Hello I thought of different ways of integrity check for the backup and look for the fastest approach to start programming. in all these approaches randomness is used. I would appreciate if someone give more suggestions or correct me. 1- Machine Name Check We can check if the machines were... (5 Replies)
Discussion started by: frhling
5 Replies

3. UNIX for Advanced & Expert Users

AIX idea needed to check the logs updated date and time

Hi with the help of Gabriel canepa, i have just edited filename only in his code. The help which i got and he helped is 1) I have around 22 logs and each log should be updated in the last 24 hours from the current timestamp. 2) It should check for ERROR message (not error,Error) in the log and... (2 Replies)
Discussion started by: Kalaihari
2 Replies

4. Shell Programming and Scripting

Help needed to sort data

Hello All, Today i have been asking lots of question, hope to become good in scripting soon with all the wonderful advices i get. The question is i want to sort data a get uniq string from it. The code i am using to generate the output is:- check_sun() { for i in $SUN_PLATFORM do $ECHO... (0 Replies)
Discussion started by: asirohi
0 Replies

5. Shell Programming and Scripting

Help needed to stick on variable data to an output

Hi all, I need help now to stick the value inside $RHAT_PRODUCT and display that in every line in the output. What changes in the code can i do. Please suggest Thanks Adsi #!/bin/sh ECHO=/bin/echo FIND=/bin/find AWK=/bin/awk LS=/bin/ls GREP=/bin/grep ... (1 Reply)
Discussion started by: asirohi
1 Replies

6. Shell Programming and Scripting

Help needed with Sort and uniq data

Hi All, After Sorting directories and files i have got following output as below, now i only want the strings common in them, so the actual output should be as below in the bottom. How do i do that? Thanks -adsi File to be modified:- Common Components for ----> AA... (4 Replies)
Discussion started by: asirohi
4 Replies

7. Shell Programming and Scripting

Help Needed in arrangind data!

Dear All, Please view the below mentioned text and help me in arranging data in format like DATE TIME Value (2nd-Feild) e.g. 20-JUN-209 00:25:38 69.00 ........... ........... ........... ........... and so on till the file end. 20-JUN-2009 00:25:38, 195.20, ... (10 Replies)
Discussion started by: jojo123
10 Replies

8. Solaris

File Integrity Check

Hi, I have two NFS shares mounted on a solaris system. share1 and share2 , both are from different NFS servers share1 has 500GB of data share 2 is empty. I am copying all the data from share1 to share2. It is like migrating the data from one NFS share to another. Is there... (8 Replies)
Discussion started by: athreyavc
8 Replies

Featured Tech Videos