Visit Our UNIX and Linux User Community


Compare files in directories with md5sum


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Compare files in directories with md5sum
# 1  
Old 02-12-2014
RedHat Compare files in directories with md5sum

And not to start. I can compare files, that's easy. The problem is that I compare files in a directory, and check if these files exist in another directory. The problem is that the file names are not the same. So I have to compare with "md5sum" or something similar. How I can do?

All this in python.

Thanks! Smilie
# 2  
Old 02-12-2014
Quote:
And not to start.
I don't understand this sentence, but ignoring that, the logic seems to be like this for me:-
  • Find all the files in your first directory. For each one, get the checksum and write a line in a work-file.
  • Find all the files in your first directory. For each one, get the checksum and check if it matches one recorded in the work-file.
I could do this in shell script, but I cannot assist with a Python.



Robin
# 3  
Old 02-13-2014
Sorry, my English is not good.

I can also use bash.

I have downloaded some files from a url.

Files are downloaded to a directory.

Code:
wget  --mirror --no-check-certificate --no-directories --no-host-directories -l1 htts://site.com/out/

rm -f index.html

I now compare the files (they may not have the same name and be identical).

There is where'm lost and do not know how.

Last edited by Scott; 02-13-2014 at 08:36 AM.. Reason: Please use code tags
# 4  
Old 02-13-2014
So, from my suggested logic above (if that's acceptable) :-
Code:
find 1st-directory -type f -exec md5sum {} \; > /tmp/1st_list
find 2nd-directory -type f -exec md5sum {} \; > /tmp/2nd_list

This will get you two files containing the file-names and the md5-checksums. You can then compare the files with diff but the output can be a bit messy. It's neater to run two commands. The following will get you files in the second list that do not match those in the first list:-
Code:
grep -vFf /tmp/1st_list /tmp/2nd_list

You can reverse this to get those in the first list that are not in the second (i.e. you might not have all the files):-
Code:
grep -vFf /tmp/2nd_list /tmp/1st_list

.
If the filenames are not important (but I rather think that they are) then you can get just the checksums like this:-
Code:
cut -f1 /tmp/1st_list > /tmp/1st_md5_only
cut -f1 /tmp/2nd_list > /tmp/2nd_md5_only

You can then show what files from your second directory are not in the first:-
Code:
grep -vFf /tmp/1st_md5_only /tmp/2nd_list

or reverse it to show what files from the first directory are missing from the second:-
Code:
grep -vFf /tmp/2nd_md5_only /tmp/1st_list

.

The grep command is 'Get Regular ExPression', so it's a way to select rows of data.
  • The -v flag means to negate the selection
  • The -F flag uses Fixed strings, else they are interpreted as expressions.
  • The -f flag uses the next item as an input file to compare to.
  • The last item is the file to scan.


I hope that this helps, but if you are still concerned, then let us know your results.

There is a good chance that you will have the same filename in the two lists with different checksums if they are downloaded at different times as fixes are released.



Robin

Last edited by rbatte1; 02-13-2014 at 10:56 AM.. Reason: Added -F flag
# 5  
Old 02-17-2014
Thank you all for your support. At the end I resolved well.
Code:
for a in `ls *.TXT`; do
   if [ -f $RECIBIDOS$a ]
   then
     rm -f $a
   fi
done

cp *.TXT $RECIBIDOS

.......

And then processes and files.

Thanks again for your help.

Last edited by Franklin52; 02-17-2014 at 09:43 AM.. Reason: Please use code tags
# 6  
Old 02-17-2014
You are just going to delete files if the name matches here. I thought that you were not concerned about matching names, but wanted to find duplicate files, hence the md5sum tests.

Oh well, so long as you are happy and you have a working solution.

Robin
# 7  
Old 02-17-2014
Honestly, it's a temporary solution. When you can be able to compare files using md5sum, then I'll change the script.

Of course I prefer to use the md5sum method.

Previous Thread | Next Thread
Test Your Knowledge in Computers #240
Difficulty: Easy
In 1973, a transatlantic satellite link connected the Norwegian Seismic Array (NORSAR) to the ARPANET, making Norway the first country outside the US to be connected to the network.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Compare two md5sum

Hello, First of all I want to apologize because i'm not a admin or coder and maybe all my efforts to write only this small script in my life would need one week full time reading man pages and forums but... I don't have the money to offer me to get this time and the script I want to do seems... (5 Replies)
Discussion started by: toscan
5 Replies

2. UNIX for Beginners Questions & Answers

Expect in Bash - and then compare md5sum

I'm running on a staging server. I will need to use expect and I think ssh or scp to the other boxes. I need to see something like this....Enter:Host 1 Enter:Host 2 Enter full directory path to compare: example /apps/acd/jboss-customer1/ Enter User Id: Enter Password: ( Assumes... (6 Replies)
Discussion started by: xgringo
6 Replies

3. Shell Programming and Scripting

Compare md5sum two servers' setup

I'm trying to think of a way to compare two boxes and make sure their files will be the same. There may be extra files on one side and some on the other. I also need to make sure the file content is identical. So far I've gotten this to create a file find /directorypath/ -type f -name... (3 Replies)
Discussion started by: xgringo
3 Replies

4. Shell Programming and Scripting

Compare files of 2 directories

Hi all, I have 2 directories dir1 and dir2 which contains many xml files. I need to compare files of dir1 with that of dir2 and if they match, I need to cut it from dir1 and paste it in dir2. I need to do this thru scripts. I'm currently investigating on the diff command. Please help me write... (6 Replies)
Discussion started by: frum
6 Replies

5. UNIX for Dummies Questions & Answers

Compare files in two directories

Hi All, I have two directories that has some files, some of the files are common to both of them like : ls -l dir1 file1 file2 file3 ls -l dir2 file1 file2 file3 file4 file5 Now i want to get the files from dir2 that are not present in dir1 (means i want to get... (2 Replies)
Discussion started by: mukulverma2408
2 Replies

6. Shell Programming and Scripting

how to compare two files in different directories

Hi all , Can any one give me the solution for below query. I have two files . firstfile: xyz123 abc234 text2456 secondfile (\home\test) xyz123:ram ab34:scrit text2456:maven After you compare the ouput should the the common items in both files (2 Replies)
Discussion started by: sravan008
2 Replies

7. UNIX for Dummies Questions & Answers

How to compare files in 2 directories?

Hi, I want to compare the content of 2 directories and list down both the duplicate and unique files from each directory. Tried to use diff but but not able to achieve the result. For example, DirA FileX FileY FileZ DirB FileY The desired outcome is Duplication: FileY... (1 Reply)
Discussion started by: Andre_2008
1 Replies

8. UNIX for Dummies Questions & Answers

compare all files under directories

Hello I am very new to Unix. I am actually using the C shell to write a program that will compare all the files in the directory and subdirectores and print out the ones that are identical, I am assuming identical by name or text Thank you (2 Replies)
Discussion started by: ga.miami56
2 Replies

9. Shell Programming and Scripting

compare files in two directories and output changed files to third directory

I have searched about 30 threads, a load of Google pages and cannot find what I am looking for. I have some of the parts but not the whole. I cannot seem to get the puzzle fit together. I have three folders, two of which contain different versions of multiple files, dist/file1.php dist/file2.php... (4 Replies)
Discussion started by: bkeep
4 Replies

10. Shell Programming and Scripting

Compare files from two directories

HI, i want to compare one file from one directory to many files in other directory. means in my /DIR/20070930/b/STG* directory i have only one file and in /DIR/20070930/a/STG* directory i have many files. so i want to check the name of that files should be present in other directory or not ... (2 Replies)
Discussion started by: ravi214u
2 Replies

Featured Tech Videos