script that detects duplicate files in directory


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting script that detects duplicate files in directory
# 1  
Old 12-04-2006
script that detects duplicate files in directory

I need help with a script which accepts one argument and goes through all the files under a directory and prints a list of possible duplicate files As its output, it prints zero or more lines, each one containing a space-separated list of filenames. All the files listed on one line have the same MD5 hash; i.e., are believed to be identical.

Others/optional
If the -s switch is specified, the script should not print a list of all duplicate files; instead, it should print the number of duplicates. (For example, in the example above, there are 4 duplicate copies of 3 files), and how much extra space the duplicates take up. (Note: this summary information should only be displayed if the -s switch is present; if it is not present, every line in the output should display a set of duplicate files.)
# 2  
Old 12-04-2006
here is an example with crc32 - you can use md5 if you absolutely must but this works just fine.
Code:
#!/bin/ksh

find ./c -type f | \
while read file
do
    echo " $(crc32 "$file")  $file"
done | sort | \
awk ' BEGIN { old_crc=""; old_file="" }
      {
       if( old_crc==$1) {print old_file, $2}
       old_crc=$1
       old_file=$2
       }'

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

PHP script that detects if auth is required or not on Apache Splunk

I am currently trying to do a PHP script that detects automatically if Apache Splunk authentication is required or not but I'm having a hard time since HTTP code 303 is always coming back, even if auth is required or not. Here is the script so far; <?php /** * Apache Splunk script to... (4 Replies)
Discussion started by: syrius
4 Replies

2. Shell Programming and Scripting

Moving Files one directory to another directory shell script

Hi, Could you please assist how to move the gz files which are older than the 90 days from one folder to another folder ,before that it need to check the file system named "nfs" if size is less than 90 or not. If size is above 90 then it shouldn't perform file move and exit the script throwing... (4 Replies)
Discussion started by: venkat918
4 Replies

3. Shell Programming and Scripting

Shell script to compare two files for duplicate..??

Hi , I had a requirement to compare two files whether the two files are same or different .... like(files contaisn of two columns each) file1.txt 121343432213 1234 64564564646 2345 343423424234 2456 file2.txt 121343432213 1234 64564564646 2345 31231313123 3455 how to... (2 Replies)
Discussion started by: hemanthsaikumar
2 Replies

4. UNIX for Dummies Questions & Answers

Duplicate home directory

Hi, I'm running a RHEL6 machine on a VMWare platform and I have somehow created a duplicate /home directory. See below. # pwd /home/home/twood # ls Desktop Documents Downloads Music Pictures Public Templates Videos # I am currently working on some disk quota procedures and I... (2 Replies)
Discussion started by: tjwops
2 Replies

5. Shell Programming and Scripting

Script which removes files from the first directory if there is a file in the second directory

Script must removes files from the first directory if there is a file with same name in the second directory Script passed to the two directories, it lies with them in one directory: sh script_name dir1 dir2 This is my version, but it does not work :wall: set - $2/* for i do set -... (6 Replies)
Discussion started by: SLAMUL
6 Replies

6. Shell Programming and Scripting

Remove duplicate files in same directory

Hi all. Am doing continuous backup of mailboxes using rsync. So whenever a new mail arrives it is automatically copied on backup server. When a new mail arrives it is named as xyz:2, when it is read by the email client an S is appended xyz:2,S Eventually , 2 copies of the same file exist on... (7 Replies)
Discussion started by: coolatt
7 Replies

7. UNIX for Advanced & Expert Users

Duplicate directory in same partition help.

Hi, I have found a directory on my web server that have 2 same directory names in the same location on the same partition. Is there a way to mkdir a name twice and be able to see them both in the same location? Heres an example of the ouput: # ls access_log.1.bkup ... (10 Replies)
Discussion started by: maiku09
10 Replies

8. UNIX for Dummies Questions & Answers

Delete duplicate files from one of two directory structures

Hello everyone, I have been struggling to clean up a back-up mess I created when manually duplicating a directory structure and then working in both of them.. The structures now are significantly different and contain in the order of 15 k files of which most are duplicates. Now I am trying to... (0 Replies)
Discussion started by: procreator
0 Replies

9. Shell Programming and Scripting

Go Thru Files in a directory:Script

Hi, I am new to unix scripting and I need to write a script that accepts a directory name as an argument, and inside the script to go through all the ".dat" files in that directory. For each ".dat" file in the directory, create a control file(.ctl) file containing the associated ".dat" file name... (0 Replies)
Discussion started by: Axis99
0 Replies

10. Shell Programming and Scripting

remove duplicate files in a directory

Hi ppl. I have to check for duplicate files in a directory . the directory has following files /the/folder /containing/the/file a1.yyyymmddhhmmss a1.yyyyMMddhhmmss b1.yyyymmddhhmmss b2.yyyymmddhhmmss c.yyyymmddhhmmss d.yyyymmddhhmmss d.yyyymmddhhmmss where the date time stamp can be... (1 Reply)
Discussion started by: asinha63
1 Replies
Login or Register to Ask a Question