Sponsored Content
Top Forums Shell Programming and Scripting script that detects duplicate files in directory Post 302098442 by trueman82 on Monday 4th of December 2006 09:31:00 AM
Old 12-04-2006
script that detects duplicate files in directory

I need help with a script which accepts one argument and goes through all the files under a directory and prints a list of possible duplicate files As its output, it prints zero or more lines, each one containing a space-separated list of filenames. All the files listed on one line have the same MD5 hash; i.e., are believed to be identical.

Others/optional
If the -s switch is specified, the script should not print a list of all duplicate files; instead, it should print the number of duplicates. (For example, in the example above, there are 4 duplicate copies of 3 files), and how much extra space the duplicates take up. (Note: this summary information should only be displayed if the -s switch is present; if it is not present, every line in the output should display a set of duplicate files.)
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

remove duplicate files in a directory

Hi ppl. I have to check for duplicate files in a directory . the directory has following files /the/folder /containing/the/file a1.yyyymmddhhmmss a1.yyyyMMddhhmmss b1.yyyymmddhhmmss b2.yyyymmddhhmmss c.yyyymmddhhmmss d.yyyymmddhhmmss d.yyyymmddhhmmss where the date time stamp can be... (1 Reply)
Discussion started by: asinha63
1 Replies

2. Shell Programming and Scripting

Go Thru Files in a directory:Script

Hi, I am new to unix scripting and I need to write a script that accepts a directory name as an argument, and inside the script to go through all the ".dat" files in that directory. For each ".dat" file in the directory, create a control file(.ctl) file containing the associated ".dat" file name... (0 Replies)
Discussion started by: Axis99
0 Replies

3. UNIX for Dummies Questions & Answers

Delete duplicate files from one of two directory structures

Hello everyone, I have been struggling to clean up a back-up mess I created when manually duplicating a directory structure and then working in both of them.. The structures now are significantly different and contain in the order of 15 k files of which most are duplicates. Now I am trying to... (0 Replies)
Discussion started by: procreator
0 Replies

4. UNIX for Advanced & Expert Users

Duplicate directory in same partition help.

Hi, I have found a directory on my web server that have 2 same directory names in the same location on the same partition. Is there a way to mkdir a name twice and be able to see them both in the same location? Heres an example of the ouput: # ls access_log.1.bkup ... (10 Replies)
Discussion started by: maiku09
10 Replies

5. Shell Programming and Scripting

Remove duplicate files in same directory

Hi all. Am doing continuous backup of mailboxes using rsync. So whenever a new mail arrives it is automatically copied on backup server. When a new mail arrives it is named as xyz:2, when it is read by the email client an S is appended xyz:2,S Eventually , 2 copies of the same file exist on... (7 Replies)
Discussion started by: coolatt
7 Replies

6. Shell Programming and Scripting

Script which removes files from the first directory if there is a file in the second directory

Script must removes files from the first directory if there is a file with same name in the second directory Script passed to the two directories, it lies with them in one directory: sh script_name dir1 dir2 This is my version, but it does not work :wall: set - $2/* for i do set -... (6 Replies)
Discussion started by: SLAMUL
6 Replies

7. UNIX for Dummies Questions & Answers

Duplicate home directory

Hi, I'm running a RHEL6 machine on a VMWare platform and I have somehow created a duplicate /home directory. See below. # pwd /home/home/twood # ls Desktop Documents Downloads Music Pictures Public Templates Videos # I am currently working on some disk quota procedures and I... (2 Replies)
Discussion started by: tjwops
2 Replies

8. Shell Programming and Scripting

Shell script to compare two files for duplicate..??

Hi , I had a requirement to compare two files whether the two files are same or different .... like(files contaisn of two columns each) file1.txt 121343432213 1234 64564564646 2345 343423424234 2456 file2.txt 121343432213 1234 64564564646 2345 31231313123 3455 how to... (2 Replies)
Discussion started by: hemanthsaikumar
2 Replies

9. Shell Programming and Scripting

Moving Files one directory to another directory shell script

Hi, Could you please assist how to move the gz files which are older than the 90 days from one folder to another folder ,before that it need to check the file system named "nfs" if size is less than 90 or not. If size is above 90 then it shouldn't perform file move and exit the script throwing... (4 Replies)
Discussion started by: venkat918
4 Replies

10. Shell Programming and Scripting

PHP script that detects if auth is required or not on Apache Splunk

I am currently trying to do a PHP script that detects automatically if Apache Splunk authentication is required or not but I'm having a hard time since HTTP code 303 is always coming back, even if auth is required or not. Here is the script so far; <?php /** * Apache Splunk script to... (4 Replies)
Discussion started by: syrius
4 Replies
FDUPES(1)						      General Commands Manual							 FDUPES(1)

NAME
fdupes - finds duplicate files in a given set of directories SYNOPSIS
fdupes [ options ] DIRECTORY ... DESCRIPTION
Searches the given path for duplicate files. Such files are found by comparing file sizes and MD5 signatures, followed by a byte-by-byte comparison. OPTIONS
-r --recurse include files residing in subdirectories -s --symlinks follow symlinked directories -H --hardlinks normally, when two or more files point to the same disk area they are treated as non-duplicates; this option will change this behav- ior -n --noempty exclude zero-length files from consideration -f --omitfirst omit the first file in each set of matches -1 --sameline list each set of matches on a single line -S --size show size of duplicate files -q --quiet hide progress indicator -d --delete prompt user for files to preserve, deleting all others (see CAVEATS below) -v --version display fdupes version -h --help displays help SEE ALSO
md5sum(1) NOTES
Unless -1 or --sameline is specified, duplicate files are listed together in groups, each file displayed on a separate line. The groups are then separated from each other by blank lines. When -1 or --sameline is specified, spaces and backslash characters () appearing in a filename are preceded by a backslash character. CAVEATS
If fdupes returns with an error message such as fdupes: error invoking md5sum it means the program has been compiled to use an external program to calculate MD5 signatures (otherwise, fdupes uses interal routines for this purpose), and an error has occurred while attempting to execute it. If this is the case, the specified program should be properly installed prior to running fdupes. When using -d or --delete, care should be taken to insure against accidental data loss. When used together with options -s or --symlink, a user could accidentally preserve a symlink while deleting the file it points to. Furthermore, when specifying a particular directory more than once, all files within that directory will be listed as their own duplicates, leading to data loss should a user preserve a file without its "duplicate" (the file itself!). AUTHOR
Adrian Lopez <adrian2@caribe.net> FDUPES(1)
All times are GMT -4. The time now is 11:16 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy