Sponsored Content
Top Forums Shell Programming and Scripting Find duplicate files by file size Post 302510135 by Dirk Einecke on Friday 1st of April 2011 03:16:04 PM
Old 04-01-2011
Find duplicate files by file size

Hi!

I want to find duplicate files (criteria: file size) in my download folder.

I try it like this:
Code:
find /Users/frodo/Downloads \! -type d -exec du {} \; | sort > /Users/frodo/Desktop/duplicates_1.txt;
cut -f 1 /Users/frodo/Desktop/duplicates_1.txt | uniq -d | grep -hif - /Users/frodo/Desktop/duplicates_1.txt > /Users/frodo/Desktop/duplicates_2.txt;

But this doesn't work. Can anybody tell my what's wrong or provide a other/better solution? Thanks!

Dirk
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

how to find duplicate files with find ?

hello all I like to make search on files , and the result need to be the files that are duplicated? (8 Replies)
Discussion started by: umen
8 Replies

2. Solaris

command to find out total size of a specific file size (spread over the server)

hi all, in my server there are some specific application files which are spread through out the server... these are spread in folders..sub-folders..chid folders... please help me, how can i find the total size of these specific files in the server... (3 Replies)
Discussion started by: abhinov
3 Replies

3. Shell Programming and Scripting

Find Duplicate files, not by name

I have a directory with images: -rw-r--r-- 1 root root 26216 Mar 19 21:00 020109.210001.jpg -rw-r--r-- 1 root root 21760 Mar 19 21:15 020109.211502.jpg -rw-r--r-- 1 root root 23144 Mar 19 21:30 020109.213002.jpg -rw-r--r-- 1 root root 31350 Mar 20 00:45 020109.004501.jpg -rw-r--r-- 1 root... (2 Replies)
Discussion started by: Ikon
2 Replies

4. Shell Programming and Scripting

Find duplicate files

What utility do you recommend for simply finding all duplicate files among all files? (4 Replies)
Discussion started by: kiasas
4 Replies

5. Shell Programming and Scripting

Find file size difference in two files using awk

Hi, Could anyone help me to solve this problem? I have two files "f1" and "f2" having 2 fields in each, a) file size and b) file name. The data are almost same in both the files except for few and new additional lines. Now, I have to find out and print the output as, the difference in the... (3 Replies)
Discussion started by: royalibrahim
3 Replies

6. Shell Programming and Scripting

Remove duplicate lines from a 50 MB file size

hi, Please help me to write a command to delete duplicate lines from a file. And the size of file is 50 MB. How to remove duplicate lins from such a big file. (6 Replies)
Discussion started by: vsachan
6 Replies

7. Shell Programming and Scripting

find duplicate string in many different files

I have more than 100 files like this: SVEAVLTGPYGYT 2 SVEGNFEETQY 10 SVELGQGYEQY 28 SVERTGTGYT 6 SVGLADYNEQF 21 SVGQGYEQY 32 SVKTVLGYEQF 2 SVNNEQF 12 SVRDGLTNSPLH 3 SVRRDREGLEQF 11 SVRTSGSYEQY 17 SVSVSGSPLQETQY 78 SVVHSTSPEAF 59 SVVPGNGYT 75 (4 Replies)
Discussion started by: xshang
4 Replies

8. Shell Programming and Scripting

Find duplicate files but with different extensions

Hi ! I wonder if anyone can help on this : I have a directory: /xyz that has the following files: chsLog.107.20130603.gz chsLog.115.20130603 chsLog.111.20130603.gz chsLog.107.20130603 chsLog.115.20130603.gz As you ca see there are two files that are the same but only with a minor... (10 Replies)
Discussion started by: fretagi
10 Replies

9. Shell Programming and Scripting

Find duplicate rows between files

Hi champs, I have one of the requirement, where I need to compare two files line by line and ignore duplicates. Note, I hav files in sorted order. I have tried using the comm command, but its not working for my scenario. Input file1 srv1..development..employee..empname,empid,empdesg... (1 Reply)
Discussion started by: Selva_2507
1 Replies

10. Shell Programming and Scripting

List duplicate files based on Name and size

Hello, I have a huge directory (with millions of files) and need to find out duplicates based on BOTH file name and File size. I know fdupes but it calculates MD5 which is very time-consuming and especially it takes forever as I have millions of files. Can anyone please suggest a script or... (7 Replies)
Discussion started by: prvnrk
7 Replies
FDUPES(1)						      General Commands Manual							 FDUPES(1)

NAME
fdupes - finds duplicate files in a given set of directories SYNOPSIS
fdupes [ options ] DIRECTORY ... DESCRIPTION
Searches the given path for duplicate files. Such files are found by comparing file sizes and MD5 signatures, followed by a byte-by-byte comparison. OPTIONS
-r --recurse include files residing in subdirectories -s --symlinks follow symlinked directories -H --hardlinks normally, when two or more files point to the same disk area they are treated as non-duplicates; this option will change this behav- ior -n --noempty exclude zero-length files from consideration -f --omitfirst omit the first file in each set of matches -1 --sameline list each set of matches on a single line -S --size show size of duplicate files -q --quiet hide progress indicator -d --delete prompt user for files to preserve, deleting all others (see CAVEATS below) -v --version display fdupes version -h --help displays help SEE ALSO
md5sum(1) NOTES
Unless -1 or --sameline is specified, duplicate files are listed together in groups, each file displayed on a separate line. The groups are then separated from each other by blank lines. When -1 or --sameline is specified, spaces and backslash characters () appearing in a filename are preceded by a backslash character. CAVEATS
If fdupes returns with an error message such as fdupes: error invoking md5sum it means the program has been compiled to use an external program to calculate MD5 signatures (otherwise, fdupes uses interal routines for this purpose), and an error has occurred while attempting to execute it. If this is the case, the specified program should be properly installed prior to running fdupes. When using -d or --delete, care should be taken to insure against accidental data loss. When used together with options -s or --symlink, a user could accidentally preserve a symlink while deleting the file it points to. Furthermore, when specifying a particular directory more than once, all files within that directory will be listed as their own duplicates, leading to data loss should a user preserve a file without its "duplicate" (the file itself!). AUTHOR
Adrian Lopez <adrian2@caribe.net> FDUPES(1)
All times are GMT -4. The time now is 07:01 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy