Sponsored Content
Top Forums Shell Programming and Scripting finding duplicate files by size and finding pattern matching and its count Post 302098150 by jerome Sukumar on Friday 1st of December 2006 02:55:25 AM
Old 12-01-2006
For finding duplicates not only by size by file naming convention too

Hi All,
sorry for rephrasing.
while finding duplicates I will use file naming convention(substring of files 1,4) and file size too.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Finding cumulative size of files older than certain days

Hi All, I've got a ton of files in a particular directory. I want to find pdf files older than 30 days in that directory and then the cumulative size of those files. Ex: find /home/jk/a -name "*.pdf" -mtime +30 consider it finds the below 4 files. /home/jk/a/1.pdf /home/jk/a/2.pdf... (1 Reply)
Discussion started by: rohan076
1 Replies

2. Shell Programming and Scripting

Finding Duplicate files

How do you delete and and find duplicate files? (1 Reply)
Discussion started by: Jicom4
1 Replies

3. Shell Programming and Scripting

Finding conserved pattern in different files

Hi power user, For examples, I have three different files: file 1: file2: file 3: AAA CCC ZZZ BBB BBB CCC CCC DDD DDD DDD TTT AAA EEE AAA XXX I... (8 Replies)
Discussion started by: anjas
8 Replies

4. UNIX for Dummies Questions & Answers

finding all files that do not match a certain pattern

I hope I'm asking this the right way -- I've been sending out a lot of resumes and some of them I saw on Craigslist -- so I named the file as 'Craigslist -- (filename)'. Well I noticed that at least one of the files was misspelled as 'Craigslit.' I want to eventually try to write a shell... (5 Replies)
Discussion started by: Straitsfan
5 Replies

5. UNIX for Dummies Questions & Answers

Finding the files and count then

Hi i was trying to find the files which are not older than one day and copy them to other location . but i need to count the number of files and the copy them if the count is matches my number A=`find $SOURCE/* -type f -mtime -1 ` in the code above i need to count the number of file A has... (8 Replies)
Discussion started by: vikatakavi
8 Replies

6. Shell Programming and Scripting

Finding size of files with spaces in their file names

I am running a UNIX script to get unused files and their sizes from the server. The issue is arising due to the spaces present in the filename/folder names.Due to this the du -k command doesn't work properly.But I need to calculate the size of all files including the ones which have spaces in them.... (4 Replies)
Discussion started by: INNSAV1
4 Replies

7. Programming

Finding duplicate files in two base directories

Hello All, I have got some assignment to complete till this Monday and problem statement is as follow :- Problem :- Find duplicate files (especially .c and .cpp) from two project base directories with following requirement :- 1.Should be extendable to search in multiple base... (4 Replies)
Discussion started by: anand.shah
4 Replies

8. Shell Programming and Scripting

Finding all files based on pattern

Hi All, I need to find all files in a directory which are containing specific pattern. Thing is that file name should not consider if pattern is only in commented area. all contents which are under /* */ are commented all lines which are starting with -- or if -- is a part of some sentence... (13 Replies)
Discussion started by: Lakshman_Gupta
13 Replies

9. Shell Programming and Scripting

Finding matching patterns in two files

Hi, I have requirement to find the matching patterns of two files in Unix. One file is the log file and the other is the error list file. If any pattern in the log file matches the list of errors in the error list file, then I would need to find the counts of the match. For example, ... (5 Replies)
Discussion started by: Bobby_2000
5 Replies

10. Shell Programming and Scripting

Finding lines of specific size in files using sed

i am using sed to detect any lines that are not exactly 21. the following gives me the lines that ARE exactly 21. i want the opposite , i want the two lines that are not size 21 (shown in bold) type a.a 000008050110010201NNN 000008060810010201NNN 21212000008070110010201NNN... (5 Replies)
Discussion started by: boncuk
5 Replies
FDUPES(1)						      General Commands Manual							 FDUPES(1)

NAME
fdupes - finds duplicate files in a given set of directories SYNOPSIS
fdupes [ options ] DIRECTORY ... DESCRIPTION
Searches the given path for duplicate files. Such files are found by comparing file sizes and MD5 signatures, followed by a byte-by-byte comparison. OPTIONS
-r --recurse include files residing in subdirectories -s --symlinks follow symlinked directories -H --hardlinks normally, when two or more files point to the same disk area they are treated as non-duplicates; this option will change this behav- ior -n --noempty exclude zero-length files from consideration -f --omitfirst omit the first file in each set of matches -1 --sameline list each set of matches on a single line -S --size show size of duplicate files -q --quiet hide progress indicator -d --delete prompt user for files to preserve, deleting all others (see CAVEATS below) -v --version display fdupes version -h --help displays help SEE ALSO
md5sum(1) NOTES
Unless -1 or --sameline is specified, duplicate files are listed together in groups, each file displayed on a separate line. The groups are then separated from each other by blank lines. When -1 or --sameline is specified, spaces and backslash characters () appearing in a filename are preceded by a backslash character. CAVEATS
If fdupes returns with an error message such as fdupes: error invoking md5sum it means the program has been compiled to use an external program to calculate MD5 signatures (otherwise, fdupes uses interal routines for this purpose), and an error has occurred while attempting to execute it. If this is the case, the specified program should be properly installed prior to running fdupes. When using -d or --delete, care should be taken to insure against accidental data loss. When used together with options -s or --symlink, a user could accidentally preserve a symlink while deleting the file it points to. Furthermore, when specifying a particular directory more than once, all files within that directory will be listed as their own duplicates, leading to data loss should a user preserve a file without its "duplicate" (the file itself!). AUTHOR
Adrian Lopez <adrian2@caribe.net> FDUPES(1)
All times are GMT -4. The time now is 11:21 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy