Find help in shell - that clears away duplicate files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Find help in shell - that clears away duplicate files
# 1  
Old 09-18-2014
Find help in shell - that clears away duplicate files

I am so frustrated!!!

I want a nice command that clears away duplicate files:

Code:
find . -type f -regex '.*[0-9]{1,3}\..*' | xargs -I## rm -v '##'

should work in my opinion. But it finds nothing even though I have files that have the file name:

Code:
Scooby-Doo-1.txt
Himalaya-2.jpg
Camping 2009-12-12-33.jpg

and so on.

WHY?!?

Last edited by Scrutinizer; 09-18-2014 at 08:08 AM.. Reason: additional code tags
# 2  
Old 09-18-2014
Try:
Code:
find . -type f -regex '.*[0-9]\{1,3\}\..*'

or equivalent:
Code:
find . -type f -regex '[0-9]\{1,3\}\.'

But what do you mean with duplicate files?

-----
EDIT the latter is invalid since anchors are implicit !
EDIT 2 removed superflous closing brace in first example

Last edited by Scrutinizer; 09-18-2014 at 10:05 AM..
# 3  
Old 09-18-2014
I think that GNU find -regex has implicit ^ $ anchors, so the .* at both ends are needed.
{ } or \{ \}, if implemented at all, is not needed here.
Code:
find . -type f -regex '.*[0-9]\..*'

Traditional Unix find needs
Code:
find . -type f -name '*[0-9].*'

This User Gave Thanks to MadeInGermany For This Post:
# 4  
Old 09-18-2014
Neither of the above is working.

Code:
find . -type f -regex '.*[0-9]\{1,3\}}\..*'

Escapes the {-parentheses and thus makes them part of the name. I want to use them to specify how many of the numbers should be in the file name and you have omitted the - sign (that might be a mistake though.
I.e they will match:
Code:
Myfile4{1,3}.txt

But not
Code:
Myfile-4.txt

----
Code:
find . -type f -regex '.*[0-9]\..*'

Has also forgot the - sign but, again, this might just be a mistake, but this regexp does find
Code:
Camping 2014-1.txt

but it will not find
Code:
Himalaya-22.txt

I want it to find 1 or more numbers after the - sign.

I want to match anything-1to3numbers.anything
The - sign has to be there, after the - sign there has to be 1-3 numbers and then a . and after the . anything.
I just cannot get it to work Smilie
# 5  
Old 09-18-2014
The -regex option with a find that supports it is BRE regex, which means that the curly braces will need to be escaped with a backslash otherwise it means a literal brace..

There was a closing brace too many in my example (one escaped, one unescaped). Corrected it in my post..

But that will match any occurrence of 1 or more numbers before the dot. To only match 1-3 numbers there needs to be another "anchor" before the numbers:

Try:
Code:
find . -type f -regex '.*-[0-9]\{1,3\}\..*'

to specify that the number is preceded with a dash, or
Code:
find . -type f -regex '.*[^0-9][0-9]\{1,3\}\..*'

to specify that the number is preceded by a character other than a number..

Last edited by Scrutinizer; 09-20-2014 at 05:36 AM..
# 6  
Old 09-18-2014
I apologize Scrutinizer, I didn't know that.
But it still dosn't work?!?
Code:
find . -type f -regex '.*-[0-9]\{1,3\}\..*'

gives me zero results Smilie I am running this in /bin/bash under cygwin.
I also tried the exact same command in /bin/bash under Ubuntu, same thing - doesn't work.
Tried tcsh under Ubuntu, same thing - doesn't work.
CSH same thing - doesn't work.

I apologize for my rude answer since I really thought the \{ was the culprit and made the expression fail. But it has to be something else.

I have also tried to swap out the 1,3 to 1..3 and 1.3 to check the perl notation inside {} but that doesn't work either.

Frustration!
# 7  
Old 09-18-2014
Strange, I get :
Code:
$ touch Scooby-Doo-1.txt Himalaya-2.jpg Camping 2009-12-12-33.jpg a-1234.txt
$ find . -type f -regex '.*-[0-9]\{1,3\}\..*'
./2009-12-12-33.jpg
./Himalaya-2.jpg
./Scooby-Doo-1.txt
$

Hold on, the above is on OSX with BSD find.

On Linux it does not work (just tested it)...
OK ON Linux it is GNU find an there the default regex type is emacs

EDIT: this seems to work on Linux:
Code:
find . -type f -regextype posix-basic -regex '.*-[0-9]\{1,3\}\..*'

Code:
find . -type f -regextype posix-extended -regex '.*-[0-9]{1,3}\..*'

Apparently the default "emacs" does not support the brace repetition operator, so it will probably be modeled after an old version of emacs..
So the -regextype option is required to change from the default regex type..

I personally never use regex with find, since it is not standardized..


--
EDIT: With the emacs default of GNU find this works:
Code:
find . -type f -regex '.*-[0-9][0-9]?[0-9]?\..*'

But that does not work with BSD find (unless the -E extended regex option is specified)..

Last edited by Scrutinizer; 09-18-2014 at 11:47 AM..
This User Gave Thanks to Scrutinizer For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

AIX find duplicate backup files

I would like find and delete old backup files in aix. How would I go about doing this? For example: server1_1-20-2020 server1_1-21-2020 server1_1-22-2020 server1_1-23-2020 server2_1-20-2020 server2_1-21-2020 server2_1-22-2020 server2_1-23-2020 How would I go about finding and... (3 Replies)
Discussion started by: cokedude
3 Replies

2. Shell Programming and Scripting

To Find Duplicate files using latest in Linux

I have tried the following code and with that i couldnt achieve what i want. #!/usr/bin/bash find ./ -type f \( -iname "*.xml" \) | sort -n > fileList sed -i '/\.\/fileList/d' fileList NAMEOFTHISFILE=$(echo $0|sed -e 's/\/()$*.^|/\\&/g') sed -i "/$NAMEOFTHISFILE/d"... (2 Replies)
Discussion started by: gold2k8
2 Replies

3. Shell Programming and Scripting

Find duplicate rows between files

Hi champs, I have one of the requirement, where I need to compare two files line by line and ignore duplicates. Note, I hav files in sorted order. I have tried using the comm command, but its not working for my scenario. Input file1 srv1..development..employee..empname,empid,empdesg... (1 Reply)
Discussion started by: Selva_2507
1 Replies

4. Shell Programming and Scripting

Find duplicate files but with different extensions

Hi ! I wonder if anyone can help on this : I have a directory: /xyz that has the following files: chsLog.107.20130603.gz chsLog.115.20130603 chsLog.111.20130603.gz chsLog.107.20130603 chsLog.115.20130603.gz As you ca see there are two files that are the same but only with a minor... (10 Replies)
Discussion started by: fretagi
10 Replies

5. Shell Programming and Scripting

find duplicate string in many different files

I have more than 100 files like this: SVEAVLTGPYGYT 2 SVEGNFEETQY 10 SVELGQGYEQY 28 SVERTGTGYT 6 SVGLADYNEQF 21 SVGQGYEQY 32 SVKTVLGYEQF 2 SVNNEQF 12 SVRDGLTNSPLH 3 SVRRDREGLEQF 11 SVRTSGSYEQY 17 SVSVSGSPLQETQY 78 SVVHSTSPEAF 59 SVVPGNGYT 75 (4 Replies)
Discussion started by: xshang
4 Replies

6. Shell Programming and Scripting

Find duplicate files by file size

Hi! I want to find duplicate files (criteria: file size) in my download folder. I try it like this: find /Users/frodo/Downloads \! -type d -exec du {} \; | sort > /Users/frodo/Desktop/duplicates_1.txt; cut -f 1 /Users/frodo/Desktop/duplicates_1.txt | uniq -d | grep -hif -... (9 Replies)
Discussion started by: Dirk Einecke
9 Replies

7. Shell Programming and Scripting

Find duplicate files

What utility do you recommend for simply finding all duplicate files among all files? (4 Replies)
Discussion started by: kiasas
4 Replies

8. Shell Programming and Scripting

Find Duplicate files, not by name

I have a directory with images: -rw-r--r-- 1 root root 26216 Mar 19 21:00 020109.210001.jpg -rw-r--r-- 1 root root 21760 Mar 19 21:15 020109.211502.jpg -rw-r--r-- 1 root root 23144 Mar 19 21:30 020109.213002.jpg -rw-r--r-- 1 root root 31350 Mar 20 00:45 020109.004501.jpg -rw-r--r-- 1 root... (2 Replies)
Discussion started by: Ikon
2 Replies

9. Shell Programming and Scripting

Find duplicate value comparing 2 files and create an output

I need a perl script which will create an output file after comparing two diff file in a directory path: /export/home/abc/file1 /export/home/abc/file2 File Format: <IP>TAB<DeviceName><TAB>DESCRIPTIONS file1: 10.1.2.1.3<tab>abc123def<tab>xyz.mm1.ppp.... (2 Replies)
Discussion started by: ricky007
2 Replies

10. Shell Programming and Scripting

how to find duplicate files with find ?

hello all I like to make search on files , and the result need to be the files that are duplicated? (8 Replies)
Discussion started by: umen
8 Replies
Login or Register to Ask a Question

Featured Tech Videos