Deleting extra files with similar filenames


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Deleting extra files with similar filenames
# 1  
Old 09-30-2009
Deleting extra files with similar filenames

Hello,

I have a large amount of files under a root directory, with several sub-directories, and many of these sub-directories have similar files with similar names. I need to clean this up.

The filenames are of the format:

Code:
/path/to/dir/subdir/file name.dat
/path/to/dir/subdir/file name 1.dat

I want to keep only
Code:
/path/to/dir/filename.dat

and remove the other file. I have tried some tools including fslint, but it didn't work because the actual content of the files may vary slightly.

Help in creating a bash script or similar to weed out the unneeded files would be highly appreciated.

Thanks!
# 2  
Old 09-30-2009
If you have only single level subdirectories you could use
Code:
rm /path/to/dir/*/filename.dat

to delete all occurences of "filename.dat" in the subdirectories immediately underneath /path/to/dir/
or if you want also remove similar files you could e.g. try something like:
Code:
ls /path/to/dir/*/file*name*.dat

if it lists exactly the files you need to remove you can
Code:
rm /path/to/dir/*/file*name*.dat


Last edited by Scrutinizer; 09-30-2009 at 05:32 PM..
# 3  
Old 10-01-2009
Unfortunately its not that straightforward.

For one, I don't know the file names beforehand. Also, the sub-directories are nested, up to three levels deep. There's a total of 30,000 files with an estimated 10,000 duplicates.

Let me try and clarify: I want to delete duplicate files. These duplicate files are in the same folder as the original and have a " 1" at the end of the filename. Some originals may also have the " 1" at the end of the filename, but duplicates for them don't exist.
# 4  
Old 10-01-2009
OK, that is quite a different requirement Smilie. Assuming the duplicates always have " 1" at the end, followed by the extension ".dat", you could try this ksh/bash code:

Code:
BASEDIR='/path/to/dir'
ext='.dat'
duplext=' 1'
find "$BASEDIR" -type d | while read dir; do
  find $dir -name "*$ext" -maxdepth 1 -type f | while read file; do
    duplicate="${file%$ext}$duplext$ext"
    if [[ -f "$duplicate" ]]; then
      rm "$duplicate"
    fi
  done
done


Last edited by Scrutinizer; 10-01-2009 at 05:36 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Solaris

Getting similar lines in two files

Hi, I need to compare the /etc/passwd files from 2 servers, and extract the users that are similar in these two files. I sorted the 2 files based on the user IDs (UID) (3rd column). I first sorted the files using the username (1st column), however when I use comm to compare the files there is no... (1 Reply)
Discussion started by: anaigini45
1 Replies

2. UNIX for Dummies Questions & Answers

Sorting files using extra buffer

i want to sort a file which was 4gb of data. my ram sixe was 2gb. so, i want to sort that file using extra buffer , is it possible in unix? if possible plz help me thanks (3 Replies)
Discussion started by: mahesh1987
3 Replies

3. Shell Programming and Scripting

Looking to find files that are similar.

Hello all, I have a server that is running AIX, running a tool that converts various printstreams (AFP/Metadata) to PDF. This is done using a rexx script and an off the shelf utility. Each report (there's around 125) uses a certain script file, it's basically a text file. I am trying... (5 Replies)
Discussion started by: jeffs42885
5 Replies

4. Shell Programming and Scripting

How to find similar values in different files

Hello, I have 4 files like this: file1: cg24163616 15 297 cg09335911 123 297 cg13515808 565 776 cg12242345 499 705 cg22905282 225 427 cg16674860 286 779 cg14251734 303 724 cg19316579 211 717 cg00612625 422 643 file2:... (2 Replies)
Discussion started by: linseyr
2 Replies

5. Shell Programming and Scripting

What extra Parameters I can use for archiving log files

Hello All, I have developed a script which takes following parameter from the input file to archive log files 1)Input Path 2)File pattern(*.csv) 3)Number of days(+1) Following is the algorithm of my script Read the input file go to that path and search for particular n days older... (3 Replies)
Discussion started by: mitsyjohn
3 Replies

6. Shell Programming and Scripting

compare the similar files

I got many pair files, which only have small difference, such as more space, or more empty line, and some unreadable characters. If list by commend "diff", I can see many many difference. So I'd like to write a script to compare the pair files, if 95% contents are same, I will think they are... (2 Replies)
Discussion started by: rdcwayx
2 Replies

7. Shell Programming and Scripting

removing extra files in dos

Hi, I have same file by name i want to keep only access file and want to delete rest. This is specific to DOS only. Any idea of doing this. I tried so many options but none worked for me. Thanks Namish (11 Replies)
Discussion started by: namishtiwari
11 Replies

8. Shell Programming and Scripting

csh script for deleting extra spaces in text file

I am new to scripting and I needed to know if there would be an easy way to delete extra spaces in a text file. I have a file with three rows with 22 numbers each, but there is extra spaces between the numbers when it gets output by this program AFNI that I am using. What script would help delete... (2 Replies)
Discussion started by: hertingm
2 Replies

9. Shell Programming and Scripting

Deleting the similar lines

Dear Friends myself Avinash working in bash shell The problem goes like this I have a file called work.txt assume that first colum=mac address second colum= IP third colum = port number ---------------------------------------- 00:12:23:34 192.168.50.1 2 00:12:23:35 192.168.50.1 5... (2 Replies)
Discussion started by: avi.skynet
2 Replies

10. UNIX for Dummies Questions & Answers

Appending text to a number of similar filenames

Hi, I was wondering if there was a way to append something to filenames based on a wildcard. For example, if I have the following files in a directory: blah1 blah2 blah3 blah4 blah5 I want to rename these all to: blah1.txt blah2.txt blah3.txt blah4.txt blah5.txt Is there a... (4 Replies)
Discussion started by: Djaunl
4 Replies
Login or Register to Ask a Question