Finding files in directory with similar names


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Finding files in directory with similar names
# 1  
Old 04-17-2014
Finding files in directory with similar names

So, I have a directory tree that has many files named thusly:

X_REVY.PDF

I need to find any files that have the same X portion (which can be nearly anything) as any another file (in any directory) but have different Y portions (which can be any number from 1-99).

I then need it to return all of the lower number duplicate REV's and not display the highest one (basically we'll be moving all of the lower numbers revs to a separate folder)

I can not for the life of me figure out a good way to do this. Any help would be greatly appreciated!

Thanks for taking the time guys! This is on a Linux system so I have access to all of the normal stuff.
# 2  
Old 04-17-2014
Code:
#!/bin/bash

declare -A max=()

# first, find highest.
for f in *REV*.PDF; do
        base=${f%_REV*}
        rev=${f##*_REV}
        rev=${rev%.PDF}

        [[ -z "${max[$base]}" ]] && max["$base"]=$rev
        (( rev > "${max[$base]}" )) && max["$base"]=$rev
done

for prefix in "${!max[@]}"; do
        for f in "$prefix"*; do
                rev=${f##*_REV}
                rev=${rev%.PDF}
                if (( rev < "${max[$prefix]}" )); then
                        echo mv "$f" /someplace/for/old/files
                fi
        done
done

Code:
mute@thedoctor:~/temp/kamezero$ touch {ONE,TWO,THREE}_REV{1..5}.PDF
mute@thedoctor:~/temp/kamezero$ ./script
mv THREE_REV1.PDF /someplace/for/old/files
mv THREE_REV2.PDF /someplace/for/old/files
mv THREE_REV3.PDF /someplace/for/old/files
mv THREE_REV4.PDF /someplace/for/old/files
mv ONE_REV1.PDF /someplace/for/old/files
mv ONE_REV2.PDF /someplace/for/old/files
mv ONE_REV3.PDF /someplace/for/old/files
mv ONE_REV4.PDF /someplace/for/old/files
mv TWO_REV1.PDF /someplace/for/old/files
mv TWO_REV2.PDF /someplace/for/old/files
mv TWO_REV3.PDF /someplace/for/old/files
mv TWO_REV4.PDF /someplace/for/old/files

# 3  
Old 04-17-2014
That's excellent neutronscott. However, I'm looking through our files and unfortunately it seems the engineers don't pay very much attention to case when doing these. So, is there anyway to make these case insensitive so that REV1.PDF and rev3.pdf, etc would be seen as duplicates?

Again, thank you very much for your help.
# 4  
Old 04-17-2014
ok we'll need to make some things lowercase and also set nocasematch option. this script requires bash4 (but it kinda already did with assoc arrays)

Code:
#!/bin/bash

declare -A max=()
shopt -s nocaseglob

# first, find highest.
for f in *rev*.pdf; do
        f=${f,,} #make lowercase
        base=${f%_rev*}
        rev=${f##*_rev}
        rev=${rev%.pdf}

        [[ -z "${max[$base]}" ]] && max["$base"]=$rev
        (( rev > "${max[$base]}" )) && max["$base"]=$rev
done

for prefix in "${!max[@]}"; do
        for f in "$prefix"*; do
                n=${f,,} #make lowercase
                rev=${n##*_rev}
                rev=${rev%.pdf}
                if (( rev < "${max[$prefix]}" )); then
                        echo mv "$f" /someplace/for/old/files
                fi
        done
done


Last edited by neutronscott; 04-17-2014 at 07:53 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Directory containing files,Print names of the files in the directory that are exactly same content.

Given a directory containing say a few thousand files, please output a list of all the names of the files in the directory that are exactly the same, i.e. have the same contents. func(a_directory_name) output -> {“matches”: , ... ]} e.g. func(“/home/my/files”) where the directory... (7 Replies)
Discussion started by: anuragpgtgerman
7 Replies

2. UNIX for Dummies Questions & Answers

Finding similar strings between two files

Hi, I have a file1 like this: ABAT ABCA1 ABCC1 ABCC5 ABCC8 ABCE1 ABHD2 ABL1 CAMTA1 ACBD3 ACCN1 And I have a second file like this: chr19 46118590 46119564 MACS_peak_1499 3100.00 chr19 46122009 46148405 CYP2B7P1 -2445 chr1 7430312 7430990... (7 Replies)
Discussion started by: a_bahreini
7 Replies

3. Shell Programming and Scripting

Merging two columns from two files with similar names into a loop

I have two files like this: fileA.net A B C fileA.dat 1 2 3 and I want the output output_expected A 1 B 2 C 3 I know that the easier way is to do a paste fileA.net fileA.dat, but the problem is that I have 10,000 couple of files (fileB.net with fileB.dat; fileC.net with... (3 Replies)
Discussion started by: valente
3 Replies

4. UNIX for Dummies Questions & Answers

finding overlapping names in different txt files

Dear Gurus, I have 57 tab-delimited different text files, each one containing entries in 3 columns. The first column in each file contains names of objects. Some names are present in more than one file. I would like to find those names and store them in a separate text file, preferably with a... (6 Replies)
Discussion started by: Unilearn
6 Replies

5. Shell Programming and Scripting

Grepping file names, comparing them to a directory of files, and moving them into a new directory

got it figured out :) (1 Reply)
Discussion started by: sHockz
1 Replies

6. Shell Programming and Scripting

concatenating similar files in a directory

Hi, I am new in unix. I have below requirement: I have two files at the same directory location File1.txt and File2.txt (just an example, real scenario we might have File2 and File3 OR File6 and File7....) File1.txt has : header1 record1 trailer1 File2.txt has: header2 record2... (4 Replies)
Discussion started by: Deepak62828r
4 Replies

7. Shell Programming and Scripting

Script to move files with similar names to folder

I have in directory /media/AUDIO/WAVE many .mp3 files with names like: my filename_01of02.mp3 my filename_02of02.mp3 Your File_01of06.mp3 Your File_02of06.mp3 etc.... In the same directory, /media/AUDIO/WAVE, I have many folders with names like 9780743579490 9780743579491 etc.. Inside... (7 Replies)
Discussion started by: glev2005
7 Replies

8. Shell Programming and Scripting

parsing file names and then grouping similar files

Hello Friends, I have .tar files which exists under different directories after the below code is run: find . -name "*" -type f -print | grep .tar > tmp.txt cat tmp.txt ./dir1/subdir1/subdir2/database-db1_28112009.tar ./dir2/subdir3/database-db2_28112009.tar... (2 Replies)
Discussion started by: EAGL€
2 Replies

9. UNIX for Dummies Questions & Answers

Finding names in multiple files - second attempt

I couldn't find the original thread that I created and since I didn't get a definitive answer, I figured I'd try again. Maybe this time I can describe what I want a little better. I've got two files, each with thousands of names all separated by new line. I want to know if 'name in file1'... (2 Replies)
Discussion started by: Rally_Point
2 Replies

10. UNIX for Dummies Questions & Answers

Finding Names in multiple files

What's the best way to see if a common name exists in two separate files? (3 Replies)
Discussion started by: Rally_Point
3 Replies
Login or Register to Ask a Question