Finding duplicates then copying, almost there, maybe?


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Finding duplicates then copying, almost there, maybe?
# 1  
Old 12-16-2011
Finding duplicates then copying, almost there, maybe?

Hi everyone. I'm trying to help my wife with a project, she has exported 200 images from many different folders, unfortunately there was a problem with the export and I need to find the master versions so that she doesn't have to go through and select them again.

I need to:

For each image in a folder (/folderA), search elsewhere (/folderB) for a copy of the file. When you find it, copy that file to a location (/folderC).

In the end, the contents of /folderA will match the contents of /folderC relative to file name yet with data from the files found in /folderB.

Here's my most recent attempt, I'm stumped:

Code:
find /folderA -name DSC\* -exec find /folderB -name \*{} \; -exec cp {} /folderC \;

# 2  
Old 12-16-2011
Primitive, but works.

bash code:
  1. #! /bin/bash
  2.  
  3. for x in `ls /path/folderA`
  4. do
  5.     [ ! -f /path/folderA/$x ] && continue
  6.     for y in `ls /path/folderB`
  7.     do
  8.         [ "$x" == "$y" ] && cp /path/folderB/$y /path/folderC/
  9.     done
  10. done
# 3  
Old 12-16-2011
You didn't mention whether or not there are subdirectories under either foldera or folderb. I assumed so. I also assumed that the path to file in foldera (e.g. foldera/foo/bar/DSC1) could be something other than foo/bar under folderb.

From that, two finds generate lists of files under foldera and folderb. The awk then finds matches and prints the copy commands to stdout. If the copy commands look right, then you can pipe them to ksh/bash to actually copy the files to folderc. All files are placed into folderc without trying to mimic any path from the source directory.

Code:
#!/usr/bin/env ksh

(
    find foldera -name "DSC*"
    echo "==="
    find folderb -name "DSC*"
) | awk '
    BEGIN { src = "a"; }
    /===/ { src = "b";  next }
    {
        n = split( $1, tok, "/" );
        if( src == "a" )
            a[tok[n]]  = $1;            # save path
        else
            b[tok[n]]  = $1;
    }
    END {
        for( f in a )
            if( b[f] != "" )        # file from a is also somewhere in b
                printf( "cp %s /folderc/\n", b[f] );
    }
'  # ksh       # remove first hash to execute the commands

---------- Post updated at 00:45 ---------- Previous update was at 00:28 ----------

Same idea, a bit cleaner code, but room for error if a subdirectory under folderb matches foldera:

Code:
#!/usr/bin/env ksh
    find foldera folderb -name "DSC*" | awk '
    {
        n = split( $1, tok, "/" );
        if( index( $0, "foldera/" ) )
            a[tok[n]]  = $1;            # save path
        else
            b[tok[n]]  = $1;
    }
    END {
        for( f in a )
            if( b[f] != "" )        # file from a is also somewhere in b
                printf( "cp %s /folderc/\n", b[f] );
    }
'  # | ksh


Last edited by agama; 12-16-2011 at 01:32 AM.. Reason: clarification
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Copying files from one directory to another, renaming duplicates.

Below is the script i have but i would like simplified but still do the same job. I need a script to copy files not directories or sub-directories into a existing or new directory. The files, if have the same name but different extension; for example 01.doc 01.pdf then only copy the .doc file. ... (1 Reply)
Discussion started by: Gilljambo
1 Replies

2. Shell Programming and Scripting

UNIX scripting for finding duplicates and null records in pk columns

Hi, I have a requirement.for eg: i have a text file with pipe symbol as delimiter(|) with 4 columns a,b,c,d. Here a and b are primary key columns.. i want to process that file to find the duplicates and null values are in primary key columns(a,b) . I want to write the unique records in which... (5 Replies)
Discussion started by: praveenraj.1991
5 Replies

3. Shell Programming and Scripting

Finding duplicates in a file excluding specific pattern

I have unix file like below >newuser newuser <hello hello newone I want to find the unique values in the file(excluding <,>),so that the out put should be >newuser <hello newone can any body tell me what is command to get this new file. (7 Replies)
Discussion started by: shiva2985
7 Replies

4. Shell Programming and Scripting

finding duplicates in csv based on key columns

Hi team, I have 20 columns csv files. i want to find the duplicates in that file based on the column1 column10 column4 column6 coulnn8 coulunm2 . if those columns have same values . then it should be a duplicate record. can one help me on finding the duplicates, Thanks in advance. ... (2 Replies)
Discussion started by: baskivs
2 Replies

5. Shell Programming and Scripting

Help finding non duplicates

I am currently creating a script to find filenames that are listed once in an input file (find non duplicates). I then want to report those single files in another file. Here is the function that I have so far: function dups_filenames { file2="" file1="" file="" dn="" ch="" pn="" ... (6 Replies)
Discussion started by: chipblah84
6 Replies

6. Shell Programming and Scripting

Finding duplicates from positioned substring across lines

I have million's of records each containing exactly 50 characters and have to check the uniqueness of 4 character substring of 50 character (postion known prior) and report if any duplicates are found. Eg. data... AAAA00000000000000XXXX0000 0000000000... upto50 chars... (2 Replies)
Discussion started by: gapprasath
2 Replies

7. UNIX for Dummies Questions & Answers

Finding and Copying Email

I have to create a bash script that will find Feedback emails and copy them to a labFeedback folder in my mail directory. I have an idea in my head on what commands can be used for this (find obviously among them). However, I have no idea where to start. I'm not sure what info needs to be given,... (1 Reply)
Discussion started by: Joesgrrrl
1 Replies

8. Shell Programming and Scripting

finding duplicates in columns and removing lines

I am trying to figure out how to scan a file like so: 1 ralphs office","555-555-5555","ralph@mail.com","www.ralph.com 2 margies office","555-555-5555","ralph@mail.com","www.ralph.com 3 kims office","555-555-5555","kims@mail.com","www.ralph.com 4 tims... (17 Replies)
Discussion started by: totus
17 Replies

9. UNIX for Dummies Questions & Answers

finding and copying files !

Hi , I have a question relating to finding and copying files. i need to find the .pdf files from the specified directory which has subdirectories too. I only need .pdf files and not the directories and need to copy those files into my current directory. copy files from :... (5 Replies)
Discussion started by: bregoty
5 Replies

10. Shell Programming and Scripting

finding duplicates with perl

I have a huge file (over 30mb) that I am processing through with perl. I am pulling out a list of filenames and placing it in an array called @reports. I am fine up till here. What I then want to do is go through the array and find any duplicates. If there is a duplicate, output it to the screen.... (3 Replies)
Discussion started by: dangral
3 Replies
Login or Register to Ask a Question