Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Finding duplicates then copying, almost there, maybe? Post 302582430 by agama on Friday 16th of December 2011 12:45:43 AM
Old 12-16-2011
You didn't mention whether or not there are subdirectories under either foldera or folderb. I assumed so. I also assumed that the path to file in foldera (e.g. foldera/foo/bar/DSC1) could be something other than foo/bar under folderb.

From that, two finds generate lists of files under foldera and folderb. The awk then finds matches and prints the copy commands to stdout. If the copy commands look right, then you can pipe them to ksh/bash to actually copy the files to folderc. All files are placed into folderc without trying to mimic any path from the source directory.

Code:
#!/usr/bin/env ksh

(
    find foldera -name "DSC*"
    echo "==="
    find folderb -name "DSC*"
) | awk '
    BEGIN { src = "a"; }
    /===/ { src = "b";  next }
    {
        n = split( $1, tok, "/" );
        if( src == "a" )
            a[tok[n]]  = $1;            # save path
        else
            b[tok[n]]  = $1;
    }
    END {
        for( f in a )
            if( b[f] != "" )        # file from a is also somewhere in b
                printf( "cp %s /folderc/\n", b[f] );
    }
'  # ksh       # remove first hash to execute the commands

---------- Post updated at 00:45 ---------- Previous update was at 00:28 ----------

Same idea, a bit cleaner code, but room for error if a subdirectory under folderb matches foldera:

Code:
#!/usr/bin/env ksh
    find foldera folderb -name "DSC*" | awk '
    {
        n = split( $1, tok, "/" );
        if( index( $0, "foldera/" ) )
            a[tok[n]]  = $1;            # save path
        else
            b[tok[n]]  = $1;
    }
    END {
        for( f in a )
            if( b[f] != "" )        # file from a is also somewhere in b
                printf( "cp %s /folderc/\n", b[f] );
    }
'  # | ksh


Last edited by agama; 12-16-2011 at 01:32 AM.. Reason: clarification
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

finding duplicates with perl

I have a huge file (over 30mb) that I am processing through with perl. I am pulling out a list of filenames and placing it in an array called @reports. I am fine up till here. What I then want to do is go through the array and find any duplicates. If there is a duplicate, output it to the screen.... (3 Replies)
Discussion started by: dangral
3 Replies

2. UNIX for Dummies Questions & Answers

finding and copying files !

Hi , I have a question relating to finding and copying files. i need to find the .pdf files from the specified directory which has subdirectories too. I only need .pdf files and not the directories and need to copy those files into my current directory. copy files from :... (5 Replies)
Discussion started by: bregoty
5 Replies

3. Shell Programming and Scripting

finding duplicates in columns and removing lines

I am trying to figure out how to scan a file like so: 1 ralphs office","555-555-5555","ralph@mail.com","www.ralph.com 2 margies office","555-555-5555","ralph@mail.com","www.ralph.com 3 kims office","555-555-5555","kims@mail.com","www.ralph.com 4 tims... (17 Replies)
Discussion started by: totus
17 Replies

4. UNIX for Dummies Questions & Answers

Finding and Copying Email

I have to create a bash script that will find Feedback emails and copy them to a labFeedback folder in my mail directory. I have an idea in my head on what commands can be used for this (find obviously among them). However, I have no idea where to start. I'm not sure what info needs to be given,... (1 Reply)
Discussion started by: Joesgrrrl
1 Replies

5. Shell Programming and Scripting

Finding duplicates from positioned substring across lines

I have million's of records each containing exactly 50 characters and have to check the uniqueness of 4 character substring of 50 character (postion known prior) and report if any duplicates are found. Eg. data... AAAA00000000000000XXXX0000 0000000000... upto50 chars... (2 Replies)
Discussion started by: gapprasath
2 Replies

6. Shell Programming and Scripting

Help finding non duplicates

I am currently creating a script to find filenames that are listed once in an input file (find non duplicates). I then want to report those single files in another file. Here is the function that I have so far: function dups_filenames { file2="" file1="" file="" dn="" ch="" pn="" ... (6 Replies)
Discussion started by: chipblah84
6 Replies

7. Shell Programming and Scripting

finding duplicates in csv based on key columns

Hi team, I have 20 columns csv files. i want to find the duplicates in that file based on the column1 column10 column4 column6 coulnn8 coulunm2 . if those columns have same values . then it should be a duplicate record. can one help me on finding the duplicates, Thanks in advance. ... (2 Replies)
Discussion started by: baskivs
2 Replies

8. Shell Programming and Scripting

Finding duplicates in a file excluding specific pattern

I have unix file like below >newuser newuser <hello hello newone I want to find the unique values in the file(excluding <,>),so that the out put should be >newuser <hello newone can any body tell me what is command to get this new file. (7 Replies)
Discussion started by: shiva2985
7 Replies

9. Shell Programming and Scripting

UNIX scripting for finding duplicates and null records in pk columns

Hi, I have a requirement.for eg: i have a text file with pipe symbol as delimiter(|) with 4 columns a,b,c,d. Here a and b are primary key columns.. i want to process that file to find the duplicates and null values are in primary key columns(a,b) . I want to write the unique records in which... (5 Replies)
Discussion started by: praveenraj.1991
5 Replies

10. Shell Programming and Scripting

Copying files from one directory to another, renaming duplicates.

Below is the script i have but i would like simplified but still do the same job. I need a script to copy files not directories or sub-directories into a existing or new directory. The files, if have the same name but different extension; for example 01.doc 01.pdf then only copy the .doc file. ... (1 Reply)
Discussion started by: Gilljambo
1 Replies
shell_builtins(1)														 shell_builtins(1)

NAME
shell_builtins, case, for, foreach, function, if, repeat, select, switch, until, while - shell command interpreter built-in commands The shell command interpreters csh(1), ksh(1), and sh(1) have special built-in commands. The commands case, for, foreach, function, if, repeat, select, switch, until, and while are commands in the syntax recognized by the shells. They are described in the Commands section of the manual pages of the respective shells. The remaining commands listed in the table below are built into the shells for reasons such as efficiency or data sharing between command invocations. They are described on their respective manual pages. | Command | Shell alias |csh, ksh bg |csh, ksh, sh break |csh, ksh, sh case |csh, ksh, sh cd |csh, ksh, sh chdir |csh, sh continue |csh, ksh, sh dirs |csh echo |csh, ksh, sh eval |csh, ksh, sh exec |csh, ksh, sh exit |csh, ksh, sh export |ksh, sh false |ksh fc |ksh fg |csh, ksh, sh for |ksh, sh foreach |csh function |ksh getopts |ksh, sh glob |csh goto |csh hash |ksh, sh hashstat |csh history |csh if |csh, ksh, sh jobs |csh, ksh, sh kill |csh, ksh, sh let |ksh limit |csh login |csh, ksh, sh logout |csh, ksh, sh nice |csh newgrp |ksh, sh nohup |csh notify |csh onintr |csh popd |csh print |ksh pushd |csh pwd |ksh, sh read |ksh, sh readonly |ksh, sh rehash |csh repeat |csh return |ksh, sh select |ksh set |csh, ksh, sh setenv |csh shift |csh, ksh, sh source |csh stop |csh, ksh, sh suspend |csh, ksh, sh switch |csh test |ksh, sh time |csh times |ksh, sh trap |ksh, sh true |ksh type |ksh, sh typeset |ksh ulimit |ksh, sh umask |csh, ksh, sh unalias |csh, ksh unhash |csh unlimit |csh unset |csh, ksh, sh unsetenv |csh until |ksh, sh wait |csh, ksh, sh whence |ksh while |csh, ksh, sh Bourne Shell, sh, Special Commands Input/output redirection is now permitted for these commands. File descriptor 1 is the default output location. When Job Control is enabled, additional Special Commands are added to the shell's environment. In addition to these built-in reserved command words, sh also uses: : No effect; the command does nothing. A zero exit code is returned. .filename Read and execute commands from filename and return. The search path specified by PATH is used to find the directory con- taining filename. C shell, csh Built-in commands are executed within the C shell. If a built-in command occurs as any component of a pipeline except the last, it is exe- cuted in a subshell. In addition to these built-in reserved command words, csh also uses: : Null command. This command is interpreted, but performs no action. Korn Shell, ksh, Special Commands Input/Output redirection is permitted. Unless otherwise indicated, the output is written on file descriptor 1 and the exit status, when there is no syntax error, is zero. Commands that are preceded by one or two * (asterisks) are treated specially in the following ways: 1. Variable assignment lists preceding the command remain in effect when the command completes. 2. I/O redirections are processed after variable assignments. 3. Errors cause a script that contains them to abort. 4. Words, following a command preceded by ** that are in the format of a variable assignment, are expanded with the same rules as a vari- able assignment. This means that tilde substitution is performed after the = sign and word splitting and file name generation are not performed. In addition to these built-in reserved command words, ksh also uses: * : [ arg ... ] The command only expands parameters. * .file [ arg ..Read the complete file then execute the commands. The commands are executed in the current shell environment. The search path specified by PATH is used to find the directory containing file. If any arguments arg are given, they become the posi- tional parameters. Otherwise, the positional parameters are unchanged. The exit status is the exit status of the last com- mand executed. the loop termination test. intro(1), alias(1), break(1), cd(1), chmod(1), csh(1), echo(1), exec(1), exit(1), find(1), getoptcvt(1), getopts(1), glob(1), hash(1), his- tory(1), jobs(1), kill(1), ksh(1), let(1), limit(1), login(1), logout(1), newgrp(1), nice(1), nohup(1), print(1), pwd(1), read(1), read- only(1), set(1), sh(1), shift(1), suspend(1), test(1B), time(1), times(1), trap(1), typeset(1), umask(1), wait(1), chdir(2), chmod(2), creat(2), umask(2), getopt(3C), profile(4), environ(5) 29 Jun 2005 shell_builtins(1)
All times are GMT -4. The time now is 02:26 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy