Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Find a list of files in directory, move to new, allow duplicates Post 302912175 by bakunin on Wednesday 6th of August 2014 11:20:05 AM
Old 08-06-2014
Quote:
Originally Posted by Clyde Lovett
You guys are great - thank you!

I hear you both re: the pre-planning, but actually I'm like the janitor here - housekeeping some shared directories where product photo is put into Dropbox by multiple users / employees etc. Images are placed in folders based on Vendor and sale name and then shared via the cloud. So these images could be buckshot all over the place & duplicated x-number of times as they might be needed in one folder for one particular sale and then some duplication to cover another sale with a different vendor and so forth. We have thousands of product SKUs and on this occasion these 300 are now sold out and we will not restock them so the task is to remove any photos that have the SKU number in the file name ... hence my conundrum and visit to this forum - which I am glad to have discovered!

I spend my day as the staff photographer but because I have some aptitude for computers I get the IT hat thrown at me on a regular basis. Hopefully this sheds some light on things Smilie
If this is any help: you have my pity. ;-))


Quote:
Originally Posted by Clyde Lovett
So a couple of things
1. I don't mind waiting for it to run - I can even run something overnight if it's going to take a very long time.
OK, but in this case error reporting is a MUST. Suppose there are some 1000 files to move. As the script works on the 457th of them something goes wrong. How would you find out? And how would you correct that? Or, another scenario, the 457th of them and any following fails (because, say, the disk being full). How are you going to correct that?

As a general advice: it doesn't matter that an automated procedure fails from time to time, but it should do so with a traceable, intelligible error message. Compare could not move /source/fileA to /target/B because disk is full. Aborting... to error in line 153. Exit. and ask yourself which supports prospective error correction attempts better.

If you do not want to program complex reporting features into your program you might want to sit and watch it run so that you can react immediately.

Quote:
Originally Posted by Clyde Lovett
2. As the script finds and moves files, it would be great if there is no over-write of files with duplicate filenames, but rather if a duplicate filename is found that it just append the duplicate filename with something like copy, copy1, copy2, etc.
My sketch of a script attempts that (see commented version below). I was under the impression that every filename can only be there once per run of the script, therefore only provisions for one additional copy per run are in place. It should be trivial to add additional code to cover for that.


Quote:
Originally Posted by Clyde Lovett
a. what identifies the file that is the list of my "SKUs" (the search criteria)?
My script expects a file "/path/to/list.of.filemasks" (see last line) to contain search criteria, one per line. An asterisk is prepended and appended to every criteria to create a wildcard expression A possible content would look like:

Code:
foo.jpg
bar.gif
baz

which would process all filea "*foo.jpg*", then "*bar.gif*", then "*baz*", etc..

Quote:
Originally Posted by Clyde Lovett
b. is my file with the list of SKUs a text file, one SKU per line? (probably yes)
As said above, yes.

Quote:
Originally Posted by Clyde Lovett
c. in what way to I save this code and run it - I gather this is a script - what do I do with it (now I'm really showing my "dummy" status on this, but I learn fast so hang in there with me!).
First, you copy it and save it as a simple text file. The name and extension does not matter, take whatever you like. I suggest you explicitly state that a certain shell is to execute it, therefore add such a line as the first line:

Code:
#! /path/to/some/shell

If you do not know which shell to use: issue "echo $SHELL" at the command line and take its output. Here is an example of one of my systems, yours might look different:

Code:
$ echo $SHELL
/usr/bin/ksh

$ cat template.ksh 
#! /usr/bin/ksh
# ----------------------------------------------------------------------
# template.ksh                               template for ksh scripts/functions
# ----------------------------------------------------------------------
...

Notice that everything after "#" is treated as a comment, but the first line (also called "shebang") has to be exactly as it is. I.e. inserting a space before "#!" would make it stop to work.

Now you need to make this file executable: execute

Code:
chmod 754 /your/filename

This sets read, write and execute rights for you (7), read and execute for members of your group (5) and read only for all other users (4). After this you can execute the file. Notice, though, that the current directory is NOT automatically in the path, unlike in Windoze. To execute a file in your current directory issue "./filename", not "filename".

Here is a commented version of my script, i have put echo-statements in place of the processing parts, so that you can try it out and see the inner workings:

Code:
while read FILEMASK ; do
     echo $FILEMASK
done < /path/to/list.of.filemasks

Read the file /path/to/list.of.filemasks and put each lines content into variable FILEMASK.


Code:
while read FILEMASK ; do
     find /path/to/sourcedir -type f -name "*${FILEMASK}*" |\
          while read MOVEFILE ; do
               echo $MOVEFILE
          done
done < /path/to/list.of.filemasks

"find" searches a complete directory hierarchy and produces a list of filenames. As you see it filters for "*FILEMASK*". This is where the content of your list of filemasks comes into play. Every filename found this way is fed to another while-loop and read into a variable "MOVEFILE". The content of this might be "/path/to/sourcedir/sub1/foo.FILEMASK.bar". If the wildcard delivers false positives then tinker withe the argument to "-name". Instead of "*${FILEMASK}*" you might want to try "*${FILEMASK}" (this will find "foo.FILEMASK" but not "FILEMASK.bar"), etc..

Code:
while read FILEMASK ; do
     find /path/to/sourcedir -type f -name "*${FILEMASK}*" |\
          while read MOVEFILE ; do
               echo "BEFORE: $MOVEFILE"
               FNAME="${MOVEFILE##*/}"
               echo "AFTER: $FNAME"
          done
done < /path/to/list.of.filemasks

This part just strips all the path information from the filename and assigns the stripped part to a variable "FNAME", like this:

MOVEFILE: "/path/to/sourcedir/sub1/some.FILEMASK.bla"
FNAME: "some.FILEMASK.bla"

Code:
               if [ -f "/path/to/targetdir/$FNAME" ] ; then
                    mv "$MOVEFILE" "/path/to/targetdir/${FNAME}.$$"
               else
                    mv "$MOVEFILE" "/path/to/targetdir"
               fi

This innermost part checks if the filename already exists at the prospective target place. If yes, the file is moved to a name with the current process-number ("$$") appended, else (if no such target exists), the original name is used.

To cover for multiple instances replace this part with the following:

Code:
               if [ -f "/path/to/targetdir/$FNAME" ] ; then
                    (( IDX = 1 ))
                    while [ -f "/path/to/targetdir/${FNAME}.${IDX}" ] ; do
                         (( IDX += 1 ))
                    done
                    mv "$MOVEFILE" "/path/to/targetdir/${FNAME}.${IDX}"
               else
                    mv "$MOVEFILE" "/path/to/targetdir"
               fi

If a file doesn't exist it is simply copied (the else-part). If such a file already exists, a counter is initialized with "1" and incremented each time such a file was found. The "while [ -f ...]" tests for "file.1", then "file.2", etc., until it finds a name that is not taken already. This is then used to move the file.

I hope this helps.

bakunin

Last edited by bakunin; 08-06-2014 at 01:25 PM..
 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Find duplicates from multuple files with 2 diff types of files

I need to compare 2 diff type of files and find out the duplicate after comparing each types of files: Type 1 file name is like: file1.abc (the extension abc could any 3 characters but I can narrow it down or hardcode for 10/15 combinations). The other file is file1.bcd01abc (the extension... (2 Replies)
Discussion started by: ricky007
2 Replies

2. UNIX for Dummies Questions & Answers

Move all files in a directory tree to a signal directory?

Is this possible? Let me know If I need specify further on what I am trying to do- I just want to spare you the boring details of my personal file management. Thanks in advance- Brian- (2 Replies)
Discussion started by: briandanielz
2 Replies

3. UNIX for Dummies Questions & Answers

Find files and display only directory list containing those files

I have a directory (and many sub dirs beneath) on AIX system, containing thousands of file. I'm looking to get a list of all directory containing "*.pdf" file. I know basic syntax of find command, but it gives me list of all pdf files, which numbers in thousands. All I need to know is, which... (4 Replies)
Discussion started by: r7p
4 Replies

4. Shell Programming and Scripting

find list of files from a list and copy to a directory

I will be very grateful if someone can help me with bash shell script that does the following: I have a list of filenames: A01_155716 A05_155780 A07_155812 A09_155844 A11_155876 that are kept in different sub directories within my current directory. I want to find these files and copy... (3 Replies)
Discussion started by: manishabh
3 Replies

5. Shell Programming and Scripting

Move files in a list to another directory

I have a number of files in a directory that can be grouped with something like "ls | grep SH2". I would like to move each file in this list to another directory. Thanks (4 Replies)
Discussion started by: kg6iia
4 Replies

6. Shell Programming and Scripting

Please help list/find files greater 1G move to different directory

I have have 6 empty directory below. I would like write bash scipt if any files less "1000000000" bytes then move to "/export/home/mytmp/final" folder first and any files greater than "1000000000" bytes then move to final1, final2, final3, final4, final4, final5 and that depend see how many files,... (6 Replies)
Discussion started by: dotran
6 Replies

7. Shell Programming and Scripting

Copying files from one directory to another, renaming duplicates.

Below is the script i have but i would like simplified but still do the same job. I need a script to copy files not directories or sub-directories into a existing or new directory. The files, if have the same name but different extension; for example 01.doc 01.pdf then only copy the .doc file. ... (1 Reply)
Discussion started by: Gilljambo
1 Replies

8. Shell Programming and Scripting

List files with date, create directory, move to the created directory

Hi all, i have a folder, with tons of files containing as following, on /my/folder/jobs/ some_name_2016-01-17-22-38-58_some name_0_0.zip.done some_name_2016-01-17-22-40-30_some name_0_0.zip.done some_name_2016-01-17-22-48-50_some name_0_0.zip.done and these can be lots of similar files,... (6 Replies)
Discussion started by: charli1
6 Replies

9. UNIX for Dummies Questions & Answers

How to move gz files from one source directory to destination directory?

Hi All, Daily i am doing the house keeping in one of my server and manually moving the files which were older than 90 days and moving to destination folder. using the find command . Could you please assist me how to put the automation using the shell script . ... (11 Replies)
Discussion started by: venkat918
11 Replies
All times are GMT -4. The time now is 01:58 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy