Quote:
Originally Posted by
pc2001
Hi Don,
The 'Barcode' column of the worksheet contains the name of the files to be copied. These ids are unique. For example, one id could be 'A123'. The file itself may be called 'xyz_A123.jpg'. Since there could be multiple copies of the same file scattered over different directories, I use the cut command to get the filepath of the first (of these identical) files.
The find statement precedes the copy statement, so there is no worry that it will find 'xyz_A123.jpg' in the test/newdata directory. Does that answer the question? Or have I misunderstood?
Is there a better way of doing it?
thanks!
I apologize for taking so long to get back to you.
In the first message in this thread, you said that a you had a file where the 2nd column on lines in that file named files to be processed. In the 3rd message in this thread, we found out that that you had file name fragments (not full filenames) in the 1st column (not the 2nd column) in that file.
We don't know what happens after a file is copied into test/newdata and we don't know how often this script is run. But if you have file name fragments like A123 and you could have filenames such as xyz_A1230.jpg, xyz_A1231.jpg, abc_A123.jpg, etc. that will match the pattern you create from the file name fragments in your input file;
find will match a file under test/newdata, any of several files that have already been copied, or any one of multiple files that match the pattern from the file name fragment in addition to the file (or files) you intended to match (and the order in which the returned pathnames are presented may seem random to your script).
With the sketchy details you have provided, I have no idea whether or not this is a real problem or just a possible problem depending on the names of files that have been processed and the names of files that you may process at some point in the future.
If you run the script multiple times with the same input data and the same file hierarchy, the second time you run the script, it could easily match a file under test/newdata. Assuming that a file has already been copied in this case, I assume it wouldn't matter, but, again, I can only guess at what the side effects might be.
If you want to keep
find from returning any files under test/newdata, that is fairly easy to do. Only you know enough about what you're doing to know if that is important or not. Only you know enough about the filenames being processed to know whether or not the patterns you're using to look for files will always only find duplicates of the file you want to match; or if a search could find several different files for different images from a single pattern.