Quote:
Originally Posted by
Corona688
Quote:
Originally Posted by Don Cragun View Post
No. Do not use -exec ... + in cases like this. If there are enough files to trigger an invocation of one of these -exec primaries before the find has processed the entire file hierarchy, the list of files processed by each -exec primary is likely to have a different set of operands that the other -exec primaries.
And this is a problem why?
This is a problem because
rm may remove a file before it is listed by
ls and archived by
tar.
Quote:
Quote:
For example, the 1st invocation of ls might process 100 files, the 1st invocation of tar might process 95 files, and the 1st invocation ofrm might process 105 files.
Doesn't seem to work that way, and I can't imagine why it would. Why wouldn't all three execs get the exact same files?
The
-exec ... + primary gathers arguments for each invocation of the specified utility with the guarantee that the arg list used will not exceed the system's ARG_MAX limit. It does not use a fixed number of operands to be passed to a utility when it is invoked. Since the utility name and argument list for
rm just includes
rm before the list of pathname operands, the argument list for
ls include the utility name and the options (
ls -latr) before the pathname operands, and the argument list for
tar is even longer (
tar -rvf /directory/foroutput/archive.tar), there is a chance that the number of pathnames given to
tar may be less than the number of pathnames given to
ls which may also be less than the number of pathnames given to
rm. Therefore, the first invocation of
rm may remove one or more files before the second invocation of
ls or
tar have a chance to process them.
Quote:
Quote:
If there aren't enough files in the file hierarchy being processed by find to trigger invocations of of those tree utilities until the entire file hierarchy has been traversed, all three utilities could be run in parallel again allowing rm to remove some or all of the files before they are listed and archived.
Does this actually happen?
find doesn't run things in parallel to my understanding.
I don't know whether or not the implementation of
find on the original poster's system does this or not. The standards say this about
-exec ... +:
Quote:
...
If the primary expression is punctuated by a <plus-sign>, the primary shall always
evaluate as true, and the pathnames for which the primary is evaluated shall be
aggregated into sets. The utility utility_name shall be invoked once for each set of
aggregated pathnames. Each invocation shall begin after the last pathname in the
set is aggregated, and shall be completed before the find utility exits and before the
first pathname in the next set (if any) is aggregated for this primary, but it is
otherwise unspecified whether the invocation occurs before, during, or after the
evaluations of other primaries. If any invocation returns a non-zero value as exit
status, the find utility shall return a non-zero exit status. An argument containing
only the two characters "{}" shall be replaced by the set of aggregated
pathnames, with each pathname passed as a separate argument to the invoked
utility in the same order that it was aggregated. The size of any set of two or more
pathnames shall be limited such that execution of the utility does not cause the
system's {ARG_MAX} limit to be exceeded. If more than one argument containing
the two characters "{}" is present, the behavior is unspecified.
...
The text marked in red above clearly allows invocations of the three utilities in the three
-exec primaries to be invoked in any order and sequentially or in parallel as long as each of the utilities that needs to be invoked more than once completes processing earlier sets of pathnames for that
-exec primary before it is invoked again to process a later set of pathnames for that
-exec primary.