Chaining together exec within find


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Chaining together exec within find
# 15  
Old 06-22-2017
Quote:
Originally Posted by Corona688
Quote:
Originally Posted by Don Cragun View Post
No. Do not use -exec ... + in cases like this. If there are enough files to trigger an invocation of one of these -exec primaries before the find has processed the entire file hierarchy, the list of files processed by each -exec primary is likely to have a different set of operands that the other -exec primaries.
And this is a problem why?
This is a problem because rm may remove a file before it is listed by ls and archived by tar.
Quote:
Quote:
For example, the 1st invocation of ls might process 100 files, the 1st invocation of tar might process 95 files, and the 1st invocation ofrm might process 105 files.
Doesn't seem to work that way, and I can't imagine why it would. Why wouldn't all three execs get the exact same files?
The -exec ... + primary gathers arguments for each invocation of the specified utility with the guarantee that the arg list used will not exceed the system's ARG_MAX limit. It does not use a fixed number of operands to be passed to a utility when it is invoked. Since the utility name and argument list for rm just includes rm before the list of pathname operands, the argument list for ls include the utility name and the options (ls -latr) before the pathname operands, and the argument list for tar is even longer (tar -rvf /directory/foroutput/archive.tar), there is a chance that the number of pathnames given to tar may be less than the number of pathnames given to ls which may also be less than the number of pathnames given to rm. Therefore, the first invocation of rm may remove one or more files before the second invocation of ls or tar have a chance to process them.
Quote:
Quote:
If there aren't enough files in the file hierarchy being processed by find to trigger invocations of of those tree utilities until the entire file hierarchy has been traversed, all three utilities could be run in parallel again allowing rm to remove some or all of the files before they are listed and archived.
Does this actually happen? find doesn't run things in parallel to my understanding.
I don't know whether or not the implementation of find on the original poster's system does this or not. The standards say this about -exec ... +:
Quote:
...

If the primary expression is punctuated by a <plus-sign>, the primary shall always
evaluate as true, and the pathnames for which the primary is evaluated shall be
aggregated into sets. The utility utility_name shall be invoked once for each set of
aggregated pathnames. Each invocation shall begin after the last pathname in the
set is aggregated, and shall be completed before the find utility exits and before the
first pathname in the next set (if any) is aggregated for this primary, but it is
otherwise unspecified whether the invocation occurs before, during, or after the
evaluations of other primaries.
If any invocation returns a non-zero value as exit
status, the find utility shall return a non-zero exit status. An argument containing
only the two characters "{}" shall be replaced by the set of aggregated
pathnames, with each pathname passed as a separate argument to the invoked
utility in the same order that it was aggregated. The size of any set of two or more
pathnames shall be limited such that execution of the utility does not cause the
system's {ARG_MAX} limit to be exceeded. If more than one argument containing
the two characters "{}" is present, the behavior is unspecified.

...
The text marked in red above clearly allows invocations of the three utilities in the three -exec primaries to be invoked in any order and sequentially or in parallel as long as each of the utilities that needs to be invoked more than once completes processing earlier sets of pathnames for that -exec primary before it is invoked again to process a later set of pathnames for that -exec primary.
This User Gave Thanks to Don Cragun For This Post:
# 16  
Old 06-22-2017
Probably it should collect them in parallel but execute them from left to right.
I have found different implementations of {} +, and some are buggy. I suspect that AIX find is buggy, too.
--
A method to run an 'embedded' shell script
Code:
find /directory/toscan -type f -exec bash -c '
ls -ltar "$@"
tar -rvf /directory/foroutput/archive.tar "$@"
rm "$@"
' bash {} +


Last edited by Don Cragun; 06-22-2017 at 05:50 PM.. Reason: Remove accidental edit.
This User Gave Thanks to MadeInGermany For This Post:
# 17  
Old 06-22-2017
Quote:
Originally Posted by MadeInGermany
Probably it should collect them in parallel but execute them from left to right.
I have found different implementations of {} +, and some are buggy. I suspect that AIX find is buggy, too.
--
A method to run an 'embedded' shell script
Code:
find /directory/toscan -type f -exec bash -c '
ls -ltar "$@"
tar -rvf /directory/foroutput/archive.tar "$@"
rm "$@"
' bash {} +

I haven't seen any reports about UNIX-branded implementations (including AIX) of find behaving contrary to the requirements of the standards in the last decade where the given command-line met the requirements stated by the standards. But, old systems and systems that aren't branded (or tested for conformance) do still exist.

On systems where find does meet the standard's requirements, your suggestion above looks like it should work as long as the code marked in red is removed, noting of course that the list of files produced will not be sorted in its entirety if the list of pathnames to be processed is too long to just invoke bash once.

But, if a file can't be archived because tar can't read it, the file may still be removed even though it wasn't archived. If the original poster wants to keep files that couldn't be listed and archived, you would need something more like:
Code:
find /directory/toscan -type f -exec bash -c '
ls -ltr "$@" &&
tar -rvf /directory/foroutput/archive.tar "$@" &&
rm "$@"
' {} +

to keep sets of files where one or more files in the list failed, or one of the two following suggestions:
Code:
find /directory/toscan -type f -exec bash -c '
for path in "$@"
do	ls -ltr "$path" &&
	tar -rvf /directory/foroutput/archive.tar "$path" &&
	rm "$path"
done
' {} +

or:
Code:
find /directory/toscan -type f -exec ls -ltr {} \; -exec tar -rvf /directory/foroutput/archive.tar {} \; -exec rm "$@" {} \;

to only keep individual files that weren't successfully archived, but, of course, these will run MUCH slower than the other suggestions and the list of files produced by these will be in the order in which they are found in the searched file hierarchy; not in reverse time order (even in subgroups in the 1st suggestion of these last two).

Note that there is no need for the ls -a option when regular filenames are given as operands (even if their name does start with a <period> character).
# 18  
Old 06-23-2017
Might be simpler to have a function call? (keeps your code cleaner?) You can use xargs to ensure that they get processed something like this:-
Code:
#!/bin/bash
function process_one_file ()
{
  ls -l $@ &&
  tar -tvf /directory/foroutput/archive.tar "$path" &&
  rm "$path"
}

find /directory/toscan -type f | xargs process_one_file

Is that an option? It will ensure you files are sequentially processed but (might) avoid spawning a shell for each file found to run the commands.


I'm happy to be corrected if this has a flaw in it. One concern is how xargs would handle a file with spaces in the name.


Robin
# 19  
Old 06-23-2017
@Don, the -exec requires to set the argv[0] for a script interpreter like bash.
This is certainly true for all LUnix - otherwise process names would always be the script interpreter (e.g. bash).
For demonstration:
Code:
$ mkdir newdir
$ touch newdir/file{1,2,3}
$ find newdir -type f -exec bash -c '
echo "$@"
' {} +
 newdir/file2 newdir/file3
$ find newdir -type f -exec bash -c '
echo "$@"
' bash {} +
newdir/file1 newdir/file2 newdir/file3
$

In the first case the process name became newdir/file1.
These 2 Users Gave Thanks to MadeInGermany For This Post:
# 20  
Old 06-23-2017
Quote:
Originally Posted by rbatte1
Is that an option? It will ensure you files are sequentially processed but (might) avoid spawning a shell for each file found to run the commands.
Don't think xargs works that way. Not a shell internal, can't run shell functions.
This User Gave Thanks to Corona688 For This Post:
# 21  
Old 06-23-2017
Hi.
Quote:
Originally Posted by Corona688
Don't think xargs works that way. Not a shell internal, can't run shell functions.
I agree, but one can with parallel in place of xargs, with the function exported, as in this quickie script:
Code:
#!/usr/bin/env bash

# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
LC_ALL=C ; LANG=C ; export LC_ALL LANG
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
em() { pe "$*" >&2 ; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C xargs parallel

p() { echo " Hi from function p, argument :$1:." ; }

pl " Running function p:"
p

pl " Running function p from xargs:"
echo a | xargs -t p 

pl " Running function p from parallel:"
echo a | parallel p

pl " Exporting function p:"
export -f p
export -pf

pl " Running function p exported from xargs:"
echo b | xargs -t p 

pl " Running function p exported from parallel:"
echo c | parallel p

exit $?

producing:
Code:
$ ./z4

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 3.16.0-4-amd64, x86_64
Distribution        : Debian 8.7 (jessie) 
bash GNU bash 4.3.30
xargs (GNU findutils) 4.4.2
parallel GNU parallel 20130922

-----
 Running function p:
 Hi from function p, argument ::.

-----
 Running function p from xargs:
p a 
Unquoted string "a" may clash with future reserved word at -e line 15.
Useless use of a constant ("a") in void context at -e line 15.

-----
 Running function p from parallel:
Unquoted string "a" may clash with future reserved word at -e line 15.
Useless use of a constant ("a") in void context at -e line 15.

-----
 Exporting function p:
p () 
{ 
    echo " Hi from function p, argument :$1:."
}
declare -fx p

-----
 Running function p exported from xargs:
p b 
Unquoted string "b" may clash with future reserved word at -e line 15.
Useless use of a constant ("b") in void context at -e line 15.

-----
 Running function p exported from parallel:
 Hi from function p, argument :c:.

I don't know what the internal difference would be that makes parallel work -- import of the environment perhaps, or possibly something to do with perl.

Best wishes ... cheers, drl
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

2 exec in find

Guys, I want to find the log files greather than 23 days and i want to perform 2 things here. one is to list the files and second is to gzip the files. hope this can be done using sh -c option. but not sure the exact command. find . -name "*.log" -mtime +23 -exec ls -la {} \; ... (5 Replies)
Discussion started by: AraR87
5 Replies

2. Shell Programming and Scripting

find command with -exec

Hi all, Please could someone help with the following command requirement. I basically need to find files NEWER than a given file and order the result on time. My attempt so far is as follows: find . -newer <file_name> -exec ls -lrt {} ;\ But I dont seem to get the right result... (12 Replies)
Discussion started by: jonnyd
12 Replies

3. Shell Programming and Scripting

find: missing argument to `-exec' while redirecting using find in perl

Hi Friends, Please help me to sort out this problem, I am running this in centos o/s and whenever I run this script I am getting "find: missing argument to `-exec' " but when I run the same code in the command line I didn't find any problem. I am using perl script to run this ... (2 Replies)
Discussion started by: ramkumarselvam
2 Replies

4. Ubuntu

Find and EXEC

This is a huge issue. and I need it fixed ASAP. account-system gate-system race_traffic_sensor achievement-system global race_voicepack admin glue-system realdriveby admin-system gps realism-system... (5 Replies)
Discussion started by: austech360
5 Replies

5. Ubuntu

Find and exec

Hello, I am a linux newbe. I want to install a program. I can download it only with wget command from internet. As far as i know this wget command does not transfer the exacutable flags. Because of that i wanted to find all configure files and change their mod to 744. I found this... (1 Reply)
Discussion started by: disconnectus
1 Replies

6. UNIX for Dummies Questions & Answers

Find Exec

Hello All, Is there a way to make exec do a couple of operations on a single input from find? For example, find . -type d -exec ls -l "{}" ";" I would like to give the result of each "ls -l" in the above to a wc. Is that possible? I want to ls -l | wc -l inside... (1 Reply)
Discussion started by: prasanna1157
1 Replies

7. Shell Programming and Scripting

Using MV FIND and -EXEC

Hi, i would like to rename files in directories and subdirs. Files contains specific french or strange caracters. I want to replace all non alpha-numerics by _ (underscore) First, i made this, but i think the "for" is limited. How can i do this directly by FIND ? for file in $(find .... (0 Replies)
Discussion started by: degraff63
0 Replies

8. Shell Programming and Scripting

| with find -exec

can we use |(pipe operator) with find -exec.....? or can pipe the output of find command to another command...? if not, why...? pls explain (3 Replies)
Discussion started by: vijay_0209
3 Replies

9. UNIX for Advanced & Expert Users

query about find and -exec

Hi, i have query about "find" command. Do I need to put the command after -exec in single quotes? Why? For ex. see output of these three find commands. Any explanations? cheers, -Ashish (2 Replies)
Discussion started by: shriashishpatil
2 Replies

10. UNIX for Advanced & Expert Users

find and exec

Hi, Happy new year. Would you be so kind to explain me what does this instruction : find /rep/app -type l -exec ls -l {} \;> allink.lst Many thanks. (2 Replies)
Discussion started by: big123456
2 Replies
Login or Register to Ask a Question