Scrutinizer and i had a discussion about loops in shell scripts and you might be interested in joining in and share your experiences:
i wrote an example script which basically employed the following logic:
Scrutinizer said, this is a UUOC. Well, in principle, he is of course right. We could write the same this way:
But still, i beg to differ. This is not a useless but a very sensible use of cat! Suppose the loop would not be as short as the example here, but several screenpages long. To understand what goes into "$var" one would have to scroll down to its end, then, to find out what is done with "$var", scroll back up again.
Is it only me that i hate to have to scroll up and down repeatedly? I find it a lot easier to read if i "steer my loops from the top" instead of from the (maybe far-away) bottom.
Of course, there is this alluring GNU shellnik startup called bash. In bash pipelines have some really weird side effects, like variables being local to them. The following works in both shells:
But while in ksh "$line" would hold all the list entries after the loop in bash the variable would be empty! Is it only me or do you think this is counter-intuitive too?
So probably in bash one has to resort to this ugly style of meticulously telling the shell the recipe in length while being totally silent about the ingredients you want to use - until the very end. Could you imagine cook books to be written that way? It would look like:
Quote:
First you take something *) and some other thing **) and mix it together, scramble it, then heat up something ***) in a something ****). Now put the mixed together something *) and something **) into the something ****) and cook for some minutes.
_______________
*) eggs
**) some cheese and ham
***) a drop of butter
****) a pan
But the question stays: do you think this - in strictest terms - UUOC should be avoided even if it has no negatie side effects or do you think the gain in clarity outweighs this?
Pipes are not free, so from the potential of efficiency I prefer to avoid the cat construct. Further, I prefer my shell loops this way:
Having the shell open the input to the while on a file descriptor other than standard input prevents me, or some future maintainer, from adding an ssh command (or similar stdin gobbling binary) and forgetting to redirect stdin from /dev/null or using an alternate mechanism (-n in the case of ssh) to prevent the binary from causing odd problems with the loop. Maybe it's me, but it seems that there have been a fair few posts on this forum that were related to a while loop's stdin being 'eaten' by a process in the loop.
Letting bash try to run this results in several errors.
A slight twist on the code above allows for the input to be defined at the top of the loop without requiring the extra cat:
Again, it doesn't work in bash. The loop below does, but I don't like having to pick the file descriptor value, and rfd=3; exec ${rfd}<data, which would allow me to hard code the constant only once, seems not to work (parsing and expansion order I believe). IMHO, Having the shell automatically assign an available file descriptor value just seems the right thing to do.
I have preference for whatever is simplest and the most intuitive to read..
I agree that specifying data at the end of the loop is a bit of an oddball, but:
To me it is the simplest and cleanest code, there is no need for a cat-and-pipe or an extra file descriptor
As noted above, in shells other than ksh it does not send the loop into a subshell, so not only is that more efficient, it ensures variables set inside the loop are available outside. I tend to go with what works in all shells. The way it is done in ksh is great, but it is not specified in POSIX.
Whenever I have a loop with more than 20 lines of code, I tend to start thinking about splitting it into functions with mnemonic names.
With regards to the use of redirects, I have a preference for using them only in the context in which they are used. Also, feeding them into the loop at the bottom is ideal, since the file descriptors get closed when the loop ends. With the exec examples you would need to use an explicit close afterwards.
Last edited by Scrutinizer; 06-14-2012 at 03:35 AM..
I prefer to redirect at the end of the loop too because I think it's the most portable construct. If the loop is several pages long I usually put a comment at the beginning that tells what is read from.
I much prefer top-down flow in any programming language and adopt the modular approach with the main program logic as the simplest control flow possible.
Each time this debate comes up there is no proof that the Shell inward redirect is faster than using cat. I can't see why the Posix folks don't make cat a Shell built-in rather than try to retire the command.
Have you read the "Useful uses of cat" collection from the excellent Mascheck site: Useful use of cat(1)
That list includes a contribution from a certain Chris F.A. Johnson !
An enhanced version of the "convert file contents into arguments" contribution came up on unix.com yesterday.
I too wish you could do <filename while read LINE ... like you can commands, but you can't, and putting the redirection at the end is the most portable.
If I had a shell loop 3 pages long, I'd try and reduce it with functions.
@methyl cat is a very useful command and many of its uses are listed on that site, who could argue with that? But one will find that the particular case that is the subject of this thread is not listed there. Interestingly, speed had not been brought up as an argument here, but since you mentioned it, I thought I'd run a couple of simple tests on a 66 MB file:
test 1:
test 2
test
shell
real
user
sys
test1
bash3
1m5.429s
0m43.494s
0m21.982s
test2
bash3
0m43.040s
0m38.658s
0m4.382s
test1
ksh93
0m35.22s
0m14.89s
0m20.36s
test2
ksh93
0m6.87s
0m6.84s
0m0.02s
So it seems there can also be a significant speed difference..
--
bash 3.2.48 / ksh93s+
---
I am not really into UUOC'ing BTW, it arose as part of a humorous exchange..
Last edited by Scrutinizer; 06-14-2012 at 06:18 PM..
working through VIOS backup options. Generally, we store mksysb's on a server and then NFS mount them from it to copy to a VIO optical library, etc.
In the case of a VIO backup, I see the -mksysb option to backupios and understand that it doesn't include the NIM resources in the backup.
... (3 Replies)
what does today=${1:-${today}} mean???
I saw a script which has these two lines:
today=`date '+%y%m%d'`
today=${1:-${today}}
but both gives the same value for $today
user:/export/home/user>today=`date '+%y%m%d'`
user:/export/home/user>echo $today
120326... (2 Replies)
I read somewhere that you should make sure Apache is configured to not allow symbolic links to be followed outside the webroot, as this can compromise security.
I can imagine how this could lead to a security risk:
eg:
Is my assumption correct? -- Is it nothing more than: "its just... (0 Replies)
I have two machines on my network - one OSX mac and one linux box. The mac is my main workhorse, and the linux box does occasional chores and webserving. Currently the mac shares (via NFS) files with the Liinux box.
Would it be less demanding on the mac if I made it a client, and moved my files... (2 Replies)