while loops and variables under bash

while loops and variables under bash


This is probably going to be very simple but i came across something i can't quite explain. Here is the situation: i have a list of files, which i'd like to process one by one (get the size, make some tests, whatever) and generate some statistics using different variables.

Something similar to this:

ls|while read f; do
#until [ $u -gt 5 ]; do
#for f in `ls `; do
echo inside u=$u
let "u+=1"

echo outside u=$u

And here are the results (there are 3 files in the current directory):

inside u=0
inside u=1
inside u=2
outside u=0
So basically the variable u appears to get re-initialised somehow.

Using until or a classic for loop u keeps its value after the loop is executed, like this:

inside u=0
inside u=1
inside u=2
inside u=3
inside u=4
inside u=5
outside u=6
So what's the deal ? That's under bash. Ksh gives similar results to the other loops.
It's not the different kinds of loops that are doing it -- it's the pipe that does it. By putting your while loop behind a pipe, you are executing it inside a subshell. Values get changed in the subshell, not the main one, and don't get copied back.

ksh organizes pipes in a different order than other Bourne shells. The innermost loop runs in the current shell while the outermost loop runs in the subshell. This is a KSH-only feature.

Depending on your goal, there are various ways to circumvent or just plain avoid this. This is a useless use of ls * for instance -- a situation where you might as well be using the * operator instead of the external ls utility.

for X in *
        let "u+=1"

The 'for' loop overcomes this by putting ls in backticks, which runs it first, and sets the value in a variable. But this is not recommended as it will be confused by filenames with spaces in them.
Aha.. makes all sense now. Thanks for explaining and that nice link
Originally Posted by Corona688
By putting your while loop behind a pipe, you are executing it inside a subshell.
@Corona688: I too thought the same, and tried to check this by printing the pid of shell before entering while loop and inside while loop. Pid's are same. If its invoking a sub-shell it should print the pid of the sub-shell inside while loop, right?

#! /bin/bash
echo "before while, pid = $$"
ls | while read x
    echo "inside while, pid = $$"

# bash --version
GNU bash, version 3.1.17(1)-release (i686-redhat-linux-gnu)
# ./test.sh
before while, pid = 6435
inside while, pid = 6435
inside while, pid = 6435
inside while, pid = 6435

Any idea what's happening here?
@balajesuri: from man bash:
Special parameters
$: Expands to the process ID of the shell. In a () subshell, it expands to the process ID of the current shell, not the subshell.
Each command in a pipeline is executed as a separate process (i.e., in a subshell).
@Scrutinizer: Yes, you're right. How could I miss that!
