To: cygwin AT cygwin DOT com
From: Eric Blake <ebb9 AT byu DOT net>
Subject:  Re: Can't set variables in a while loop that is passed to the rest  of the script.
Date: Fri, 15 Jan 2010 21:06:26 +0000 (UTC)
Lines: 53
Message-ID:  <loom.20100115T214642-4@post.gmane.org>
References:  <EDF49EC9787F914CA157FF6927D65A88302633AB82 AT CBMCC-X7-MBX10 DOT ad DOT cibc DOT com> <20100115194014 DOT M26660 AT ds DOT net>
Mime-Version:  1.0
Content-Type:  text/plain; charset=us-ascii
Content-Transfer-Encoding:  7bit
User-Agent: Loom/3.14 (http://gmane.org/)
Mailing-List: contact cygwin-help AT cygwin DOT com; run by ezmlm
Sender: cygwin-owner AT cygwin DOT com
Mail-Followup-To: cygwin AT cygwin DOT com

Brian Wilson <wilson <at> ds.net> writes:

> 
> The pipe is what spawns the sub shell.  In Unix the last process runs in your 
> current shell.  In Linux the first process of the pipe runs in the current 
> shell.  The difference is that when the while statement (which is run in the 
> sub shell) finishes the sub shell dies and any variable changes are lost.  In 
> Unix the variables remain in the current shell.

Don't top-post.  Also, your terminology is incorrect.  The correct formulation 
is that:

The use of a single pipe in a shell historically created TWO subshells, one for 
each side (a pipeline with two | created three subshells, for the three 
commands, etc.).  Then some shells decided to optimize.  The most common 
optimization is that the last pipe in the pipeline is done in the current 
shell, but POSIX went one step further and said that ANY, or even ALL, of the 
pipe commands in the pipeline can be executed in the current shell.  However, I 
don't know of any shell offhand that optimizes only the first command in 
isolation, nor of any shell that attempts to optimize more than one pipe 
command per pipeline.  In other words, the optimization tends to be a question 
between last vs. none, and not between last vs. first.

Furthermore, it is not Unix vs. Linux that decides whether a subshell is 
created for each side of the pipe, but the implementation of the particular 
shell you are running (ksh vs. zsh vs. bash...) and how much they chose to 
optimize.  In all cases, the shell calls pipe(2) (well, actually ksh uses 
socketpair(2), but the effect is the same), and that behaves identically across 
all platforms.  The difference in the number of subshells is thus how many fork
(2) calls are made after the pipe is created, and not how the pipe(2) call 
behaved.

Bash follows the traditional behavior (no pipeline optimizations whatsoever), 
whereas dash tries to optimize as much as possible.  But the bash behavior of 
no optimization will be the same whether bash is running on Cygwin, on a 
traditional Unix machine like Solaris, or on Linux; likewise the dash behavior 
of optimizing whatever it can, regardless of platform.  But if you step back 
and look at the bigger picture, and realize that which shell is chosen to 
implement /bin/sh on the various platforms, then you can understand why the 
behavior differs as you change machines (for example, Linux tends to favor bash 
as /bin/sh, while Solaris tends to favor ksh).

Since POSIX allows both behaviors (either the creation of a subshell, or the 
optimization into the current shell), the only portable way to work with 
pipelines is to make no assumption about which shell will be running a command 
within a pipeline.  Thus, any variable assignments made in a pipe (and that 
includes via the read builtin) will affect subsequent commands only on those 
shells that chose to optimize that part of the pipeline into the current shell.

-- 
Eric Blake


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple