Re: [Libguestfs] [libnbd PATCH v3 09/29] lib/utils: introduce async-signal-safe execvpe()

Tuesday, 21 February 2023

On 2/21/23 20:32, Eric Blake wrote:
...
 On Tue, Feb 21, 2023 at 07:07:38PM +0100, Laszlo Ersek wrote: 
...
> Well, given Daniel's comments meanwhile, it seems like the
original
> execvp() is something we shouldn't fret about. :/

 glibc marks execvp() and exevpe() as 'MT-Safe env', which means it
 does not modify 'environ' and presumably does not use 'malloc'.  If
 the only reason POSIX marks execvp() as thread-unsafe is because of
 its interaction with getenv() for PATH and therefore unsafe to exec a
 child in one thread while another is calling putenv(), then it is no
 worse than our use of getenv() without locking on the grounds that no
 sane app will be doing setenv() after spawning threads.

 But you've gone to a lot of work on this series; I'm still in favor of
 including your work, even if decide our premise behind it is weaker
 than intended.  As I understood it, the premise is that actively
 avoiding as many non-async-safe functions as possible between fork()
 and exec() is always a good idea to avoid the deadlock created when a
 multithreaded process fork()s in one thread while another thread holds
 a mutex on the non-async-safe resource (in the child process, the
 other thread no longer exists and therefore can never release the
 resource).  Knowing WHICH resources are liable to be locked in other
 threads makes it easier to reason about which generically
 non-async-safe functions can be used if we make other limiting
 restrictions (such as glibc's 'MT-Safe env' designation), but that
 requires more thought than blindly avoiding all non-async-safe
 functions. 
It's killing me to see my work turn out as waste, but I don't want it to be
included (and to present maintenance burden) just for "saving" the work. I kind
of expected it to be unwelcome (due to it being opinionated and not fixing acute
symptoms); I didn't expect it to be lost from the specification / premise up. This
early paragraph in the commit message is torpedoed:

    However, execvp() is not async-signal-safe [03], and so we shouldn't call
    it in a child process forked from a multi-threaded parent process [04],
    which the libnbd client application may well be.

That "and so" implication is busted because the glibc manual (from GNU, not the
Linux manual pages!) *implies* that fork(), as opposed to _Fork(), does not restrict the
child process to AS-Safe functions.

Anyway, I think your characterization is right. My approach was, "avoid calling
anything that's not explicitly async-signal-safe". A laxer approach is
"avoid calling anything that might interfere with particular resources".

Unfortunately, this laxer approach, while it may require some "further caution"
in case we want to call further functions in the affected context, quite summarily
obviates this particular execvp() replacement.

Plus, regarding said "further caution" in general, I'm getting the
impression (from RHBZ#906468) that glibc's intent is actually the opposite -- i.e.,
the intent seems to be that any particular API should opt out of, rather than opt in to,
usability after fork().

Considering how much work is still needed to address the review feedback thus far (and the
necessary v4 reviews!), I'm seriously torn if we should just drop the whole thing.
(And there's a really sore lesson for me in this, regarding cleanups for standards
conformance.)

...
> Can you please elaborate on "+s"? (I'd like to
understand your point
> regardless of this patch, too.)

 https://www.austingroupbugs.net/view.php?id=1440 is where it came up
 in the Austin Group.  Basically, having system("+s") be _required_ to
 invoke ["/path/to/sh", "-c", "+s"] is risky -
"+s" is ambiguous
 between being the name of a real script and being a shell option 
So what does the "+s" option do specifically?

"-s" makes the shell read commands from stdin, but "+s" would only
apply to such an "-s" option that were taken by the "set" special
builtin -- it would *disable* that set-option (while "set -s" would enable it).
However, I don't see *any* "-s" under "set", so I don't know
what "set +s" would do.

...
 (POSIX already warns that naming an executable with a leading
"-" is
 unwise because of potential conflicts with options, but 'sh' has the
 special rules that + as well as - can introduce options for historical
 reasons). 
So is "+s" just a synonym for "-s", meaning the shell will read
commands from stdin?

...
 Calling ["/path/to/sh", "-c", "--",
"+s"] is unambiguous
 for a POSIX sh that understands "--" as the end-of-options marker, but
 not all historical sh did that.  It is also possible to use
 ["/path/to/sh", "-c", " +s"] (note the added leading space,
which is
 ignored by the shell), 
ignored why?

...
 but that requires space for injecting the
 leading space, and how do you do that concisely while still
 maintaining async-safety? 
(right, not doing it, just learning about +s)

...
> (... I wanted to say that "there's no replacement for
reading POSIX in
> full", but now I'm wondering if reading POSIX *at all* makes sense...)

 Alas, this rings too close to home...  Standards are only useful if
 they are likely to be followed, which in turn is harder if they are
 hard to read. 
"Hard to read" is a problem, but not the core issue IMO here. The problem is
that for the sake of various technological advances, POSIX is intentionally abandoned, and
that leaves us with *zero* common documentation that might uniformly cover even a small
handful of modern OSes. And we have no alternative to individually testing out every
"standard" API on each platform of interest. I have no problem with prctl()
being Linux-specific; I'm very much irritated by fork() relaxing its standard
restrictions without proper documentation. It's a recipe for wasting work.

Anyway I'll log out now because I'm hardly coherent.

Laszlo

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [Libguestfs] [libnbd PATCH v3 09/29] lib/utils: introduce async-signal-safe execvpe()