Re: [Libguestfs] [libnbd PATCH v3 09/29] lib/utils: introduce async-signal-safe execvpe()

Tuesday, 21 February 2023

On 2/21/23 17:16, Daniel P. Berrangé wrote:

...
>> Apps simply shouldn't ever call setenv once threads are
created, and
>> malloc() is safe in any impl that is relevant these days. 
...
 Yes, the docs are written by language lawyers, but the reality is
 that malloc is safe in glibc for 15+ years and that's not going to
 change. 
...
 macOS, Linux glib, Linux musl, FreeBSD are all malloc safe after
fork
 with threads. Windows doesn't have fork.  I'm fairly certain that I
 validated the OpenBSD/NetBSD malloc impl too, but it was a while ago. 
Consider two patterns:

(1) an application calling setenv() in one thread and getenv() in
another thread

(2) a multi-threaded application forking, and then calling malloc() in
the child process

(Assume there are no user-provided libraries here, just the standard C
library, and applications.)

POSIX rules each practice a bug in the application.

"We" (for some uncertain definition of "we") however call (1) a bug
in
the application, and (2) a bug in the standard C library.

Put differently, glibc chooses to break under (1), and to protect iself
against (2).

If users report bugs about issue (1), the tickets are routed to the
application's bug tracker; if they report bugs about issue (2), the
tickets are routed to glibc's bug tracker.

Why? More precisely: why this inconsistency between the choices for (1)
and (2)? I'm not even asking why diverge from the published technical
standard -- my question is, why diverge *inconsistently* from the standard.

A different standard C library implementation might choose the inverse
"armoring".

I can't wrap my brain around the *arbitrariness* of this.

I guess I could entertain an argument like

  issue (1) is easy to fix in the application, while issue (2) is hard
  to fix in an application, so considering total development cost, it's
  best to fix issue (2) in a central location: glibc; i.e., it's best to
  extend POSIX just for issue (2)

*IF* your run-of-the-mill Linux distribution's *own* documentation were
not inconsistent with such an argument. The fact that we have two
divergent, conflicting documentation sets on just that *specific*
programming environment -- the linux manual pages, vs. the glibc info
pages from GNU --, where the former calls malloc() unsafe for (2) but
the latter calls it safe, turns the whole thing into a travesty. No
statement ever in either manual can be then taken seriously, everything
needs to be tested.

What justification do you see for the different treatment of (1) vs (2)?

Related tickets (both about the same issue):
- https://bugzilla.redhat.com/show_bug.cgi?id=906468
- https://sourceware.org/bugzilla/show_bug.cgi?id=19431

These were fixed in 2016-2017, so *not* 15+ years ago. malloc() may be
"exceptionally safe" in this regard, in practice, but other functions
that POSIX similarly calls unsafe, may be just "relatively safe", or not
safe at all, in glibc. The problem is that we now can't make *any
assessment at all* based on the documentation, because POSIX is being
called "overly cautious", and because the Linux manual vs. the glibc
documentation have been proved obsolete and/or contradictory.

With this, it's effectively impossible, by way of code review and docs
checking, to determine if API usage is safe or not. What sense does code
review make like this in the first place?

Laszlo

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [Libguestfs] [libnbd PATCH v3 09/29] lib/utils: introduce async-signal-safe execvpe()