On Wed, Mar 08, 2023 at 04:29:03PM -0600, Eric Blake wrote:
When a highly multi-threaded program such as nbdcopy encounters an
error, there is a race condition in the library which can cause an
assertion failure and thus a core dump:
(1) An error occurs on one of the threads. nbdcopy calls exit(3).
(2) In lib/errors.c, the destructor calls pthread_key_delete.
(3) Another thread which is still running also encounters an error,
and inside libnbd the library calls set_error().
(4) The call to set_error() calls pthread_getspecific. This is
undefined behavior per POSIX, since the key has already been destroyed
in step (2), but glibc handles it by returning NULL with an EINVAL
error (POSIX recommends, but can't mandate, that course of action for
technical reasons). In any case, the error message is lost, and any
subsequent call to nbd_get_error() will return NULL.
(5) We enter the %DEAD state, where there is an assertion that
nbd_get_error() is not NULL. This assertion is supposed to be for
checking that the code called set_error() before entering the %DEAD
state. Although we did call set_error(), the error message was lost
at step (4) and nbd_get_error() will always return NULL. This
assertion failure causes a crash.
There aren't any good ways to fix this. We have proven that the
assertions are too tight (it IS possible for one thread's first use of
libnbd API, which causes the allocation of a thread-local error
context, to occur after another thread has already triggered the
destructor via exit(3)), so remove those. Meanwhile, since POSIX says
any use of a destroyed key is undefined, add some atomic bookkeeping
such that we track a reference count of how many threads have
allocated an error context and subsequently had the per-thread data
destructor called. Only call pthread_key_delete() if all threads had
the chance to exit; leaking small amounts of memory is preferable to
undefined behavior.
Affected tests that now produce the new leak message:
- copy/copy-nbd-error.sh (known issue - we probably want to improve
nbdcopy to gracefully issue NBD_CMD_DISC on error, rather than
outright calling exit(), to work around the assertion failure in
nbdkit 1.32.5)
- various fuse/* (not sure if we can address that; cleaning up FUSE is
tricky)
This reverts a small part of commit 831cbf7ee14c ("generator: Allow
DEAD state actions to run").
Thanks: Richard W.M. Jones <rjones(a)redhat.com>
Signed-off-by: Eric Blake <eblake(a)redhat.com>
...
@@ -87,16 +92,23 @@ free_errors_key (void *vp)
static struct last_error *
allocate_last_error_on_demand (void)
{
- struct last_error *last_error = pthread_getspecific (errors_key);
+ struct last_error *last_error = NULL;
- if (!last_error) {
- last_error = calloc (1, sizeof *last_error);
- if (last_error) {
- int err = pthread_setspecific (errors_key, last_error);
- if (err != 0) {
- /* This is not supposed to happen (XXX). */
- fprintf (stderr, "%s: %s: %s\n", "libnbd",
"pthread_setspecific",
- strerror (err));
+ if (key_use_count) {
+ last_error = pthread_getspecific (errors_key);
+ if (!last_error) {
+ last_error = calloc (1, sizeof *last_error);
+ if (last_error) {
+ int err = pthread_setspecific (errors_key, last_error);
+ key_use_count++;
+ if (err != 0) {
+ /* This is not supposed to happen (XXX). */
+ fprintf (stderr, "%s: %s: %s\n", "libnbd",
"pthread_setspecific",
+ strerror (err));
+ free (last_error);
+ last_error = NULL;
+ key_use_count--;
+ }
}
}
}
I suspect this may not be completely race condition free unless
there's a lock. Otherwise key_use_count could be > 0 when we enter
this code and then another thread could run the destructor at the same
time.
However I don't want to add locking around these functions since
(especially the one setting context) is highly contended and this
could kill performance.
I think the worst that might happen is we would get EINVAL from
pthread_setspecific, print a message, but at least we won't crash
because the assert has been removed.
So soft ACK, but I'm still wondering if there's a better way to do this.
If simply never calling pthread_key_destroy a good idea? The worst is
it leaks slots in glibc's thread-local storage array.
Rich.
--
Richard Jones, Virtualization Group, Red Hat
http://people.redhat.com/~rjones
Read my programming and virtualization blog:
http://rwmj.wordpress.com
libguestfs lets you edit virtual machines. Supports shell scripting,
bindings from many languages.
http://libguestfs.org