On Wed, Mar 08, 2023 at 04:29:03PM -0600, Eric Blake wrote:
There aren't any good ways to fix this. We have proven that the
assertions are too tight (it IS possible for one thread's first use of
libnbd API, which causes the allocation of a thread-local error
context, to occur after another thread has already triggered the
destructor via exit(3)), so remove those. Meanwhile, since POSIX says
any use of a destroyed key is undefined, add some atomic bookkeeping
such that we track a reference count of how many threads have
allocated an error context and subsequently had the per-thread data
destructor called. Only call pthread_key_delete() if all threads had
the chance to exit; leaking small amounts of memory is preferable to
undefined behavior.
Correction: Continue to always call pthread_key_delete() - if the
module is being unloaded, we do NOT want any other thread exiting to
call back into our (now-unloaded) data destructor code. But diagnose
when that unload action is known to trigger a memory leak.
/* Destroy the thread-local key when the library is unloaded. */
@@ -59,16 +62,16 @@ static void errors_key_destroy (void) __attribute__ ((destructor));
static void
errors_key_destroy (void)
{
- struct last_error *last_error = pthread_getspecific (errors_key);
-
/* "No destructor functions shall be invoked by
* pthread_key_delete(). Any destructor function that may have been
* associated with key shall no longer be called upon thread exit."
+ * If any threads were still using the key, the memory they allocated
+ * will be leaked; this is unusual enough to diagnose.
Based on Dan's comments about libvirt, I can see that libvirt never
called pthread_key_delete(), which DOES leave a thread liable to call
into unloaded code if dlclose() does not leave the library resident.
But since libnbd IS trying to delete the key, I'm not yet convinced
that preventing library unloading is necessary to solve this
particular issue (although the broader conclusion that preventing
unload may be wise for other reasons, and these days memory is cheap
enough that preventing unload is unlikely to cause grief, may still
warrant that change in a separate patch).
*/
- if (last_error != NULL) {
- free (last_error->error);
- free (last_error);
- }
+ if (key_use_count > 1)
+ fprintf (stderr, "%s: %s: %d threads leaking thread-local data\n",
+ "libnbd", "errors_key_destroy", key_use_count - 1);
+ key_use_count = 0;
pthread_key_delete (errors_key);
}
In short, it is intentional that this version of the patch
unconditionally calls pthread_key_delete() on module unload.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization:
qemu.org |
libvirt.org