On 6/22/20 10:49 AM, Richard W.M. Jones wrote:
If you have a plugin which either creates background threads itself
or
uses a library that creates background threads, it turns out you
cannot create these in .get_ready (or earlier). The reason is that
nbdkit forks when either daemonizing itself or using the --run option,
and fork cancels all the background threads in the child process (the
daemonized or captive nbdkit).
Or more precisely, the main thread after the fork cannot interact with
the helper threads stranded in the parent process (fork always creates a
single-threaded process - you have to use Linux-specific clone to copy
threads from parent to child, and it's not worth us trying to go that
low-level).
The only good solution here is to add a new callback which I've called
.after_fork which runs after either daemonization or the --run
function.
Yeah, I'm not thinking of any other viable solutions.
It should be used sparingly because it's not possible to report errors
nicely from this callback. In particular when daemonizing we have
lost the controlling terminal and errors go to syslog - it looks to
the user as if nbdkit “died”.
This adds the callback to the C and OCaml APIs, and also sh and eval
(to make the tests work), but I didn't update the other language APIs
yet.
---
+++ b/docs/nbdkit-filter.pod
+=head2 C<.after_fork>
+
+ int (*after_fork) (nbdkit_next_after_fork *next, void *nxdata);
+
+This intercepts the plugin C<.after_fork> method and can be used by
+the filter to start background threads (although these have limited
+usefulness since they will not be able to access the plugin).
+
+If there is an error, C<.after_fork> should call C<nbdkit_error> with
+an error message and return C<-1>.
Should the backend code guarantee that .after_fork is called for all
layers in the backend chain? The only flexibility I see us gaining by
letting the filter choose when .after_fork is called is fine-grained
control over what threads the filter can start before or after the
plugin's .after_fork, but is that flexibility really needed, or are we
equally safe if the backend just always calls .after_fork and the filter
doesn't have to worry about calling next(nxdata)?
Of course, given that filters are in-tree, we can change this signature
as needed later on, even if we make it void for now we can hook it into
passing .after_fork later if we find a use for that finer-grained call
semantics.
+=item C<.after_fork>
+
+In normal operation, C<.after_fork> is called after the server has
+forked into the background and changed UID and directory. If a plugin
+needs to create background threads (or uses an external library that
+creates threads) it should do so here, because background threads are
+killed by fork.
s/killed/invalidated/ (they are still alive in the parent process, but
after the fork we are no longer in the parent process to interact with them)
+
+Because the server may have forked into the background, error messages
+and failures from C<.after_fork> cannot be seen by the user unless
+they look through syslog. An error in C<.after_fork> can appear to
+the user as if nbdkit “just died”. So in almost all cases it is
+better to use C<.get_ready> instead of this callback, or to do as much
+preparation work as possible in C<.get_ready> and only start
+background threads here.
+
+The server doesn't always fork (eg. if the I<-f> flag is used), but
+even so this callback will be called. If you want to find out if the
+server forked between C<.get_ready> and C<.after_fork> use
+L<getpid(2)>.
Thinking about back-compat:
We promise that plugins compiled against older nbdkit will continue to
run against newer nbdkit, and that is preserved here.
What we do not promise is that plugins compiled against newer nbdkit
will work wehn loaded by older nbdkit. This is one of those cases:
anything you stick in .after_fork will not be called by older nbdkit
which didn't know about the callback.
At present, we silently ignore the tail of 'struct plugin' from any
plugin compiled against newer nbdkit when loaded by older nbdkit.
Should we tweak that to instead output a warning that the plugin might
rely on features present only in newer nbdkit and therefore might not
work in the current nbdkit version? Of course, that doesn't help the
fact that existing older nbdkit is lacking this safety check.
Do we need to instead add some sort of API for plugins to be able to
learn which features are provided by the nbdkit loading the plugin, so
that a plugin that requires the use of 1.22 nbdkit features (such as
.after_fork) can gracefully fail during .config_complete (or similar)
when loaded by a 1.20 nbdkit, rather than mysteriously failing to work
when .after_fork is not called? [In a distro setting, the solution is
version dependencies: you could make it so that installing the
nbdkit-vddk-plugin package forces an upgrade of the nbdkit package to
1.22 - but this is more a question for situations not covered by distro
packaging setups]
+=head2 C<.after_fork>
+
+ int after_ready (void);
Typo in the function name.
+
+This optional callback is called before the server starts serving. It
+is called after the server forks and changes directory. If a plugin
+needs to create background threads (or uses an external library that
+creates threads) it should do so here, because background threads are
+killed by fork. However you should try to do as little as possible
+here because error reporting is difficult. See L</Callback lifecycle>
+above.
+
+If there is an error, C<.after_fork> should call C<nbdkit_error> with
+an error message and return C<-1>.
+
+++ b/server/filters.c
@@ -193,6 +193,28 @@ filter_get_ready (struct backend *b)
b->next->get_ready (b->next);
}
+static int
+next_after_fork (struct backend *b)
+{
+ b->after_fork (b);
+ return 0;
+}
+
+static void
+filter_after_fork (struct backend *b)
+{
+ struct backend_filter *f = container_of (b, struct backend_filter, backend);
+
+ debug ("%s: after_fork", b->name);
+
+ if (f->filter.after_fork) {
+ if (f->filter.after_fork (next_after_fork, b->next) == -1)
+ exit (EXIT_FAILURE);
+ }
+ else
+ b->next->after_fork (b->next);
+}
This code changes if we decide that filters could live with 'void'
instead of next_after_fork, because we mandate the chain traversal
through the backend rather than leaving it up to the filters.
Still a few tweaks needed, but the idea looks sane.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3226
Virtualization:
qemu.org |
libvirt.org