We noticed while writing various libnbd tests that when the delay
filter is in use, there are scenarios where we had to resort to
SIGKILL to get rid of nbdkit, because it was non-responsive to SIGINT.
I'm still trying to figure out the best way to add testsuite coverage
of this, but already proved to myself that it works from the command
line, under two scenarios that both used to cause long delays:
1. Early client death:
nbdkit -U - -vf --filter=delay null size=1m rdelay=10 --run \
'timeout 1s nbdsh --connect "nbd+unix:///?socket=$unixsocket"
-c "h.pread(1,0)"'
Pre-patch, the server detects the death of the client, but the worker
thread stays busy for the remaining 9 seconds before nbdkit can
finally exit. Post-patch, the server exits right after the client.
2. Early server death:
timeout 1s nbdkit -U - -vf --filter=delay null size=1m rdelay=10 --run \
'nbdsh --connect "nbd+unix:///?socket=$unixsocket" -c
"h.pread(1,0)"'
Pre-patch, the server reacts to the signal and kills the client, but
the worker thread stays busy for the remaining 9 seconds before nbdkit
can finally exit. Post-patch, the server is able to finalize right
after the signal.
Use of --run in the above tests lets you test things in one command
line, but to some extent hides the longevity of the nbdkit process
(you get the shell prompt back when the main thread exits, even though
the detatched threads are still around); if you avoid --run and
actually keep nbdkit in the foreground in one terminal and use nbdsh
in a different terminal, choosing which terminal gets ^C, the effects
are a bit more apparent.
Patch 3 needs porting to any platform lacking ppoll. I have ideas for
that port, but don't want to spend time on it before knowing for sure
we need the port. And the fact that the pre-patch tests show output
after the shell prompt returns means we still have cases where our
detached threads are executing past the point where the main thread
has tried to unload the plugin, which is never a nice thing. We may
still have more work ahead of us to ensure that we don't unload an
in-use plugin.
Eric Blake (3):
server: Add threadlocal_get_conn
server: Add pipe for tracking disconnects
server: Add and use nbdkit_nanosleep
docs/nbdkit-plugin.pod | 28 +++++++++++++++++
configure.ac | 1 +
include/nbdkit-common.h | 1 +
server/internal.h | 3 ++
filters/delay/delay.c | 14 ++-------
filters/rate/rate.c | 10 +++----
server/connections.c | 66 ++++++++++++++++++++++++++++++++++++++++-
server/public.c | 61 +++++++++++++++++++++++++++++++++++++
server/threadlocal.c | 22 +++++++++++++-
server/nbdkit.syms | 1 +
10 files changed, 188 insertions(+), 19 deletions(-)
--
2.20.1