On 8/3/19 11:01 AM, Eric Blake wrote:
There are a couple of problems with filters trying to sleep. First,
when it is time to shut down nbdkit, we wait until all pending
transactions have had a chance to wind down. But consider what
happens if one or more of those pending transactions are blocked in a
sleep. POSIX says nanosleep is interrupted with EINTR if that thread
handles a signal, but wiring up signal masks just to ensure a specific
thread will get the signal is not pretty, and that means the thread
processing SIGINT is NOT the thread blocked in nanosleep. Couple that
with the fact that if multiple threads are sleeping, only one thread
needs to process the signal, so the rest continue to sleep. Thus,
even though we know the user wants nbdkit to quit, we are blocked
waiting for a potentially long sleep to complete. This problem can be
solved by realizing we already have a pipe-to-self to learn about a
quit request or the end of a particular connection, and check for
activities on those pipes in parallel with a timeout through pselect
or ppoll to break our wait as soon as we know there is no reason to
continue on with the transaction.
+++ b/server/public.c
+#else
+ struct timespec ts;
+ struct connection *conn = threadlocal_get_conn ();
+ struct pollfd fds[2] = {
+ [0].fd = quit_fd,
+ [0].events = POLLIN,
+ [1].fd = conn ? conn->status_pipe[0] : -1,
+ [1].events = POLLIN,
In testing this, the code is responsive to a multi-threaded connection
detecting client death on any other thread, but not responsive to a
single-threaded connection detecting client death on the lone thread.
But even with a multi-threaded connection, if every single thread is
tied up in a sleep, we lose responsiveness. So I'm currently testing an
amendment to this patch that uses fds[3], with conn->sockout being
polled with .events = 0 for POLLHUP/POLLERR.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3226
Virtualization:
qemu.org |
libvirt.org