Re: [Libguestfs] [PATCH libnbd v2 3/4] api: Implement concurrent writer.

Tuesday, 4 June 2019

On 6/4/19 6:16 AM, Richard W.M. Jones wrote:
...
 There are several races / deadlocks which I've thought about. 
Let's
 see if I can remember them all ...

 (1) This I experienced: nbd_aio_get_fd deadlocks if there are
 concurrent synchronous APIs going on.  A typical case is where you set
 up the concurrent writer thread before connecting, and then call a
 synchronous connect function such as connect_tcp.  The synchronous
 function grabs h->lock, then writes something, which eventually
 invokes the writer thread which calls nbd_aio_get_fd and deadlocks on
 h->lock.

 -> Probably the writer thread should be forbidden from using
 nbd_handle. 
Or rather, forbidden from using nbd_handle in a locking manner.  We
should basically document a whitelist of functions that are safe to call...

...

 (2) The writer thread calls nbd_aio_get_fd, but the fd returned might
 might be closed before we use it, resulting in either EBADF or worse
 using another fd that happens to be opened around the same time.

 -> I think the solution to this would be to allow the writer callback
 to signal that the socket is about to be closed (eg. add an extra flag
 parameter to the callback), which would kill the writer thread. 
Yes, I think a flag argument to the writer callback can serve a couple
of additional purposes:

Right now, the writer callback is reached exactly once for most commands
(on data from h->request), and exactly twice for NBD_CMD_WRITE
(h->request, then the user's persistent buffer). A flag argument would
let us inform the user whether the buffer is transient (h->request is
volatile; the writer thread MUST memcpy it aside; but it is fixed in
length so the memcpy is not performance sensitive) or persistent (the
NBD_CMD_WRITE buffer was provided by the caller, and may be several
megabytes; memcpy() can negatively impact the cache and is pointless
since the original buffer is not going to be scribbled on); it also lets
us inform the writer thread on whether TCP_CORK would be worthwhile
(corking the outgoing socket for NBD_CMD_WRITE would let the OS buffer
things into a single wire transaction on both the request header and the
payload, rather than more overhead for two separate TCP packets,
particularly since we've disabled Nagle's algorithm).

Or, in pseudo-code:

nbd_aio_pread(,userbuf,) - >
  write_callback(data, h->request, len, 0);
nbd_aio_pwrite(,userbuf,) - >
  write_callback(data, h->request, len, LIBNBD_WRITE_CORK)
  write_callback(data, userbuf, len, LIBNBD_WRITE_PERSISTENT_BUF)
nbd_aio_shutdown() - >
  write_callback(data, NULL, 0, LIBNBD_WRITE_SHUTDOWN)

...

 (3) nbd_concurrent_writer_error could lose errors.  This might happen
 if the socket is closed normally without writing anything, which would
 never check h->writer_error.

 (4) nbd_concurrent_writer_error possibly deadlocks too since it needs
 to grab h->lock.  Basically the same as (1). 
...and this function must be in the documented safe whitelist, but that
means that instead of grabbing h->lock, we have to implement error
reporting by use of atomics.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [Libguestfs] [PATCH libnbd v2 3/4] api: Implement concurrent writer.