On 1/23/21 12:38 AM, Richard W.M. Jones wrote:
> Sending flush request is more tricky; on imageio side, we have
one
> qemu-nbd server, with multiple connections. I'm not sure if sending one
> flush command on one of the connections is good enough to flush all
> commands, so we send flush command on all connections in the flush
> callback.
I know the answer to this! It depends if the NBD server advertises
multi-conn or not. With libnbd you can find out by querying
nbd_can_multi_conn on any of the connections (if the server is
behaving, the answer should be identical for any connection with the
same exportname). See:
https://github.com/NetworkBlockDevice/nbd/blob/master/doc/proto.md
"bit 8, NBD_FLAG_CAN_MULTI_CONN: Indicates that the server operates
entirely without cache, or that the cache it uses is shared among
all connections to the given device. In particular, if this flag is
present, then the effects of NBD_CMD_FLUSH and NBD_CMD_FLAG_FUA MUST
be visible across all connections when the server sends its reply to
that command to the client. In the absence of this flag, clients
SHOULD NOT multiplex their commands over more than one connection to
the export."
For unclear reasons qemu-nbd only advertises multi-conn for r/o
connections, assuming my reading of the code is correct. For nbdkit
we went through the plugins a long time ago and made them advertise
(or not) multi-conn as appropriate.
For qemu-nbd, supporting FLAG_CAN_MULTI_CONN on a writable connection
would require work to audit that a flush request on one connection
properly affects all pending writes on other connections, as visible by
all subsequent reads from any of the connections. This is not a trivial
task.
However, that's for the general case (when multiple writers are sending
requests that can overlap one another, and when you have to worry about
read consistency). But when copying an image, you have a special case:
each writer is only touching distinct subsets of the overall image, and
you are not reading from the image. As long as those subsets do not
overlap, and you do not trigger read-modify-write actions, you really
don't care whether flushes are consistent between writers, because there
are no consistency issues in the first place.
So we _might_ be able to optimize nbdcopy for the case of parallel
writes even when the NBD server does not advertise CAN_MULTI_CONN; the
writers can either use FLAG_FUA for every write, or _each_ write without
flushes and perform a final NBD_CMD_FLUSH; as long as EACH writer
performs a final flush before disconnecting (rather than trying to
optimize to just one writer flushing), you will be guaranteed that the
entire copied image is now consistent, even though you could not
guarantee flush consistency during the parallel writes.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3226
Virtualization:
qemu.org |
libvirt.org