On Thu, May 27, 2021 at 1:11 AM Nir Soffer <nsoffer(a)redhat.com> wrote:
On Thu, May 27, 2021 at 12:36 AM Eric Blake <eblake(a)redhat.com> wrote:
>
> On Wed, May 26, 2021 at 03:15:13PM +0100, Richard W.M. Jones wrote:
> > On Wed, May 26, 2021 at 04:49:50PM +0300, Nir Soffer wrote:
> > > On Wed, May 26, 2021 at 4:03 PM Richard W.M. Jones
<rjones(a)redhat.com> wrote:
> > > > In my testing, nbdcopy is a clear 4x faster than qemu-img convert,
with
> > > > 4 also happening to be the default number of connections/threads.
> > > > Why use nbdcopy --connections=1? That completely disables threads
in
> > > > nbdcopy.
> > >
> > > Because qemu-nbd does not report multicon when writing, so practically
> > > you get one nbd handle for writing.
> >
> > Let's see if we can fix that. Crippling nbdcopy because of a missing
> > feature in qemu-nbd isn't right. I wonder what Eric's reasoning for
> > multi-conn not being safe is.
>
> multi-conn implies that connection A writes, connection B flushes, and
> connection C is then guaranteed to read what connection A wrote.
> Furthermore, if client A and B plan on doing overlapping writes, the
> presence of multi-conn means that whoever flushes last is guaranteed
> to have that last write stick. Without multiconn, even if A writes, B
> writes, B flushes, then A flushes, you can end up with A's data
> (rather than B's) as the final contents on disk, because the separate
> connections are allowed to have separate caching regions where the
> order of flushes determines which cache (with potentially stale data)
> gets flushed when. And note that the effect of overlapping writes may
> happen even when your client requests are not overlapping: if client A
> and B both write distinct 512 byte regions within a larger 4k page,
> the server performing RMW caching of that page will behave as though
> there are overlapping writes.
>
> During nbdcopy or qemu-img convert, we aren't reading what we just
> wrote and can easily arrange to avoid overlapping writes, so we don't
> care about the bulk of the semantics of multi-conn (other than it is a
> nice hint of a server that accepts multiple clients). So at the end
> of the day, it boils down to:
>
> If the server advertised multi-conn: connect multiple clients, then
> when all of them are done writing, only ONE client has to flush, and
> the flush will be valid for what all of the clients wrote.
>
> If the server did not advertise multi-conn, but still allows multiple
> clients: connect those clients, write to distinct areas (avoid
> overlapping writes, and hopefully your writes are sized large enough
> that you are also avoiding overlapping cache granularities); then when
> all clients are finished writing, ALL of them must call flush
> (ideally, only the first flush takes any real time, and the server can
> optimize the later flushes as having nothing further to flush - but we
> can't guarantee that).
In nbdcopy case, all writers write 128mb segments, except the last
one which may have shorter segment, but this cannot overlap with
anything. So I think we cannot have overlapping writes due to partial
blocks.
Optimizing flush to do only one flush when server supports multi-conn
seems like very low priority to me, so we do this later.
Looks liks nbd_ops_flush is already doing the right thing, flushing all
open handles.
So the only change needed seems to be:
diff --git a/copy/main.c b/copy/main.c
index b9dbe1d..924fedb 100644
--- a/copy/main.c
+++ b/copy/main.c
@@ -321,10 +321,6 @@ main (int argc, char *argv[])
exit (EXIT_FAILURE);
}
- /* If multi-conn is not supported, force connections to 1. */
- if (! src->ops->can_multi_conn (src) || ! dst->ops->can_multi_conn (dst))
- connections = 1;
-
/* Calculate the number of threads from the number of connections. */
if (threads == 0) {
long t;
Missing the next chunk:
@@ -384,10 +380,8 @@ main (int argc, char *argv[])
*/
if (connections > 1) {
assert (threads == connections);
- if (src->ops->can_multi_conn (src))
- src->ops->start_multi_conn (src);
- if (dst->ops->can_multi_conn (dst))
- dst->ops->start_multi_conn (dst);
+ src->ops->start_multi_conn (src);
+ dst->ops->start_multi_conn (dst);
}
/* If the source is NBD and we couldn't negotiate meta
We always use the requested number of connections, ignoring can_multi_conn.
What do you think?