On Fri, Mar 25, 2022 at 08:04:42PM +0000, Richard W.M. Jones wrote:
On Fri, Mar 25, 2022 at 07:41:02AM -0500, Eric Blake wrote:
> Our docs were inconsistent on whether a NBD_REPLY_TYPE_BLOCK_STATUS
> reply chunk can exceed the client's requested length, and silent on
> whether the lengths must be consistent whem multiple contexts were
"when"
Oops - will fix ;)
> negotiated. Clarify this to match existing practice as implemented in
> qemu-nbd. Clean up some nearby grammatical errors while at it.
> ---
> doc/proto.md | 38 ++++++++++++++++++++++++--------------
> 1 file changed, 24 insertions(+), 14 deletions(-)
>
> diff --git a/doc/proto.md b/doc/proto.md
> index 8a817d2..8276201 100644
> --- a/doc/proto.md
> +++ b/doc/proto.md
> @@ -882,15 +882,22 @@ The procedure works as follows:
> server supports.
> - During transmission, a client can then indicate interest in metadata
> for a given region by way of the `NBD_CMD_BLOCK_STATUS` command,
> - where *offset* and *length* indicate the area of interest. The
> - server MUST then respond with the requested information, for all
> - contexts which were selected during negotiation. For every metadata
> - context, the server sends one set of extent chunks, where the sizes
> - of the extents MUST be less than or equal to the length as specified
> - in the request. Each extent comes with a *flags* field, the
> - semantics of which are defined by the metadata context.
> -- A server MUST reply to `NBD_CMD_BLOCK_STATUS` with a structured
> - reply of type `NBD_REPLY_TYPE_BLOCK_STATUS`.
> + where *offset* and *length* indicate the area of interest. On
> + success, the server MUST respond with one structured reply chunk of
> + type `NBD_REPLY_TYPE_BLOCK_STATUS` per metadata context selected
> + during negotiation, where each reply chunk is a list of one or more
> + extents for that context.
> +
> +The client's requested *length* is only a hint to the server, so the
> +cumulative size of the extents in each chunk of the server's reply may
> +be shorter or longer the original request; and when more than one
^ than the original request
> +metadata context was negotiated, the cumulative length per context may
Why is the word "cumulative" here?
Suppose we negotiate base:allocation (server replies ID 1) and
qemu:dirty-bitmap:map (server replies ID 2). Then on a successful
NBD_CMD_BLOCK_STATUS for 4096 bytes starting at offset 1024, qemu-nbd
could reply:
REPLY_TYPE_BLOCK_STATUS (length 20 bytes = 2 extents): id=1, extent[0] = { length 2048,
flags 0 }, extent[1] = { length 3072, flags HOLE }
REPLY_TYPE_BLOCK_STATUS (length 12 bytes = 1 extent): id=2, extent[1] = { length 1024,
flags 0 }
There are two lists of extents, but no required correlation between
the two lists; the list for id 1 has a cumulative length of 5k (larger
than the client's request), and the list for id 2 has a cumulative
length of 1k (smaller than the client's request).
Clients negotiating multiple contexts must be prepared for such
impedence mismatch between the different contexts, rather than servers
being required to generate lists with identical number of elements and
length fields. But I'm welcome to ideas on how best to word that.
> +differ within a single block status request. Each extent comes with a
> +*flags* field, the semantics of which are defined by the metadata
> +context. The client may use the `NBD_CMD_FLAG_REQ_ONE` flag to
> +further constrains the server's reply so that each chunk contains
> +exactly one extent whose length does not exceed the client's original
> +*length*.
In these two adjacent sentences you talk about the flags field for
each extent (which is really the type of the extent), and then the
client request flags, and it's kind of confusing. Maybe make that a
bit clearer, eg. "In the request, the client may set the
NBD_CMD_FLAG_REQ_ONE flag ...".
Yes, that is an improvement.
> A client MUST NOT use `NBD_CMD_BLOCK_STATUS` unless it selected a
> nonzero number of metadata contexts during negotiation, and used the
> @@ -1778,8 +1785,8 @@ MUST initiate a hard disconnect.
> *length* MUST be 4 + (a positive integer multiple of 8). This reply
> represents a series of consecutive block descriptors where the sum
> of the length fields within the descriptors is subject to further
> - constraints documented below. This chunk type MUST appear
> - exactly once per metadata ID in a structured reply.
> + constraints documented below. A successful block status request MUST
> + have exactly one status chunk per negotiated metadata context ID.
>
> The payload starts with:
>
> @@ -1801,15 +1808,18 @@ MUST initiate a hard disconnect.
> *length* of the final extent MAY result in a sum larger than the
> original requested length, if the server has that information anyway
> as a side effect of reporting the status of the requested region.
> + When multiple metadata contexts are negotiated, the cumulative
> + lengths in each chunk reply need not be identical.
Again, I'm unclear what is "cumulative" about the lengths.
If a reply chunk has more than one extent for a given context, then
there is a cumulative length in bytes of the amount of information
returned by adding up the length portion of each of the extent
entries.
> Even if the client did not use the `NBD_CMD_FLAG_REQ_ONE` flag in
> its request, the server MAY return fewer descriptors in the reply
> than would be required to fully specify the whole range of requested
> information to the client, if looking up the information would be
> too resource-intensive for the server, so long as at least one
> - extent is returned. Servers should however be aware that most
> - clients implementations will then simply ask for the next extent
> - instead.
> + extent is returned. Servers should however be aware that most
> + client implementations will likely follow up with a request for
> + extent information at the first offset not covered by a
> + reduced-length reply.
Rich.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization:
qemu.org |
libvirt.org