On 5/30/23 16:56, Wouter Verhelst wrote:
On Tue, May 30, 2023 at 01:50:59PM +0200, Laszlo Ersek wrote:
> On 5/25/23 15:00, Eric Blake wrote:
>> +struct nbd_structured_reply_block_status_ext_hdr {
>> + uint32_t context_id; /* metadata context ID */
>> + uint32_t count; /* length of following array */
>
> (1) I think that "length of following array" is confusing here. Per
> spec, this is "descriptor count" -- that's clearer. "Length"
could be
> mistaken for "number of bytes".
>
> Also, what was the justification for *not* making "count" uint64_t?
> (Just asking.) I do understand a server is permitted to respond with a
> block status reply that covers less than the requested range, so I
> understand a server, if it needs to, *can* truncate its response for the
> sake of fitting "count" into a uint32_t.
This is, as you say, the number of block descriptor messages we are
going to send to the client. Each such message is at 8 bytes long.
16 bytes, isn't it? (uint64_t length, uint64_t status_flags)
That would mean that by the time you overflow a uint32_t with the
number
of extents that are to be sent, you'll be promising to send 2^32 * 8
(i.e., 2^36) bytes of data, or 32 GiB.
(Or 64 GiB if we agree that sizeof(nbd_block_descriptor_ext)=16.)
But... yes, I'm aware this is exorbitant, practically speaking.
My concern was that "practical considerations" must have been what led
to the original 32-bit-only:
struct nbd_block_descriptor {
uint32_t length; /* length of block */
uint32_t status_flags; /* block type (hole etc) */
} NBD_ATTRIBUTE_PACKED;
which is now proving too restrictive (being the basis for this entire
work, IIUC).
A 2^(32+9) B = 2 TB image is not unthinkable. If the image used 512B
sectors, and sectors alternated between being a hole and being
allocated, then 2^32 extended descriptors would be justified.
May not be practical, but that's "policy"; the "mechanism" could
still
exist (if it doesn't come with undesirable costs).
That would be a ridiculously amount of data, and no user is going to
wait for a client to finish parsing that amount of data without a hard
reboot of their client.
Over a 10gbit/s connection, transferring 64GB should take on the order
of a minute.
... *parsing* 4.3 billion entries is a different matter, of course ;)
OK!
The only change that would be reasonable here is to reduce the size
of
this field 16 to bits, rather than increasing it (but then we lose
alignment, so that's also not a good idea)
Putting aside alignment even, I don't understand why reducing "count" to
uint16_t would be reasonable. With the current 32-bit-only block
descriptor, we already need to write loops in libnbd clients, because we
can't cover the entire remote image in one API call [*]. If I understood
Eric right earlier, the 64-bit extensions were supposed to remedy that
-- but as it stands, clients will still need loops ("chunking") around
block status fetching; is that right?
Let me emphasize again that I'm not challenging the spec; also I don't
secretly wish for the patches to do more than required by the spec. I
guess I'd like to understand the intent of the spec, plus the
consequences for client applications.
[*] References (in this order!):
-
https://github.com/libguestfs/virt-v2v/commit/27c056cdc6aa0
-
https://gitlab.com/nbdkit/libnbd/-/commit/0e714a6e06e6
-
https://github.com/libguestfs/virt-v2v/commit/a2afed32d8b1f
To be less cryptic, the first commit added "chunked" block status
fetching to virt-v2v. Because our OCaml language libnbd mappings weren't
proper at the time, that loop could move backwards and get stuck. We
fixed that in the second commit (regarding the bindings) and then
adapted virt-v2v's loop to the fixed bindings in the thirds commit. I've
been hoping that such complexities could be eliminated in the future by
virtue of not having to "chunk" the block status fetching.
(
BTW I'm foreseeing a problem: if the extended block descriptor can
provide an unsigned 64-bit length, we're going to have trouble exposing
that in OCaml, because OCaml only has signed 64-bit integers. So that's
going to reproduce the same issue, only for OCaml callers of the *new* API.
I can see Eric's series includes patches like "ocaml: Add example for
64-bit extents" -- I've not looked at those yet; for now I'm just
wondering what tricks we might need in the bindings generator. The
method seen in the "middle patch" above won't work; we don't have a
native OCaml "i128" type for example that we could use as an escape
hatch, for representing C's uint64_t.
)
Thanks!
Laszlo