I will be following up to this email with four separate threads each
addressed to the appropriate single list, with proposed changes to:
- the NBD protocol
- qemu: both server and client
- libnbd: client
- nbdkit: server
The feature in question adds a new optional NBD_INFO_ packet to the
NBD_OPT_GO portion of handshake, adding up to 16 bits of information
that the server can advertise to the client at connection time about any
known initial state of the export [review to this series may propose
slight changes, such as using 32 bits; but hopefully by having all four
series posted in tandem it becomes easier to see whether any such tweaks
are warranted, and can keep such tweaks interoperable before any of the
projects land the series upstream]. For now, only 2 of those 16 bits
are defined: NBD_INIT_SPARSE (the image has at least one hole) and
NBD_INIT_ZERO (the image reads completely as zero); the two bits are
orthogonal and can be set independently, although it is easy enough to
see completely sparse files with both bits set. Also, advertising the
bits is orthogonal to whether the base:allocation metacontext is used,
although a server with all possible extensions is likely to have the two
concepts match one another.
The new bits are added as an information chunk rather than as runtime
flags; this is because the intended client of this information is
operations like copying a sparse image into an NBD server destination.
Such a client only cares at initialization if it needs to perform a
pre-zeroing pass or if it can rely on the destination already reading as
zero. Once the client starts making modifications, burdening the server
with the ability to do a live runtime probe of current reads-as-zero
state does not help the client, and burning per-export flags for
something that quickly goes stale on the first edit was not thought to
be wise, similarly, adding a new NBD_CMD did not seem worthwhile.
The existing 'qemu-img convert source... nbd://...' is the first command
line example that can benefit from the new information; the goal of
adding a protocol extension was to make this benefit automatic without
the user having to specify the proposed --target-is-zero when possible.
I have a similar thread pending for qemu which adds similar
known-reads-zero information to qcow2 files:
https://lists.gnu.org/archive/html/qemu-devel/2020-01/msg08075.html
That qemu series is at v1, and based on review it has had so far, it
will need some interface changes for v2, which means my qemu series here
will need a slight rebasing, but I'm posting this series to all lists
now to at least demonstrate what is possible when we have better startup
information.
Note that with this new bit, it is possible to learn if a destination is
sparse as part of NBD_OPT_GO rather than having to use block-status
commands. With existing block-status commands, you can use an O(n) scan
of block-status to learn if an image reads as all zeroes (or
short-circuit in O(1) time if the first offset is reported as probable
data rather than reading as zero); but with this new bit, the answer is
O(1). So even with Vladimir's recent change to make the spec permit 4G
block-status even when max block size is 32M, or the proposed work to
add 64-bit block-status, you still end up with more on-the-wire traffic
for block-status to learn if an image is all zeroes than if the server
just advertises this bit. But by keeping both extensions orthogonal, a
server can implement whichever one or both reporting methods it finds
easiest, and a client can work with whatever a server supplies with sane
fallbacks when the server lacks either extension. Conversely,
block-status tracks live changes to the image, while this bit is only
valid at connection time.
My repo for each of the four projects contains a tag 'nbd-init-v1':
https://repo.or.cz/nbd/ericb.git/shortlog/refs/tags/nbd-init-v1
https://repo.or.cz/qemu/ericb.git/shortlog/refs/tags/nbd-init-v1
https://repo.or.cz/libnbd/ericb.git/shortlog/refs/tags/nbd-init-v1
https://repo.or.cz/nbdkit/ericb.git/shortlog/refs/tags/nbd-init-v1
For doing interoperability testing, I find it handy to use:
PATH=/path/to/built/qemu:/path/to/built/nbdkit:$PATH
/path/to/libnbd/run your command here
to pick up just-built qemu-nbd, nbdsh, and nbdkit that all support the
feature.
For quickly setting flags:
nbdkit eval init_sparse='exit 0' init_zero='exit 0' ...
For quickly checking flags:
qemu-nbd --list ... | grep init
nbdsh -u uri... -c 'print(h.get_init_flags())'
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3226
Virtualization:
qemu.org |
libvirt.org