On Fri, Mar 22, 2019 at 12:17:59PM -0500, Eric Blake wrote:
On 3/22/19 11:42 AM, Eric Blake wrote:
>
> Hence, it is desirable to have a way for clients to specify that a
> particular write zero request is being attempted for a fast wipe, and
> get an immediate failure if the zero request would otherwise take the
> same time as a write. Conversely, if the client is not performing a
> pre-initialization pass, it is still more efficient in terms of
> networking traffic to send NBD_CMD_WRITE_ZERO requests where the
> server implements the fallback to the slower write, than it is for the
> client to have to perform the fallback to send NBD_CMD_WRITE with a
> zeroed buffer.
>
> Add a protocol flag and corresponding transmission advertisement flag
> to make it easier for clients to inform the server of their intent. If
Note that this is independent of proposals made on the NBD list in the
past [1] of having a way for the server to advertise that a particular
export starts in an all-zeroes state (faster than a series of 32-bit
NBD_CMD_BLOCK_STATUS would be able to do), although I may _also_ try to
revive proposed documentation and a reference implementation of that
optimization as well (as qemu-img convert can completely skip the
zeroing, whether the bulk wipe or per-hole writing, when it knows the
destination is already zero).
It has to be said that this would be a lot easier to implement, and
for our purposes (optimizing qemu-img convert) it does everything we
need.
However the original proposal you put here seems reasonable. I have
only one comment about it: Should the new error (ENOTSUP) be submitted
as a separate patch to the spec?
[1]
https://lists.debian.org/nbd/2016/12/msg00015.html and following
(doc: Propose NBD_FLAG_INIT_ZEROES extension)
>
> I will not push this without both:
> - a positive review (for example, we may decide that burning another
> NBD_FLAG_* is undesirable, and that we should instead have some sort
> of NBD_OPT_ handshake for determining when the server supports
> NBD_CMD_FLAG_FAST_ZERO)
From an implementation point of view I prefer simple flags over having
to implement a brand new option.
We can always work out how to extend the flags field if we run out of
flags. For example, by implementing NBD_OPT_INFO2 with a much bigger
flags field.
> - a reference client and server implementation (probably both
via qemu,
> since it was qemu that raised the problem in the first place)
The last time we mentioned the possibility of advertising an initial
zero state, we debated whether burning one of our 16 NBD_FLAG_*
transmission bits for that purpose was wise [2], but discussion stalled
after I never developed a counterproposal with NBD_OPT_* handshaking and
never produced a reference implementation.
[2]
https://lists.debian.org/nbd/2016/12/msg00048.html
Also, keep in mind that knowing that something started as all zeroes
(which only affects startup; once you do any write, that early status
bit no longer means anything to current operation, so less important to
hand to the kernel during transmission phase, especially if the kernel
can ever learn to utilize NBD_CMD_BLOCK_STATUS) is indeed different from
knowing if probing for fast zeroing is supported (where it is easy to
conceive of the kernel needing to know if it can send
NBD_CMD_FLAG_FAST_ZERO). So we may still want to use NBD_OPT_* to get
the initial zero extension, but NBD_FLAG to advertise the fast zero
extension.
On the other hand, it's also worth thinking about which extensions are
easy for servers to implement - NBD_FLAG_INIT_ZEROES and
NBD_FLAG_SEND_FAST_ZERO are orthogonal enough that I could see a full
2x2 mix of servers (unsupported, either one of the two supported, or
both supported), and where clients may make optimization choices based
on any of those four combinations.
[and if we're keeping score, other extension proposals that I want
revisit, in no particular order, include:
- 64-bit operations
- NBD_CMD_RESIZE
- more precision on TRIM/WRITE_ZERO alignment constraints
]
Rich.
--
Richard Jones, Virtualization Group, Red Hat
http://people.redhat.com/~rjones
Read my programming and virtualization blog:
http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW