Re: [Libguestfs] [Qemu-block] v2v: -o rhv-upload: Long time spent zeroing the disk

Tuesday, 10 April 2018

On 04/10/2018 09:40 AM, Richard W.M. Jones wrote:
...
> When the destination is a block device we cannot avoid zeroing
since a block
> device may contain junk data (we usually get dirty empty images from our
> local
> xtremio server).

 (Off topic for qemu-block but ...)  We don't have enough information
 at our end to know about any of this. 
Yep, see my other email about a possible NBD protocol extension to
actually let the client learn up-front if the exported device is known
to start in an all-zero state.

...

>> The problem is that the NBD block driver has max_pwrite_zeroes = 32 MB,
>> so it's not that efficient after all. I'm not sure if there is a real
>> reason for this, but Eric should know.
>>
>
> We support zero with unlimited size without sending any payload to oVirt,
> so
> there is no reason to limit zero request by max_pwrite_zeros. This limit may
> make sense when zero is emulated using pwrite.

 Yes, this seems wrong, but I'd want Eric to comment. 
The 32M cap is currently the fault of qemu-img, not nbdkit (nbdkit is
not further reducing the size of the zero requests it passes on to
oVirt); and I explained in the other email about how qemu 2.13 will fix
things to send larger zero requests (hmm, that means nbdkit really needs
to start supporting NBD_OPT_GO, as that is what qemu will be relying on
to learn the larger limits).

...

>>> However, since you suggest that we could use "trim" request for
these
>>> requests, it means that these requests are advisory (since trim is), and
>>> we can just ignore them if the server does not support trim.
>>
>> What qemu-img sends shouldn't be a NBD_CMD_TRIM request (which is indeed
>> advisory), but a NBD_CMD_WRITE_ZEROES request. qemu-img relies on the
>> image actually being zeroed after this.
>>
>
> So it seems that may_trim=1 is wrong, since trim cannot replace zero.

 Note that the current plugin ignores may_trim.  It is not used at all,
 so it's not relevant to this problem.

 However this flag actually corresponds to the inverse of
 NBD_CMD_FLAG_NO_HOLE which is defined by the NBD spec as:

     bit 1, NBD_CMD_FLAG_NO_HOLE; valid during
     NBD_CMD_WRITE_ZEROES. SHOULD be set to 1 if the client wants to
     ensure that the server does not create a hole. The client MAY send
     NBD_CMD_FLAG_NO_HOLE even if NBD_FLAG_SEND_TRIM was not set in the
     transmission flags field. The server MUST support the use of this
     flag if it advertises NBD_FLAG_SEND_WRITE_ZEROES. *

 qemu-img convert uses NBD_CMD_WRITE_ZEROES and does NOT set this flag
 (hence in the plugin we see may_trim=1), and I believe that qemu-img
 is correct because it doesn't want to force preallocation. 
Yes, the flag usage is correct, and you are also correct that the
'may_trim' flag of nbdkit is the inverse bit sense of the
NBD_CMD_FLAG_NO_HOLE of the NBD protocol; it's all a documentation game
in deciding whether having a bit be 0 or 1 in the default state made
more sense.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [Libguestfs] [Qemu-block] v2v: -o rhv-upload: Long time spent zeroing the disk