On Tue, Apr 10, 2018 at 4:48 PM Kevin Wolf <kwolf(a)redhat.com> wrote:
Am 10.04.2018 um 15:03 hat Nir Soffer geschrieben:
> On Tue, Apr 10, 2018 at 1:44 PM Richard W.M. Jones <rjones(a)redhat.com>
> wrote:
>
> > We now have true zeroing support in oVirt imageio, thanks for that.
> >
> > However a problem is that ‘qemu-img convert’ issues zero requests for
> > the whole disk before starting the transfer. It does this using 32 MB
> > requests which take approx. 1 second each to execute on the oVirt side.
>
>
> > Two problems therefore:
> >
> > (1) Zeroing the disk can take a long time (eg. 40 GB is approx.
> > 20 minutes). Furthermore there is no progress indication while
this
> > is happening.
> >
>
> > Nothing bad happens: because it is making frequent requests there
> > is no timeout.
> >
> > (2) I suspect that because we don't have trim support that this is
> > actually causing the disk to get fully allocated on the target.
> >
> > The NBD requests are sent with may_trim=1 so we could turn these
> > into trim requests, but obviously cannot do that while there is no
> > trim support.
> >
>
> It sounds like nbdkit is emulating trim with zero instead of noop.
>
> I'm not sure why qemu-img is trying to do, I hope the nbd maintainer on
> qemu side can explain this.
qemu-img tries to efficiently zero out the whole device at once so that
it doesn't have to use individual small write requests for unallocated
parts of the image later on.
This makes sense if the device is backed by a block device on oVirt side,
and the NBD support efficient zeroing. But in this case the device is backed
by an empty sparse file on NFS, and oVirt does not support yet efficient
zeroing, we just write zeros manually.
I think should be handled on virt-v2v plugin side. When zeroing a file raw
image,
you can ignore zero requests after the highest write offset, since the
plugin
created a new image, and we know that the image is empty.
When the destination is a block device we cannot avoid zeroing since a block
device may contain junk data (we usually get dirty empty images from our
local
xtremio server).
The problem is that the NBD block driver has max_pwrite_zeroes = 32
MB,
so it's not that efficient after all. I'm not sure if there is a real
reason for this, but Eric should know.
We support zero with unlimited size without sending any payload to oVirt,
so
there is no reason to limit zero request by max_pwrite_zeros. This limit may
make sense when zero is emulated using pwrite.
> However, since you suggest that we could use "trim" request for these
> requests, it means that these requests are advisory (since trim is), and
> we can just ignore them if the server does not support trim.
What qemu-img sends shouldn't be a NBD_CMD_TRIM request (which is indeed
advisory), but a NBD_CMD_WRITE_ZEROES request. qemu-img relies on the
image actually being zeroed after this.
So it seems that may_trim=1 is wrong, since trim cannot replace zero.
Nir