On Wed, Mar 14, 2018 at 9:04 PM Richard W.M. Jones <rjones(a)redhat.com>
wrote:
On Wed, Mar 14, 2018 at 06:56:19PM +0000, Nir Soffer wrote:
> I posted documentation for the new API optimized for random I/O:
>
https://gerrit.ovirt.org/#/c/89022/
Wish I'd had this documentation when I started the patch :-)
Yes, it's much clearer.
> I changed POST to PATCH to match the existing /tickets API, and
> this also seems to be more standard way to do such operations.
Assuming Python httplib will allow us to put anything in the method
argument of http.putrequest then this doesn't appear to make any
significant difference so that's fine. Also we can set the "flush"
(ie. FUA) parameter to match the NBD request.
> Please check and comment if this makes sense and serves the v2v
> use case or other uses case we missed.
>
> I think we can implement all of this for 4.2.4, but:
>
> - using simple zero loop, as in
https://gerrit.ovirt.org/#/c/88793/.
> later we can make it more efficient.
> - trim is a noop, maybe we will be able to support it in 4.3
> - flush - may be noop now (all requests will implicitly flush).
I don't think we really need trim or flush. They're only minor
optimizations. Zero is the one which is required.
FWIW NBD allows you to flush ranges or flush the whole disk, in case
that matters (your proposal only allows you to flush the whole disk).
What is the use case for flushing ranges? I guess we will have one
or few flushes per images.
Looking at sync_file_range(2), it does not seem to be a safe way to
flush:
Warning
This system call is extremely dangerous and should not be used in
portable programs. None of these operations writes out the file's
metadata. Therefore, unless the application is strictly
performing
overwrites of already-instantiated disk blocks, there are no
guarantees
that the data will be available after a crash. There is no user
interface to know if a write is purely an overwrite. On file
systems
using copy-on-write semantics (e.g., btrfs) an overwrite of existing
allocated blocks is impossible. When writing into preal‐ located
space, many file systems also require calls into the block
allocator,
which this system call does not sync out to disk. This system call
does not flush disk write caches and thus does not provide any data
integrity on systems with volatile disk write caches.
I can support the same size and offset arguments, and treat them
as a hint if we can implement this safely in some future version.
But I think providing only simple and safe way to flush is good
enough for this context.
I think we better have complete API with partial or simpler
> implementation now, to minimize the hacks needed in v2v and
> other clients.
Agreed.
Rich.
--
Richard Jones, Virtualization Group, Red Hat
http://people.redhat.com/~rjones
Read my programming and virtualization blog:
http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines. Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v