Re: [Libguestfs] [PATCH v7 6/6] v2v: Add -o rhv-upload output mode (RHBZ#1557273).

Monday, 26 March 2018

On Sun, Mar 25, 2018 at 11:05 PM Nir Soffer <nirsof(a)gmail.com&gt; wrote:

...
 On Sun, Mar 25, 2018 at 2:41 PM Richard W.M. Jones
<rjones(a)redhat.com&gt;
 wrote:

> On Sat, Mar 24, 2018 at 10:36:06PM +0000, Nir Soffer wrote:
> > On Thu, Mar 22, 2018 at 5:25 PM Richard W.M. Jones <rjones(a)redhat.com&gt;
> > wrote:
> >
> > > PROBLEMS:
> > >  - -of qcow2 does not work, with multiple problems
> > >     * needs to set NBD size to something larger than virtual size
> > >
> >
> > How is this related to the actual file size specified when you create a
> > disk?
> >
> > In block storage, qcow2 image is always on a thin provisioned disk,
> which
> > in oVirt is a regular logical volume, created at the requested
> > "initial_size":
> [...]
> >         initial_size=image_size,
> >         provisioned_size=image_info["virtual-size"],
>
> Can you describe exactly what these two sizes mean?
>

 - provisioned_size is the virtual size of the image.
 - initial_size is used only for thin provisioned disk on block storage.
 This is
 the size of the logical volume we created for this disk.

 On thin disk on block storage you will not be able to write or zero more
 then initial_size bytes.

 When a vm is running, vdsm monitor the allocated space and ask the SPM
 host to allocated more space when the highest write offset is too high, so
 the disk looks like a thin provisioned disk to the vm. This mechanism is
 not available for storage operations, and all of them specify initial_size
 when converting images to qcow2 format.

>
> Remember that virt-v2v works by streaming.  At the point where we are
> calling this API virt-v2v only knows the virtual size.  We never know
> the size of the final qcow2 file.
>

 If you don't have a way to estimate the final size you need to allocated
 the entire image when you create a disk.

 But I don't understand why you don't have access to the image. I understood
 that the flow is:

 image file -> qemu-img convert -> nbdkit -> ovirt-imageio -> ovirt disk

 In this flow you can estimate the size using the image file before starting
 the streaming process.

 If there is no way to estimate the size and you must allocate the entire
 image, we can add a shrink step in upload finalization, to shrink the image
 size to optimal size. We are already doing this in several flows that
 cannot estimate the final disk size and do over allocation.

 [snipped]

> > > +        log = logging.getLogger(),
>
 > > +        insecure = params['insecure'],
> > >
> >
> > If ca_file cannot be None, then insecure is not needed, based
> > on Ondra review from earlier version.
>
> Is this really true?  My reading of the code is that the insecure flag
> verifies the server to the client, whereas the certificate bundle is
> for verifying the client to the server.
>

 Maybe, I'm not familiar with the SDK.

 What is the motivation for disabling this flag?

> > [snipped]
> >
> > > +    # Create the disk.
> > > +    disks_service = system_service.disks_service()
> > > +    if params['disk_format'] == "raw":
> > > +        disk_format = types.DiskFormat.RAW
> > > +    else:
> > > +        disk_format = types.DiskFormat.COW
> > > +    disk = disks_service.add(
> > > +        disk = types.Disk(
> > > +            name = params['disk_name'],
> > > +            description = "Uploaded by virt-v2v",
> > > +            format = disk_format,
> > > +            provisioned_size = params['disk_size'],
> > >
> >
> > This must be the virtual size.
> >
> > You don't specify initial_size - in this case you get 1G, and most
> > images will fail to upload.
>
> This works for me, and I'm using a guest which is larger than 1G.  As
> above can you explain what these numbers are supposed to be, and note
> that we only know the virtual size (params['disk_size']).  We cannot
> know any other information because we're streaming the data, so
> anything that requires us to know the final size of the image is a
> non-starter.
>

 Discussed above.

> > > +            sparse = params['output_sparse'],
> > >
> > The user cannot configure that. This must be based on the image
> > format. The current coded may create images in unsupported
> > combinations, e.g. raw/sparse on block storage, or fail validation
> > in engine.
>
> In virt-v2v this can be configured using ‘-oa’.  What are the possible
> combinations?
>

 I could not find documentation for this in the ovirt engine sdk.
 There is http://ovirt.github.io/ovirt-engine-sdk/master/  but it is
 not very useful for anything.

 The ovirt-engine REST API has real documentation here:
 http://ovirt.github.io/ovirt-engine-api-model/master

 But I could not find any documentation about creating disks there, so
 the above is based on current system behavior.

 When creating disks from the UI, the user select the storage domain
 (implicitly selecting the file/block), and the allocation policy
 (thin/preallocated). The user cannot select the image format when
 creating disk, ovirt will choose the image format and sparse for you.

 storage type  allocation   | image format  sparse  creating

---------------------------|-----------------------------------------------------------------
 file          thin         | raw           true    sparse file of
 provisioned_size bytes
 file          preallocated | raw           false   preallocated file of
 provisioned_size bytes
 block         thin         | qcow2         true    logical volume of
 initial_size bytes
 block         preallocated | raw           false   logical volume of
 provisioned_size bytes

 When uploading disk, the user select a file, implicitly selecting the
 image format (raw/qcow2), and the storage domain, implicitly selecting
 the storage type (file/block).  oVirt selects the allocation and sparse
 value for you.

 storage type  image format | allocation    sparse   creating

---------------------------|---------------------------------------------------------------
 file          qcow2        | thin          true     sparse file of image
 size bytes
 file          raw          | thin          true     sparse file of
 provisioned_size bytes
 block         qcow2        | thin          true     logical volume of
 initial_size bytes
 block         raw          | preallocated  false    logical volume of
 provisioned_size bytes

 Notes:
 - file,qcow2 cannot be created from the UI.
 - no way to create preallocated uploading from the UI (looks like a bug to
 me)

In the UI, upload dialog doesn't expose allocation selection.
Instead, the UI selects it according to the selected image format.
I.e. cow -> thin, raw -> file: thin, block: preallocated

...

 When using the sdk, you can select both the image format and sparse, so you
 can create invalid combinations. oVirt selects the allocation for you.
...

 storage type  image format  sparse | allocation      creating

-----------------------------------|--------------------------------------------------------
 file          qcow2         true   | thin           sparse file of image
 size bytes
 file          raw           false  | preallocated   preallocated file of
 provisioned_size bytes
 file          raw           true   | thin           sparse file of provisioned_size
 bytes
 file          qcow2         false  | -              unsupported
 block         qcow2         true   | thin           logical volume of
 initial_size bytes
 block         raw           false  | preallocated   logical volume of
 provisioned_size bytes
 block         qcow2         false  | -              unsupported
 block         raw           true   | -              unsupported

 The only case when selecting the sparse value is useful is raw file
 on file based storage domain, allowing creation of sparse or preallocated
 disk.

 I guess you can check if a storage domain is file based or block basedu
 using the SDK, but I'm not familiar with it.

 Daniel, do we expose this info?

You can get it from '.storage.type' on the StorageDomain object.
E.g.
sd = connection.system_service().storage_domains_service().list()[0]
sd.storage.type -> nfs/iscsi/etc

...

 > [snipped]
> >
> > > +# Can we issue zero, trim or flush requests?
> > >
> > +def get_options(h):
> > > +    if h['got_options']:
> > > +        return
> > > +    h['got_options'] = True
> > > +
> > > +    http = h['http']
> > > +    transfer=h['transfer']
> > > +
> > > +    http.putrequest("OPTIONS", h['path'])
> > > +    http.putheader("Authorization", transfer.signed_ticket)
> > > +    http.endheaders()
> > > +
> > > +    r = http.getresponse()
> > > +    if r.status == 200:
> > > +        j = json.loads(r.read())

 > > +        h['can_zero'] = "zero" in j['features']
> > > +        h['can_trim'] = "trim" in j['features']
> > > +        h['can_flush'] = "flush" in
j['features']
> > > +
> > > +        # If can_zero is true then it means we're using the new
> imageio
> > > +        # which never needs the Authorization header.
> > > +        if h['can_zero']:
> > > +            h['needs_auth'] = False
>

 In older proxy, we do support OPTIONS, used for allowing access
 from enigne UI, but it does not return any content. Fortunately, we
 return httplib.NO_CONTENT so it is easy to detect.

 So we need to handle gracefully r.status = 204

> > >
> >
> > If we got here, we are working with new daemon or proxy, and both
> > of them do not need auth, so we can set 'needs_auth' to False
> > if OPTIONS returned 200 OK.
>
> Which is what this code does, unless I'm misunderstanding things.
>

 Accessing readonly ticket will return empty features list, and
 h['needs_auth']
 will be True, while this server does not need auth. Yes, it cannot happen
 with this code but it is not the right check.

> > [snipped]
> >
> > +def pwrite(h, buf, offset):
> > > +    http = h['http']
> > > +    transfer=h['transfer']
> > > +    transfer_service=h['transfer_service']
> > > +
> > > +    count = len(buf)
> > > +    h['highestwrite'] = max(h['highestwrite'],
offset+count)
> > > +
> > > +    http.putrequest("PUT", h['path'] +
"?flush=n")
> > >
> >
> > "?flush=n" has effect only if h["can_flush"] is True. Older
daemon/proxy
> > do not know support disabling flushing. The docs mention that this
> > query string will be added in 1.3, we are now at 1.2.
>
> But it doesn't seem to break the old imageio?
>

 Old imageio just ignores this, so it is safe to use as is, the issue is
 again
 confusing the reader that this has some effect on the server.

 [snipped]

> > > +    # Construct the JSON request for zeroing.

> > > +    buf = json.dumps({'op': "zero",
> > > +                      'offset': offset,
> > > +                      'size': count,
> > > +                      'flush': False})
> > > +
> > > +    http.putrequest("PATCH", h['path'])
> > > +    http.putheader("Content-Type",
"application/json")
> > > +    if h['needs_auth']:
> > > +        http.putheader("Authorization", transfer.signed_ticket)
> > >
> >
> > Only GET and PUT on a server that does not implement OPTIONS need auth.
> >
> > This will work if h['needs_auth'] is set correctly, but I think it is
> > better to include
> > this check only in pread() and pwrite(), and add a comment there that
> this
> > is
> > need to support legacy versions.
>
> So I think you mean that we can remove the if h['needs_auth']
> and following line completely?
>

 Yes, probably can be removed.

 [snipped]

> > > +    buf = json.dumps({'op': "flush",
> > > +                      'offset': 0,
> > > +                      'size': 0,
> > > +                      'flush': True})
> > >
> >
> > Should be (discussed in v6)
> >
> >     buf = json.dumps({"op": "flush"})
>
> This contradicts the documentation, but I can change the code.
>

 Agree, I'll fix the docs.

 Nir

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [Libguestfs] [PATCH v7 6/6] v2v: Add -o rhv-upload output mode (RHBZ#1557273).