On Sun, Mar 25, 2018 at 11:05 PM Nir Soffer <nirsof@gmail.com> wrote:

On Sun, Mar 25, 2018 at 2:41 PM Richard W.M. Jones <rjones@redhat.com> wrote:
On Sat, Mar 24, 2018 at 10:36:06PM +0000, Nir Soffer wrote:
> On Thu, Mar 22, 2018 at 5:25 PM Richard W.M. Jones <rjones@redhat.com>
> wrote:
>
> > PROBLEMS:
> > - -of qcow2 does not work, with multiple problems
> > * needs to set NBD size to something larger than virtual size
> >
>
> How is this related to the actual file size specified when you create a
> disk?
>
> In block storage, qcow2 image is always on a thin provisioned disk, which
> in oVirt is a regular logical volume, created at the requested
> "initial_size":
[...]
> initial_size=image_size,
> provisioned_size=image_info["virtual-size"],

Can you describe exactly what these two sizes mean?

- provisioned_size is the virtual size of the image.
- initial_size is used only for thin provisioned disk on block storage. This is
the size of the logical volume we created for this disk.

On thin disk on block storage you will not be able to write or zero more
then initial_size bytes.

When a vm is running, vdsm monitor the allocated space and ask the SPM
host to allocated more space when the highest write offset is too high, so
the disk looks like a thin provisioned disk to the vm. This mechanism is
not available for storage operations, and all of them specify initial_size
when converting images to qcow2 format.

Remember that virt-v2v works by streaming. At the point where we are
calling this API virt-v2v only knows the virtual size. We never know
the size of the final qcow2 file.

If you don't have a way to estimate the final size you need to allocated
the entire image when you create a disk.

But I don't understand why you don't have access to the image. I understood
that the flow is:

image file -> qemu-img convert -> nbdkit -> ovirt-imageio -> ovirt disk

In this flow you can estimate the size using the image file before starting
the streaming process.

If there is no way to estimate the size and you must allocate the entire
image, we can add a shrink step in upload finalization, to shrink the image
size to optimal size. We are already doing this in several flows that
cannot estimate the final disk size and do over allocation.

[snipped]
> > + log = logging.getLogger(),
> > + insecure = params['insecure'],
> >
>
> If ca_file cannot be None, then insecure is not needed, based
> on Ondra review from earlier version.

Is this really true? My reading of the code is that the insecure flag
verifies the server to the client, whereas the certificate bundle is
for verifying the client to the server.

Maybe, I'm not familiar with the SDK.

What is the motivation for disabling this flag?

> [snipped]
>
> > + # Create the disk.
> > + disks_service = system_service.disks_service()
> > + if params['disk_format'] == "raw":
> > + disk_format = types.DiskFormat.RAW
> > + else:
> > + disk_format = types.DiskFormat.COW
> > + disk = disks_service.add(
> > + disk = types.Disk(
> > + name = params['disk_name'],
> > + description = "Uploaded by virt-v2v",
> > + format = disk_format,
> > + provisioned_size = params['disk_size'],
> >
>
> This must be the virtual size.
>
> You don't specify initial_size - in this case you get 1G, and most
> images will fail to upload.

This works for me, and I'm using a guest which is larger than 1G. As
above can you explain what these numbers are supposed to be, and note
that we only know the virtual size (params['disk_size']). We cannot
know any other information because we're streaming the data, so
anything that requires us to know the final size of the image is a
non-starter.

Discussed above.

> > + sparse = params['output_sparse'],
> >
> The user cannot configure that. This must be based on the image
> format. The current coded may create images in unsupported
> combinations, e.g. raw/sparse on block storage, or fail validation
> in engine.

In virt-v2v this can be configured using ‘-oa’. What are the possible
combinations?

I could not find documentation for this in the ovirt engine sdk.
There is http://ovirt.github.io/ovirt-engine-sdk/master/ but it is
not very useful for anything.

The ovirt-engine REST API has real documentation here:
http://ovirt.github.io/ovirt-engine-api-model/master

But I could not find any documentation about creating disks there, so
the above is based on current system behavior.

When creating disks from the UI, the user select the storage domain
(implicitly selecting the file/block), and the allocation policy
(thin/preallocated). The user cannot select the image format when
creating disk, ovirt will choose the image format and sparse for you.

storage type allocation | image format sparse creating
---------------------------|-----------------------------------------------------------------
file thin | raw true sparse file of provisioned_size bytes
file preallocated | raw false preallocated file of provisioned_size bytes
block thin | qcow2 true logical volume of initial_size bytes
block preallocated | raw false logical volume of provisioned_size bytes

When uploading disk, the user select a file, implicitly selecting the
image format (raw/qcow2), and the storage domain, implicitly selecting
the storage type (file/block). oVirt selects the allocation and sparse
value for you.

storage type image format | allocation sparse creating
---------------------------|---------------------------------------------------------------
file qcow2 | thin true sparse file of image size bytes
file raw | thin true sparse file of provisioned_size bytes
block qcow2 | thin true logical volume of initial_size bytes
block raw | preallocated false logical volume of provisioned_size bytes

Notes:
- file,qcow2 cannot be created from the UI.
- no way to create preallocated uploading from the UI (looks like a bug to me)

In the UI, upload dialog doesn't expose allocation selection.

Instead, the UI selects it according to the selected image format.

I.e. cow -> thin, raw -> file: thin, block: preallocated

When using the sdk, you can select both the image format and sparse, so you

can create invalid combinations. oVirt selects the allocation for you.

storage type image format sparse | allocation creating
-----------------------------------|--------------------------------------------------------
file qcow2 true | thin sparse file of image size bytes
file raw false | preallocated preallocated file of provisioned_size bytes
file raw true | thin sparse file of provisioned_size bytes
file qcow2 false | - unsupported
block qcow2 true | thin logical volume of initial_size bytes
block raw false | preallocated logical volume of provisioned_size bytes
block qcow2 false | - unsupported
block raw true | - unsupported

The only case when selecting the sparse value is useful is raw file
on file based storage domain, allowing creation of sparse or preallocated
disk.

I guess you can check if a storage domain is file based or block basedu
using the SDK, but I'm not familiar with it.

Daniel, do we expose this info?

You can get it from '.storage.type' on the StorageDomain object.

E.g.

sd = connection.system_service().storage_domains_service().list()[0]

sd.storage.type -> nfs/iscsi/etc

> [snipped]
>
> > +# Can we issue zero, trim or flush requests?
> >
> +def get_options(h):
> > + if h['got_options']:
> > + return
> > + h['got_options'] = True
> > +
> > + http = h['http']
> > + transfer=h['transfer']
> > +
> > + http.putrequest("OPTIONS", h['path'])
> > + http.putheader("Authorization", transfer.signed_ticket)
> > + http.endheaders()
> > +
> > + r = http.getresponse()
> > + if r.status == 200:
> > + j = json.loads(r.read())
> > + h['can_zero'] = "zero" in j['features']
> > + h['can_trim'] = "trim" in j['features']
> > + h['can_flush'] = "flush" in j['features']
> > +
> > + # If can_zero is true then it means we're using the new imageio
> > + # which never needs the Authorization header.
> > + if h['can_zero']:
> > + h['needs_auth'] = False

In older proxy, we do support OPTIONS, used for allowing access
from enigne UI, but it does not return any content. Fortunately, we
return httplib.NO_CONTENT so it is easy to detect.

So we need to handle gracefully r.status = 204

> >
>
> If we got here, we are working with new daemon or proxy, and both
> of them do not need auth, so we can set 'needs_auth' to False
> if OPTIONS returned 200 OK.

Which is what this code does, unless I'm misunderstanding things.

Accessing readonly ticket will return empty features list, and h['needs_auth']
will be True, while this server does not need auth. Yes, it cannot happen
with this code but it is not the right check.

> [snipped]
>
> +def pwrite(h, buf, offset):
> > + http = h['http']
> > + transfer=h['transfer']
> > + transfer_service=h['transfer_service']
> > +
> > + count = len(buf)
> > + h['highestwrite'] = max(h['highestwrite'], offset+count)
> > +
> > + http.putrequest("PUT", h['path'] + "?flush=n")
> >
>
> "?flush=n" has effect only if h["can_flush"] is True. Older daemon/proxy
> do not know support disabling flushing. The docs mention that this
> query string will be added in 1.3, we are now at 1.2.

But it doesn't seem to break the old imageio?

Old imageio just ignores this, so it is safe to use as is, the issue is again
confusing the reader that this has some effect on the server.

[snipped]
> > + # Construct the JSON request for zeroing.

> > + buf = json.dumps({'op': "zero",
> > + 'offset': offset,
> > + 'size': count,
> > + 'flush': False})
> > +
> > + http.putrequest("PATCH", h['path'])
> > + http.putheader("Content-Type", "application/json")
> > + if h['needs_auth']:
> > + http.putheader("Authorization", transfer.signed_ticket)
> >
>
> Only GET and PUT on a server that does not implement OPTIONS need auth.
>
> This will work if h['needs_auth'] is set correctly, but I think it is
> better to include
> this check only in pread() and pwrite(), and add a comment there that this
> is
> need to support legacy versions.

So I think you mean that we can remove the if h['needs_auth']
and following line completely?

Yes, probably can be removed.

[snipped]
> > + buf = json.dumps({'op': "flush",
> > + 'offset': 0,
> > + 'size': 0,
> > + 'flush': True})
> >
>
> Should be (discussed in v6)
>
> buf = json.dumps({"op": "flush"})

This contradicts the documentation, but I can change the code.

Agree, I'll fix the docs.

Nir