On Thu, Mar 22, 2018 at 5:25 PM Richard W.M. Jones <rjones@redhat.com> wrote:
PROBLEMS:
 - -of qcow2 does not work, with multiple problems
    * needs to set NBD size to something larger than virtual size

How is this related to the actual file size specified when you create a disk?

In block storage, qcow2 image is always on a thin provisioned disk, which
in oVirt is a regular logical volume, created at the requested "initial_size":

From:
https://github.com/oVirt/ovirt-engine-sdk/blob/aba2a83ec94ecac1594adfff62d3cfcfdbdef416/sdk/examples/upload_disk.py#L113

if image_info["format"] == "qcow2":
    disk_format = types.DiskFormat.COW
else:
    disk_format = types.DiskFormat.RAW

disks_service = connection.system_service().disks_service()
disk = disks_service.add(
    disk=types.Disk(
        name=os.path.basename(image_path),
        description='Uploaded disk',
        format=disk_format,
        initial_size=image_size,
        provisioned_size=image_info["virtual-size"],
        sparse=disk_format == types.DiskFormat.COW,
        storage_domains=[
            types.StorageDomain(
                name='mydata'
            )
        ]
    )
)
Internally we do allocated 10% more then the requested initial_size
to leave room for qcow2 metadata.

See:
https://github.com/oVirt/vdsm/blob/3de6f326d7e338b064b11d2a2269500a3857098b/lib/vdsm/storage/blockVolume.py#L328



https://github.com/oVirt/vdsm/blob/3de6f326d7e338b064b11d2a2269500a3857098b/lib/vdsm/storage/volume.py#L666




Do we have other problems?
 
 - Cannot choose the datacenter.

The storage domain belongs to some data center, so I don't think you
need to select a data center.

[snipped]

 diff --git a/v2v/rhv-upload-plugin.py b/v2v/rhv-upload-plugin.py
new file mode 100644
index 000000000..979cbd63b
--- /dev/null
+++ b/v2v/rhv-upload-plugin.py
@@ -0,0 +1,414 @@
+# -*- python -*-
+# oVirt or RHV upload nbdkit plugin used by ‘virt-v2v -o rhv-upload’
+# Copyright (C) 2018 Red Hat Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License along
+# with this program; if not, write to the Free Software Foundation, Inc.,
+# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+
+import builtins
+import json
+import logging
+import ovirtsdk4 as sdk
+import ovirtsdk4.types as types

It is good practice to separate stdlib imports and 3rd party imports
like ovirtsdk, helps to understand the dependencies in modules.
 
[snipped] 
+from http.client import HTTPSConnection
+from urllib.parse import urlparse

These will not work in python 2, but you can use six.moves to have
code that works on both 2 and 3.
 
[snipped] 
+    # Connect to the server.
+    connection = sdk.Connection(
+        url = params['output_conn'],
+        username = username,
+        password = password,
+        ca_file = params['rhv_cafile'], 

Can this be None?
 
+        log = logging.getLogger(),
+        insecure = params['insecure'],

If ca_file cannot be None, then insecure is not needed, based
on Ondra review from earlier version.
 
[snipped] 
+    # Create the disk.
+    disks_service = system_service.disks_service()
+    if params['disk_format'] == "raw":
+        disk_format = types.DiskFormat.RAW
+    else:
+        disk_format = types.DiskFormat.COW
+    disk = disks_service.add(
+        disk = types.Disk(
+            name = params['disk_name'],
+            description = "Uploaded by virt-v2v",
+            format = disk_format,
+            provisioned_size = params['disk_size'],

This must be the virtual size.

You don't specify initial_size - in this case you get 1G, and most
images will fail to upload.
 
+            sparse = params['output_sparse'],

The user cannot configure that. This must be based on the image
format. The current coded may create images in unsupported 
combinations, e.g. raw/sparse on block storage, or fail validation
in engine.
 
[snipped]
+# Can we issue zero, trim or flush requests?
+def get_options(h):
+    if h['got_options']:
+        return
+    h['got_options'] = True
+
+    http = h['http']
+    transfer=h['transfer']
+
+    http.putrequest("OPTIONS", h['path'])
+    http.putheader("Authorization", transfer.signed_ticket)
+    http.endheaders()
+
+    r = http.getresponse()
+    if r.status == 200:
+        j = json.loads(r.read())
+        h['can_zero'] = "zero" in j['features']
+        h['can_trim'] = "trim" in j['features']
+        h['can_flush'] = "flush" in j['features']
+
+        # If can_zero is true then it means we're using the new imageio
+        # which never needs the Authorization header.
+        if h['can_zero']:
+            h['needs_auth'] = False

If we got here, we are working with new daemon or proxy, and both
of them do not need auth, so we can set 'needs_auth' to False 
if OPTIONS returned 200 OK.
 
+    # Old imageio servers returned 405 Method Not Allowed.  If
+    # we see that we just say that no operations are allowed and
+    # we will emulate them.
+    elif r.status == 405:
+        h['can_zero'] = False
+        h['can_trim'] = False
+        h['can_flush'] = False

I would set all the defaults when creating the sate dict. This ensures 
that we don't forget needs_auth or other keys and crash later with
KeyError, and make it easier to understand what is the state managed
by the plugin.
 
[snipped]

+def pwrite(h, buf, offset):
+    http = h['http']
+    transfer=h['transfer']
+    transfer_service=h['transfer_service']
+
+    count = len(buf)
+    h['highestwrite'] = max(h['highestwrite'], offset+count)
+
+    http.putrequest("PUT", h['path'] + "?flush=n")

"?flush=n" has effect only if h["can_flush"] is True. Older daemon/proxy
do not know support disabling flushing. The docs mention that this
query string will be added in 1.3, we are now at 1.2.
 
+    if h['needs_auth']:
+        http.putheader("Authorization", transfer.signed_ticket) 
+    # The oVirt server only uses the first part of the range, and the
+    # content-length.
 
[snipped]
 
+def zero(h, count, offset, may_trim):
+    http = h['http']
+    transfer=h['transfer']
+    transfer_service=h['transfer_service']
+
+    # Unlike the trim and flush calls, there is no 'can_zero' method
+    # so nbdkit could call this even if the server doesn't support
+    # zeroing.  If this is the case we must emulate.
+    if not h['can_zero']:
+        emulate_zero(h, count, offset)
+        return
+
+    # Construct the JSON request for zeroing.
+    buf = json.dumps({'op': "zero",
+                      'offset': offset,
+                      'size': count,
+                      'flush': False})
+
+    http.putrequest("PATCH", h['path'])
+    http.putheader("Content-Type", "application/json")
+    if h['needs_auth']:
+        http.putheader("Authorization", transfer.signed_ticket)

Only GET and PUT on a server that does not implement OPTIONS need auth.

This will work if h['needs_auth'] is set correctly, but I think it is better to include
this check only in pread() and pwrite(), and add a comment there that this is
need to support legacy versions.
 
+    http.putheader("Content-Length", len(buf))
+    http.endheaders()
+    http.send(buf)
+
+    r = http.getresponse()
+    if r.status != 200:
+        transfer_service.pause()
+        h['failed'] = True
+        raise RuntimeError("could not zero sector (%d, %d): %d: %s" %
+                           (offset, count, r.status, r.reason))
+
+# qemu-img convert starts by trying to zero/trim the whole device.
+# Since we've just created a new disk it's safe to ignore these
+# requests as long as they are smaller than the highest write seen.
+# After that we must emulate them with writes.

I think this comment is not related to this code. Maybe it belongs to 
write() where we compute the highwrite?
 
+def emulate_zero(h, count, offset):
+    if offset+count < h['highestwrite']:
+        http.putrequest("PUT", h['path'] + "?flush=n")

This is is used only on old daemon/proxy, the flush has no effect.
 
[snipped]

+def trim(h, count, offset):
+    http = h['http']
+    transfer=h['transfer']
+    transfer_service=h['transfer_service']
+
+    # Construct the JSON request for trimming.
+    buf = json.dumps({'op': "trim",
+                      'offset': offset,
+                      'size': count,
+                      'flush': False})
+
+    http.putrequest("PATCH", h['path'])
+    http.putheader("Content-Type", "application/json")
+    if h['needs_auth']:
+        http.putheader("Authorization", transfer.signed_ticket)

Never needed.
 
[snipped]
+def flush(h):
+    http = h['http']
+    transfer=h['transfer']
+    transfer_service=h['transfer_service']
+
+    # Construct the JSON request for flushing.  Note the offset
+    # and count must be present but are ignored by imageio.
+    buf = json.dumps({'op': "flush",
+                      'offset': 0,
+                      'size': 0,
+                      'flush': True})

Should be (discussed in v6)

    buf = json.dumps({"op": "flush"})
 
+
+    http.putrequest("PATCH", h['path'])
+    http.putheader("Content-Type", "application/json")
+    if h['needs_auth']:
+        http.putheader("Authorization", transfer.signed_ticket)

Never needed.
 
[snipped]
+def close(h):
+    http = h['http']
+    connection = h['connection']
+
+    http.close()
+
+    # If we didn't fail, then finalize the transfer.
+    if not h['failed']:
+        disk = h['disk']
+        transfer_service=h['transfer_service']
+
+        transfer_service.finalize()

If this raises, we never clean up
 
[snipped]
+        # Write the disk ID file.  Only do this on successful completion.
+        with builtins.open(params['diskid_file'], 'w') as fp:
+            fp.write(disk.id)

If this raises, we will not remove the disk, or close the connection.
 
+
+    # Otherwise if we did fail then we should delete the disk.
+    else:
+        disk_service = h['disk_service']
+        disk_service.remove()
+
+    connection.close()

[snipped]

I did not check all the SDK related code, I'm not very familiar with the SDK.

Thanks for creating this, and sorry for the bad documentation on our side :-)

Nir