On Thursday, 22 February 2018 14:57:25 CET Richard W.M. Jones wrote:
 PROBLEMS:
  - Spools to a temporary disk
  - Need to specify direct/indirect upload via flag
  - Location of ca.pem
  - Target cluster
  - Delete disks on failure, or rename disks on success?
  - Handling of sparseness in raw format disks
 
 This adds a new output mode to virt-v2v.  virt-v2v -o rhv-upload
 streams images directly to an oVirt or RHV >= 4 Data Domain using the
 oVirt SDK v4.  It is more efficient than -o rhv because it does not
 need to go via the Export Storage Domain, and is possible for humans
 to use unlike -o vdsm.
 
 The implementation uses the Python SDK by running snippets of Python
 code to interact with the ‘ovirtsdk4’ module.  It requires both Python 3
 and the Python SDK v4 to be installed at run time (these are not,
 however, new dependencies of virt-v2v since most people wouldn't have
 them).
 --- 
Since the patch is still RFC, and I cannot comment on the actual
procedure, I'll just leave few misc comments.
 +    | `RHV_Upload ->
 +      let output_conn =
 +        match output_conn with
 +        | None ->
 +           error (f_"-o rhv-upload: output connection was not specified, use ‘-oc’
to point to the oVirt or RHV server REST API URL") 
More than "output connection", I'd just mention "REST API URL"
directly.
 +# Wait til the disk is up.  The transfer cannot start if the
 +# disk is locked.
 +disk_service = disks_service.disk_service(disk_id)
 +while True:
 +    time.sleep(5)
 +    disk = disk_service.get()
 +    if disk.status == types.DiskStatus.OK:
 +        break 
Is a busy loop the only way to wait for an OK disk?
Regardless, this (and other busy loops in the Python snippets) IMHO
need also a timeout, otherwise any error coming from oVirt will block
v2v indefinitely...
 +# This seems to give the best throughput when uploading from
Yaniv's
 +# machine to a server that drops the data. You may need to tune this
 +# on your setup.
 +BUF_SIZE = 128 * 1024
 +
 +with open(image_path, \"rb\") as disk:
 +    pos = 0
 +    while pos < image_size:
 +        # Send the next chunk to the proxy.
 +        to_read = min(image_size - pos, BUF_SIZE)
 +        chunk = disk.read(to_read)
 +        if not chunk:
 +            transfer_service.pause()
 +            raise RuntimeError(\"Unexpected end of file at pos=%%d\" %% pos)
 +
 +        proxy_connection.send(chunk)
 +        pos += len(chunk) 
No need for 'pos', just use disk.tell() to get the current position
in the file being read.
Also, I'd flip it to the other side: instead of count the read bytes
until the file size, just keep read & send bytes, and error out when
reading more data than the file size.
Maybe something like the untested:
  with open(image_path, "rb") as disk:
      while True:
          # Send the next chunk to the proxy.
          chunk = disk.read(BUF_SIZE)
          if not chunk:
              transfer_service.pause()
              raise RuntimeError("Unexpected end of file at pos=%d" %
file.tell())
          if disk.tell() > image_size:
              transfer_service.pause()
              # maybe even stat() image_path, and show more details in the exception
message
              raise RuntimeError("Read more data than expected: pos=%d vs
size=%d" % file.tell(), image_size)
          proxy_connection.send(chunk)
 +# Get the response
 +response = proxy_connection.getresponse()
 +if response.status != 200:
 +    transfer_service.pause()
 +    print(\"Upload failed: %%s %%s\" %%
 +          (response.status, response.reason))
 +    sys.exit(1) 
sys.exit() vs raise?
 +(* Find the Python 3 binary. *)
 +let find_python3 () =
 +  let rec loop = function
 +    | [] ->
 +       error "could not locate Python 3 binary on the $PATH.  You may have to
install Python 3.  If Python 3 is already installed then you may need to create a
directory containing a binary called ‘python3’ which runs Python 3."
 +    | python :: rest ->
 +       (* Use shell_command first to check the binary exists. *)
 +       let cmd = sprintf "%s --help >/dev/null 2>&1" (quote python)
in 
Std_utils.which here fits better IMHO.  If the binary is broken, then
the run_python after this will fail anyway.
 +(* Parse the -oc URI. *)
 +let parse_output_conn oc =
 +  let uri = Xml.parse_uri oc in
 +  if uri.Xml.uri_scheme <> Some "https" then
 +    error (f_"rhv-upload: -oc: URI must start with https://...");
 +  if uri.Xml.uri_server = None then
 +    error (f_"rhv-upload: -oc: no remote server name in the URI");
 +  if uri.Xml.uri_path = None || uri.Xml.uri_path = Some "/" then
 +    error (f_"rhv-upload: -oc: URI path component looks incorrect");
 +  let username =
 +    match uri.Xml.uri_user with
 +    | None ->
 +       warning (f_"rhv-upload: -oc: username was missing from URI, assuming
‘admin@internal’");
 +       "admin@internal"
 +    | Some user -> user in
 +  (* Reconstruct the URI without the username. *)
 +  let url = sprintf "%s://%s%s"
 +                    (Option.default "https" uri.Xml.uri_scheme)
 +                    (Option.default "localhost" uri.Xml.uri_server)
 +                    (Option.default "" uri.Xml.uri_path) in 
Since the checks above make sure uri.Xml.uri_server is not None, then
I'd use "invalid-hostname" or something like that instead of localhost,
just as safety measure.
 +               error (f_"rhv-upload: -of %s: Only output format
‘raw’ or ‘qcow2’ is supported.  If the input is in a different format then force one of
these output formats by adding either ‘-of raw’ or ‘-of qcow’ on the command line.")
Typo, "qcow" -> "qcow2".
 +    (* Create the metadata. *)
 +    let ovf =
 +      Create_ovf.create_ovf source targets guestcaps inspect
 +                            Sparse
 +                            "domain"
 +                            (List.map (fun t -> "") targets)
 +                            target_disk_ids
 +                            "vm" in 
Shouldn't real UUIDs (as in random UUIDs) be used instead of "domain",
and "vm"?
-- 
Pino Toscano