On Thu, Nov 28, 2019 at 10:58 PM Richard W.M. Jones <rjones(a)redhat.com> wrote:
On Thu, Nov 28, 2019 at 09:34:18PM +0200, Nir Soffer wrote:
> We were not considering failures while initializing the transfer. In
> this case the transfer phase can change to PAUSED_SYSTEM or
> FINISHED_FAILURE, and transfer_url will be None, which failed the
> upload with a misleading error:
>
> RuntimeError: direct upload to host not supported, requires
> ovirt-engine >= 4.2 and only works when virt-v2v is run within the
> oVirt/RHV environment, eg. on an oVirt node
>
> Change the wait loop to consider all cases:
> - Transfer failed and was removed
> - Transfer failed and will be removed soon
> - Transfer paused by the system (cancel required)
> - Unexpected transfer phase (cancel required)
> - Timeout waiting for TRANSFERRING state (cancel required)
>
> Reported-by: Xiaodai Wang
> ---
>
> I could easy simulate the case when the system paused the transfer by
> injecting an error in vdsm, failing transfer initialization.
>
> The import fail with:
>
> nbdkit: python[1]: error:
/home/nsoffer/src/virt-v2v/tmp/rhvupload.1DgXyh/rhv-upload-plugin.py: open: error:
Traceback (most recent call last):
> File
"/home/nsoffer/src/virt-v2v/tmp/rhvupload.1DgXyh/rhv-upload-plugin.py", line
109, in open
> transfer = create_transfer(connection, disk, host)
> File
"/home/nsoffer/src/virt-v2v/tmp/rhvupload.1DgXyh/rhv-upload-plugin.py", line
539, in create_transfer
> "transfer %s was paused by system" % transfer.id)
> RuntimeError: transfer 32b97384-ac8b-40d5-b423-26d31faabe32 was paused by system
>
> I could not simulate the other cases. This probaly requires injecting
> errors in engine.
You might be able to inject errors more easily than that by modifying
the test harness (tests/test-v2v-o-rhv-upload-module/ovirtsdk4/).
Good idea. I'm testing currently with real ovirt setup which is pretty easy
when you have one, but we should improve the automated tests.
I also cannot run the tests, "make check" fail very early with:
SRCDIR=. LAYOUT=partitions ../../run --test ./make-fedora-img.pl
Can't locate Sys/Guestfs.pm in @INC (you may need to install the
Sys::Guestfs module) (@INC contains: /usr/local/lib64/perl5
/usr/local/share/perl5 /usr/lib64/perl5/vendor_perl
/usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5) at
/home/nsoffer/src/virt-v2v/test-data/phony-guests/make-fedora-img.pl
line 27.
BEGIN failed--compilation aborted at
/home/nsoffer/src/virt-v2v/test-data/phony-guests/make-fedora-img.pl
line 27.
/home/nsoffer/src/virt-v2v/run: command failed with exit code 2
Anyway patch looks reasonable, although I didn't test it, so:
ACK
Rich.
> v2v/rhv-upload-plugin.py | 40 +++++++++++++++++++++++++++++++---------
> 1 file changed, 31 insertions(+), 9 deletions(-)
>
> diff --git a/v2v/rhv-upload-plugin.py b/v2v/rhv-upload-plugin.py
> index 75e4f404..3942ec72 100644
> --- a/v2v/rhv-upload-plugin.py
> +++ b/v2v/rhv-upload-plugin.py
> @@ -513,21 +513,43 @@ def create_transfer(connection, disk, host):
> # Get a reference to the created transfer service.
> transfer_service = transfers_service.image_transfer_service(transfer.id)
>
> - # After adding a new transfer for the disk, the transfer's status
> - # will be INITIALIZING. Wait until the init phase is over. The
> - # actual transfer can start when its status is "Transferring".
> + # Wait until transfer's phase change from INITIALIZING to TRANSFERRING. On
> + # errors transfer's phase can change to PAUSED_SYSTEM or FINISHED_FAILURE.
> + # If the transfer was paused, we need to cancel it to remove the disk,
> + # otherwise the system will remove the disk and transfer shortly after.
> +
> endt = time.time() + timeout
> while True:
> - transfer = transfer_service.get()
> - if transfer.phase != types.ImageTransferPhase.INITIALIZING:
> + time.sleep(1)
> + try:
> + transfer = transfer_service.get()
> + except sdk.NotFoundError:
> + # The system has removed the disk and the transfer.
> + raise RuntimeError("transfer %s was removed" % transfer.id)
> +
> + if transfer.phase == types.ImageTransferPhase.FINISHED_FAILURE:
> + # The system will remove the disk and the transfer soon.
> + raise RuntimeError(
> + "transfer %s has failed" % transfer.id)
> +
> + if transfer.phase == types.ImageTransferPhase.PAUSED_SYSTEM:
> + transfer_service.cancel()
> + raise RuntimeError(
> + "transfer %s was paused by system" % transfer.id)
> +
> + if transfer.phase == types.ImageTransferPhase.TRANSFERRING:
> break
> - if time.time() > endt:
> +
> + if transfer.phase != types.ImageTransferPhase.INITIALIZING:
> transfer_service.cancel()
> raise RuntimeError(
> - "timed out waiting for transfer %s status !=
INITIALIZING"
> - % transfer.id)
> + "unexpected transfer %s phase %s"
> + % (transfer.id, transfer.phase))
>
> - time.sleep(1)
> + if time.time() > endt:
> + transfer_service.cancel()
> + raise RuntimeError(
> + "timed out waiting for transfer %s" % transfer.id)
>
> return transfer
>
> --
> 2.21.0
--
Richard Jones, Virtualization Group, Red Hat
http://people.redhat.com/~rjones
Read my programming and virtualization blog:
http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html