[PATCH 0/5] Fix rhv-upload output

[PATCH libnbd] generator/Go.ml:...

Testing virt-v2v with slow input...

Nir Soffer

Saturday, 18 December 2021 Sat, 18 Dec '21

2:36 p.m.

Fix problems in new rhv-upload implementation: - The plugin does not flush to all connections in flush() - The plugin does not close all connections in cleanup() - Idle connections are closed in imageio server, and we don't have a safe way to recover. - virt-v2v try to get disk allocation using imageio output, but imageio output does not support extents. Even if imageio output will support extents, the call is done after the transfer was finalized so it does not have access to storage. Problems not fixed yet: - Image transfer is finalized *before* closing the connection to imageio - this will always time out with RHV < 4.4.9, and succeeds by mistake with RHV 4.4.9 due to a regression that will be fixed in 4.4.10. This will be a non-issue in next RHV version[1]. To support older RHV versions, virt-v2v must finalize the image transfer *after* closing the output. Tested on RHEL 8.6 with upstream nbdkit and libnbd. [1] https://github.com/oVirt/ovirt-imageio/pull/15 Fixes https://bugzilla.redhat.com/2032324 Nir Soffer (5): output/rhv-upload-plugin: Fix flush and close v2v/lib/util.ml: Get disk allocation from input output/rhv-upload-plugin: Extract send_flush() helper output/rhv-upload-plugin: Track http last request time output/rhv-upload-plugin: Keep connections alive lib/utils.ml | 2 +- output/rhv-upload-plugin.py | 151 ++++++++++++++++++++++++++---------- 2 files changed, 113 insertions(+), 40 deletions(-) -- 2.33.1

Show replies by date

Nir Soffer

Saturday, 18 December Sat, 18 Dec

2:36 p.m.

New subject: [PATCH 1/5] output/rhv-upload-plugin: Fix flush and close

When locking the http pool, we wait until all connections are idle, and take them from the pool. But since we used pool.qsize(), which is the number of items currently in the queue, we did not wait for all connections. This leads to following issues: - We send flush request only for some connections, which does not ensure that all uploaded data is flushed to storage. - We close only some of the connections in cleanup(). This should not matter since the connections are closed when the plugin process terminates. An example import showing sending only one FLUSH request instead of 4: https://bugzilla.redhat.com/2032324#c8 Fixed by creating a bounded queue and using pool.maxsize to get all the connections from the pool. --- output/rhv-upload-plugin.py | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/output/rhv-upload-plugin.py b/output/rhv-upload-plugin.py index 1cb837dd..bad0e8a3 100644 --- a/output/rhv-upload-plugin.py +++ b/output/rhv-upload-plugin.py @@ -307,30 +307,30 @@ class UnixHTTPConnection(HTTPConnection): def connect(self): self.sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM) if self.timeout is not socket._GLOBAL_DEFAULT_TIMEOUT: self.sock.settimeout(timeout) self.sock.connect(self.path) # Connection pool. def create_http_pool(url, options): - pool = queue.Queue() - count = min(options["max_readers"], options["max_writers"], MAX_CONNECTIONS) nbdkit.debug("creating http pool connections=%d" % count) unix_socket = options["unix_socket"] if is_ovirt_host else None + pool = queue.Queue(count) + for i in range(count): http = create_http(url, unix_socket=unix_socket) pool.put(http) return pool @contextmanager def http_context(pool): """ @@ -347,22 +347,22 @@ def http_context(pool): def iter_http_pool(pool): """ Wait until all inflight requests are done, and iterate on imageio connections. The pool is empty during iteration. New requests issued during iteration will block until iteration is done. """ locked = [] - # Lock the pool by taking the connection out. - while len(locked) < pool.qsize(): + # Lock the pool by taking all connections out. + while len(locked) < pool.maxsize: locked.append(pool.get()) try: for http in locked: yield http finally: # Unlock the pool by puting the connection back. for http in locked: pool.put(http) @@ -371,21 +371,21 @@ def close_http_pool(pool): """ Wait until all inflight requests are done, close all connections and remove them from the pool. No request can be served by the pool after this call. """ nbdkit.debug("closing http pool") locked = [] - while len(locked) < pool.qsize(): + while len(locked) < pool.maxsize: locked.append(pool.get()) for http in locked: http.close() def create_http(url, unix_socket=None): """ Create http connection for transfer url. -- 2.33.1

Richard W.M. Jones

Tuesday, 8 February Tue, 8 Feb

10:45 a.m.

New subject: [PATCH 1/5] output/rhv-upload-plugin: Fix flush and close

On Sat, Dec 18, 2021 at 10:36:29PM +0200, Nir Soffer wrote:

...

Obvious bug fix, ACK Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-builder quickly builds VMs from scratch http://libguestfs.org/virt-builder.1.html

Nir Soffer

Saturday, 18 December Sat, 18 Dec

2:36 p.m.

New subject: [PATCH 2/5] v2v/lib/util.ml: Get disk allocation from input

After finalizing the transfer, virt-v2v try to connect to the output socket and query disk allocation. This may work for some outputs supporting block status, but for rhv_upload output this cannot work for 2 reasons: - The rhv-upload-plugin does not support extents - The transfer was finalized before this call, so the plugin lost access to the image. Here is an example failure log: [ 74.2] Creating output metadata python3 '/tmp/v2v.WMq8Tk/rhv-upload-finalize.py' '/tmp/v2v.WMq8Tk/params6.json' finalizing transfer b03fe3ba-a4ff-4634-a0a0-10b3daba3cc2 ... transfer b03fe3ba-a4ff-4634-a0a0-10b3daba3cc2 finalized in 2.118 seconds ... nbdkit: debug: accepted connection ... nbdkit: python[4]: debug: python: close virt-v2v: error: exception: NBD.Error("nbd_block_status: request out of bounds: Invalid argument", 22) Fix by using the input socket. --- lib/utils.ml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/utils.ml b/lib/utils.ml index d6861d08..f6c85543 100644 --- a/lib/utils.ml +++ b/lib/utils.ml @@ -171,21 +171,21 @@ let with_nbd_connect_unix ~socket ~meta_contexts ~f = ~f:(fun () -> List.iter (NBD.add_meta_context nbd) meta_contexts; NBD.connect_unix nbd socket; protect ~f:(fun () -> f nbd) ~finally:(fun () -> NBD.shutdown nbd) ) ~finally:(fun () -> NBD.close nbd) let get_disk_allocated ~dir ~disknr = - let socket = sprintf "%s/out%d" dir disknr + let socket = sprintf "%s/in%d" dir disknr and alloc_ctx = "base:allocation" in with_nbd_connect_unix ~socket ~meta_contexts:[alloc_ctx] ~f:(fun nbd -> if NBD.can_meta_context nbd alloc_ctx then ( (* Get the list of extents, using a 2GiB chunk size as hint. *) let size = NBD.get_size nbd and allocated = ref 0_L and fetch_offset = ref 0_L in while !fetch_offset < size do let remaining = size -^ !fetch_offset in -- 2.33.1

Richard W.M. Jones

3:32 p.m.

New subject: [PATCH 2/5] v2v/lib/util.ml: Get disk allocation from input

On Sat, Dec 18, 2021 at 10:36:30PM +0200, Nir Soffer wrote:

...

This patch is definitely wrong - we need to get the allocation size from the output disk. Options such as -oa preallocated, and just general issues like block size, nbdcopy sparseness detection etc, would affect this. It probably indicates a problem with rhv-upload-plugin (again) that it's not really prepared to be an idempotent part of a disk image pipeline. Rich.

...

and alloc_ctx = "base:allocation" in with_nbd_connect_unix ~socket ~meta_contexts:[alloc_ctx] ~f:(fun nbd -> if NBD.can_meta_context nbd alloc_ctx then ( (* Get the list of extents, using a 2GiB chunk size as hint. *) let size = NBD.get_size nbd and allocated = ref 0_L and fetch_offset = ref 0_L in while !fetch_offset < size do let remaining = size -^ !fetch_offset in -- 2.33.1

-- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/

Nir Soffer

4:40 p.m.

New subject: [PATCH 2/5] v2v/lib/util.ml: Get disk allocation from input

On Sat, Dec 18, 2021 at 11:33 PM Richard W.M. Jones <rjones(a)redhat.com> wrote:

...

On Sat, Dec 18, 2021 at 10:36:30PM +0200, Nir Soffer wrote: > After finalizing the transfer, virt-v2v try to connect to the output > socket and query disk allocation. This may work for some outputs > supporting block status, but for rhv_upload output this cannot work for > 2 reasons: > - The rhv-upload-plugin does not support extents > - The transfer was finalized before this call, so the plugin lost access > to the image. > > Here is an example failure log: > > [ 74.2] Creating output metadata > python3 '/tmp/v2v.WMq8Tk/rhv-upload-finalize.py' '/tmp/v2v.WMq8Tk/params6.json' > finalizing transfer b03fe3ba-a4ff-4634-a0a0-10b3daba3cc2 > ... > transfer b03fe3ba-a4ff-4634-a0a0-10b3daba3cc2 finalized in 2.118 seconds > ... > nbdkit: debug: accepted connection > ... > nbdkit: python[4]: debug: python: close > virt-v2v: error: exception: NBD.Error("nbd_block_status: request out of > bounds: Invalid argument", 22) > > Fix by using the input socket. > --- > lib/utils.ml | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/lib/utils.ml b/lib/utils.ml > index d6861d08..f6c85543 100644 > --- a/lib/utils.ml > +++ b/lib/utils.ml > @@ -171,21 +171,21 @@ let with_nbd_connect_unix ~socket ~meta_contexts ~f = > ~f:(fun () -> > List.iter (NBD.add_meta_context nbd) meta_contexts; > NBD.connect_unix nbd socket; > protect > ~f:(fun () -> f nbd) > ~finally:(fun () -> NBD.shutdown nbd) > ) > ~finally:(fun () -> NBD.close nbd) > > let get_disk_allocated ~dir ~disknr = > - let socket = sprintf "%s/out%d" dir disknr > + let socket = sprintf "%s/in%d" dir disknr This patch is definitely wrong - we need to get the allocation size from the output disk. Options such as -oa preallocated, and just general issues like block size, nbdcopy sparseness detection etc, would affect this. It probably indicates a problem with rhv-upload-plugin (again) that it's not really prepared to be an idempotent part of a disk image pipeline.

This patch also definitely works for rhv-upload, we create working vms. It can be fixed in another way, maybe skipping the check when using rhv-upload because it cannot provide the needed info. I'm not sure why this check is needed - RHV already has everything it needs at this point, and it does not care about disk allocation after the disk was uploaded. If RHV needs any info about the disk, it can get it from storage. Maybe you add stuff to the ovf that RHV does need?

...

Rich. > and alloc_ctx = "base:allocation" in > with_nbd_connect_unix ~socket ~meta_contexts:[alloc_ctx] > ~f:(fun nbd -> > if NBD.can_meta_context nbd alloc_ctx then ( > (* Get the list of extents, using a 2GiB chunk size as hint. *) > let size = NBD.get_size nbd > and allocated = ref 0_L > and fetch_offset = ref 0_L in > while !fetch_offset < size do > let remaining = size -^ !fetch_offset in > -- > 2.33.1 -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/

Nir Soffer

5:30 p.m.

New subject: [PATCH 2/5] v2v/lib/util.ml: Get disk allocation from input

On Sun, Dec 19, 2021 at 12:40 AM Nir Soffer <nsoffer(a)redhat.com> wrote:

...

On Sat, Dec 18, 2021 at 11:33 PM Richard W.M. Jones <rjones(a)redhat.com> wrote: > > On Sat, Dec 18, 2021 at 10:36:30PM +0200, Nir Soffer wrote: > > After finalizing the transfer, virt-v2v try to connect to the output > > socket and query disk allocation. This may work for some outputs > > supporting block status, but for rhv_upload output this cannot work for > > 2 reasons: > > - The rhv-upload-plugin does not support extents > > - The transfer was finalized before this call, so the plugin lost access > > to the image. > > > > Here is an example failure log: > > > > [ 74.2] Creating output metadata > > python3 '/tmp/v2v.WMq8Tk/rhv-upload-finalize.py' '/tmp/v2v.WMq8Tk/params6.json' > > finalizing transfer b03fe3ba-a4ff-4634-a0a0-10b3daba3cc2 > > ... > > transfer b03fe3ba-a4ff-4634-a0a0-10b3daba3cc2 finalized in 2.118 seconds > > ... > > nbdkit: debug: accepted connection > > ... > > nbdkit: python[4]: debug: python: close > > virt-v2v: error: exception: NBD.Error("nbd_block_status: request out of > > bounds: Invalid argument", 22) > > > > Fix by using the input socket. > > --- > > lib/utils.ml | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/lib/utils.ml b/lib/utils.ml > > index d6861d08..f6c85543 100644 > > --- a/lib/utils.ml > > +++ b/lib/utils.ml > > @@ -171,21 +171,21 @@ let with_nbd_connect_unix ~socket ~meta_contexts ~f = > > ~f:(fun () -> > > List.iter (NBD.add_meta_context nbd) meta_contexts; > > NBD.connect_unix nbd socket; > > protect > > ~f:(fun () -> f nbd) > > ~finally:(fun () -> NBD.shutdown nbd) > > ) > > ~finally:(fun () -> NBD.close nbd) > > > > let get_disk_allocated ~dir ~disknr = > > - let socket = sprintf "%s/out%d" dir disknr > > + let socket = sprintf "%s/in%d" dir disknr > > This patch is definitely wrong - we need to get the allocation size > from the output disk. Options such as -oa preallocated, and just > general issues like block size, nbdcopy sparseness detection etc, > would affect this. > > It probably indicates a problem with rhv-upload-plugin (again) that > it's not really prepared to be an idempotent part of a disk image > pipeline. This patch also definitely works for rhv-upload, we create working vms. It can be fixed in another way, maybe skipping the check when using rhv-upload because it cannot provide the needed info. I'm not sure why this check is needed - RHV already has everything it needs at this point, and it does not care about disk allocation after the disk was uploaded. If RHV needs any info about the disk, it can get it from storage. Maybe you add stuff to the ovf that RHV does need?

I see that this code is not in rhel-9.0.0 branch, so this patch is not needed for https://bugzilla.redhat.com/2032324 Laszlo, can you explain why we need the number of allocated bytes after the disk was already converted to the target storage? Looking at the bug: https://bugzilla.redhat.com/2027598 This is a fix for the issue in the old rhv output, which is deprecated and should not be used in new code, and it is not compatible with rhv-upload which is the modern way to import into RHV. Also this cannot work for RHV now, so we need a way to disable this when importing to RHV. We can expose the block status in RHV plugin, but to use this we must call block status after the disk is converted, but before the output is finalized, since when you finalize RHV disconnect the disk from the host, and the RHV plugin cannot access it any more. The flow for rhv-upload should be: run nbdcopy query block status finalize output create ovf Nir

Nir Soffer

Sunday, 19 December Sun, 19 Dec

1:16 a.m.

New subject: [PATCH 2/5] v2v/lib/util.ml: Get disk allocation from input

On Sun, Dec 19, 2021 at 1:30 AM Nir Soffer <nsoffer(a)redhat.com> wrote:

...

On Sun, Dec 19, 2021 at 12:40 AM Nir Soffer <nsoffer(a)redhat.com> wrote: > > On Sat, Dec 18, 2021 at 11:33 PM Richard W.M. Jones <rjones(a)redhat.com> wrote: > > > > On Sat, Dec 18, 2021 at 10:36:30PM +0200, Nir Soffer wrote: > > > After finalizing the transfer, virt-v2v try to connect to the output > > > socket and query disk allocation. This may work for some outputs > > > supporting block status, but for rhv_upload output this cannot work for > > > 2 reasons: > > > - The rhv-upload-plugin does not support extents > > > - The transfer was finalized before this call, so the plugin lost access > > > to the image. > > > > > > Here is an example failure log: > > > > > > [ 74.2] Creating output metadata > > > python3 '/tmp/v2v.WMq8Tk/rhv-upload-finalize.py' '/tmp/v2v.WMq8Tk/params6.json' > > > finalizing transfer b03fe3ba-a4ff-4634-a0a0-10b3daba3cc2 > > > ... > > > transfer b03fe3ba-a4ff-4634-a0a0-10b3daba3cc2 finalized in 2.118 seconds > > > ... > > > nbdkit: debug: accepted connection > > > ... > > > nbdkit: python[4]: debug: python: close > > > virt-v2v: error: exception: NBD.Error("nbd_block_status: request out of > > > bounds: Invalid argument", 22) > > > > > > Fix by using the input socket. > > > --- > > > lib/utils.ml | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/lib/utils.ml b/lib/utils.ml > > > index d6861d08..f6c85543 100644 > > > --- a/lib/utils.ml > > > +++ b/lib/utils.ml > > > @@ -171,21 +171,21 @@ let with_nbd_connect_unix ~socket ~meta_contexts ~f = > > > ~f:(fun () -> > > > List.iter (NBD.add_meta_context nbd) meta_contexts; > > > NBD.connect_unix nbd socket; > > > protect > > > ~f:(fun () -> f nbd) > > > ~finally:(fun () -> NBD.shutdown nbd) > > > ) > > > ~finally:(fun () -> NBD.close nbd) > > > > > > let get_disk_allocated ~dir ~disknr = > > > - let socket = sprintf "%s/out%d" dir disknr > > > + let socket = sprintf "%s/in%d" dir disknr > > > > This patch is definitely wrong - we need to get the allocation size > > from the output disk. Options such as -oa preallocated, and just > > general issues like block size, nbdcopy sparseness detection etc, > > would affect this. > > > > It probably indicates a problem with rhv-upload-plugin (again) that > > it's not really prepared to be an idempotent part of a disk image > > pipeline. > > This patch also definitely works for rhv-upload, we create working > vms. > > It can be fixed in another way, maybe skipping the check when using > rhv-upload because it cannot provide the needed info. > > I'm not sure why this check is needed - RHV already has everything > it needs at this point, and it does not care about disk allocation after > the disk was uploaded. If RHV needs any info about the disk, it can > get it from storage. > > Maybe you add stuff to the ovf that RHV does need? I see that this code is not in rhel-9.0.0 branch, so this patch is not needed for https://bugzilla.redhat.com/2032324 Laszlo, can you explain why we need the number of allocated bytes after the disk was already converted to the target storage? Looking at the bug: https://bugzilla.redhat.com/2027598 This is a fix for the issue in the old rhv output, which is deprecated and should not be used in new code, and it is not compatible with rhv-upload which is the modern way to import into RHV. Also this cannot work for RHV now, so we need a way to disable this when importing to RHV. We can expose the block status in RHV plugin,

This patch adds extents to rhv-upload-plugin: https://listman.redhat.com/archives/libguestfs/2021-December/msg00197.html It depends on fixing the python plugin extents callback: https://listman.redhat.com/archives/libguestfs/2021-December/msg00196.html

...

but to use this we must call block status after the disk is converted, but before the output is finalized, since when you finalize RHV disconnect the disk from the host, and the RHV plugin cannot access it any more. The flow for rhv-upload should be: run nbdcopy query block status finalize output create ovf Nir

Richard W.M. Jones

4:34 a.m.

New subject: [PATCH 2/5] v2v/lib/util.ml: Get disk allocation from input

On Sun, Dec 19, 2021 at 01:30:28AM +0200, Nir Soffer wrote:

...

On Sun, Dec 19, 2021 at 12:40 AM Nir Soffer <nsoffer(a)redhat.com> wrote: > > It can be fixed in another way, maybe skipping the check when using > rhv-upload because it cannot provide the needed info. > > I'm not sure why this check is needed - RHV already has everything > it needs at this point, and it does not care about disk allocation after > the disk was uploaded. If RHV needs any info about the disk, it can > get it from storage. > > Maybe you add stuff to the ovf that RHV does need?

So looking at the commit (a2a4f7a09996) it seems as if it applies to all OVF outputs (including -o rhv-upload), but I think we only intended to add it to -o rhv output. Normally we'd do that by checking if ovf_flavor = RHVExportStorageDomain. So I think a fix is to make the actual_size stuff dependent on that test.

...

I see that this code is not in rhel-9.0.0 branch, so this patch is not needed for https://bugzilla.redhat.com/2032324

It is in the latest rhel-9.0.0 branch.

...

Laszlo, can you explain why we need the number of allocated bytes after the disk was already converted to the target storage? Looking at the bug: https://bugzilla.redhat.com/2027598 This is a fix for the issue in the old rhv output, which is deprecated and should not be used in new code, and it is not compatible with rhv-upload which is the modern way to import into RHV. Also this cannot work for RHV now, so we need a way to disable this when importing to RHV.

"RHV" here is confusing - I think you mean -o rhv-upload here? (not -o rhv)

...

We can expose the block status in RHV plugin, but to use this we must call block status after the disk is converted, but before the output is finalized, since when you finalize RHV disconnect the disk from the host, and the RHV plugin cannot access it any more. The flow for rhv-upload should be: run nbdcopy query block status finalize output create ovf

I suppose this won't be needed if we make the change above, but it was Laszlo who investigated this issue in detail so let's see what he says. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 100 libraries supported. http://fedoraproject.org/wiki/MinGW

Laszlo Ersek

Monday, 20 December Mon, 20 Dec

4:17 a.m.

New subject: [PATCH 2/5] v2v/lib/util.ml: Get disk allocation from input

On 12/19/21 11:34, Richard W.M. Jones wrote:

...

On Sun, Dec 19, 2021 at 01:30:28AM +0200, Nir Soffer wrote: > On Sun, Dec 19, 2021 at 12:40 AM Nir Soffer <nsoffer(a)redhat.com> wrote: >> >> It can be fixed in another way, maybe skipping the check when using >> rhv-upload because it cannot provide the needed info. >> >> I'm not sure why this check is needed - RHV already has everything >> it needs at this point, and it does not care about disk allocation after >> the disk was uploaded. If RHV needs any info about the disk, it can >> get it from storage. >> >> Maybe you add stuff to the ovf that RHV does need? So looking at the commit (a2a4f7a09996) it seems as if it applies to all OVF outputs (including -o rhv-upload), but I think we only intended to add it to -o rhv output. Normally we'd do that by checking if ovf_flavor = RHVExportStorageDomain. So I think a fix is to make the actual_size stuff dependent on that test. > I see that this code is not in rhel-9.0.0 branch, so this patch is not > needed for https://bugzilla.redhat.com/2032324 It is in the latest rhel-9.0.0 branch. > Laszlo, can you explain why we need the number of allocated bytes > after the disk was already converted to the target storage? > > Looking at the bug: > https://bugzilla.redhat.com/2027598 > > This is a fix for the issue in the old rhv output, which is deprecated and > should not be used in new code, and it is not compatible with rhv-upload > which is the modern way to import into RHV. > > Also this cannot work for RHV now, so we need a way to disable > this when importing to RHV. "RHV" here is confusing - I think you mean -o rhv-upload here? (not -o rhv) > We can expose the block status in RHV plugin, but to use this we must > call block status after the disk is converted, but before the output > is finalized, since when you finalize RHV disconnect the disk from > the host, and the RHV plugin cannot access it any more. > > The flow for rhv-upload should be: > > run nbdcopy > query block status > finalize output > create ovf I suppose this won't be needed if we make the change above, but it was Laszlo who investigated this issue in detail so let's see what he says.

We have a common utility function (create_ovf) that no longer works for all call sites. (1) One way to fix it is to let each call site compute the list of disk sizes ("int64 option list"), pass that in to "create_ovf", and let "create_ovf" merge (combine) that list with the other lists it already merges. The callers that don't need this functionality can just pass in a list of "None"s. I outlined this in my previous email. (2) The other way to fix it is to restrict the logic inside "create_ovf", based on some other (simpler) parameter that tells apart the callers. If you tell me that "ovf_flavor = RHVExportStorageDomain" is a condition I can rely on, in "create_ovf", I'm happy to use that -- it's a lot simpler than the other option! Note that this would not generally restore the pre-modularization OVF output of virt-v2v, but maybe it does not matter for vdsm and rhv-upload? - "output_rhv.ml": passes in RHVExportStorageDomain as flavor - "output_rhv_upload.ml": passes OVirt as flavor - "output_vdsm.ml": well, this is tricky; this output plugin defaults to RHVExportStorageDomain, but can be flipped to OVirt via the "-oo vdsm-ovf-flavour" option! Does that match what we want to do in (2)? Thanks Laszlo

Laszlo Ersek

4:27 a.m.

New subject: [PATCH 2/5] v2v/lib/util.ml: Get disk allocation from input

On 12/20/21 11:17, Laszlo Ersek wrote:

...

On 12/19/21 11:34, Richard W.M. Jones wrote: > On Sun, Dec 19, 2021 at 01:30:28AM +0200, Nir Soffer wrote: >> On Sun, Dec 19, 2021 at 12:40 AM Nir Soffer <nsoffer(a)redhat.com> wrote: >>> >>> It can be fixed in another way, maybe skipping the check when using >>> rhv-upload because it cannot provide the needed info. >>> >>> I'm not sure why this check is needed - RHV already has everything >>> it needs at this point, and it does not care about disk allocation after >>> the disk was uploaded. If RHV needs any info about the disk, it can >>> get it from storage. >>> >>> Maybe you add stuff to the ovf that RHV does need? > > So looking at the commit (a2a4f7a09996) it seems as if it applies to > all OVF outputs (including -o rhv-upload), but I think we only > intended to add it to -o rhv output. > > Normally we'd do that by checking if ovf_flavor = RHVExportStorageDomain. > So I think a fix is to make the actual_size stuff dependent on that test. > >> I see that this code is not in rhel-9.0.0 branch, so this patch is not >> needed for https://bugzilla.redhat.com/2032324 > > It is in the latest rhel-9.0.0 branch. > >> Laszlo, can you explain why we need the number of allocated bytes >> after the disk was already converted to the target storage? >> >> Looking at the bug: >> https://bugzilla.redhat.com/2027598 >> >> This is a fix for the issue in the old rhv output, which is deprecated and >> should not be used in new code, and it is not compatible with rhv-upload >> which is the modern way to import into RHV. >> >> Also this cannot work for RHV now, so we need a way to disable >> this when importing to RHV. > > "RHV" here is confusing - I think you mean -o rhv-upload here? > (not -o rhv) > >> We can expose the block status in RHV plugin, but to use this we must >> call block status after the disk is converted, but before the output >> is finalized, since when you finalize RHV disconnect the disk from >> the host, and the RHV plugin cannot access it any more. >> >> The flow for rhv-upload should be: >> >> run nbdcopy >> query block status >> finalize output >> create ovf > > I suppose this won't be needed if we make the change above, but it was > Laszlo who investigated this issue in detail so let's see what he says. We have a common utility function (create_ovf) that no longer works for all call sites. (1) One way to fix it is to let each call site compute the list of disk sizes ("int64 option list"), pass that in to "create_ovf", and let "create_ovf" merge (combine) that list with the other lists it already merges. The callers that don't need this functionality can just pass in a list of "None"s. I outlined this in my previous email. (2) The other way to fix it is to restrict the logic inside "create_ovf", based on some other (simpler) parameter that tells apart the callers. If you tell me that "ovf_flavor = RHVExportStorageDomain" is a condition I can rely on, in "create_ovf", I'm happy to use that -- it's a lot simpler than the other option! Note that this would not generally restore the pre-modularization OVF output of virt-v2v, but maybe it does not matter for vdsm and rhv-upload? - "output_rhv.ml": passes in RHVExportStorageDomain as flavor - "output_rhv_upload.ml": passes OVirt as flavor - "output_vdsm.ml": well, this is tricky; this output plugin defaults to RHVExportStorageDomain, but can be flipped to OVirt via the "-oo vdsm-ovf-flavour" option! Does that match what we want to do in (2)?

This flavor selection in the VDSM output plugin goes back to: commit 0361ae81a16661e3b0bab052d28a9972b9514930 Author: Tomáš Golembiovský <tgolembi(a)redhat.com> Date: Thu Feb 22 11:41:08 2018 +0100 v2v: vdsm: add --vdsm-fixed-ovf option Add option for -o vdsm that enables output of the modified OVF. oVirt engine should already be able to consume the OVF, but let's not take any chances and enable it only by command line argument. It can be made default later when it receives proper testing. Signed-off-by: Tomáš Golembiovský <tgolembi(a)redhat.com> which tells me exactly nothing, unfortunately. I think the only safe way to fix the issue is to *generally* go back to the OVF that virt-v2v used to generate before the modularization; that is, option (1). Thanks, Laszlo

Richard W.M. Jones

4:46 a.m.

New subject: [PATCH 2/5] v2v/lib/util.ml: Get disk allocation from input

On Mon, Dec 20, 2021 at 11:17:47AM +0100, Laszlo Ersek wrote:

...

It's a bit of a hack isn't it, but it would solve the problem for now. The fundamental problem here (again) is that OVF isn't a real format. It's a collection of formats which share some common XML elements. In this case it turns out we now have 3 formats: RHVExportStorageDomain, OVirt, and (for -o vdsm) DataDomain (since -o vdsm writes directly into the oVirt Data Domain). So ... I don't know ... How about a ?(need_actual_sizes = false) optional param on create_ovf? Ugly but effective? Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into KVM guests. http://libguestfs.org/virt-v2v

Laszlo Ersek

5:28 a.m.

New subject: [PATCH 2/5] v2v/lib/util.ml: Get disk allocation from input

On 12/20/21 11:46, Richard W.M. Jones wrote:

...

On Mon, Dec 20, 2021 at 11:17:47AM +0100, Laszlo Ersek wrote: > On 12/19/21 11:34, Richard W.M. Jones wrote: >> On Sun, Dec 19, 2021 at 01:30:28AM +0200, Nir Soffer wrote: >>> On Sun, Dec 19, 2021 at 12:40 AM Nir Soffer <nsoffer(a)redhat.com> wrote: >>>> >>>> It can be fixed in another way, maybe skipping the check when using >>>> rhv-upload because it cannot provide the needed info. >>>> >>>> I'm not sure why this check is needed - RHV already has everything >>>> it needs at this point, and it does not care about disk allocation after >>>> the disk was uploaded. If RHV needs any info about the disk, it can >>>> get it from storage. >>>> >>>> Maybe you add stuff to the ovf that RHV does need? >> >> So looking at the commit (a2a4f7a09996) it seems as if it applies to >> all OVF outputs (including -o rhv-upload), but I think we only >> intended to add it to -o rhv output. >> >> Normally we'd do that by checking if ovf_flavor = RHVExportStorageDomain. >> So I think a fix is to make the actual_size stuff dependent on that test. >> >>> I see that this code is not in rhel-9.0.0 branch, so this patch is not >>> needed for https://bugzilla.redhat.com/2032324 >> >> It is in the latest rhel-9.0.0 branch. >> >>> Laszlo, can you explain why we need the number of allocated bytes >>> after the disk was already converted to the target storage? >>> >>> Looking at the bug: >>> https://bugzilla.redhat.com/2027598 >>> >>> This is a fix for the issue in the old rhv output, which is deprecated and >>> should not be used in new code, and it is not compatible with rhv-upload >>> which is the modern way to import into RHV. >>> >>> Also this cannot work for RHV now, so we need a way to disable >>> this when importing to RHV. >> >> "RHV" here is confusing - I think you mean -o rhv-upload here? >> (not -o rhv) >> >>> We can expose the block status in RHV plugin, but to use this we must >>> call block status after the disk is converted, but before the output >>> is finalized, since when you finalize RHV disconnect the disk from >>> the host, and the RHV plugin cannot access it any more. >>> >>> The flow for rhv-upload should be: >>> >>> run nbdcopy >>> query block status >>> finalize output >>> create ovf >> >> I suppose this won't be needed if we make the change above, but it was >> Laszlo who investigated this issue in detail so let's see what he says. > > We have a common utility function (create_ovf) that no longer works for > all call sites. > > (1) One way to fix it is to let each call site compute the list of disk > sizes ("int64 option list"), pass that in to "create_ovf", and let > "create_ovf" merge (combine) that list with the other lists it already > merges. The callers that don't need this functionality can just pass in > a list of "None"s. I outlined this in my previous email. > > (2) The other way to fix it is to restrict the logic inside > "create_ovf", based on some other (simpler) parameter that tells apart > the callers. If you tell me that "ovf_flavor = RHVExportStorageDomain" > is a condition I can rely on, in "create_ovf", I'm happy to use that -- > it's a lot simpler than the other option! Note that this would not > generally restore the pre-modularization OVF output of virt-v2v, but > maybe it does not matter for vdsm and rhv-upload? > > - "output_rhv.ml": passes in RHVExportStorageDomain as flavor > > - "output_rhv_upload.ml": passes OVirt as flavor > > - "output_vdsm.ml": well, this is tricky; this output plugin defaults to > RHVExportStorageDomain, but can be flipped to OVirt via the "-oo > vdsm-ovf-flavour" option! Does that match what we want to do in (2)? It's a bit of a hack isn't it, but it would solve the problem for now. The fundamental problem here (again) is that OVF isn't a real format. It's a collection of formats which share some common XML elements. In this case it turns out we now have 3 formats: RHVExportStorageDomain, OVirt, and (for -o vdsm) DataDomain (since -o vdsm writes directly into the oVirt Data Domain). So ... I don't know ... How about a ?(need_actual_sizes = false) optional param on create_ovf? Ugly but effective?

Can I perhaps change the "dir:string" parameter of create_ovf to "dir:string option"? Thanks! Laszlo

Richard W.M. Jones

6:32 a.m.

New subject: [PATCH 2/5] v2v/lib/util.ml: Get disk allocation from input

On Mon, Dec 20, 2021 at 12:28:01PM +0100, Laszlo Ersek wrote:

...

On 12/20/21 11:46, Richard W.M. Jones wrote: > On Mon, Dec 20, 2021 at 11:17:47AM +0100, Laszlo Ersek wrote: >> On 12/19/21 11:34, Richard W.M. Jones wrote: >>> On Sun, Dec 19, 2021 at 01:30:28AM +0200, Nir Soffer wrote: >>>> On Sun, Dec 19, 2021 at 12:40 AM Nir Soffer <nsoffer(a)redhat.com> wrote: >>>>> >>>>> It can be fixed in another way, maybe skipping the check when using >>>>> rhv-upload because it cannot provide the needed info. >>>>> >>>>> I'm not sure why this check is needed - RHV already has everything >>>>> it needs at this point, and it does not care about disk allocation after >>>>> the disk was uploaded. If RHV needs any info about the disk, it can >>>>> get it from storage. >>>>> >>>>> Maybe you add stuff to the ovf that RHV does need? >>> >>> So looking at the commit (a2a4f7a09996) it seems as if it applies to >>> all OVF outputs (including -o rhv-upload), but I think we only >>> intended to add it to -o rhv output. >>> >>> Normally we'd do that by checking if ovf_flavor = RHVExportStorageDomain. >>> So I think a fix is to make the actual_size stuff dependent on that test. >>> >>>> I see that this code is not in rhel-9.0.0 branch, so this patch is not >>>> needed for https://bugzilla.redhat.com/2032324 >>> >>> It is in the latest rhel-9.0.0 branch. >>> >>>> Laszlo, can you explain why we need the number of allocated bytes >>>> after the disk was already converted to the target storage? >>>> >>>> Looking at the bug: >>>> https://bugzilla.redhat.com/2027598 >>>> >>>> This is a fix for the issue in the old rhv output, which is deprecated and >>>> should not be used in new code, and it is not compatible with rhv-upload >>>> which is the modern way to import into RHV. >>>> >>>> Also this cannot work for RHV now, so we need a way to disable >>>> this when importing to RHV. >>> >>> "RHV" here is confusing - I think you mean -o rhv-upload here? >>> (not -o rhv) >>> >>>> We can expose the block status in RHV plugin, but to use this we must >>>> call block status after the disk is converted, but before the output >>>> is finalized, since when you finalize RHV disconnect the disk from >>>> the host, and the RHV plugin cannot access it any more. >>>> >>>> The flow for rhv-upload should be: >>>> >>>> run nbdcopy >>>> query block status >>>> finalize output >>>> create ovf >>> >>> I suppose this won't be needed if we make the change above, but it was >>> Laszlo who investigated this issue in detail so let's see what he says. >> >> We have a common utility function (create_ovf) that no longer works for >> all call sites. >> >> (1) One way to fix it is to let each call site compute the list of disk >> sizes ("int64 option list"), pass that in to "create_ovf", and let >> "create_ovf" merge (combine) that list with the other lists it already >> merges. The callers that don't need this functionality can just pass in >> a list of "None"s. I outlined this in my previous email. >> >> (2) The other way to fix it is to restrict the logic inside >> "create_ovf", based on some other (simpler) parameter that tells apart >> the callers. If you tell me that "ovf_flavor = RHVExportStorageDomain" >> is a condition I can rely on, in "create_ovf", I'm happy to use that -- >> it's a lot simpler than the other option! Note that this would not >> generally restore the pre-modularization OVF output of virt-v2v, but >> maybe it does not matter for vdsm and rhv-upload? >> >> - "output_rhv.ml": passes in RHVExportStorageDomain as flavor >> >> - "output_rhv_upload.ml": passes OVirt as flavor >> >> - "output_vdsm.ml": well, this is tricky; this output plugin defaults to >> RHVExportStorageDomain, but can be flipped to OVirt via the "-oo >> vdsm-ovf-flavour" option! Does that match what we want to do in (2)? > > It's a bit of a hack isn't it, but it would solve the problem > for now. > > The fundamental problem here (again) is that OVF isn't a real format. > It's a collection of formats which share some common XML elements. In > this case it turns out we now have 3 formats: RHVExportStorageDomain, > OVirt, and (for -o vdsm) DataDomain (since -o vdsm writes directly > into the oVirt Data Domain). > > So ... I don't know ... How about a ?(need_actual_sizes = false) > optional param on create_ovf? Ugly but effective? Can I perhaps change the "dir:string" parameter of create_ovf to "dir:string option"?

Overloading the meaning ... hmm. I think a separate named parameter would be clearer? We won't forget in a year's time why it was this way. Rich.

...

Thanks! Laszlo

Richard W.M. Jones

6:36 a.m.

New subject: [PATCH 2/5] v2v/lib/util.ml: Get disk allocation from input

I mean, I don't like functions with lots of parameters either (and this function does have a lot already). But at least with strong typing + labels it's harder to mix them up. An alternative is to put the parameters into a struct and pass them that way. That's essentially what I did with the Input/Output/Convert "options" structs. A disadvantage of this is it's less clear when a parameter is no longer being used because the compiler can't warn you about it (-w +20 [ignored-extra-argument] Unused function argument.) Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/

Laszlo Ersek

4:04 a.m.

New subject: [PATCH 2/5] v2v/lib/util.ml: Get disk allocation from input

On 12/19/21 00:30, Nir Soffer wrote:

...

Laszlo, can you explain why we need the number of allocated bytes after the disk was already converted to the target storage?

It's all described in commit a2a4f7a09996 ("lib/create_ovf: populate "actual size" attributes again", 2021-12-10):

...

commit a2a4f7a09996a5e66d027d0d9692e083eb0a8128 Author: Laszlo Ersek <lersek(a)redhat.com> Date: Fri Dec 10 12:35:37 2021 +0100 lib/create_ovf: populate "actual size" attributes again Commit 255722cbf39a ("v2v: Modular virt-v2v", 2021-09-07) removed the following attributes from the OVF output: - ovf:Envelope/References/File/@ovf:size - ovf:Envelope/Section[@xsi:type='ovf:DiskSection_Type']/Disk/@ovf:actual_size Unfortunately, ovirt-engine considers the second one mandatory; without it, ovirt-engine refuses to import the OVF. Restore both attributes, using the utility functions added to the Utils module previously. (If we do not have the information necessary to fill in @ovf:actual_size, we still have to generate the attribute, only with empty contents. Ovirt-engine does cope with that.) Fixes: 255722cbf39afc0b012e2ac00d16fa6ba2f8c21f Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2027598 Signed-off-by: Laszlo Ersek <lersek(a)redhat.com> Message-Id: <20211210113537.10907-4-lersek(a)redhat.com> Acked-by: Richard W.M. Jones <rjones(a)redhat.com>

Basically if we place an OVF file in the Export Storage Domain directory of ovirt-engine such that the OVF lacks the above-mentioned "actual_size" attribute, then ovirt-engine rejects the OVF file with a parse error. This behavior is actually a regression from ovirt-engine commit 1082d9dec289 ("core: undo recent generalization of ovf processing", 2017-08-09; however, we triggered this ovirt-engine bug first in virt-v2v commit 255722cbf39a ("v2v: Modular virt-v2v", 2021-09-07). Therefore restoring the previous virt-v2v behavior makes sense. We could of course just provide an empty actual_size='' attribute (as explained above), but that would still not restore the original virt-v2v behavior, and I cannot tell what the consequences for ovirt-engine would be. Back to your email: On 12/19/21 00:30, Nir Soffer wrote:

...

Looking at the bug: https://bugzilla.redhat.com/2027598 This is a fix for the issue in the old rhv output, which is deprecated and

I made the exact same argument before my analysis / fix, namely in bullet (1) of <https://bugzilla.redhat.com/show_bug.cgi?id=2027598#c16>;. Please see Rich's answer to that at <https://bugzilla.redhat.com/show_bug.cgi?id=2027598#c19>;. That was the reason we continued working on the analysis / patch at all.

...

should not be used in new code, and it is not compatible with rhv-upload which is the modern way to import into RHV. Also this cannot work for RHV now, so we need a way to disable this when importing to RHV. We can expose the block status in RHV plugin, but to use this we must call block status after the disk is converted, but before the output is finalized, since when you finalize RHV disconnect the disk from the host, and the RHV plugin cannot access it any more.

The code very much depends on being run during finalization -- please search commit a2a4f7a09996 for the word "finalization". The disconnection that you describe -- does that happen in the "rhv-upload-finalize.py" script? This is what we've got in "output/output_rhv_upload.ml":

...

(* Finalize all the transfers. *) let json_params = let ids = List.map (fun id -> JSON.String id) transfer_ids in let json_params = ("transfer_ids", JSON.List ids) :: json_params in let ids = List.map (fun uuid -> JSON.String uuid) disk_uuids in let json_params = ("disk_uuids", JSON.List ids) :: json_params in json_params in if Python_script.run_command finalize_script json_params [] <> 0 then <------- is the output disk detached here? error (f_"failed to finalize the transfers, see earlier errors"); (* The storage domain UUID. *) let sd_uuid = match rhv_storagedomain_uuid with | None -> assert false | Some uuid -> uuid in (* The volume and VM UUIDs are made up. *) let vol_uuids = List.map (fun _ -> uuidgen ()) disk_sizes and vm_uuid = uuidgen () in (* Create the metadata. *) let ovf = Create_ovf.create_ovf source inspect target_meta disk_sizes <------- output NBD block status collected here Sparse output_format sd_uuid disk_uuids vol_uuids dir vm_uuid OVirt in let ovf = DOM.doc_to_string ovf in

Back to your email: On 12/19/21 00:30, Nir Soffer wrote:

...

The flow for rhv-upload should be: run nbdcopy query block status finalize output create ovf

Currently, commit a2a4f7a09996 passes the temp dir, where the NBD sockets live, to "create_ovf". Then "create_ovf" collects the block status for each disk. One of the other approaches I outlined earlier was this: let each output plugin collect the list of block statuses (-> "int64 option list") before calling "create_ovf". Then pass that *list* to "create_ovf". Please see here <https://listman.redhat.com/archives/libguestfs/2021-December/msg00116.htm...;:

...

Yet another idea was to collect the actual sizes for all the output disks in one go, separately, then pass in that list, to create_ovf. Then in create_ovf, I would replace: (* Iterate over the disks, adding them to the OVF document. *) List.iteri ( fun i (size, image_uuid, vol_uuid) -> ... ) (List.combine3 sizes image_uuids vol_uuids) with "combine4", adding a fourth component (the actual size) to the tuple parameter of the nameless iterator function.

This complication did not seem justified at that point (see the last paragraph at <https://listman.redhat.com/archives/libguestfs/2021-December/msg00119.html>). I can do this *if*: - you can please tell me *where exactly* I should collect the block statuses inside the "rhv_upload_finalize" function [output/output_rhv_upload.ml] -- that is, if you can confirm where the disks get detached (because I need to collect the acutal sizes some place before that) - AND we can confirm if the "actual sizes" are actually useful for *all* output plugins different from the old rhv output. The whole thing is being done for the old rhv output's sake -- if neither "rhv-upload" nor "vdsm" (= the other two OVF-creating output plugins) need the actual disk sizes, then the actual list construction should be unique to "output/output_rhv_upload.ml". Sorry about the regression, my only excuse is that the comment that commit a2a4f7a09996 removed: - (* virt-v2v 1.4x used to try to collect the actual size of the - * sparse disk. It would be possible to get this information - * accurately now by reading the extent map of the output disk - * (this function is called during finalization), but we don't - * yet do that. XXX - *) implied that fetching the extent map was safe during finalization. The rhv-upload output plugin does not conform to that invariant however. Thanks, Laszlo

Richard W.M. Jones

4:42 a.m.

New subject: [PATCH 2/5] v2v/lib/util.ml: Get disk allocation from input

On Mon, Dec 20, 2021 at 11:04:16AM +0100, Laszlo Ersek wrote:

...

- AND we can confirm if the "actual sizes" are actually useful for *all* output plugins different from the old rhv output. The whole thing is being done for the old rhv output's sake -- if neither "rhv-upload" nor "vdsm" (= the other two OVF-creating output plugins) need the actual disk sizes, then the actual list construction should be unique to "output/output_rhv_upload.ml".

I'm fairly sure this only needs to be done for -o rhv. For sure not -o rhv-upload, and I'm fairly sure but not 100% about -o vdsm.

...

implied that fetching the extent map was safe during finalization. The rhv-upload output plugin does not conform to that invariant however.

This I feel is generally a problem with rhv-upload-plugin (also that it doesn't support extents at all), but it's not something we need to fix right now if we don't have to fetch the extent map. One possible solution being discussed off list was to move rhv-upload-plugin to nbdkit ("nbdkit-imageio-plugin"), and at the same time make it auto-renew the ticket and add extents support, so it behaves more like an nbdkit plugin. That could solve the other problem (ticket expiring if there is a delay on the input side). Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-top

Nir Soffer

7:09 a.m.

New subject: [PATCH 2/5] v2v/lib/util.ml: Get disk allocation from input

On Mon, Dec 20, 2021 at 12:04 PM Laszlo Ersek <lersek(a)redhat.com> wrote:

...

On 12/19/21 00:30, Nir Soffer wrote: > Laszlo, can you explain why we need the number of allocated bytes > after the disk was already converted to the target storage? It's all described in commit a2a4f7a09996 ("lib/create_ovf: populate "actual size" attributes again", 2021-12-10): > commit a2a4f7a09996a5e66d027d0d9692e083eb0a8128 > Author: Laszlo Ersek <lersek(a)redhat.com> > Date: Fri Dec 10 12:35:37 2021 +0100 > > lib/create_ovf: populate "actual size" attributes again > > Commit 255722cbf39a ("v2v: Modular virt-v2v", 2021-09-07) removed the > following attributes from the OVF output: > > - ovf:Envelope/References/File/@ovf:size > - ovf:Envelope/Section[@xsi:type='ovf:DiskSection_Type']/Disk/@ovf:actual_size > > Unfortunately, ovirt-engine considers the second one mandatory; without > it, ovirt-engine refuses to import the OVF. > > Restore both attributes, using the utility functions added to the Utils > module previously. > > (If we do not have the information necessary to fill in @ovf:actual_size, > we still have to generate the attribute, only with empty contents. > Ovirt-engine does cope with that.) > > Fixes: 255722cbf39afc0b012e2ac00d16fa6ba2f8c21f > Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2027598 > Signed-off-by: Laszlo Ersek <lersek(a)redhat.com> > Message-Id: <20211210113537.10907-4-lersek(a)redhat.com> > Acked-by: Richard W.M. Jones <rjones(a)redhat.com> Basically if we place an OVF file in the Export Storage Domain directory of ovirt-engine such that the OVF lacks the above-mentioned "actual_size" attribute, then ovirt-engine rejects the OVF file with a parse error. This behavior is actually a regression from ovirt-engine commit 1082d9dec289 ("core: undo recent generalization of ovf processing", 2017-08-09; however, we triggered this ovirt-engine bug first in virt-v2v commit 255722cbf39a ("v2v: Modular virt-v2v", 2021-09-07). Therefore restoring the previous virt-v2v behavior makes sense. We could of course just provide an empty actual_size='' attribute (as explained above), but that would still not restore the original virt-v2v behavior, and I cannot tell what the consequences for ovirt-engine would be.

Yes, this commit makes sense, and providing empty value sounds like the right approach to continue supporting legacy code.

...

Back to your email: On 12/19/21 00:30, Nir Soffer wrote: > Looking at the bug: > https://bugzilla.redhat.com/2027598 > > This is a fix for the issue in the old rhv output, which is deprecated > and I made the exact same argument before my analysis / fix, namely in bullet (1) of <https://bugzilla.redhat.com/show_bug.cgi?id=2027598#c16>;. Please see Rich's answer to that at <https://bugzilla.redhat.com/show_bug.cgi?id=2027598#c19>;. That was the reason we continued working on the analysis / patch at all. > should not be used in new code, and it is not compatible with > rhv-upload which is the modern way to import into RHV. > > Also this cannot work for RHV now, so we need a way to disable this > when importing to RHV. > > We can expose the block status in RHV plugin, but to use this we must > call block status after the disk is converted, but before the output > is finalized, since when you finalize RHV disconnect the disk from the > host, and the RHV plugin cannot access it any more. The code very much depends on being run during finalization -- please search commit a2a4f7a09996 for the word "finalization". The disconnection that you describe -- does that happen in the "rhv-upload-finalize.py" script?

The finalize script finalize the image transfer. On the oVirt side this does: 1. Removing the authentication ticket that let imageio access the disk. 2. Verifying the upload disk contents using qemu-img info (format, size backing file). If verification fails the disk is deleted and the transfer will fail. 3. Disconnecting the disk from the host. For example for block based storage we deactivate the logical volume. 4. Marking the disk is legal, so the rest of the system can use it.

...

This is what we've got in "output/output_rhv_upload.ml": > (* Finalize all the transfers. *) > let json_params = > let ids = List.map (fun id -> JSON.String id) transfer_ids in > let json_params = ("transfer_ids", JSON.List ids) :: json_params in > let ids = List.map (fun uuid -> JSON.String uuid) disk_uuids in > let json_params = ("disk_uuids", JSON.List ids) :: json_params in > json_params in > if Python_script.run_command finalize_script json_params [] <> 0 then <------- is the output disk detached here?

Yes, it cannot be accessed at this point. ...

...

> (* Create the metadata. *) > let ovf = > Create_ovf.create_ovf source inspect target_meta disk_sizes <------- output NBD block status collected here

Right, and the target is not accessible now. ...

...

I can do this *if*: - you can please tell me *where exactly* I should collect the block statuses inside the "rhv_upload_finalize" function [output/output_rhv_upload.ml] -- that is, if you can confirm where the disks get detached (because I need to collect the acutal sizes some place before that)

It must happen before we call the python finalize script. But for this we need first to add extents support: https://listman.redhat.com/archives/libguestfs/2021-December/msg00197.html

...

Right, spending time on providing useful actual_size makes sense only if actual_size is useful. I'm not sure why engine wants actual_size from the OVF. oVirt storage code does not need this information after a disk was copied to storage. What the storage needs is the size to allocate on block storage. We calculate this using qemu-img measure: qemu-img measure -O qcow2 /path/to/disk The "required" value should be passed to the initial_size argument when creating a disk. Without this it is not possible to import sparse disks to block based storage in oVirt. So for rhv-upload case we need this info after converting the image, and before we create a disk in oVirt. So the flow should be: start the input convert the image using the overlay measure the *overlay* disk to get the required size create disk in oVirt create transfer in oVirt start the output run ndbcopy with input and output stop the input finalize the transfer in oVirt create ovf import ovf in engine For "rhv" case, if I understand this correctly, it creates disks in an export domain which is always NFS (oVirt limitation). The actual_size in the OVF is not needed. oVirt will measure the disks when importing them to a data domain anyway. For "vdsm" case - virt-v2v creates the disks in a data domain. The actual disks are created before running virt-v2v, so the actual_size is not needed.

...

Sorry about the regression, my only excuse is that the comment that commit a2a4f7a09996 removed: - (* virt-v2v 1.4x used to try to collect the actual size of the - * sparse disk. It would be possible to get this information - * accurately now by reading the extent map of the output disk - * (this function is called during finalization), but we don't - * yet do that. XXX - *) implied that fetching the extent map was safe during finalization. The rhv-upload output plugin does not conform to that invariant however.

I think the issue is that we don't have a good way to test changes without having a real oVirt setup. Nir

Nir Soffer

Saturday, 18 December Sat, 18 Dec

2:36 p.m.

New subject: [PATCH 3/5] output/rhv-upload-plugin: Extract send_flush() helper

Extract a helper for sending flush request for single connection, and inline the iter_http_pool() helper into flush(), its only user. --- output/rhv-upload-plugin.py | 54 ++++++++++++++++--------------------- 1 file changed, 23 insertions(+), 31 deletions(-) diff --git a/output/rhv-upload-plugin.py b/output/rhv-upload-plugin.py index bad0e8a3..f7e5950f 100644 --- a/output/rhv-upload-plugin.py +++ b/output/rhv-upload-plugin.py @@ -271,36 +271,51 @@ def emulate_zero(h, count, offset, flags): r = http.getresponse() if r.status != 200: request_failed(r, "could not write zeroes offset %d size %d" % (offset, count)) r.read() def flush(h, flags): + # Wait until all inflight requests are completed, and send a flush + # request for all imageio connections. + locked = [] + + # Lock the pool by taking all connections out. + while len(locked) < pool.maxsize: + locked.append(pool.get()) + + try: + for http in locked: + send_flush(http) + finally: + # Unlock the pool by puting the connection back. + for http in locked: + pool.put(http) + + +def send_flush(http): # Construct the JSON request for flushing. buf = json.dumps({'op': "flush"}).encode() headers = {"Content-Type": "application/json", "Content-Length": str(len(buf))} - # Wait until all inflight requests are completed, and send a flush - # request for all imageio connections. - for http in iter_http_pool(pool): - http.request("PATCH", url.path, body=buf, headers=headers) + http.request("PATCH", url.path, body=buf, headers=headers) - r = http.getresponse() - if r.status != 200: - request_failed(r, "could not flush") + r = http.getresponse() + if r.status != 200: + request_failed(r, "could not flush") - r.read() + r.read() # Modify http.client.HTTPConnection to work over a Unix domain socket. # Derived from uhttplib written by Erik van Zijst under an MIT license. # (https://pypi.org/project/uhttplib/) # Ported to Python 3 by Irit Goihman. class UnixHTTPConnection(HTTPConnection): def __init__(self, path, timeout=socket._GLOBAL_DEFAULT_TIMEOUT): self.path = path HTTPConnection.__init__(self, "localhost", timeout=timeout) @@ -337,43 +352,20 @@ def http_context(pool): Context manager yielding an imageio http connection from the pool. Blocks until a connection is available. """ http = pool.get() try: yield http finally: pool.put(http) -def iter_http_pool(pool): - """ - Wait until all inflight requests are done, and iterate on imageio - connections. - - The pool is empty during iteration. New requests issued during iteration - will block until iteration is done. - """ - locked = [] - - # Lock the pool by taking all connections out. - while len(locked) < pool.maxsize: - locked.append(pool.get()) - - try: - for http in locked: - yield http - finally: - # Unlock the pool by puting the connection back. - for http in locked: - pool.put(http) - - def close_http_pool(pool): """ Wait until all inflight requests are done, close all connections and remove them from the pool. No request can be served by the pool after this call. """ nbdkit.debug("closing http pool") locked = [] -- 2.33.1

Richard W.M. Jones

Tuesday, 8 February Tue, 8 Feb

9:22 a.m.

New subject: [PATCH 3/5] output/rhv-upload-plugin: Extract send_flush() helper

On Sat, Dec 18, 2021 at 10:36:31PM +0200, Nir Soffer wrote:

...

This one looks like a neutral refactoring, so ACK Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 100 libraries supported. http://fedoraproject.org/wiki/MinGW

Nir Soffer

Saturday, 18 December Sat, 18 Dec

2:36 p.m.

New subject: [PATCH 4/5] output/rhv-upload-plugin: Track http last request time

Track the last time a connection was used. This will be used to detect idle connections. --- output/rhv-upload-plugin.py | 30 ++++++++++++++++++++---------- 1 file changed, 20 insertions(+), 10 deletions(-) diff --git a/output/rhv-upload-plugin.py b/output/rhv-upload-plugin.py index f7e5950f..8d088c4e 100644 --- a/output/rhv-upload-plugin.py +++ b/output/rhv-upload-plugin.py @@ -13,20 +13,21 @@ # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License along # with this program; if not, write to the Free Software Foundation, Inc., # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. import json import queue import socket import ssl +import time from contextlib import contextmanager from http.client import HTTPSConnection, HTTPConnection from urllib.parse import urlparse import nbdkit # Using version 2 supporting the buffer protocol for better performance. API_VERSION = 2 @@ -280,26 +281,27 @@ def emulate_zero(h, count, offset, flags): def flush(h, flags): # Wait until all inflight requests are completed, and send a flush # request for all imageio connections. locked = [] # Lock the pool by taking all connections out. while len(locked) < pool.maxsize: locked.append(pool.get()) try: - for http in locked: - send_flush(http) + for item in locked: + send_flush(item.http) + item.last_used = time.monotonic() finally: # Unlock the pool by puting the connection back. - for http in locked: - pool.put(http) + for item in locked: + pool.put(item) def send_flush(http): # Construct the JSON request for flushing. buf = json.dumps({'op': "flush"}).encode() headers = {"Content-Type": "application/json", "Content-Length": str(len(buf))} http.request("PATCH", url.path, body=buf, headers=headers) @@ -320,68 +322,76 @@ class UnixHTTPConnection(HTTPConnection): self.path = path HTTPConnection.__init__(self, "localhost", timeout=timeout) def connect(self): self.sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM) if self.timeout is not socket._GLOBAL_DEFAULT_TIMEOUT: self.sock.settimeout(timeout) self.sock.connect(self.path) +class PoolItem: + + def __init__(self, http): + self.http = http + self.last_used = None + + # Connection pool. def create_http_pool(url, options): count = min(options["max_readers"], options["max_writers"], MAX_CONNECTIONS) nbdkit.debug("creating http pool connections=%d" % count) unix_socket = options["unix_socket"] if is_ovirt_host else None pool = queue.Queue(count) for i in range(count): http = create_http(url, unix_socket=unix_socket) - pool.put(http) + pool.put(PoolItem(http)) return pool @contextmanager def http_context(pool): """ Context manager yielding an imageio http connection from the pool. Blocks until a connection is available. """ - http = pool.get() + item = pool.get() try: - yield http + yield item.http finally: - pool.put(http) + item.last_used = time.monotonic() + pool.put(item) def close_http_pool(pool): """ Wait until all inflight requests are done, close all connections and remove them from the pool. No request can be served by the pool after this call. """ nbdkit.debug("closing http pool") locked = [] while len(locked) < pool.maxsize: locked.append(pool.get()) - for http in locked: - http.close() + for item in locked: + item.http.close() def create_http(url, unix_socket=None): """ Create http connection for transfer url. Returns HTTPConnection. """ if unix_socket: nbdkit.debug("creating unix http connection socket=%r" % unix_socket) -- 2.33.1

Nir Soffer

2:36 p.m.

New subject: [PATCH 5/5] output/rhv-upload-plugin: Keep connections alive

When importing from vddk, nbdcopy may be blocked for few minutes(!) trying to get extents. While nbdcopy is blocked, imageio server closes the idle connections. When we finally get a request from nbdcopy, we fail to detect that the connection was closed. Detecting a closed connection is hard and racy. In the good case, we get a BrokenPipe error. In the bad case, imageio closed the socket right after we sent a request, and we get an invalid status line. When using imageio proxy, we may get http error (e.g. 500) if the proxy connection to imageio server on the host was closed. Even worse, when we find that the connection was closed, it is not safe to reopen the connection, since qemu-nbd does not ensure yet that data written to the previous connection will be flushed when we flush the new connection. Fix the issue by keeping the connections alive. A pool keeper thread sends a flush request on idle connection every ~30 seconds. This also improves data integrity and efficiency, using idle time to flush written data. Fixes https://bugzilla.redhat.com/2032324 --- output/rhv-upload-plugin.py | 71 +++++++++++++++++++++++++++++++++++++ 1 file changed, 71 insertions(+) diff --git a/output/rhv-upload-plugin.py b/output/rhv-upload-plugin.py index 8d088c4e..172da358 100644 --- a/output/rhv-upload-plugin.py +++ b/output/rhv-upload-plugin.py @@ -13,50 +13,60 @@ # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License along # with this program; if not, write to the Free Software Foundation, Inc., # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. import json import queue import socket import ssl +import threading import time from contextlib import contextmanager from http.client import HTTPSConnection, HTTPConnection from urllib.parse import urlparse import nbdkit # Using version 2 supporting the buffer protocol for better performance. API_VERSION = 2 # Maximum number of connection to imageio server. Based on testing with imageio # client, this give best performance. MAX_CONNECTIONS = 4 +# Maximum idle time allowed for imageio connections. +IDLE_TIMEOUT = 30 + # Required parameters. size = None url = None # Optional parameters. cafile = None insecure = False is_ovirt_host = False # List of options read from imageio server. options = None # Pool of HTTP connections. pool = None +# Set when plugin is cleaning up. +done = threading.Event() + +# Set when periodic flush request fails. +pool_error = None + # Parse parameters. def config(key, value): global cafile, url, is_ovirt_host, insecure, size if key == "cafile": cafile = value elif key == "insecure": insecure = value.lower() in ['true', '1'] elif key == "is_ovirt_host": @@ -84,25 +94,31 @@ def after_fork(): options = get_options(http, url) http.close() nbdkit.debug("imageio features: flush=%(can_flush)r " "zero=%(can_zero)r unix_socket=%(unix_socket)r " "max_readers=%(max_readers)r max_writers=%(max_writers)r" % options) pool = create_http_pool(url, options) + t = threading.Thread(target=pool_keeper, name="poolkeeper") + t.daemon = True + t.start() + # This function is not actually defined before nbdkit 1.28, but it # doesn't particularly matter if we don't close the pool because # clients should call flush(). def cleanup(): + nbdkit.debug("cleaning up") + done.set() close_http_pool(pool) def thread_model(): """ Using parallel model to speed up transfer with multiple connections to imageio server. """ return nbdkit.THREAD_MODEL_PARALLEL @@ -272,20 +288,23 @@ def emulate_zero(h, count, offset, flags): r = http.getresponse() if r.status != 200: request_failed(r, "could not write zeroes offset %d size %d" % (offset, count)) r.read() def flush(h, flags): + if pool_error: + raise pool_error + # Wait until all inflight requests are completed, and send a flush # request for all imageio connections. locked = [] # Lock the pool by taking all connections out. while len(locked) < pool.maxsize: locked.append(pool.get()) try: for item in locked: @@ -348,26 +367,78 @@ def create_http_pool(url, options): pool = queue.Queue(count) for i in range(count): http = create_http(url, unix_socket=unix_socket) pool.put(PoolItem(http)) return pool +def pool_keeper(): + """ + Thread flushing idle connections, keeping them alive. + + If a connection does not send any request for 60 seconds, imageio + server closes the connection. Recovering from closed connection is + hard and unsafe, so this thread ensure that connections never + becomes idle by sending a flush request if the connection is idle + for too much time. + + In normal conditions, all connections are busy most of the time, so + the keeper will find no idle connections. If there short delays in + nbdcopy, the keeper will find some idle connections, but will + quickly return them back to the pool. In the pathological case when + nbdcopy is blocked for 3 minutes on vddk input, the keeper will send + a flush request on all connections every ~30 seconds, until nbdcopy + starts communicating again. + """ + global pool_error + + nbdkit.debug("pool keeper started") + + while not done.wait(IDLE_TIMEOUT / 2): + idle = [] + + while True: + try: + idle.append(pool.get_nowait()) + except queue.Empty: + break + + if idle: + now = time.monotonic() + for item in idle: + if item.last_used and now - item.last_used > IDLE_TIMEOUT: + nbdkit.debug("Flushing idle connection") + try: + send_flush(item.http) + item.last_used = now + except Exception as e: + # We will report this error on the next request. + pool_error = e + item.last_used = None + + pool.put(item) + + nbdkit.debug("pool keeper stopped") + + @contextmanager def http_context(pool): """ Context manager yielding an imageio http connection from the pool. Blocks until a connection is available. """ + if pool_error: + raise pool_error + item = pool.get() try: yield item.http finally: item.last_used = time.monotonic() pool.put(item) def close_http_pool(pool): """ -- 2.33.1

Richard W.M. Jones

Tuesday, 4 January Tue, 4 Jan

10:02 a.m.

New subject: [PATCH 5/5] output/rhv-upload-plugin: Keep connections alive

On Sat, Dec 18, 2021 at 10:36:33PM +0200, Nir Soffer wrote:

...

Ideally imageio would not just time out after such a short time when a client has connections open. (Do we actually hold the TCP-level connection open during this time?) Is there a no-op ping-like request that we can send? If TCP-level connection is open, can we enable TCP keepalives? Rich.

...

output/rhv-upload-plugin.py | 71 +++++++++++++++++++++++++++++++++++++ 1 file changed, 71 insertions(+) diff --git a/output/rhv-upload-plugin.py b/output/rhv-upload-plugin.py index 8d088c4e..172da358 100644 --- a/output/rhv-upload-plugin.py +++ b/output/rhv-upload-plugin.py @@ -13,50 +13,60 @@ # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License along # with this program; if not, write to the Free Software Foundation, Inc., # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. import json import queue import socket import ssl +import threading import time from contextlib import contextmanager from http.client import HTTPSConnection, HTTPConnection from urllib.parse import urlparse import nbdkit # Using version 2 supporting the buffer protocol for better performance. API_VERSION = 2 # Maximum number of connection to imageio server. Based on testing with imageio # client, this give best performance. MAX_CONNECTIONS = 4 +# Maximum idle time allowed for imageio connections. +IDLE_TIMEOUT = 30 + # Required parameters. size = None url = None # Optional parameters. cafile = None insecure = False is_ovirt_host = False # List of options read from imageio server. options = None # Pool of HTTP connections. pool = None +# Set when plugin is cleaning up. +done = threading.Event() + +# Set when periodic flush request fails. +pool_error = None + # Parse parameters. def config(key, value): global cafile, url, is_ovirt_host, insecure, size if key == "cafile": cafile = value elif key == "insecure": insecure = value.lower() in ['true', '1'] elif key == "is_ovirt_host": @@ -84,25 +94,31 @@ def after_fork(): options = get_options(http, url) http.close() nbdkit.debug("imageio features: flush=%(can_flush)r " "zero=%(can_zero)r unix_socket=%(unix_socket)r " "max_readers=%(max_readers)r max_writers=%(max_writers)r" % options) pool = create_http_pool(url, options) + t = threading.Thread(target=pool_keeper, name="poolkeeper") + t.daemon = True + t.start() + # This function is not actually defined before nbdkit 1.28, but it # doesn't particularly matter if we don't close the pool because # clients should call flush(). def cleanup(): + nbdkit.debug("cleaning up") + done.set() close_http_pool(pool) def thread_model(): """ Using parallel model to speed up transfer with multiple connections to imageio server. """ return nbdkit.THREAD_MODEL_PARALLEL @@ -272,20 +288,23 @@ def emulate_zero(h, count, offset, flags): r = http.getresponse() if r.status != 200: request_failed(r, "could not write zeroes offset %d size %d" % (offset, count)) r.read() def flush(h, flags): + if pool_error: + raise pool_error + # Wait until all inflight requests are completed, and send a flush # request for all imageio connections. locked = [] # Lock the pool by taking all connections out. while len(locked) < pool.maxsize: locked.append(pool.get()) try: for item in locked: @@ -348,26 +367,78 @@ def create_http_pool(url, options): pool = queue.Queue(count) for i in range(count): http = create_http(url, unix_socket=unix_socket) pool.put(PoolItem(http)) return pool +def pool_keeper(): + """ + Thread flushing idle connections, keeping them alive. + + If a connection does not send any request for 60 seconds, imageio + server closes the connection. Recovering from closed connection is + hard and unsafe, so this thread ensure that connections never + becomes idle by sending a flush request if the connection is idle + for too much time. + + In normal conditions, all connections are busy most of the time, so + the keeper will find no idle connections. If there short delays in + nbdcopy, the keeper will find some idle connections, but will + quickly return them back to the pool. In the pathological case when + nbdcopy is blocked for 3 minutes on vddk input, the keeper will send + a flush request on all connections every ~30 seconds, until nbdcopy + starts communicating again. + """ + global pool_error + + nbdkit.debug("pool keeper started") + + while not done.wait(IDLE_TIMEOUT / 2): + idle = [] + + while True: + try: + idle.append(pool.get_nowait()) + except queue.Empty: + break + + if idle: + now = time.monotonic() + for item in idle: + if item.last_used and now - item.last_used > IDLE_TIMEOUT: + nbdkit.debug("Flushing idle connection") + try: + send_flush(item.http) + item.last_used = now + except Exception as e: + # We will report this error on the next request. + pool_error = e + item.last_used = None + + pool.put(item) + + nbdkit.debug("pool keeper stopped") + + @contextmanager def http_context(pool): """ Context manager yielding an imageio http connection from the pool. Blocks until a connection is available. """ + if pool_error: + raise pool_error + item = pool.get() try: yield item.http finally: item.last_used = time.monotonic() pool.put(item) def close_http_pool(pool): """ -- 2.33.1

Nir Soffer

10:29 a.m.

New subject: [PATCH 5/5] output/rhv-upload-plugin: Keep connections alive

On Tue, Jan 4, 2022 at 6:02 PM Richard W.M. Jones <rjones(a)redhat.com> wrote:

...

On Sat, Dec 18, 2021 at 10:36:33PM +0200, Nir Soffer wrote: > When importing from vddk, nbdcopy may be blocked for few minutes(!) > trying to get extents. While nbdcopy is blocked, imageio server closes > the idle connections. When we finally get a request from nbdcopy, we > fail to detect that the connection was closed. > > Detecting a closed connection is hard and racy. In the good case, we get > a BrokenPipe error. In the bad case, imageio closed the socket right > after we sent a request, and we get an invalid status line. When using > imageio proxy, we may get http error (e.g. 500) if the proxy connection > to imageio server on the host was closed. > > Even worse, when we find that the connection was closed, it is not safe > to reopen the connection, since qemu-nbd does not ensure yet that data > written to the previous connection will be flushed when we flush the new > connection. > > Fix the issue by keeping the connections alive. A pool keeper thread > sends a flush request on idle connection every ~30 seconds. This also > improves data integrity and efficiency, using idle time to flush written > data. > > Fixes https://bugzilla.redhat.com/2032324 Ideally imageio would not just time out after such a short time when a client has connections open. (Do we actually hold the TCP-level connection open during this time?) Is there a no-op ping-like request that we can send?

We don't have no-op request, but OPTIONS can be used for that. It is has tiny json response that can be droped when you use it as a "ping". Using options can simplify the change, since we don't have to report errors in OPTIONS, they are not critical.

...

If TCP-level connection is open, can we enable TCP keepalives?

We can but I think the timeouts are too long, and it is not the right way to keep a connection open. You want to do this in the application level, this way you verify the entire stack on each ping.

...

Rich. > output/rhv-upload-plugin.py | 71 +++++++++++++++++++++++++++++++++++++ > 1 file changed, 71 insertions(+) > > diff --git a/output/rhv-upload-plugin.py b/output/rhv-upload-plugin.py > index 8d088c4e..172da358 100644 > --- a/output/rhv-upload-plugin.py > +++ b/output/rhv-upload-plugin.py > @@ -13,50 +13,60 @@ > # GNU General Public License for more details. > # > # You should have received a copy of the GNU General Public License along > # with this program; if not, write to the Free Software Foundation, Inc., > # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. > > import json > import queue > import socket > import ssl > +import threading > import time > > from contextlib import contextmanager > from http.client import HTTPSConnection, HTTPConnection > from urllib.parse import urlparse > > import nbdkit > > # Using version 2 supporting the buffer protocol for better performance. > API_VERSION = 2 > > # Maximum number of connection to imageio server. Based on testing with imageio > # client, this give best performance. > MAX_CONNECTIONS = 4 > > +# Maximum idle time allowed for imageio connections. > +IDLE_TIMEOUT = 30 > + > # Required parameters. > size = None > url = None > > # Optional parameters. > cafile = None > insecure = False > is_ovirt_host = False > > # List of options read from imageio server. > options = None > > # Pool of HTTP connections. > pool = None > > +# Set when plugin is cleaning up. > +done = threading.Event() > + > +# Set when periodic flush request fails. > +pool_error = None > + > > # Parse parameters. > def config(key, value): > global cafile, url, is_ovirt_host, insecure, size > > if key == "cafile": > cafile = value > elif key == "insecure": > insecure = value.lower() in ['true', '1'] > elif key == "is_ovirt_host": > @@ -84,25 +94,31 @@ def after_fork(): > options = get_options(http, url) > http.close() > > nbdkit.debug("imageio features: flush=%(can_flush)r " > "zero=%(can_zero)r unix_socket=%(unix_socket)r " > "max_readers=%(max_readers)r max_writers=%(max_writers)r" > % options) > > pool = create_http_pool(url, options) > > + t = threading.Thread(target=pool_keeper, name="poolkeeper") > + t.daemon = True > + t.start() > + > > # This function is not actually defined before nbdkit 1.28, but it > # doesn't particularly matter if we don't close the pool because > # clients should call flush(). > def cleanup(): > + nbdkit.debug("cleaning up") > + done.set() > close_http_pool(pool) > > > def thread_model(): > """ > Using parallel model to speed up transfer with multiple connections to > imageio server. > """ > return nbdkit.THREAD_MODEL_PARALLEL > > @@ -272,20 +288,23 @@ def emulate_zero(h, count, offset, flags): > r = http.getresponse() > if r.status != 200: > request_failed(r, > "could not write zeroes offset %d size %d" % > (offset, count)) > > r.read() > > > def flush(h, flags): > + if pool_error: > + raise pool_error > + > # Wait until all inflight requests are completed, and send a flush > # request for all imageio connections. > locked = [] > > # Lock the pool by taking all connections out. > while len(locked) < pool.maxsize: > locked.append(pool.get()) > > try: > for item in locked: > @@ -348,26 +367,78 @@ def create_http_pool(url, options): > > pool = queue.Queue(count) > > for i in range(count): > http = create_http(url, unix_socket=unix_socket) > pool.put(PoolItem(http)) > > return pool > > > +def pool_keeper(): > + """ > + Thread flushing idle connections, keeping them alive. > + > + If a connection does not send any request for 60 seconds, imageio > + server closes the connection. Recovering from closed connection is > + hard and unsafe, so this thread ensure that connections never > + becomes idle by sending a flush request if the connection is idle > + for too much time. > + > + In normal conditions, all connections are busy most of the time, so > + the keeper will find no idle connections. If there short delays in > + nbdcopy, the keeper will find some idle connections, but will > + quickly return them back to the pool. In the pathological case when > + nbdcopy is blocked for 3 minutes on vddk input, the keeper will send > + a flush request on all connections every ~30 seconds, until nbdcopy > + starts communicating again. > + """ > + global pool_error > + > + nbdkit.debug("pool keeper started") > + > + while not done.wait(IDLE_TIMEOUT / 2): > + idle = [] > + > + while True: > + try: > + idle.append(pool.get_nowait()) > + except queue.Empty: > + break > + > + if idle: > + now = time.monotonic() > + for item in idle: > + if item.last_used and now - item.last_used > IDLE_TIMEOUT: > + nbdkit.debug("Flushing idle connection") > + try: > + send_flush(item.http) > + item.last_used = now > + except Exception as e: > + # We will report this error on the next request. > + pool_error = e > + item.last_used = None > + > + pool.put(item) > + > + nbdkit.debug("pool keeper stopped") > + > + > @contextmanager > def http_context(pool): > """ > Context manager yielding an imageio http connection from the pool. Blocks > until a connection is available. > """ > + if pool_error: > + raise pool_error > + > item = pool.get() > try: > yield item.http > finally: > item.last_used = time.monotonic() > pool.put(item) > > > def close_http_pool(pool): > """ > -- > 2.33.1 -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/

Nir Soffer

3:10 p.m.

New subject: [PATCH 5/5] output/rhv-upload-plugin: Keep connections alive

On Tue, Jan 4, 2022 at 6:29 PM Nir Soffer <nsoffer(a)redhat.com> wrote:

...

On Tue, Jan 4, 2022 at 6:02 PM Richard W.M. Jones <rjones(a)redhat.com> wrote: > > On Sat, Dec 18, 2021 at 10:36:33PM +0200, Nir Soffer wrote: > > When importing from vddk, nbdcopy may be blocked for few minutes(!) > > trying to get extents. While nbdcopy is blocked, imageio server closes > > the idle connections. When we finally get a request from nbdcopy, we > > fail to detect that the connection was closed. > > > > Detecting a closed connection is hard and racy. In the good case, we get > > a BrokenPipe error. In the bad case, imageio closed the socket right > > after we sent a request, and we get an invalid status line. When using > > imageio proxy, we may get http error (e.g. 500) if the proxy connection > > to imageio server on the host was closed. > > > > Even worse, when we find that the connection was closed, it is not safe > > to reopen the connection, since qemu-nbd does not ensure yet that data > > written to the previous connection will be flushed when we flush the new > > connection. > > > > Fix the issue by keeping the connections alive. A pool keeper thread > > sends a flush request on idle connection every ~30 seconds. This also > > improves data integrity and efficiency, using idle time to flush written > > data. > > > > Fixes https://bugzilla.redhat.com/2032324 > > Ideally imageio would not just time out after such a short time when a > client has connections open. (Do we actually hold the TCP-level > connection open during this time?) > > Is there a no-op ping-like request that we can send? We don't have no-op request, but OPTIONS can be used for that. It is has tiny json response that can be droped when you use it as a "ping". Using options can simplify the change, since we don't have to report errors in OPTIONS, they are not critical. > If TCP-level > connection is open, can we enable TCP keepalives? We can but I think the timeouts are too long, and it is not the right way to keep a connection open. You want to do this in the application level, this way you verify the entire stack on each ping.

And even if we want to use TCP keepalive - it will not help, since current imageio still timeout on reading from the client socket after 60 seconds. The only way to avoid the timeout is to send some request before that timeout. In ovirt 4.5 this is not an issue, imageio uses the inactivity_timeout defined when creating an image transfer. Nir

...

> > Rich. > > > output/rhv-upload-plugin.py | 71 +++++++++++++++++++++++++++++++++++++ > > 1 file changed, 71 insertions(+) > > > > diff --git a/output/rhv-upload-plugin.py b/output/rhv-upload-plugin.py > > index 8d088c4e..172da358 100644 > > --- a/output/rhv-upload-plugin.py > > +++ b/output/rhv-upload-plugin.py > > @@ -13,50 +13,60 @@ > > # GNU General Public License for more details. > > # > > # You should have received a copy of the GNU General Public License along > > # with this program; if not, write to the Free Software Foundation, Inc., > > # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. > > > > import json > > import queue > > import socket > > import ssl > > +import threading > > import time > > > > from contextlib import contextmanager > > from http.client import HTTPSConnection, HTTPConnection > > from urllib.parse import urlparse > > > > import nbdkit > > > > # Using version 2 supporting the buffer protocol for better performance. > > API_VERSION = 2 > > > > # Maximum number of connection to imageio server. Based on testing with imageio > > # client, this give best performance. > > MAX_CONNECTIONS = 4 > > > > +# Maximum idle time allowed for imageio connections. > > +IDLE_TIMEOUT = 30 > > + > > # Required parameters. > > size = None > > url = None > > > > # Optional parameters. > > cafile = None > > insecure = False > > is_ovirt_host = False > > > > # List of options read from imageio server. > > options = None > > > > # Pool of HTTP connections. > > pool = None > > > > +# Set when plugin is cleaning up. > > +done = threading.Event() > > + > > +# Set when periodic flush request fails. > > +pool_error = None > > + > > > > # Parse parameters. > > def config(key, value): > > global cafile, url, is_ovirt_host, insecure, size > > > > if key == "cafile": > > cafile = value > > elif key == "insecure": > > insecure = value.lower() in ['true', '1'] > > elif key == "is_ovirt_host": > > @@ -84,25 +94,31 @@ def after_fork(): > > options = get_options(http, url) > > http.close() > > > > nbdkit.debug("imageio features: flush=%(can_flush)r " > > "zero=%(can_zero)r unix_socket=%(unix_socket)r " > > "max_readers=%(max_readers)r max_writers=%(max_writers)r" > > % options) > > > > pool = create_http_pool(url, options) > > > > + t = threading.Thread(target=pool_keeper, name="poolkeeper") > > + t.daemon = True > > + t.start() > > + > > > > # This function is not actually defined before nbdkit 1.28, but it > > # doesn't particularly matter if we don't close the pool because > > # clients should call flush(). > > def cleanup(): > > + nbdkit.debug("cleaning up") > > + done.set() > > close_http_pool(pool) > > > > > > def thread_model(): > > """ > > Using parallel model to speed up transfer with multiple connections to > > imageio server. > > """ > > return nbdkit.THREAD_MODEL_PARALLEL > > > > @@ -272,20 +288,23 @@ def emulate_zero(h, count, offset, flags): > > r = http.getresponse() > > if r.status != 200: > > request_failed(r, > > "could not write zeroes offset %d size %d" % > > (offset, count)) > > > > r.read() > > > > > > def flush(h, flags): > > + if pool_error: > > + raise pool_error > > + > > # Wait until all inflight requests are completed, and send a flush > > # request for all imageio connections. > > locked = [] > > > > # Lock the pool by taking all connections out. > > while len(locked) < pool.maxsize: > > locked.append(pool.get()) > > > > try: > > for item in locked: > > @@ -348,26 +367,78 @@ def create_http_pool(url, options): > > > > pool = queue.Queue(count) > > > > for i in range(count): > > http = create_http(url, unix_socket=unix_socket) > > pool.put(PoolItem(http)) > > > > return pool > > > > > > +def pool_keeper(): > > + """ > > + Thread flushing idle connections, keeping them alive. > > + > > + If a connection does not send any request for 60 seconds, imageio > > + server closes the connection. Recovering from closed connection is > > + hard and unsafe, so this thread ensure that connections never > > + becomes idle by sending a flush request if the connection is idle > > + for too much time. > > + > > + In normal conditions, all connections are busy most of the time, so > > + the keeper will find no idle connections. If there short delays in > > + nbdcopy, the keeper will find some idle connections, but will > > + quickly return them back to the pool. In the pathological case when > > + nbdcopy is blocked for 3 minutes on vddk input, the keeper will send > > + a flush request on all connections every ~30 seconds, until nbdcopy > > + starts communicating again. > > + """ > > + global pool_error > > + > > + nbdkit.debug("pool keeper started") > > + > > + while not done.wait(IDLE_TIMEOUT / 2): > > + idle = [] > > + > > + while True: > > + try: > > + idle.append(pool.get_nowait()) > > + except queue.Empty: > > + break > > + > > + if idle: > > + now = time.monotonic() > > + for item in idle: > > + if item.last_used and now - item.last_used > IDLE_TIMEOUT: > > + nbdkit.debug("Flushing idle connection") > > + try: > > + send_flush(item.http) > > + item.last_used = now > > + except Exception as e: > > + # We will report this error on the next request. > > + pool_error = e > > + item.last_used = None > > + > > + pool.put(item) > > + > > + nbdkit.debug("pool keeper stopped") > > + > > + > > @contextmanager > > def http_context(pool): > > """ > > Context manager yielding an imageio http connection from the pool. Blocks > > until a connection is available. > > """ > > + if pool_error: > > + raise pool_error > > + > > item = pool.get() > > try: > > yield item.http > > finally: > > item.last_used = time.monotonic() > > pool.put(item) > > > > > > def close_http_pool(pool): > > """ > > -- > > 2.33.1 > > -- > Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones > Read my programming and virtualization blog: http://rwmj.wordpress.com > virt-df lists disk usage of guests without needing to install any > software inside the virtual machine. Supports Linux and Windows. > http://people.redhat.com/~rjones/virt-df/ >

Richard W.M. Jones

Tuesday, 8 February Tue, 8 Feb

9:24 a.m.

New subject: [PATCH 5/5] output/rhv-upload-plugin: Keep connections alive

On Sat, Dec 18, 2021 at 10:36:33PM +0200, Nir Soffer wrote:

...

ACK 4 & 5. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-top

Nir Soffer

Thursday, 10 February Thu, 10 Feb

12:30 p.m.

New subject: [PATCH 5/5] output/rhv-upload-plugin: Keep connections alive

On Tue, Feb 8, 2022 at 5:24 PM Richard W.M. Jones <rjones(a)redhat.com> wrote: ...

...

ACK 4 & 5.

Push series without patch 2 as 99b6e31b output/rhv-upload-plugin: Keep connections alive 02d2236b output/rhv-upload-plugin: Track http last request time a436a0dc output/rhv-upload-plugin: Extract send_flush() helper 6e4f3270 output/rhv-upload-plugin: Fix flush and close

Nir Soffer

Tuesday, 4 January Tue, 4 Jan

9 a.m.

On Sat, Dec 18, 2021 at 10:36 PM Nir Soffer <nsoffer(a)redhat.com> wrote:

...

Richard, can you take a look at the patches?

...

lib/utils.ml | 2 +- output/rhv-upload-plugin.py | 151 ++++++++++++++++++++++++++---------- 2 files changed, 113 insertions(+), 40 deletions(-) -- 2.33.1

Richard W.M. Jones

9:56 a.m.

On Tue, Jan 04, 2022 at 05:00:48PM +0200, Nir Soffer wrote:

...

On Sat, Dec 18, 2021 at 10:36 PM Nir Soffer <nsoffer(a)redhat.com> wrote: > > Fix problems in new rhv-upload implementation: > > - The plugin does not flush to all connections in flush() > - The plugin does not close all connections in cleanup() > - Idle connections are closed in imageio server, and we don't have a safe way > to recover. > - virt-v2v try to get disk allocation using imageio output, but imageio output > does not support extents. Even if imageio output will support extents, the > call is done after the transfer was finalized so it does not have access to > storage. > > Problems not fixed yet: > > - Image transfer is finalized *before* closing the connection to imageio - this > will always time out with RHV < 4.4.9, and succeeds by mistake with RHV 4.4.9 > due to a regression that will be fixed in 4.4.10. This will be a non-issue in > next RHV version[1]. To support older RHV versions, virt-v2v must finalize > the image transfer *after* closing the output. > > Tested on RHEL 8.6 with upstream nbdkit and libnbd. > > [1] https://github.com/oVirt/ovirt-imageio/pull/15 > > Fixes https://bugzilla.redhat.com/2032324 > > Nir Soffer (5): > output/rhv-upload-plugin: Fix flush and close > v2v/lib/util.ml: Get disk allocation from input > output/rhv-upload-plugin: Extract send_flush() helper > output/rhv-upload-plugin: Track http last request time > output/rhv-upload-plugin: Keep connections alive Richard, can you take a look at the patches?

Patch 2 is definitely wrong. I'll look at 3-5. Rich.

...

> > lib/utils.ml | 2 +- > output/rhv-upload-plugin.py | 151 ++++++++++++++++++++++++++---------- > 2 files changed, 113 insertions(+), 40 deletions(-) > > -- > 2.33.1 > >

Richard W.M. Jones

10:06 a.m.

On Tue, Jan 04, 2022 at 03:56:03PM +0000, Richard W.M. Jones wrote:

...

Patch 2 is definitely wrong.

I should add ... and no longer necessary since Laszlo's fix went upstream: https://github.com/libguestfs/virt-v2v/commit/07b12fe99fb9cf0b75fe45d3eaa... We should no longer be requesting disk allocation when using -o rhv-upload (so extents support is also not necessary). Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/

Nir Soffer

10:30 a.m.

On Tue, Jan 4, 2022 at 6:06 PM Richard W.M. Jones <rjones(a)redhat.com> wrote:

...

On Tue, Jan 04, 2022 at 03:56:03PM +0000, Richard W.M. Jones wrote: > Patch 2 is definitely wrong. I should add ... and no longer necessary since Laszlo's fix went upstream: https://github.com/libguestfs/virt-v2v/commit/07b12fe99fb9cf0b75fe45d3eaa... We should no longer be requesting disk allocation when using -o rhv-upload (so extents support is also not necessary).

Sure, I will drop it. But extents support can be useful for the future, I think it's a good idea to include it upstream.

...

Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/

1237

days inactive

1291

days old

guestfs@lists.libguestfs.org

Manage subscription

30 comments

3 participants

tags (0)

participants (3)

Laszlo Ersek
Nir Soffer
Richard W.M. Jones

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

[PATCH 0/5] Fix rhv-upload output