Re: [Libguestfs] [PATCH virt-v2v 1/2] v2v: -o rhv-upload: Split plugin functions from create/finalize

Tuesday, 3 August 2021

On Tue, Aug 3, 2021 at 10:25 PM Richard W.M. Jones <rjones(a)redhat.com&gt; wrote:
...

 On Tue, Aug 03, 2021 at 09:18:50PM +0300, Nir Soffer wrote:
 > On Tue, Aug 3, 2021 at 8:48 PM Richard W.M. Jones <rjones(a)redhat.com&gt; wrote:
 > > Since the pool is a global (there is only one plugin instance, but
 > > open() is called 4 times), I believe this actually limits the pool to
 > > 4, unless max_readers/writers overrides it.
 >
 > I see, so if I understand how it works correctly, we have:
 >
 > after fork:
 > - called once
 > - create pool of connections

 Yes.  Note there's only one nbdkit process and one Python interpreter
 running even if the client connects multiple times.

 > open:
 > - called once per nbd connection
 > - does nothing since we already opened the connections

 Yup.

 > write/zero:
 > - called once per nbd command
 > - pick random connection and send the request

 Yes.  But crucially because the thread model is parallel, and because
 certain Python http requests will block and release the Python GIL, it
 can be that nbdkit will call into the plugin in parallel even on the
 same handle.

 But when running Python code we are never preempted unless we call
 some C extension that blocks and releases the GIL.

 > flush:
 > - called once per nbd connection
 > - flush all connections

 Yes.  However this is controlled by the client.

 It happens that both qemu-img convert and nbdcopy do a flush operation
 right at the end.  qemu-img convert doesn't support multi-conn, so
 there's only one NBD connection and one flush.

 nbdcopy happens to call flush on every NBD connection (but that's just
 because we are being over-cautious, I think multi-conn guarantees mean
 we don't have to do this).

https://gitlab.com/nbdkit/libnbd/-/blob/3d9d23b1b0d1df049782555ad602476c3...

According to Eric, there is no need to do multiple flushes, since
qemu-nbd and imageio
implementation ensure that every flush is visible by all clients.

So we can simplify the plugin to send one flush on one of the connections
instead of waiting for all connections and sending one flush per http
connection.

...
 > close:
 > - called once per nbd connection
 > - close all connections on first call
 > - does nothing on later call

 I think the patch I posted had a mistake in it where I closed the
 whole thread pool in close().  In my own copy the thread pool is
 destroyed on cleanup() (when nbdkit exits).  However this requires a
 small change to nbdkit
 (https://gitlab.com/nbdkit/nbdkit/-/commit/f2fe99e4b0f54467ab8028eaf2d039c...)

 > So we have some issues:
 > - calling flush 16 times instead of 4 times

 Yes I think so, but probably the later flushes don't do very much work? 
Yes this is just pointless calls, no harm.

...
 > Can we disable the parallel threading model with multi_con, or
we must
 > use it to have concurrent calls?

 The client decides how many connections to make, and that happens
 after we've made the decision to choose a thread model.

 > > > > -def create_http_pool(url, host, options):
 > > > > +# Connection pool.
 > > > > +def create_http_pool(url, options):
 > > > >      pool = queue.Queue()
 > > > >
 > > > >      count = min(options["max_readers"],
 > > > >                  options["max_writers"],
 > > > >                  MAX_CONNECTIONS)
 > > >
 > > > Note that this logic must move to virt-v2v now. If imageio does
 > > > not support multiple writers, you should not use multi_con.
 > > > Since we create the transfer with format="raw", we should
always
 > > > support 8 writers, but the correctness this must be enforce
 > > > using max_writers option.
 > >
 > > I don't understand this comment.
 >
 > In create_transfer we use create the image transfer with:
 >
 >     format="raw"
 >
 > This enables the nbd backend in imageio. The system starts qemu-nbd
 > and imageio connects to qemu-nbd instead of using the file backend
 > opening the actual disk. The nbd backend supports multiple writers, but
 > the file backend does not.

 I see.  However I'm not sure I understand why the logic for choosing
 "count" has to move to virt-v2v. 
I did not understand how things are wired up. I think the current way can
work, having m:n mapping between nbd connection and http connections.

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [Libguestfs] [PATCH virt-v2v 1/2] v2v: -o rhv-upload: Split plugin functions from create/finalize