On Tue, Aug 3, 2021 at 10:25 PM Richard W.M. Jones <rjones(a)redhat.com> wrote:
On Tue, Aug 03, 2021 at 09:18:50PM +0300, Nir Soffer wrote:
> On Tue, Aug 3, 2021 at 8:48 PM Richard W.M. Jones <rjones(a)redhat.com> wrote:
> > Since the pool is a global (there is only one plugin instance, but
> > open() is called 4 times), I believe this actually limits the pool to
> > 4, unless max_readers/writers overrides it.
>
> I see, so if I understand how it works correctly, we have:
>
> after fork:
> - called once
> - create pool of connections
Yes. Note there's only one nbdkit process and one Python interpreter
running even if the client connects multiple times.
> open:
> - called once per nbd connection
> - does nothing since we already opened the connections
Yup.
> write/zero:
> - called once per nbd command
> - pick random connection and send the request
Yes. But crucially because the thread model is parallel, and because
certain Python http requests will block and release the Python GIL, it
can be that nbdkit will call into the plugin in parallel even on the
same handle.
But when running Python code we are never preempted unless we call
some C extension that blocks and releases the GIL.
> flush:
> - called once per nbd connection
> - flush all connections
Yes. However this is controlled by the client.
It happens that both qemu-img convert and nbdcopy do a flush operation
right at the end. qemu-img convert doesn't support multi-conn, so
there's only one NBD connection and one flush.
nbdcopy happens to call flush on every NBD connection (but that's just
because we are being over-cautious, I think multi-conn guarantees mean
we don't have to do this).
https://gitlab.com/nbdkit/libnbd/-/blob/3d9d23b1b0d1df049782555ad602476c3...
According to Eric, there is no need to do multiple flushes, since
qemu-nbd and imageio
implementation ensure that every flush is visible by all clients.
So we can simplify the plugin to send one flush on one of the connections
instead of waiting for all connections and sending one flush per http
connection.
> close:
> - called once per nbd connection
> - close all connections on first call
> - does nothing on later call
I think the patch I posted had a mistake in it where I closed the
whole thread pool in close(). In my own copy the thread pool is
destroyed on cleanup() (when nbdkit exits). However this requires a
small change to nbdkit
(
https://gitlab.com/nbdkit/nbdkit/-/commit/f2fe99e4b0f54467ab8028eaf2d039c...)
> So we have some issues:
> - calling flush 16 times instead of 4 times
Yes I think so, but probably the later flushes don't do very much work?
Yes this is just pointless calls, no harm.
> Can we disable the parallel threading model with multi_con, or
we must
> use it to have concurrent calls?
The client decides how many connections to make, and that happens
after we've made the decision to choose a thread model.
> > > > -def create_http_pool(url, host, options):
> > > > +# Connection pool.
> > > > +def create_http_pool(url, options):
> > > > pool = queue.Queue()
> > > >
> > > > count = min(options["max_readers"],
> > > > options["max_writers"],
> > > > MAX_CONNECTIONS)
> > >
> > > Note that this logic must move to virt-v2v now. If imageio does
> > > not support multiple writers, you should not use multi_con.
> > > Since we create the transfer with format="raw", we should
always
> > > support 8 writers, but the correctness this must be enforce
> > > using max_writers option.
> >
> > I don't understand this comment.
>
> In create_transfer we use create the image transfer with:
>
> format="raw"
>
> This enables the nbd backend in imageio. The system starts qemu-nbd
> and imageio connects to qemu-nbd instead of using the file backend
> opening the actual disk. The nbd backend supports multiple writers, but
> the file backend does not.
I see. However I'm not sure I understand why the logic for choosing
"count" has to move to virt-v2v.
I did not understand how things are wired up. I think the current way can
work, having m:n mapping between nbd connection and http connections.