[NB: Adding PUBLIC mailing list because this is upstream discussion]
On Mon, Aug 03, 2020 at 06:27:04PM +0100, Richard W.M. Jones wrote:
On Mon, Aug 03, 2020 at 06:03:23PM +0300, Nir Soffer wrote:
> On Mon, Aug 3, 2020 at 5:47 PM Richard W.M. Jones <rjones(a)redhat.com> wrote:
> All this make sense, but when we upload 10 disks we have 10 connections
> but still we cannot push data fast enough. Avoiding copies will help,
> but I don't
> expect huge difference.
>
> My guess is the issue is on the other side - pulling data from vmware.
I can believe this too. VDDK is really slow, and especially the way
we use it is probably not optimal either -- but it has a confusing
threading model and I don't know if we can safely use a more parallel
thread model:
https://github.com/libguestfs/nbdkit/blob/89a36b1fab8302ddc370695d386a28a...
I may have a play around with this tomorrow.
The threading model allowed by VDDK is restrictive. The rules are here:
https://code.vmware.com/docs/11750/virtual-disk-development-kit-programmi...
I did a bit of testing, and it's possible to do better than what we
are doing at the moment. Not sure at present if this will be easy or
will add a lot of complexity. Read on ...
I found through experimentation that it is possible to open multiple
VDDK handles pointing to the same disk. This would allow us to use
SERIALIZE_REQUESTS (instead of SERIALIZE_ALL_REQUESTS) and have
overlapping calls through different handles all pointing back to the
same server/disk. We should have to change all open/close calls to
make the request through a single background thread - see document
above for why.
Adding a background thread and all the RPC needed to marshall these
calls is the part which would add the complexity.
However I suspect we might be able to get away with just adding a
mutex around open/close. The open/close requests would happen on
different threads but would not overlap. This is contrary to the
rules above, but it could be sufficient. This is what I'm testing at
the moment.
It is definitely *not* possible to move to PARALLEL since nbdkit would
make requests in parallel on the same VDDK handle, which is not
allowed. (I did try this to see if the document above was serious,
and it crashed in all kinds of strange ways, so I guess yes they are
serious.)
Rich.
--
Richard Jones, Virtualization Group, Red Hat
http://people.redhat.com/~rjones
Read my programming and virtualization blog:
http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW