This is based on top of:
https://github.com/nertpinx/v2v-conversion-host/commit/0bb2efdcacd975a2ca...
The first 4 patches are fairly uncontroversial miscellaneous cleanups.
Patch 5 is the interesting one. (Note it doesn't quite work yet, so
it's for discussion only.)
Patch 5 converts the inner loop to use asynchronous libnbd calls.
performance improves quite a bit for me -- about 13 minutes down to 5
minutes for an "initial" run on a moderate sized Linux VM.
We do this by changing the read call from nbd.pread to nbd.aio_pread
and moving the writing code into a completion callback which runs when
the NBD_CMD_READ has been executed by nbdkit.
The problem with this patch, which is why I say it's for discussion
only, is that we need to change it to throttle the number of commands
in flight. Issuing large numbers of commands isn't in itself a
problem. However with each command is an associated NBD.Buffer, and
so the effect at the moment is that we need to allocate enough memory
up front to store the whole disk image(!) By throttling the commands
we can control exactly how much memory is used, and indeed control the
trade-off between memory and parallelism.
I checked the MD5 of the disk before and after and they were unchanged
(my VM is turned off).
Rich.