On Wed, Nov 25, 2020 at 12:31 PM Richard W.M. Jones <rjones(a)redhat.com> wrote:
For a long time I've wanted to split up virt-v2v into smaller
components to make it easier to consume. It's never been clear how to
do this, but I think I have a workable plan now, described in this email.
----------------------------------------------------------------------
First, the AIMS, which are:
(a) Preserve current functionality, including copying conversion,
in-place conversion, and the virt-v2v command line.
(b) Allow warm migration to use virt-v2v without requiring the
"--debug-overlays hack".
(c) Allow threads, multi-conn, and parallel copying of guest disks, all
for better copying performance.
(d) Allow an alternate supervisor to convert and copy many guests in
parallel, given that the supervisor has a global view of the
system/network (I'm not intending to implement this, only to make
it possible).
(e) Better progress bars.
(f) Better logging.
(g) Reuse as much existing code as possible. This is NOT a rewrite!
Sounds good
----------------------------------------------------------------------
Here's my PLAN:
/usr/bin/virt-v2v still exists, but it's now a supervisor program
(possibly even a shell script) that runs the steps below:
(1) Set up the input side by running "helper-v2v-input-<type>". For
all input types this creates a temporary directory containing:
/tmp/XXXXXX/in1 NBD endpoints overlaying the source disk(s)
/tmp/XXXXXX/in2 (these are actually Unix domain sockets)
/tmp/XXXXXX/in3
/tmp/XXXXXX/metadata.in Metadata parsed from the source.
Currently for most inputs we have a running nbdkit process for
each source disk, and we'd do the same here, except we add
nbdkit-cow-filter on top so that the source disk is protected from
being modified. Another small difference is that for -i disk
(local input) we would need an active nbdkit process on top of the
disk, whereas currently we set the disk as a qcow2 backing file.
(2) Perform the conversion by running "helper-v2v-convert". This does
the conversion and sparsification. It writes directly to the NBD
endpoints (in*) above. The writes are stored in the COW overlay
so the source disk is not modified.
Conversion will also create an output metadata file:
/tmp/XXXXXX/metadata.out Target metadata
Exact format of the metadata files is to be decided, but some kind
of not-quite-libvirt-XML may be suitable. It's also not clear if
the metadata format is an internal detail of virt-v2v, or if we
document it as a stable API.
(3) Set up the output side by running "helper-v2v-output-<type>
setup". This will read the output metadata and do whatever is
needed to set up the empty output disks (perhaps by creating a
guest on the target, but also this could be done in step (5)
below).
This will create:
/tmp/XXXXXX/out1 NBD endpoints overlaying the target disk(s)
/tmp/XXXXXX/out2 (these are actually Unix domain sockets)
/tmp/XXXXXX/out3
For oVirt, this step can create the image transfer.
(4) Do the copy. By default this will run either nbdcopy or
qemu-img
convert from in* -> out*.
For oVirt, if we have a qcow2 image with a backing file connected to nbdkit
connected to vddk, we can upload the disk using imageio client,
already supporting multiple connections. For local disk (-i) it is about 2 times
faster compared with qemu-img convert + nbdkit + rhv-upload-plugin.
I think a generic implementation using nbdcopy or "qemu-img convert" is
a good idea, but specific output should be able to control the copy step.
Copying could be done in parallel, currently it is done
serially.
(5) Finalize the output by running "helper-v2v-output-<type> final".
This might create the target guest and whatever else is needed.
For oVirt we would finalize the transfer and create the VM here.
(6) Kill the NBD servers and clean up the temporary directory.
----------------------------------------------------------------------
Let's see how this plan matches the aims.
Aim (a):
Copying conversion works as outlined above. In-place conversion
works by placing an NBD server on top of the files you want to
convert and running helper-v2v-convert (virt-v2v --in-place would
also still work for backwards compat).
Aim (b):
Warm migration: Should be fairly clear this can work in the same way
as in-place conversion, but I'll discuss this further with Martin K
and Tomas to make sure I'm not missing anything.
Aims (c), (d):
Threads etc for performance: Although I don't plan to implement
this, it's clear that an alternate supervisor program could improve
performance here by either doing copies of a single guest / multiple
disks in parallel, but even better by having a global view of the
system and doing copies of multiple guests' disks in parallel.
This is outside the scope of the virt-v2v project, but in scope for
something like MTV.
Aim (e):
Better progress bars: nbdcopy should have support for
machine-readable progress bars, once I push the changes. It will
mean no more need to parse debug logs.
I don't think this is only about the progress bar, but the entire process.
The tool should have an option to generate output for machines, for example
GUI showing what the tool is doing. Typically such application needs to get
events when changing the state or when progress changes.
It can be simple stream of lines:
event: Setting up ...
event: Converting disk ...
event: Copying disk ...
progress: 0
progress: 1
...
progress: 100
event: Creating vm ...
The program consuming this can change the text on each "event:" line and
update the progress bar on each "progress:" line.
This can be enhanced by providing additional info about the process. For example
in RHV it will be helpful if we have the transfer id, disk id, and vm
id when try to
debug failures. Currently we have to use keep debug logs and grep the log to get
the values, but it would be more useful if the details were reported
by the tool:
info: disk_id=xxxyyy transfer_id=yyyzzz
If the process failed the tool can report:
error: Why it failed...
This is something that every tool doing complex/slow stuff needs to
implement, like
qemu-img convert or dd.
Like
https://testanything.org/ for watching processes - watch anything protocol?
Aim (f):
Better logging: I hope we can log each step separately.
A custom supervisor program would also be able to tell which
particular step failed (eg. did it fail in conversion? did it fail
copying a disk and which one?)
For logging it would be useful to support logging to file, so tools
do not have to read the logs and write them to file.
This is true for all virt-* tools.
Aim (g):
This works by splitting up the existing v2v code base into separate
binaries. It is already broadly structured (internally) like this.
So it's not a rewrite, it's a big refactoring.
However I'd probably write a new virt-v2v supervisor binary, because
the existing command line parsing code is extremely complex.
Nir