libguestfs recently added support for virtio-scsi and libvirt, and
when these are both available this lets us relatively easily add
hotplugging of drives. This email is about how we would present that
through the libguestfs API.
(a) The current API
Currently you have to add drive(s) via guestfs_add_drive* and then
call guestfs_launch, ie. your program must look something like this:
guestfs_h *g = guestfs_create ();
guestfs_add_drive (g, "/tmp/disk1.img");
guestfs_add_drive (g, "/tmp/disk2.img");
guestfs_launch (g);
After guestfs_launch [the qemu backend is running] you are not allowed
to add more drives.
The API specifies that you refer to drives in other commands in one of
two ways. Either you're allowed to use names like "/dev/sda",
"/dev/sdb" etc to refer to the first, second etc drive you added, in
the same order that you added them. Or you can call
guestfs_list_devices which returns a list of device names, opaque
strings that you pass to other functions.
In the first case (using "/dev/sdX" names), some magic already happens
translating these to the real names underneath, but currently that
magic is just "/dev/sdX" -> "/dev/vdX" for the virtio case.
Note that we cannot change this API or break existing programs.
(b) The hidden appliance drive
The libguestfs appliance has to have its own root drive. Currently
this is added after the user-added drives. For example, if the user
adds two drives, then the appliance will appear as /dev/sdc (or
/dev/vdc or whatever). Some magic in the bootloader causes the last
drive to be used as the root filesystem.
This hidden drive doesn't appear in the API -- for example it is
suppressed when we generate the result of guestfs_list_devices.
(c) /dev/null drives
It's always been possible to add "/dev/null" as a drive via
guestfs_add_drive*. This is mainly useful for testing, or if you want
to access just those parts of the API which don't require a disk image
(for various reasons we force you to add one drive, so if you don't
have any drive to add, you can use "/dev/null"). Current libguestfs
treats "/dev/null" as a magic string and (because of bugs in qemu)
substitutes a non-zero sized temporary file.
(d) Maximum number of drives
With virtio-scsi, this maximum is pretty large -- currently 255 (256
targets less the hidden appliance), but if we used LUNs or even
multiple controllers then it'd be almost unlimited. We actually test
up to 255, and virt-df will use as many slots as it can.
(e) For libguestfs you can assume the latest of everything, qemu,
guest kernel, host kernel, libvirt, tools. Any suggestions based on
very new features are fine, even proposed features provided there's a
working implementation which is likely to go upstream.
- - - -
Here are some ideas about how we might add hotplugging without
breaking existing clients.
(1) The "raw libvirt" option
In this one we'd simply provide thin wrappers around
virDomainAttachDevice and virDomainDetactDevice, and leave it up to
the user to know what they're doing.
The problem with this is the hidden appliance disk. We certainly
don't want the user to accidentally detach that(!) It's also
undesirable for there to be a "hole" in the naming scheme so that
you'd have:
/dev/sda <- your normal drives
/dev/sdb <-
[/dev/sdc # sorry, you can't use this, we won't tell you why]
/dev/sdd <- your first hotplugged device
As far as I know, the kernel assigns /dev/sdX names on a first-free
basis, so there's no way to permanently put the appliance at
/dev/sdzzz (if there is, please let me know!)
(2) The "slots" option
In this option you'd have to use null devices to reserve the maximum
number of drive slots that you're going to use in the libguestfs
handle before launch. Then after launching you'd be allowed to
hotplug only those slots.
So for example:
guestfs_add_drive (g, "/dev/null"); # reserves /dev/sda
guestfs_add_drive (g, "/dev/null"); # reserves /dev/sdb
guestfs_add_drive (g, "/dev/null"); # reserves /dev/sdc
guestfs_launch (g);
guestfs_hotplug (g, 1, "/tmp/foo"); # replaces index 1 == /dev/sdb
guestfs_hotplug (g, 3, "/tmp/foo"); # error!
Although ugly, in some ways this is quite attractive. It maps easily
into guestfish scripts. You have contiguous device naming. You often
know how many drives you'll need in advance, and if you don't then you
can reserve up to max_disks-1.
(3) The "serial numbers" option
This was Dan's suggestion. Hotplugged drives are known only by their
serial number. ie. We hotplug them via libvirt using the <serial/>
field, and then they are accessed using /dev/disk/by-id/serial.
This is tempting, but unfortunately it doesn't quite work in stock
udev, because the actual name used is:
/dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_SERIAL
We could add a custom udev rule to get the path we wanted.
(4) The "rewriting device names" option
Since we already have the infrastructure to rewrite device names, we
could do some complicated and hairy device name rewriting to make
names appear continguous, even though there's an hidden appliance
drive.
This is my least favourite option, mainly because of the complexity,
and complexity is bound to lead to bugs.
(5) Your idea here ...
As usual, comments and suggestions welcome.
Rich.
--
Richard Jones, Virtualization Group, Red Hat
http://people.redhat.com/~rjones
virt-top is 'top' for virtual machines. Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://et.redhat.com/~rjones/virt-top