Plans for changing libguestfs appliance building
by Richard W.M. Jones
Well you can all read the thread here for the technical details:
http://lists.gnu.org/archive/html/qemu-devel/2010-08/threads.html#00133
but the bottom line for anyone who wasn't in that discussion or on IRC
last week is that we have to change the way that the appliance is
built in libguestfs so that we don't depend (as much) on the qemu
-initrd option. The bad thing is this is loads of make-work at a
particularly inconvenient time. The good thing is that it should make
libguestfs 'boot' in under 5 seconds, even if your hardware is old and
slow. It will also reduce the memory and I/O requirements for using
libguestfs.
Because this issue is adversely affecting people using kernel 2.6.35
and/or the latest qemu right now (and will affect many people when
Fedora 14 is released which has both), this is my top priority to fix
in the coming week.
This email documents how I will change appliance building. The
patches will follow some time later in the week when I've written and
tested them.
The new boot method will be this sequence:
(a) (Same as now) A suitable kernel to use is located on the host, or
from $libdir/guestfs in the non-supermin case.
(b) A tiny initrd is built (on the fly for supermin). This will
contain just these files:
any CD-ROM, IDE, virtio kmods required to read the virtual CD device
ext2.ko if required
/sbin/modprobe
a tiny init script, written in C
(c) An ext2 image is created for the root filesystem. I will say a
lot more about how this is created below.
(d) qemu is invoked with something along these lines:
qemu -kernel kernel -initrd initrd -drive file=isofile,media=cdrom
where:
kernel = kernel found in step (a)
initrd = tiny initrd created in step (b)
isofile = ext2 filesystem image created in step (c)
(e) The boot process proceeds by starting the kernel, reading the
/init script from the tiny initrd which mounts the filesystem from the
CD-ROM device and pivot_root(2)s into it, running another /init script
from that filesystem. At this point, boot continues as it does in the
current libguestfs.
Since this involves attaching an extra device to the appliance, we
also need to change the daemon to ignore this extra device, adding
somewhat to the complexity of several operations.
We believe this should improve the speed of boot greatly. Obviously
there is the saving because we are no longer using the (now broken)
qemu -initrd support with a large initrd, but that's purely because of
a qemu regression. We also save because we don't need to unpack the
initrd inside the appliance, and because the device is loaded on
demand. Also there are some unrelated changes that I intend to make
which will improve boot speeds. We have every reason to believe that
5 seconds will be achievable, even on relatively old hardware.
Step (c) above involves creating an ext2 filesystem for root. I chose
ext2 because it is considerably less complex than ISO9660, and is the
native Linux filesystem so it supports all Linux features (long
filenames, extended attributes and so on).
Creating an ext2 filesystem is more complex than the initrd that we
currently create using some hand-coded cpio-like code. Of course we
can't use libguestfs to help us! We plan to write some C code to
assemble the ext2 filesystem from scratch (or starting from a mke2fs
blank filesystem).
To further improve boot times, we intend to cache this so it will only
be created the first time it is used, and only when it needs to be
updated. Caching this means that most times that you use libguestfs
or other tools, no appliance creation will be required at all, and
only the bits of the appliance that you use will be loaded.
If you have any further questions about this, please follow up on this
mailing list.
Rich.
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-top is 'top' for virtual machines. Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://et.redhat.com/~rjones/virt-top
14 years, 2 months
[PATCH 1/2] Allow absolute paths in virt-v2v.conf
by Matthew Booth
This patch allows paths in virt-v2v.conf to be either relative or absolute. If
relative, they are relative to software-root.
This allows virt-v2v.conf to use files provided by packages independent of
virt-v2v.
---
lib/Sys/VirtV2V/Config.pm | 10 ++++++----
1 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/lib/Sys/VirtV2V/Config.pm b/lib/Sys/VirtV2V/Config.pm
index f703152..121e774 100644
--- a/lib/Sys/VirtV2V/Config.pm
+++ b/lib/Sys/VirtV2V/Config.pm
@@ -118,14 +118,16 @@ sub get_transfer_iso
foreach my $path ($dom->findnodes('/virt-v2v/app/path/text()')) {
$path = $path->getData();
- # Get the absolute path if iso-root was defined
my $abs;
- if (defined($root)) {
- $abs = File::Spec->catfile($root, $path);
- } else {
+ if (File::Spec->file_name_is_absolute($path) || !defined($root)) {
$abs = $path;
}
+ # Make relative paths relative to iso-root if it was defined
+ else {
+ $abs = File::Spec->catfile($root, $path);
+ }
+
if (-r $abs) {
$path_args{"$path=$abs"} = 1;
$paths{$abs} = 1;
--
1.7.2.1
14 years, 2 months
[PATCH] Install VirtIO storage and network drivers in Windows
by Matthew Booth
Currently when converting a Windows guest we do a minimum installation of the
viostor driver, configure the RHEV guest agent and leave RHEV to properly
install viostor and all remaining drivers. This works well if RHEV is properly
configured and the installation is not interrupted on first boot.
However, if the target of the conversion is not RHEV, RHEV is not properly
configured, or the first boot installation process is interrupted, for
example by the user logging in and interacting with it, this will fail. In this
case, in the absence of a correct driver Windows can mis-detect the VirtIO 'SCSI
Controller' and configure the wrong driver for it. This will lead to the guest
subsequently failing to boot.
This patch complements the RHEV-managed process by additionally copying
installable versions of the VirtIO storage and network drivers to the guest
during conversion, and adding the location of the drivers to the default search
path for drivers. This means that Windows will install correct drivers for
network and storage if the RHEV process fails, or if the conversion target is
not RHEV.
---
lib/Sys/VirtV2V/Converter/Windows.pm | 77 ++++++++++++++++++++++++++++++++++
1 files changed, 77 insertions(+), 0 deletions(-)
diff --git a/lib/Sys/VirtV2V/Converter/Windows.pm b/lib/Sys/VirtV2V/Converter/Windows.pm
index 1d4c526..f5bf399 100644
--- a/lib/Sys/VirtV2V/Converter/Windows.pm
+++ b/lib/Sys/VirtV2V/Converter/Windows.pm
@@ -23,6 +23,7 @@ use warnings;
use Carp qw(carp);
use File::Temp qw(tempdir);
use Data::Dumper;
+use Encode qw(encode decode);
use IO::String;
use XML::DOM;
use XML::DOM::XPath;
@@ -187,6 +188,7 @@ sub _preconvert
_upload_files ($g, $tmpdir, $desc, $devices, $config);
_add_viostor_to_registry ($g, $tmpdir, $desc, $devices, $config);
_add_service_to_registry ($g, $tmpdir, $desc, $devices, $config);
+ _prepare_virtio_drivers ($g, $tmpdir, $desc, $devices, $config);
}
# See http://rwmj.wordpress.com/2010/04/30/tip-install-a-device-driver-in-a-win...
@@ -345,6 +347,81 @@ sub _add_service_to_registry
$g->upload ($tmpdir . "/system", $system_filename);
}
+# We copy the VirtIO drivers to a directory on the guest and add this directory
+# to HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\DevicePath so that it will
+# be searched automatically when automatically installing drivers.
+sub _prepare_virtio_drivers
+{
+ my $g = shift;
+ my $tmpdir = shift;
+ my $desc = shift;
+ my $devices = shift;
+ my $config = shift;
+
+ # Copy the target VirtIO drivers to the guest
+ my $driverdir = File::Spec->catdir($g->case_sensitive_path("/windows"), "Drivers/VirtIO");
+
+ $g->mkdir_p($driverdir);
+
+ my ($virtio) = $config->match_app ($desc, 'virtio', $desc->{arch});
+ $virtio = $config->get_transfer_path($g, $virtio);
+
+ foreach my $src ($g->ls($virtio)) {
+ my $name = $src;
+ $src = File::Spec->catfile($virtio);
+ my $dst = File::Spec->catfile($driverdir, $name);
+ $g->cp_a($src, $dst);
+ }
+
+ # Locate and download the SOFTWARE hive
+ my $sw_local = File::Spec->catfile($tmpdir, 'software');
+ my $sw_guest = $g->case_sensitive_path('/windows/system32/config/software');
+
+ $g->download($sw_guest, $sw_local);
+
+ # Open the registry hive.
+ my $h = Win::Hivex->open($sw_local, write => 1)
+ or die "open hive $sw_local: $!";
+
+ # Find the node \Microsoft\Windows\CurrentVersion
+ my $node = $h->root();
+ foreach ('Microsoft', 'Windows', 'CurrentVersion') {
+ $node = $h->node_get_child($node, $_);
+ }
+
+ # Update DevicePath, but leave everything else as is
+ my @new;
+ my $append = ';%SystemRoot%\Drivers\VirtIO';
+ foreach my $v ($h->node_values($node)) {
+ my $key = $h->value_key($v);
+ my ($type, $data) = $h->value_value($v);
+
+ # Decode the string from utf16le to perl native
+ my $value = decode('UTF-16LE', $data);
+
+ # Append the driver location if it's not there already
+ if ($key eq 'DevicePath' && index($value, $append) == -1) {
+ # Remove the explicit trailing NULL
+ chop($value);
+
+ # Append the new path and a new explicit trailing NULL
+ $value .= $append."\0";
+
+ # Re-encode the string back to utf16le
+ $data = encode('UTF-16LE', $value);
+ }
+
+ push (@new, { key => $key, t => $type, value => $data });
+ }
+ $h->node_set_values($node, \@new);
+
+ $h->commit(undef);
+ undef $h;
+
+ # Upload the new registry.
+ $g->upload($sw_local, $sw_guest);
+}
+
sub _upload_files
{
my $g = shift;
--
1.7.2.1
14 years, 2 months
[PATCH] Add debug output to hivex_close
by Matthew Booth
---
lib/hivex.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)
diff --git a/lib/hivex.c b/lib/hivex.c
index 13d7556..8a774de 100644
--- a/lib/hivex.c
+++ b/lib/hivex.c
@@ -543,6 +543,9 @@ hivex_close (hive_h *h)
free (h->filename);
free (h);
+ if (h->msglvl >= 1)
+ fprintf (stderr, "hivex_close\n");
+
return r;
}
--
1.7.2.1
14 years, 2 months
Question about using guestfish --ro as a backup solution
by Richard W.M. Jones
> <monolive> what is the risk if I backup a running VM ? using
> guestfish --ro ? I understand that my open files might be funny but
> the issue should be solve by a fsck ? it won't work for an open DB
It's a bit different from this.
What happens is that the libguestfs appliance / kernel attaches to the
disk, which is in a potentially unclean *and* potentially changing
state. When libguestfs does the mount, the journal recovery is
performed (against a throw-away snapshot of the original disk -- if
you use the --ro option, the original disk is not written to at all).
However the disk is still changing and the libguestfs kernel could
interpret this in all sorts of ways: eg. panicking or silently reading
corrupt data.
The news is: DON'T use guestfish --ro as your backup solution, UNLESS:
(1) you don't care about the integrity of your backups, or
(2) you can prove that nothing is writing to the disk (eg. the VM is
switched off), or
(3) you take a snapshot of the disk first (eg. LVM snapshot), which is
really just a special case of (2), or
(4) the filesystem that you are backing up is frozen[a], which is also
a special case of (2).
Run a backup daemon inside the guest instead. There are plenty of
network backup programs around. Choose your favorite one and install
it in your VM.
Rich.
[a] http://lwn.net/Articles/287435/
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine. Supports Linux and Windows.
http://et.redhat.com/~rjones/virt-df/
14 years, 2 months
Question about blkdev-setbsz
by Richard W.M. Jones
> <jzheng> Do you have any idea why blockdev-setbsz succeeds, but does
> not affect output of blockdev-getbsz?
I can reproduce the same issue too:
><fs> blockdev-getbsz /dev/vda
send_to_daemon: 0xf62050 g->state = 3, n = 40
recv_from_daemon: 0xf62050 g->state = 3, size_rtn = 0x7ffffbb4f00c, buf_rtn = 0x7ffffbb4f000
blockdev --getbsz /dev/vda
proc 60 (blockdev_getbsz) took 0.01 seconds
4096
><fs> blockdev-setbsz /dev/vda 2048
send_to_daemon: 0xf62050 g->state = 3, n = 44
recv_from_daemon: 0xf62050 g->state = 3, size_rtn = 0x7ffffbb4f01c, buf_rtn = 0x7ffffbb4f010
blockdev --setbsz 2048 /dev/vda
proc 61 (blockdev_setbsz) took 0.01 seconds
><fs> blockdev-getbsz /dev/vda
send_to_daemon: 0xf62050 g->state = 3, n = 40
recv_from_daemon: 0xf62050 g->state = 3, size_rtn = 0x7ffffbb4f00c, buf_rtn = 0x7ffffbb4f000
blockdev --getbsz /dev/vda
proc 60 (blockdev_getbsz) took 0.01 seconds
4096
I checked the source of blockdev, and it is using the BLKBSZSET ioctl.
As far as I can tell if the ioctl returns an error then blockdev
should return an error too (and hence the error should go all the way
out to guestfish).
So it appears that the Linux kernel is not returning an error.
http://lxr.linux.no/#linux+v2.6.35/fs/block_dev.c#L73
The other possibility is that the blockdev --getbsz call is not
returning the block size we just set, but that resolves to this call
inside the kernel:
http://lxr.linux.no/#linux+v2.6.35/include/linux/blkdev.h#L1196
As far as I can see this really ought to work. Perhaps we need to
perform some other operation between the two calls, like a udev settle
or some sort of rereading block devices ioctl.
If this worries you, please file a BZ.
Rich.
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming blog: http://rwmj.wordpress.com
Fedora now supports 80 OCaml packages (the OPEN alternative to F#)
http://cocan.org/getting_started_with_ocaml_on_red_hat_and_fedora
14 years, 2 months
[PATCH 0/3] Inspection code in C
by Richard W.M. Jones
These three patches (two were previously posted) can do simple
operating system inspection in C.
Example of use:
><fs> add-ro rhel55.img
><fs> run
><fs> inspect-os
/dev/VolGroup00/LogVol00
><fs> inspect-get-type /dev/VolGroup00/LogVol00
linux
><fs> inspect-get-distro /dev/VolGroup00/LogVol00
rhel
><fs> inspect-get-arch /dev/VolGroup00/LogVol00
x86_64
><fs> inspect-get-major-version /dev/VolGroup00/LogVol00
5
><fs> inspect-get-minor-version /dev/VolGroup00/LogVol00
5
><fs> inspect-get-product-name /dev/VolGroup00/LogVol00
Red Hat Enterprise Linux Server release 5.5 (Tikanga)
><fs> inspect-get-filesystems /dev/VolGroup00/LogVol00
/dev/VolGroup00/LogVol00
/dev/vda1
/dev/VolGroup00/LogVol01
><fs> inspect-get-mountpoints /dev/VolGroup00/LogVol00
/dev/VolGroup00/LogVol00: /
/dev/vda1: /boot
I've tested this with all of my available guests. The only ones it
*doesn't* work on were a Fedora guest with an encrypted disk, and some
FreeBSD/PCBSD guests. For the encrypted guest, you have to open the
device first. We could probably get the *BSD guests working with some
effort.
Anyway I'm quite happy with this and I intend to push it if no one
objects.
Rich.
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
New in Fedora 11: Fedora Windows cross-compiler. Compile Windows
programs, test, and build Windows installers. Over 70 libraries supprt'd
http://fedoraproject.org/wiki/MinGW http://www.annexia.org/fedora_mingw
14 years, 3 months