Issue with downloading files whose path contains multi-byte utf-8 characters
by Yonatan Shtarkman
Hey,
When downloading a file whose path contains multi-byte utf-8, libguestfs
sometimes crashes.
This reproduces when using python, and not when using guestfish.
Code to reproduce:
for i in range(2000):
g.download ('/xxxó', '/tmp/1')
#0 raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007ffff7fac140 in <signal handler called> () at
/lib/x86_64-linux-gnu/libpthread.so.0
#2 0x00007ffff6f77701 in _Py_INCREF (op=<optimized out>) at
/usr/include/python3.9/object.h:408
#3 guestfs_int_py_event_callback_wrapper
(g=<optimized out>, flags=<optimized out>, array_len=0, array=0x0,
buf_len=47, buf=0x113b8a0 "gs=0x0\r\ncommandrvf: udevadm --debug settle -E
\303by", event_handle=0, event=16, callback=0x7ffff2516600) at handle.c:137
#4 guestfs_int_py_event_callback_wrapper
(g=<optimized out>, callback=0x7ffff2516600, event=16, event_handle=0,
flags=<optimized out>, buf=0x113b8a0 "gs=0x0\r\ncommandrvf: udevadm --debug
settle -E \303by", buf_len=47, array=0x0, array_len=0) at handle.c:104
#5 0x00007ffff6e0076a in guestfs_int_call_callbacks_message (g=0xf31290,
event=16, buf=0x113b8a0 "gs=0x0\r\ncommandrvf: udevadm --debug settle -E
\303by", buf_len=47)
at events.c:117
#6 0x00007ffff6e1702e in guestfs_int_log_message_callback
(g=g@entry=0xf31290, buf=0x113b8a0 "gs=0x0\r\ncommandrvf: udevadm
--debug settle -E \303by", len=len@entry=47) at proto.c:145
#7 0x00007ffff6dfb759 in handle_log_message (g=g@entry=0xf31290,
conn=conn@entry=0x110e280) at conn-socket.c:395
#8 0x00007ffff6dfbd63 in read_data (len=4, bufv=<optimized out>,
connv=<optimized out>, g=<optimized out>) at conn-socket.c:179
#9 read_data (g=0xf31290, connv=0x110e280, bufv=<optimized out>, len=4) at
conn-socket.c:142
#10 0x00007ffff6e1764a in recv_from_daemon (buf_rtn=0x7fffffffd858,
size_rtn=0x7fffffffd854, g=0xf31290) at proto.c:545
#11 guestfs_int_recv_from_daemon (g=g@entry=0xf31290,
size_rtn=size_rtn@entry=0x7fffffffd854, buf_rtn=buf_rtn@entry=0x7fffffffd858)
at proto.c:623
#12 0x00007ffff6e17a5a in guestfs_int_recv
(g=g@entry=0xf31290, fn=fn@entry=0x7ffff6e3b3e8 "download",
hdr=hdr@entry=0x7fffffffd920, err=err@entry=0x7fffffffd8f0,
xdrp=xdrp@entry=0x0,
ret=ret@entry=0x0)
at proto.c:668
I debugged this issue and noticed that the appliance logs
from commandrvf are truncated, leading to parse failure (missing utf-8
additional bytes):
https://github.com/libguestfs/libguestfs/blob/master/python/handle.c#L134
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x84 in position 0:
invalid start byte
1 year, 9 months
[PATCH] lib: Choose q35 machine type for x86-64
by Richard W.M. Jones
This machine type is more modern than the older 'pc' type and as most
qemu development is now focused there we expect it will perform and
behave better. In almost all respects this change should make no
difference.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2168578
---
lib/guestfs-internal.h | 3 +++
1 file changed, 3 insertions(+)
diff --git a/lib/guestfs-internal.h b/lib/guestfs-internal.h
index 07a2b9f617..4e9a103d78 100644
--- a/lib/guestfs-internal.h
+++ b/lib/guestfs-internal.h
@@ -128,6 +128,9 @@ cleanup_mutex_unlock (pthread_mutex_t **ptr)
#define MAX_WINDOWS_EXPLORER_SIZE (4 * 1000 * 1000)
/* Machine types. */
+#if defined(__x86_64__)
+#define MACHINE_TYPE "q35"
+#endif
#ifdef __arm__
#define MACHINE_TYPE "virt"
#endif
--
2.39.0
1 year, 9 months
[PATCH nbdkit v2 0/4] curl: Use a curl handle pool
by Richard W.M. Jones
NOTE! At least patch 4 should not be applied, and maybe the whole
series is a bust. I am mainly posting this on the list for discussion
and maybe to archive it.
Version 1 was here:
https://listman.redhat.com/archives/libguestfs/2023-February/030610.html
This patch series introduces the concept of a pool of libcurl handles,
instead of always associating one libcurl handle with one NBD
connection. This gives you a bit more flexibility, eg. you can have a
highly concurrent multi-conn NBD connection, but not overwhelm the
remote web server with HTTP connections. Or vice versa.
Compared to the earlier version, I have pushed a couple of simple
patches from the old series upstream. The remaining patches are
tidied up a bit and better tested, but are largely the same.
Patches 1-3 on their own are performance neutral in the cases I tested
where you have approximately the same number of NBD connections as web
server connections (as expected).
Patch 4 kills performance, for reasons discussed here:
https://listman.redhat.com/archives/libguestfs/2023-February/030618.html
I was not able to fix this although I have tried several approaches.
Rich.
1 year, 9 months
[PATCH v2v] -o libvirt: Use cpu='host-model' for gcaps_default_cpu = false
by Richard W.M. Jones
For RHEL >= 9 / x86-64 guests we cannot use the default qemu CPU
(eg. "qemu64"), and so we have a mechanism for conversion to indicate
to the output modes that a more capable CPU is required. We
previously picked cpu='host-passthrough' (ie. the equivalent of qemu's
-cpu host). However this is not live migratable. cpu='host-model' is
a better choice as it is more likely to be migratable.
See also discussion here:
https://listman.redhat.com/archives/libguestfs/2023-February/030625.html
---
output/create_libvirt_xml.ml | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/output/create_libvirt_xml.ml b/output/create_libvirt_xml.ml
index e3dac4d894..60977cf5bb 100644
--- a/output/create_libvirt_xml.ml
+++ b/output/create_libvirt_xml.ml
@@ -193,7 +193,7 @@ let create_libvirt_xml ?pool source inspect
(match source.s_cpu_model with
| None ->
if not guestcaps.gcaps_default_cpu then
- List.push_back cpu_attrs ("mode", "host-passthrough");
+ List.push_back cpu_attrs ("mode", "host-model");
| Some model ->
List.push_back cpu_attrs ("match", "minimum");
if model = "qemu64" then
--
2.39.0
1 year, 9 months
ANNOUNCE: libguestfs 1.50 & guestfs-tools 1.50
by Richard W.M. Jones
I'm pleased to announce the releases of libguestfs 1.50 and
guestfs-tools 1.50. These are a library and a set of tools for for
accessing and modifying filesystems inside virtual machines and disk
images, securely and without needing root access.
You can download both projects here:
https://download.libguestfs.org/1.50-stable/
https://download.libguestfs.org/guestfs-tools/1.50-stable/
I've attached the release notes below, or you can read them online
here:
https://libguestfs.org/guestfs-release-notes-1.50.1.html
https://libguestfs.org/guestfs-tools-release-notes-1.50.1.html
Thanks to the many authors who have contributed to these releases.
Rich.
- - -
RELEASE NOTES FOR LIBGUESTFS 1.50
These are the release notes for libguestfs stable release 1.50. This
describes the major changes since 1.48.
Libguestfs 1.50.0 was released on 7 February 2023.
Language bindings
Fix the PHP bindings for PHP8 (Geoff Amey).
Fix various deprecation warnings in the GObject bindings.
We no longer use the deprecated Python distutils library (thanks Miro
Hrončok).
Inspection
When listing the packages in RPM-based guests, the inspection API no
longer checks package signatures. This is because the newer librpm
used by libguestfs does not understand signatures stored in older
guests, such as SHA1 used by RHEL 6 (thanks Panu Matilainen).
"guestfs_inspect_get_hostname" in guestfs(3) can now handle
/etc/hostname files containing comments (thanks Dawid Zamirski).
"guestfs_file_architecture" in guestfs(3) can now parse files using
zstd compression.
"guestfs_inspect_get_osinfo" in guestfs(3) now returns the correct
osinfo field for Windows 11. However because of decisions made by
Microsoft, these guests still return product name and other strings
identifying as Windows 10 (thanks Yaakov Selkowitz, Yongkui Guo).
API
New APIs
"guestfs_device_name" in guestfs(3) is a new API to read the device
name associated with a drive, for example calling this with 0 will
return "/dev/sda".
"guestfs_clevis_luks_unlock" in guestfs(3) is a new API for unlocking
disks using the Clevis/Tang network-based full disk encryption scheme.
Furthermore implement this in guestfish and guestmount (Laszlo Ersek).
"guestfs_inspect_get_build_id" in guestfs(3) is a new API for reading
the build ID from some Linux and Windows guests. It is not widely used
on Linux, but for Windows it is vital for identifying Windows 11.
Other API changes
In the "guestfs_add_drive" in guestfs(3) API, the "name" and "iface"
fields are not used. "name" has not been used since around 2017, but
the documentation has only just been updated to reflect this. "iface"
was never allowed for the libvirt backend and didn't work reliably with
the direct backend (Laszlo Ersek).
"guestfs_readdir" in guestfs(3) is no longer limited to the maximum
message size but can read a directory of any size (Laszlo Ersek).
Build changes
Note that the Augeas bindings for libguestfs are no longer bundled with
libguestfs and must be built separately. See:
https://people.redhat.com/~rjones/augeas/ For RHEL 7+ rebuilding the
Fedora Rawhide package will work.
Note that libguestfs now requires minimum OCaml 4.04. It will not
compile on RHEL 6.
Note that zstd is now a required dependency.
OCaml gettext is no longer a dependency of libguestfs. (Plain gettext
is still optionally used to translate C source files.)
Add support for OCaml 4.14.
Fix build for missing stdlib functions in OCaml 4.04.
Fix "./configure --disable-ocaml". OCaml is still required to build
libguestfs, but this now correctly disables the OCaml bindings of the
API.
Add support for building on Artix, Rocky and Virtuozzo (Halil Tezcan
KARABULUT, Neil Hanlon, Andrey Drobyshev). In addition when working
out the local distro we now look at $ID_LIKE in /etc/os-release before
$ID which helps on Arch (thanks S D Rausty).
Add preliminary support for compiling libguestfs on macOS.
Fix website description of cloning the libguestfs repository (Kashyap
Chamarthy).
We no longer use glibc custom printf.
We no longer use "LD_PRELOAD=libSegFault.so" in the appliance. This
feature was removed in glibc 2.35.
We no longer use dtrace / systemtap probes.
Internals
Fix regression tests to use correct paths (Nikolay Shirokovskiy).
Various improvements to qcow2 appliance handling (Andrey Drobyshev).
Disable 5-level page tables in qemu. This avoids a bug in older
versions of qemu.
Disable the LVM2 devicesfile in the appliance since it interferes with
cloned LVs that have the same UUID (Laszlo Ersek).
Don't use "-cpu max" on RISC-V as it is not yet supported by qemu's TCG
emulation of that architecture. This will be reenabled when qemu gets
support.
Avoid a rare hang that would happen when launching the appliance. This
turned out to be caused by using the unsafe call setenv(3) between fork
and exec (thanks Siddhesh Poyarekar).
When running the file command inside the appliance we now disable
seccomp since it interferes with processing compressed files (thanks
David Runge, Toolybird).
Bugs fixed
https://bugzilla.redhat.com/2108425
compile of libguestfs-1.48.4 fails with Error: static declaration
of ‘caml_alloc_initialized_string’ follows non-static declaration
https://bugzilla.redhat.com/2064182
SHA 1 signatures required to inspect packages in RHEL 6 guests
https://bugzilla.redhat.com/2033247
document encrypted RBD disk limitation
https://bugzilla.redhat.com/2012658
libguestfs fails to detect Windows 11 guest image
https://bugzilla.redhat.com/1965941
lvm-set-filter failed in guestfish with the latest lvm2 package
https://bugzilla.redhat.com/1844341
The duplicate block device is listed when iface is set to 'virtio'
https://bugzilla.redhat.com/1809453
[RFE] Add support for LUKS encrypted disks with Clevis & Tang
https://bugzilla.redhat.com/1794518
Rewrite libguestfs use of setfiles so that it doesn't stop on ext4
immutable bits
https://bugzilla.redhat.com/1674392
No return values from a directory listing when there are simply too
many files in that directory (NULL value return)
https://bugzilla.redhat.com/1554735
RFE: customize --selinux-relabel should be the default, with
--no-selinux-relabel used to opt out
SEE ALSO
guestfs-examples(1), guestfs-faq(1), guestfs-performance(1),
guestfs-recipes(1), guestfs-testing(1), guestfs(3), guestfish(1),
http://libguestfs.org/
AUTHORS
Adolfo Jayme Barrientos
Andrey Drobyshev
Emilio Herrera
Ettore Atalan
Geoff Amey
Hela Basa
Jan Kuparinen
Kashyap Chamarthy
Laszlo Ersek
Marcin Stanclik
Michał Smyk
Neil Hanlon
Nikolay Shirokovskiy
Pavel Borecki
Piotr Drąg
Richard W.M. Jones
Ricky Tigg
Temuri Doghonadze
Yuri Chornoivan
COPYRIGHT
Copyright (C) 2009-2023 Red Hat Inc.
- - -
RELEASE NOTES FOR GUESTFS TOOLS 1.50
These are the release notes for guestfs tools stable release 1.50.
This describes the major changes since 1.48.
Guestfs tools 1.50.0 was released on 7 February 2023.
Security
CVE-2022-2211
https://bugzilla.redhat.com/show_bug.cgi?id=2100862
A buffer overflow was found in the --key option of several guestfs
tools. For more information on this low severity bug see the bug
report above (Laszlo Ersek).
New virt-drivers tool
This new tool can examine a disk image to determine:
• Whether it uses BIOS or UEFI for booting
• What bootloader it uses (Linux only)
• What kernels may be chosen at boot time (Linux only)
• What device drivers (kernel modules) are installed
This is useful for determining how (or if) a guest can boot on a
virtualization hypervisor.
virt-customize
--selinux-relabel is now the default for SELinux guests. You no longer
need to specify this flag. In the rare case where you don't want to
relabel a guest after customizing it, you can use --no-selinux-relabel.
Note this is not needed for non-SELinux guests, it will do the right
thing automatically (Laszlo Ersek).
New --inject-qemu-ga and --inject-virtio-win operations which
respectively inject QEMU Guest Agent and virtio-win drivers into
Windows guests.
Rocky Linux guests are now supported (thanks Harry Benson).
virt-inspector
Virt-inspector now outputs the new <build_id> element containing the
guest build ID, if using libguestfs ≥ 1.50.
virt-sysprep
New "lvm-system-devices" operation for removing LVM2's system.devices
file. This avoids certain problems when cloning a VM (Laszlo Ersek).
Virt-sysprep supports guests using LUKS logical volumes on top of LVM
(Laszlo Ersek).
Common changes
All the tools supporting the --key option can now use Clevis/Tang to
decrypt full disk encryption using this network-based scheme (Laszlo
Ersek).
Build changes
Note that libguestfs now requires minimum OCaml 4.04. It will not
compile on RHEL 6.
Note that libosinfo is a new required dependency.
OCaml 4.14 is now supported.
"./configure --disable-ocaml" and "./configure --disable-perl" now
disable the OCaml- and Perl-based tools respectively (thanks Simon
Walter).
Experimental support for compiling on macOS.
When running "make check-valgrind", Valgrind logs are no longer written
to separate files under tmp/. Instead the output is written to the
normal test-name.log file.
Bugs fixed
https://bugzilla.redhat.com/2133443
RFE: Support Rocky Linux in virt-customize
https://bugzilla.redhat.com/2106286
virt-sysprep: make an effort to support LUKS on LV
https://bugzilla.redhat.com/2089748
Removal of "--selinux-relabel" option breaks existing scripts
https://bugzilla.redhat.com/2075718
Having to use "--selinux-relabel" is not intuitive given Red Hat
products default to selinux enabled.
https://bugzilla.redhat.com/2072493
[RFE] Request to add lvm system.devices cleanup operation to virt-
sysprep
https://bugzilla.redhat.com/2059545
[RHEL 9.0][Nutanix] lvm partition "home" will lost with SCSI disk
either in the new cloned VM or restored from a snapshot
https://bugzilla.redhat.com/2028764
Install the qemu-guest-agent package during the conversion process
https://bugzilla.redhat.com/1809453
[RFE] Add support for LUKS encrypted disks with Clevis & Tang
https://bugzilla.redhat.com/1554735
RFE: customize --selinux-relabel should be the default, with
--no-selinux-relabel used to opt out
SEE ALSO
http://libguestfs.org/
AUTHORS
Laszlo Ersek
Richard W.M. Jones
COPYRIGHT
Copyright (C) 2009-2023 Red Hat Inc.
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
nbdkit - Flexible, fast NBD server with plugins
https://gitlab.com/nbdkit/nbdkit
1 year, 9 months
[PATCH libnbd] generator: Pass LISTEN_FDNAMES=nbd with systemd socket activation
by Richard W.M. Jones
systemd allows sockets passed through socket activation to be named
with the protocol they require. We only ever pass one socket, name
it. This environment variable is currently ignored by qemu-nbd and
nbdkit, but might be used by qemu-storage-daemon:
https://lists.nongnu.org/archive/html/qemu-devel/2023-01/msg06114.html
---
generator/states-connect-socket-activation.c | 41 +++++++++++---------
1 file changed, 23 insertions(+), 18 deletions(-)
diff --git a/generator/states-connect-socket-activation.c b/generator/states-connect-socket-activation.c
index 9a83834915..22f06d4fd3 100644
--- a/generator/states-connect-socket-activation.c
+++ b/generator/states-connect-socket-activation.c
@@ -34,16 +34,18 @@
/* This is baked into the systemd socket activation API. */
#define FIRST_SOCKET_ACTIVATION_FD 3
-/* == strlen ("LISTEN_PID=") | strlen ("LISTEN_FDS=") */
-#define PREFIX_LENGTH 11
-
extern char **environ;
/* Prepare environment for calling execvp when doing systemd socket
* activation. Takes the current environment and copies it. Removes
- * any existing LISTEN_PID or LISTEN_FDS and replaces them with new
- * variables. env[0] is "LISTEN_PID=..." which is filled in by
- * CONNECT_SA.START, and env[1] is "LISTEN_FDS=1".
+ * any existing LISTEN_PID, LISTEN_FDS or LISTEN_FDNAMES, and replaces
+ * them with new variables.
+ *
+ * env[0] is "LISTEN_PID=..." which is filled in by CONNECT_SA.START
+ *
+ * env[1] is "LISTEN_FDS=1"
+ *
+ * env[2] is "LISTEN_FDNAMES=nbd"
*/
static int
prepare_socket_activation_environment (string_vector *env)
@@ -53,26 +55,29 @@ prepare_socket_activation_environment (string_vector *env)
assert (env->len == 0);
- /* Reserve slots env[0] and env[1]. */
+ /* Reserve slots env[0]..env[2] */
+ if (string_vector_reserve (env, 3) == -1)
+ goto err;
p = strdup ("LISTEN_PID=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx");
if (p == NULL)
goto err;
- if (string_vector_append (env, p) == -1) {
- free (p);
- goto err;
- }
+ string_vector_append (env, p);
p = strdup ("LISTEN_FDS=1");
if (p == NULL)
goto err;
- if (string_vector_append (env, p) == -1) {
- free (p);
+ string_vector_append (env, p);
+ p = strdup ("LISTEN_FDNAMES=nbd");
+ if (p == NULL)
goto err;
- }
+ string_vector_append (env, p);
- /* Append the current environment, but remove LISTEN_PID, LISTEN_FDS. */
+ /* Append the current environment, but remove the special
+ * environment variables.
+ */
for (i = 0; environ[i] != NULL; ++i) {
- if (strncmp (environ[i], "LISTEN_PID=", PREFIX_LENGTH) != 0 &&
- strncmp (environ[i], "LISTEN_FDS=", PREFIX_LENGTH) != 0) {
+ if (strncmp (environ[i], "LISTEN_PID=", 11) != 0 &&
+ strncmp (environ[i], "LISTEN_FDS=", 11) != 0 &&
+ strncmp (environ[i], "LISTEN_FDNAMES=", 15) != 0) {
char *copy = strdup (environ[i]);
if (copy == NULL)
goto err;
@@ -194,7 +199,7 @@ CONNECT_SA.START:
char buf[32];
const char *v =
nbd_internal_fork_safe_itoa ((long) getpid (), buf, sizeof buf);
- strcpy (&env.ptr[0][PREFIX_LENGTH], v);
+ strcpy (&env.ptr[0][strlen ("LISTEN_FDS=")], v);
/* Restore SIGPIPE back to SIG_DFL. */
signal (SIGPIPE, SIG_DFL);
--
2.39.0
1 year, 9 months
[PATCH nbdkit 0/6] curl: Use a curl handle pool
by Richard W.M. Jones
This experimental series changes the way that the curl plugin deals
with libcurl handles. It also changes the thread model of the plugin
from SERIALIZE_REQUESTS to PARALLEL.
Currently one NBD connection opens one libcurl handle. This also
implies one TCP connection to the web server. If you want to open
multiple libcurl handles (and multiple TCP connections), the client
must open multiple NBD connections, eg. using multi-conn.
After this series, there is a pool of libcurl handles shared across
all NBD connections. The pool defaults to 4 handles, but this can be
changed using the connections=N parameter.
Previously the plugin relied on nbdkit SERIALIZE_REQUESTS to ensure
that a curl handle could not be used from multiple threads at the same
time (https://curl.se/libcurl/c/threadsafe.html). After this change
it is possible to use the PARALLEL thread model. This change is quite
valuable because it means we can use filters like readahead and scan.
Anyway, this all seems to work, but it actually reduces performance :-(
In particular this simple test slows down quite substantially:
time ./nbdkit -r -U - curl file:/var/tmp/fedora-36.img --run 'nbdcopy --no-extents -p "$uri" null:'
(where /var/tmp/fedora-36.img is a 10G file).
I've been looking at flamegraphs all morning and I can't really see
what the problem is (except that lots more time is spent with libcurl
calling sigaction?!?)
I'm wondering if it might be a locality issue, since curl handles are
now being scattered randomly across threads. (It might mean in the
file: case that Linux kernel readahead is ineffective). I can't
easily see a way to change the implementation to encourage handles to
be reused by the same thread.
Well, here we are ...
Rich.
1 year, 9 months