May 2022 - Libguestfs - Libguestfs List Archives

S3 plugin test case breaks test suite / development workflow...

by Laszlo Ersek

Hi, I'm writing this about a specific problem and about a general problem. * The specific problem is that commit 5130c43bc1f9 ("S3 plugin: add support for accessing multiple objects", 2022-05-12) introduced a dependency on the "botocore" python module, and now "make check" fails for me, because this module is unavailable on my system. > Traceback (most recent call last): > File "[...]/nbdkit/plugins/S3/nbdkit-S3-plugin", line 43, in <module> > from botocore.exceptions import ClientError > ModuleNotFoundError: No module named 'botocore' Now, while the README file says, from commit 6715c3d8b3e6 ("New plugin: S3 plugin for accessing disks stored in AWS S3 and Ceph.", 2020-11-14): > For the Python plugin: > > - python interpreter (version 3 only) > > - python development libraries > > - python unittest to run the test suite > > - boto3 is required to run the S3 plugin written in Python the "boto3" dependency had never been a "hard" one until commit 5130c43bc1f9. The language suggests that running the S3 plugin -- even for testing -- is optional. Therefore, commit 5130c43bc1f9 should have updated the test cases "test-S3.sh" and "test-S3-unit.sh" with a proper "requires" line (I assume anyway). FWIW, the following trick in "test-S3.sh": > # There is a fake boto3 module in test-S3/ which we use as a test > # harness for the plugin. Does not work. ... So, after all, the bug in commit 5130c43bc1f9 may have been that it did not add the ClientError exception type to the fake module in "tests/test-S3/boto3". I'm unsure; but please fix it anyway. * The generic problem is that I need to write this separate error report email, rather than commenting directly under the submission -- on the mailing list -- or the pull request -- on gitlab -- that ended up as commit 5130c43bc1f9. For the life of me, I just can't figure out *where* commit 5130c43bc1f9 was originally reviewed. I think that's wrong. ... Ultimately, I've found the patch in the MR listing at <https://gitlab.com/nbdkit/nbdkit/-/merge_requests?scope=all&state=all>, namely in MR#9 -- <https://gitlab.com/nbdkit/nbdkit/-/merge_requests/9>. But this merge request has status *closed*, not *merged*. So even though a commit is given, I can't find the associated discussion because (a) MR#9 is not listed in the list of *merged* merge requests (only the "rejected" ones), (b) the commit message itself does not reference MR#9. I think we should improve our process here. Thanks, Laszlo

3 years, 9 months

3
2
0 / 0

Communication issues between NBD driver and NBDKit server

by Nikolaus Rath

Hi, I am observing some strange errors when using the Kernel's NBD driver with NBDkit. On the kernel side, I see: May 15 16:16:11 vostro.rath.org kernel: nbd0: detected capacity change from 0 to 104857600 May 15 16:16:11 vostro.rath.org kernel: nbd1: detected capacity change from 0 to 104857600 May 15 16:18:23 vostro.rath.org kernel: block nbd0: Possible stuck request 00000000ae5feee7: control (write@4836316160,32768B). Runtime 30 seconds May 15 16:18:25 vostro.rath.org kernel: block nbd0: Possible stuck request 000000007094eddc: control (write@5372947456,10240B). Runtime 30 seconds May 15 16:18:27 vostro.rath.org kernel: block nbd0: Suspicious reply 89 (status 0 flags 0) May 15 16:18:31 vostro.rath.org kernel: block nbd0: Possible stuck request 0000000075f8b9bc: control (write@8057764864,32768B). Runtime 30 seconds May 15 16:18:41 vostro.rath.org kernel: block nbd0: Possible stuck request 000000002d1b3e8b: control (write@14499979264,32768B). Runtime 30 seconds [...] And userspace ('zfs snapshot" in this instance) is stuck afterwards. On the NBDkit side, there seemingly are write errors when replying back to the kernel: $ nbdkit --unix /tmp/tmpi5o59_y_/nbd_socket_sb --foreground --filter=exitlast --filter=stats --threads 16 S3 size=50G bucket=nikratio-backup key=sb statsfile=/tmp/tmpi5o59_y_/stats_sb.txt object-size=32K & $ nbd-client -unix /tmp/tmpi5o59_y_/nbd_socket_sb /dev/nbd2 Warning: the oldstyle protocol is no longer supported. This method now uses the newstyle protocol with a default export Negotiation: ..size = 51200MB Connected /dev/nbd0 [....] nbdkit: python.10: error: write reply: NBD_CMD_WRITE: Broken pipe What's the best way to narrow down who's the culprit here (kernel vs NBD server)? Best, -Nikolaus -- GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.«

3 years, 9 months

5
7
0 / 0

[PATCH v2v 2/2] convert: Remove /dev/mapper/osprober-* devices left around by grub2

by Richard W.M. Jones

Still testing this one as well ... Rich.

3 years, 9 months

2
11
0 / 0

[nbdkit PATCH] vddk: advise user on obscure thumbprint mismatch error condition

by Laszlo Ersek

If the thumbprint parameter is wrong, it's only reported in VixDiskLib_Open(), and then with the non-descript VIX_E_FAIL error code. If the user typed or cut-and-pasted the thumbprint incorrectly, said "Unkown error" message is not helpful for fixing the nbkit command line. Hint at the thumbprint as the potential culprit. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1905772 Signed-off-by: Laszlo Ersek <lersek(a)redhat.com> --- plugins/vddk/vddk-structs.h | 1 + plugins/vddk/vddk.c | 20 ++++++++++++++++++++ 2 files changed, 21 insertions(+) diff --git a/plugins/vddk/vddk-structs.h b/plugins/vddk/vddk-structs.h index 799c4aecc5b8..4c7c6fe2e4fc 100644 --- a/plugins/vddk/vddk-structs.h +++ b/plugins/vddk/vddk-structs.h @@ -43,6 +43,7 @@ typedef uint64_t VixError; #define VIX_OK 0 +#define VIX_E_FAIL 1 #define VIX_E_NOT_SUPPORTED 6 #define VIX_ASYNC 25000 diff --git a/plugins/vddk/vddk.c b/plugins/vddk/vddk.c index 2ea071d641e6..dbd3fdbe09af 100644 --- a/plugins/vddk/vddk.c +++ b/plugins/vddk/vddk.c @@ -769,6 +769,26 @@ vddk_open (int readonly) VDDK_CALL_END (VixDiskLib_Open, 0); if (err != VIX_OK) { VDDK_ERROR (err, "VixDiskLib_Open: %s", filename); + + /* Attempt to advise the user on the extremely helpful "Unknown error" + * result of VixDiskLib_Open(). The one reason we've seen for this error + * mode is a thumbprint mismatch (RHBZ#1905772). Note that: + * + * (1) The thumbprint (as a part of "h->params") is passed to + * VixDiskLib_ConnectEx() above, but the fingerprint mismatch is + * detected only inside VixDiskLib_Open(). + * + * (2) "thumb_print" may be NULL -- vddk_config_complete() is correct not to + * require a non-NULL "thumb_print" for a remote connection; the sample + * program "vixDiskLibSample.cpp" in vddk-7.0.3 explicitly permits + * "-thumb" to be absent. + */ + if (is_remote && err == VIX_E_FAIL) + nbdkit_error ("Please verify whether the \"thumbprint\" parameter (%s) " + "matches the SHA1 fingerprint of the remote VMware " + "server. Refer to nbdkit-vddk-plugin(1) section " + "\"THUMBPRINTS\" for details.", + thumb_print == NULL ? "not specified" : thumb_print); goto err2; } base-commit: 5007409b03486fa4b43526412d3db8de50325efd -- 2.19.1.3.g30247aa5d201

3 years, 9 months

2
2
0 / 0

[PATCH] daemon: In list_filesystems ignore /dev/mapper/osprober-* devices

by Richard W.M. Jones

Still running virt-v2v to test this one ... Rich.

3 years, 9 months

2
3
0 / 0

[PATCH v2v] convert: Ignore /dev/mapper/osprober-* devices when trimming

by Richard W.M. Jones

These devices can be left around by either grub2 or the osprober tool. They are read-only mirrors of existing filesystems and it appears we can safely ignore them. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2003503 --- convert/convert.ml | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/convert/convert.ml b/convert/convert.ml index 87fca7252b..997f6b08bd 100644 --- a/convert/convert.ml +++ b/convert/convert.ml @@ -194,10 +194,16 @@ and do_fstrim g inspect = (* Get all filesystems. *) let fses = g#list_filesystems () in + (* Ignore unknown/swap devices. *) let fses = List.filter_map ( function (_, ("unknown"|"swap")) -> None | (dev, _) -> Some dev ) fses in + (* Ignore filesystems left around by osprober (RHBZ#2003503). *) + let fses = + List.filter (fun dev -> not (String.is_prefix dev "/dev/mapper/osprober-")) + fses in + (* Trim the filesystems. *) List.iter ( fun dev -> -- 2.35.1

3 years, 9 months

2
6
0 / 0

[PATCH] lib: Disable 5-level page tables when using -cpu max

by Richard W.M. Jones

In https://bugzilla.redhat.com/show_bug.cgi?id=2082806 we've been tracking an insidious qemu bug which intermittently prevents the libguestfs appliance from starting. The symptoms are that SeaBIOS starts and displays its messages, but the kernel isn't reached. We found that the kernel does in fact start, but when it tries to set up page tables and jump to protected mode it gets a triple fault which causes the emulated CPU in qemu to reset (qemu exits). This seems to only affect TCG (not KVM). Yesterday I found that this is caused by using -cpu max which enables the "la57" feature (5-level page tables[0]), and that we can make the problem go away using -cpu max,la57=off. Note that I still don't fully understand the qemu bug, so this is only a workaround. I chose to disable 5-level page tables for both TCG and KVM, partly to make the patch simpler, and partly because I guess it's not a feature (ie. 57 bit linear addresses) that is useful for the libguestfs appliance case, where we have limited physical memory and no need to run any programs with huge address spaces. I tested this by running both the direct & libvirt paths overnight. I expect that this patch will fail with old qemu/libvirt which doesn't understand the "la57" feature, but this is only intended as a temporary workaround. [0] Article about 5-level page tables as background: https://lwn.net/Articles/717293/ Thanks: Laszlo Ersek Fixes: https://answers.launchpad.net/ubuntu/+source/libguestfs/+question/701625 --- lib/launch-direct.c | 15 +++++++++++++-- lib/launch-libvirt.c | 7 +++++++ 2 files changed, 20 insertions(+), 2 deletions(-) diff --git a/lib/launch-direct.c b/lib/launch-direct.c index c07a8d78f..ff0eaeb62 100644 --- a/lib/launch-direct.c +++ b/lib/launch-direct.c @@ -518,8 +518,19 @@ launch_direct (guestfs_h *g, void *datav, const char *arg) } end_list (); cpu_model = guestfs_int_get_cpu_model (has_kvm && !force_tcg); - if (cpu_model) - arg ("-cpu", cpu_model); + if (cpu_model) { +#if defined(__x86_64__) + /* Temporary workaround for RHBZ#2082806 */ + if (STREQ (cpu_model, "max")) { + start_list ("-cpu") { + append_list (cpu_model); + append_list ("la57=off"); + } end_list (); + } + else +#endif + arg ("-cpu", cpu_model); + } if (g->smp > 1) arg_format ("-smp", "%d", g->smp); diff --git a/lib/launch-libvirt.c b/lib/launch-libvirt.c index 87da2f40e..03d69e027 100644 --- a/lib/launch-libvirt.c +++ b/lib/launch-libvirt.c @@ -1185,6 +1185,13 @@ construct_libvirt_xml_cpu (guestfs_h *g, else if (STREQ (cpu_model, "max")) { /* https://bugzilla.redhat.com/show_bug.cgi?id=1935572#c11 */ attribute ("mode", "maximum"); +#if defined(__x86_64__) + /* Temporary workaround for RHBZ#2082806 */ + start_element ("feature") { + attribute ("policy", "disable"); + attribute ("name", "la57"); + } end_element (); +#endif } else single_element ("model", cpu_model); -- 2.31.1

3 years, 9 months

2
2
0 / 0

[libguestfs PATCH 0/2] daemon/selinux-relabel: tolerate relabeling errors

by Laszlo Ersek

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1794518 In the "selinux-relabel" API, utilize the "-C" option of setfiles(8), if it is available. (It's going to be part of the policycoreutils-3.4 release.) See patch#2 for a bit longer explanation. Thanks Laszlo Laszlo Ersek (2): daemon/selinux-relabel: generalize setfiles_has_m_option() daemon/selinux-relabel: tolerate relabeling errors daemon/selinux-relabel.c | 36 +++++++++++++------- 1 file changed, 24 insertions(+), 12 deletions(-) base-commit: 08c4ac90f5a3c08b48444e2faf3d0f58d6ddc206 -- 2.19.1.3.g30247aa5d201

3 years, 10 months

2
4
0 / 0

[PATCH nbdkit] vddk: Demote another useless phone-home error message to debug

by Richard W.M. Jones

Earlier commit df7957c8b8 ("vddk: Demote useless VMware error message to a debug statement.") turned an error message from VMware's phone home anti-feature into a debug message. It turns out there is more than one of these messages. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2083617 Reported-by: Ming Xie --- plugins/vddk/vddk.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/plugins/vddk/vddk.c b/plugins/vddk/vddk.c index 51ef8f33..2ea071d6 100644 --- a/plugins/vddk/vddk.c +++ b/plugins/vddk/vddk.c @@ -513,11 +513,15 @@ error_function (const char *fs, va_list args) trim (str); - /* VDDK 7 added a useless error message about their "phone home" - * system called CEIP which only panics users. Demote it to a debug - * statement. https://bugzilla.redhat.com/show_bug.cgi?id=1834267 + /* VDDK 7 added some useless error messages about their "phone home" + * system called CEIP which only panics users. Demote to a debug + * statement. + * https://bugzilla.redhat.com/show_bug.cgi?id=1834267 + * https://bugzilla.redhat.com/show_bug.cgi?id=2083617 */ - if (strstr (str, "Get CEIP status failed") != NULL) { + if (strstr (str, "Get CEIP status failed") != NULL || + strstr (str, "VDDK_PhoneHome: Unable to load configuration " + "options from ") != NULL) { nbdkit_debug ("%s", str); return; } -- 2.35.1

3 years, 10 months

3
4
0 / 0

[PATCH nbdkit] nbd: Hide some state machine debugging behind a debug flag

by Richard W.M. Jones

When running virt-p2v which uses this plugin, the log file is consumed by messages about state machine transitions and so on. In a log file that was shared with me, out of the 135,023 lines in total, 94,653 (70%) were: nbdkit: debug: polling, dir=1 and 18,676 (14%) were: nbdkit: debug: cookie X completed state machine, status 0 This commit changes the logging so these state machine transitions are only printed when you use the debug flag “-D nbd.verbose=1”. I didn't document this flag because it's likely only of use to developers who are reading the code already. There are some debug messages along error paths which are (a) generally useful and (b) did not appear in the log file, so I left those alone. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2083498 Reported-by: Ming Xie --- plugins/nbd/nbd.c | 24 +++++++++++++++++------- 1 file changed, 17 insertions(+), 7 deletions(-) diff --git a/plugins/nbd/nbd.c b/plugins/nbd/nbd.c index 45bf05e9..d0d6544b 100644 --- a/plugins/nbd/nbd.c +++ b/plugins/nbd/nbd.c @@ -66,6 +66,9 @@ #define USE_VSOCK 1 #endif +/* Use '-D nbd.verbose=1' for verbose messages about the state machine. */ +NBDKIT_DLL_PUBLIC int nbd_debug_verbose = 0; + /* The per-transaction details */ struct transaction { int64_t cookie; @@ -421,7 +424,8 @@ nbdplug_reader (void *handle) { struct handle *h = handle; - nbdkit_debug ("nbd: started reader thread"); + if (nbd_debug_verbose) + nbdkit_debug ("nbd: started reader thread"); while (!nbd_aio_is_dead (h->nbd) && !nbd_aio_is_closed (h->nbd)) { int r; @@ -433,7 +437,8 @@ nbdplug_reader (void *handle) unsigned dir; dir = nbd_aio_get_direction (h->nbd); - nbdkit_debug ("polling, dir=%d", dir); + if (nbd_debug_verbose) + nbdkit_debug ("polling, dir=%d", dir); if (dir & LIBNBD_AIO_DIRECTION_READ) fds[0].events |= POLLIN; if (dir & LIBNBD_AIO_DIRECTION_WRITE) @@ -466,8 +471,11 @@ nbdplug_reader (void *handle) } } - nbdkit_debug ("state machine changed to %s", nbd_connection_state (h->nbd)); - nbdkit_debug ("exiting reader thread"); + if (nbd_debug_verbose) { + nbdkit_debug ("state machine changed to %s", + nbd_connection_state (h->nbd)); + nbdkit_debug ("exiting reader thread"); + } return NULL; } @@ -481,8 +489,9 @@ nbdplug_notify (void *opaque, int *error) * updated by nbdplug_register, but it's only an informational * message. */ - nbdkit_debug ("cookie %" PRId64 " completed state machine, status %d", - trans->cookie, *error); + if (nbd_debug_verbose) + nbdkit_debug ("cookie %" PRId64 " completed state machine, status %d", + trans->cookie, *error); trans->err = *error; if (sem_post (&trans->sem)) { nbdkit_error ("failed to post semaphore: %m"); @@ -514,7 +523,8 @@ nbdplug_register (struct handle *h, struct transaction *trans, int64_t cookie) return; } - nbdkit_debug ("cookie %" PRId64 " started by state machine", cookie); + if (nbd_debug_verbose) + nbdkit_debug ("cookie %" PRId64 " started by state machine", cookie); trans->cookie = cookie; if (write (h->fds[1], &c, 1) == -1 && errno != EAGAIN) -- 2.35.1

3 years, 10 months

3
3
0 / 0

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Libguestfs May 2022