On Thu, Oct 13, 2022 at 09:49:09AM +0100, Richard W.M. Jones wrote:
On Wed, Oct 12, 2022 at 02:00:21PM -0500, Eric Blake wrote:
> > Job #3163966643 (
https://gitlab.com/nbdkit/libnbd/-/jobs/3163966643/raw )
> >
> > Stage: builds
> > Name: x86_64-opensuse-leap-153-prebuilt-env
>
> This one is still failing because of a bug in gnutls; the log is
> reporting:
>
> libnbd: debug: nbd1: nbd_connect_command: transition:
NEWSTYLE.OPT_STARTTLS.RECV_REPLY_PAYLOAD -> NEWSTYLE.OPT_STARTTLS.CHECK_REPLY
> free(): invalid pointer
> libnbd: debug: nbd1: nbd_connect_command: transition:
NEWSTYLE.OPT_STARTTLS.CHECK_REPLY -> NEWSTYLE.OPT_STARTTLS.TLS_HANDSHAKE_READ
> libnbd: debug: nbd1: nbd_connect_command: transition:
NEWSTYLE.OPT_STARTTLS.TLS_HANDSHAKE_READ -> DEAD
> libnbd: debug: nbd1: nbd_connect_command: leave: error="nbd_connect_command:
gnutls_handshake: Error in the pull function. (-1/1)"
>
> That libc message about invalid free() is scary; I'm not yet sure
> whether it is a bug in opensuse-leap's gnutls package or something
> we're doing wrong in libnbd.
I had a look into this. Unfortunately I only have OpenSUSE Tumbleweed
available. It doesn't fail for me in Tumbleweed. (It also doesn't
fail in the CI pipeline for Tumbleweed.)
Anyone has access to the CI env. Line 9 of the build log
shows the container env used:
Using docker image sha256:e4a8e52b0bbb712a544a90d21b21010daad8ab3e85a768cfea38571461ec85fc
for
registry.gitlab.com/nbdkit/libnbd/ci-opensuse-leap-153:latest with digest
registry.gitlab.com/nbdkit/libnbd/ci-opensuse-leap-153@sha256:11179119130...
...
You just need to launch the same container, clone the git repo and
then run the build commands
IOW, on your local machine do:
$ podman run -it
registry.gitlab.com/nbdkit/libnbd/ci-opensuse-leap-153:latestn
# git clone
https://gitlab.com/nbdkit/libnbd
# cd libnbd
# autoreconf -if
# ./configure --enable-gcc-warnings --with-gnutls --with-libxml2 --enable-fuse
--enable-ocaml --enable-python --enable-golang
# make -j 20
# cd tests
# ./connect-tls-psk
requires nbdkit --tls-verify-peer -U - null --run 'exit 0'
nbdkit: pattern: error: failed to set TLS session priority to
@NBDKIT,SYSTEM:+ECDHE-PSK:+DHE-PSK:+PSK: The request is invalid.
nbd_connect_command: gnutls_handshake: Error in the push function. (-1/1)
What's interesting here is that this shows the real error
mesage about TLS sessino priority.
If you set MALLOC_CHECK=1, however, then we loose the useful
error message:
# MALLOC_CHECK_=1 MALLOC_PERTURB_=146 ./connect-tls-psk
requires nbdkit --tls-verify-peer -U - null --run 'exit 0'
free(): invalid pointer
nbd_connect_command: gnutls_handshake: Error in the pull function. (-1/1)
which was unfortunate for debuggability.
I confirmed it is nbdkit that is crashing and it appears to be
in gnutls code.
Looking at the image there is no /etc/crypto-policies directory,
and nor is there any 'crypto-policies' package available in the
distro.
So they have mis-built nbdkit in leap 15.3 with TLS priority
string of @NBDKIT,SYSTEM, despite not having support for that
in their distro.
So I guess this problem is somehow specific to nbdkit or gnutls in
OpenSUSE 15.3.
Yep, broken nbdkit, compared by free() crash bug in gnutls
hiding the real error
We can probably ignore this failure, under the assumption it is
fixed
upstream.
In ci/manifest.yml set 'allow-failure: true' for 15.3, and
re-run lcitool manifest.
Or disable gnutls build on 15.3 for CI purposes by passing --without-gnutls
With regards,
Daniel
--
|:
https://berrange.com -o-
https://www.flickr.com/photos/dberrange :|
|:
https://libvirt.org -o-
https://fstop138.berrange.com :|
|:
https://entangle-photo.org -o-
https://www.instagram.com/dberrange :|