Hi Daniel,
Thanks for the detailed report!
On 10/13/22 03:33, Daniel P. Berrangé wrote:
> On Thu, Oct 13, 2022 at 09:49:09AM +0100, Richard W.M. Jones wrote:
> > On Wed, Oct 12, 2022 at 02:00:21PM -0500, Eric Blake wrote:
> > > > Job #3163966643 (
https://gitlab.com/nbdkit/libnbd/-/jobs/3163966643/raw )
> > > >
> > > > Stage: builds
> > > > Name: x86_64-opensuse-leap-153-prebuilt-env
> > >
> > > This one is still failing because of a bug in gnutls; the log is
> > > reporting:
> > >
> > > libnbd: debug: nbd1: nbd_connect_command: transition:
NEWSTYLE.OPT_STARTTLS.RECV_REPLY_PAYLOAD -> NEWSTYLE.OPT_STARTTLS.CHECK_REPLY
> > > free(): invalid pointer
> > > libnbd: debug: nbd1: nbd_connect_command: transition:
NEWSTYLE.OPT_STARTTLS.CHECK_REPLY -> NEWSTYLE.OPT_STARTTLS.TLS_HANDSHAKE_READ
> > > libnbd: debug: nbd1: nbd_connect_command: transition:
NEWSTYLE.OPT_STARTTLS.TLS_HANDSHAKE_READ -> DEAD
> > > libnbd: debug: nbd1: nbd_connect_command: leave:
error="nbd_connect_command: gnutls_handshake: Error in the pull function.
(-1/1)"
> > >
> > > That libc message about invalid free() is scary; I'm not yet sure
> > > whether it is a bug in opensuse-leap's gnutls package or something
> > > we're doing wrong in libnbd.
> >
> > I had a look into this. Unfortunately I only have OpenSUSE Tumbleweed
> > available. It doesn't fail for me in Tumbleweed. (It also doesn't
> > fail in the CI pipeline for Tumbleweed.)
>
> Anyone has access to the CI env. Line 9 of the build log
> shows the container env used:
>
> Using docker image
sha256:e4a8e52b0bbb712a544a90d21b21010daad8ab3e85a768cfea38571461ec85fc for
registry.gitlab.com/nbdkit/libnbd/ci-opensuse-leap-153:latest with digest
registry.gitlab.com/nbdkit/libnbd/ci-opensuse-leap-153@sha256:11179119130...
...
>
> You just need to launch the same container, clone the git repo and
> then run the build commands
>
> IOW, on your local machine do:
>
> $ podman run -it
registry.gitlab.com/nbdkit/libnbd/ci-opensuse-leap-153:latestn
> # git clone
https://gitlab.com/nbdkit/libnbd
> # cd libnbd
> # autoreconf -if
> # ./configure --enable-gcc-warnings --with-gnutls --with-libxml2 --enable-fuse
--enable-ocaml --enable-python --enable-golang
>
> # make -j 20
> # cd tests
> # ./connect-tls-psk
> requires nbdkit --tls-verify-peer -U - null --run 'exit 0'
> nbdkit: pattern: error: failed to set TLS session priority to
@NBDKIT,SYSTEM:+ECDHE-PSK:+DHE-PSK:+PSK: The request is invalid.
> nbd_connect_command: gnutls_handshake: Error in the push function. (-1/1)
>
> What's interesting here is that this shows the real error
> mesage about TLS sessino priority.
>
> If you set MALLOC_CHECK=1, however, then we loose the useful
> error message:
>
> # MALLOC_CHECK_=1 MALLOC_PERTURB_=146 ./connect-tls-psk
> requires nbdkit --tls-verify-peer -U - null --run 'exit 0'
> free(): invalid pointer
> nbd_connect_command: gnutls_handshake: Error in the pull function. (-1/1)
>
> which was unfortunate for debuggability.
>
> I confirmed it is nbdkit that is crashing and it appears to be
> in gnutls code.
>
> Looking at the image there is no /etc/crypto-policies directory,
> and nor is there any 'crypto-policies' package available in the
> distro.
Indeed. Leap 15.4 and newer include the crypto-policies package. Should the
container move to a 15.4 base?
Yes, we need to add 15.4 to libvirt-ci facts database, given
the relative EOL dates.
With regards,
Daniel
--
|: