On Wed, Jun 30, 2021 at 05:49:41PM +0200, Martin Kletzander wrote:
On Wed, Jun 30, 2021 at 05:11:45PM +0200, Martin Kletzander wrote:
>- Both openSUSE builds are failing to run check-valgrind and it looks
> like it might be unrelated to libnbd, although it would be nice for
> someone else to confirm that. For now I have disabled check-valgrind
> on those platforms in my branch.
>
>- Similarly to openSUSE Ubuntu 20.04 fails in valgrind tests, but
> somewhere down the GnuTLS rabbit hole, which I presume is unrelated
> too, so I disabled check-valgrind on that one as well.
>
>I will send the patches once they are cleaned up, but I wanted to let
>everyone know what the current status is because eliminating all random
>issues is essential to properly consuming CI results.
>
I forgot to mention the pipeline with all the errors (before the
check-valgrind skips) is here:
https://gitlab.com/nertpinx/libnbd/-/pipelines/329661257
This one:
https://gitlab.com/nertpinx/libnbd/-/jobs/1389193576
FAIL: dlopen
The actual failure is this memory leak:
==17953== 4,096 bytes in 1 blocks are still reachable in loss record 4 of 4
==17953== at 0x4C2E2DF: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd
64-linux.so)
==17953== by 0x52CFEFB: _IO_file_doallocate (in /lib64/libc-2.26.so)
==17953== by 0x52DF0A8: _IO_doallocbuf (in /lib64/libc-2.26.so)
==17953== by 0x52DDC67: _IO_file_overflow@(a)GLIBC_2.2.5 (in /lib64/libc-2.26.so)
==17953== by 0x52DCCDE: _IO_file_xsputn@(a)GLIBC_2.2.5 (in /lib64/libc-2.26.so)
==17953== by 0x52AFCBA: vfprintf (in /lib64/libc-2.26.so)
==17953== by 0x52B88D5: printf (in /lib64/libc-2.26.so)
==17953== by 0x400ACE: thread_start (dlopen.c:117)
==17953== by 0x50474F8: start_thread (in /lib64/libpthread-2.26.so)
==17953== by 0x5359ECE: clone (in /lib64/libc-2.26.so)
It looks benign so the fix would be to add a suppression to
libnbd.git/valgrind/glibc.suppressions, probably something like this
(untested):
{
glibc_6
Memcheck:Leak
fun:malloc
fun:_IO_file_doallocate
}
---
https://gitlab.com/nertpinx/libnbd/-/jobs/1389193577
This has dozens of failures in the OCaml tests. Most but not
all of them are like this:
==20650== 16 bytes in 1 blocks are still reachable in loss record 1 of 38
==20650== at 0x483E7B5: malloc (in
/usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==20650== by 0x47239C: caml_stat_alloc_noexc (memory.c:818)
==20650== by 0x47239C: caml_stat_alloc (memory.c:840)
==20650== by 0x485BCD: caml_register_custom_operations (custom.c:121)
==20650== by 0x485BCD: caml_init_custom_operations (custom.c:163)
==20650== by 0x48BE46: caml_startup_common (startup_nat.c:132)
==20650== by 0x48C05A: caml_startup_exn (startup_nat.c:163)
==20650== by 0x48C05A: caml_startup (startup_nat.c:168)
==20650== by 0x48C05A: caml_main (startup_nat.c:175)
==20650== by 0x430CEB: main (main.c:41)
There's already a suppression for something similar
(ocaml_heap_leak_5), but it probably needs to be adjusted slightly for
the different OCaml compiler being used by SUSE.
This leads us to the general problem with attempting to run valgrind
tests across lots of different distros: We're going to be forever
chasing minor differences in versions of supporting software (glibc,
OCaml, gnutls, etc.)
---
https://gitlab.com/nertpinx/libnbd/-/jobs/1389193579
Multiple memory leaks in getaddrinfo. I'm going to guess that Ubuntu
uses a different NSS configuration from Fedora and so NSS plugin
they're using is leaky.
I think a suppression covering getaddrinfo -> ... -> malloc would be
too broad since it would suppress valid errors eg. where we didn't
call freeaddrinfo on all paths.
Fun ...
Rich.
--
Richard Jones, Virtualization Group, Red Hat
http://people.redhat.com/~rjones
Read my programming and virtualization blog:
http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine. Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/