Re: [Libguestfs] Anyone seen build hangs (esp armv7, s390x) in Fedora?

Saturday, 28 March 2020

On 3/19/20 7:13 AM, Richard W.M. Jones wrote:
...
 [Dropping devel, adding libguestfs]

 This can be reproduced on x86-64 so I can reproduce it locally.  It
 only appears to happen when the tests are run under rpmbuild, not when
 I run them as ‘make check’, but I'm unclear why this is.

 As Eric described earlier, the test runs two copies of nbdkit and a
 client, connected like this:

    qemu-img info ===> nbdkit nbd ===> nbdkit example1
       [3]                [2]             [1]

 These are started in order [1], [2] then [3].  When the client
 (process [3]) completes it exits and then the test harness kills
 processes [1] and [2] in that order. 
I just hit a breakthrough in understanding the deadlock.

...

 The stack trace of [2] at the hang is:

 Thread 3 (Thread 0x7fabbf4f7700 (LWP 3955842)):
 #0  0x00007fabc05c0f0f in poll () from /lib64/libc.so.6 
This thread is calling poll() at the same time as:

...
 #1  0x00007fabc090abba in poll (__timeout=-1, __nfds=2,
__fds=0x7fabbf4f6bb0) at /usr/include/bits/poll2.h:46
 #2  nbdplug_reader (handle=0x5584020e09b0) at nbd.c:323
 #3  0x00007fabc069d472 in start_thread () from /lib64/libpthread.so.0
 #4  0x00007fabc05cc063 in clone () from /lib64/libc.so.6
 Thread 2 (Thread 0x7fabbfcf8700 (LWP 3955793)):
 #0  0x00007fabc069eab7 in __pthread_clockjoin_ex () from /lib64/libpthread.so.0
 #1  0x00007fabc090af2b in nbdplug_close_handle (h=0x5584020e09b0) at nbd.c:538 
this one just finished a poll(), because I used the blocking 
nbd_shutdown instead of the non-blocking nbd_aio_disconnect.  Depending 
on which of the two threads wakes up first to service the server's 
reaction, the other one can be stranded.

Closing the pipe-to-self is a bandaid that ensures the reader thread 
eventually wakes up, but using the right API to begin with is even better.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [Libguestfs] Anyone seen build hangs (esp armv7, s390x) in Fedora?