On Thu, Feb 27, 2020 at 03:12:11PM +0000, Richard W.M. Jones wrote:
In tests/test-eval.sh we have a test which looks something like
this:
nbdkit eval close='echo closed > file' --run 'qemu-img info $nbd'
if ! grep 'closed' file; then fail; fi
However there was a race condition here. nbdkit exits when the --run
command exits without waiting for the captive nbdkit subprocess. Thus
we couldn't be sure that the final 'closed' message got written to the
file. It worked most of the time, but on slow machines the test
failed.
This indicates that we ought to wait for the captive nbdkit to exit.
One reason is so that plugin cleanup is done before we continue after
the captive nbdkit. That should make shell scripts using captive
nbdkit + any plugin that does significant cleanup on exit more
predictable.
This adds the appropriate call to waitpid but we still ignore the exit
status of the captive nbdkit subprocess (in fact it is most likely to
fail since it will sent a SIGTERM signal).
---
server/captive.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/server/captive.c b/server/captive.c
index 102a77ea..0a243d2b 100644
--- a/server/captive.c
+++ b/server/captive.c
@@ -171,12 +171,14 @@ run_command (void)
r = EXIT_FAILURE;
break;
case 0:
- /* Captive nbdkit still running; kill it, but no need to wait
- * for it, as the --run program's exit status is good enough (if
- * the captive nbdkit fails to exit after SIGTERM, we have a
- * bigger bug to fix).
+ /* Captive nbdkit is still running; kill it. We want to wait
+ * for nbdkit to exit since that ensures all cleanup is done in
+ * the plugin before we return. However we don't care if nbdkit
+ * returns an error, the exit code we return always comes from
+ * the --run command.
*/
kill (pid, SIGTERM);
+ waitpid (pid, NULL, 0);
break;
default:
/* Captive nbdkit exited unexpectedly; update the exit status. */
I *thought* we'd fixed this, but it failed on a recent ppc64le build
(of 1.19.2, so including this patch). At the moment I can't think of
a plausible reason for why this is still failing.
https://github.com/libguestfs/nbdkit/blob/be44e94a7d0494ba562cb4c5c08bc58...
Rich.
FAIL: test-eval.sh
==================
+ requires qemu-img --version
+ files='eval.out eval.missing'
+ rm -f eval.out eval.missing
+ cleanup_fn rm -f eval.out eval.missing
+ _cleanup_hook[${#_cleanup_hook[@]}]='rm -f eval.out eval.missing'
+ nbdkit -U - eval 'get_size=echo 64M' 'pread=dd if=/dev/zero count=$3
iflag=count_bytes' 'missing=echo "in missing: $@" >> eval.missing;
exit 2' unload= --run 'qemu-img info $nbd'
+ cat eval.out
image: nbd+unix://?socket=/tmp/nbdkitExyrix/socket
file format: raw
virtual size: 64 MiB (67108864 bytes)
disk size: unavailable
+ grep '67108864 bytes' eval.out
virtual size: 64 MiB (67108864 bytes)
+ cat eval.missing
in missing: config_complete
in missing: thread_model
in missing: get_ready
in missing: thread_model
in missing: preconnect false
in missing: open false
in missing: can_write
in missing: can_flush
in missing: is_rotational
in missing: can_multi_conn
in missing: can_cache
in missing: can_extents
+ grep 'in missing' eval.missing
in missing: config_complete
in missing: thread_model
in missing: get_ready
in missing: thread_model
in missing: preconnect false
in missing: open false
in missing: can_write
in missing: can_flush
in missing: is_rotational
in missing: can_multi_conn
in missing: can_cache
in missing: can_extents
+ grep 'in missing: config_complete' eval.missing
in missing: config_complete
+ grep 'in missing: thread_model' eval.missing
in missing: thread_model
in missing: thread_model
+ grep 'in missing: can_write' eval.missing
in missing: can_write
+ grep 'in missing: is_rotational' eval.missing
in missing: is_rotational
+ grep 'in missing: get_ready' eval.missing
in missing: get_ready
+ grep 'in missing: preconnect' eval.missing
in missing: preconnect false
+ grep 'in missing: open' eval.missing
in missing: open false
+ grep 'in missing: close' eval.missing
++ _run_cleanup_hooks
++ status=1
++ set +e
++ trap '' INT QUIT TERM EXIT ERR
++ echo ./test-eval.sh: run cleanup hooks: exit code 1
./test-eval.sh: run cleanup hooks: exit code 1
++ (( i = 0 ))
++ (( i < 1 ))
++ rm -f eval.out eval.missing
++ (( ++i ))
++ (( i < 1 ))
++ exit 1
FAIL test-eval.sh (exit status: 1)
--
Richard Jones, Virtualization Group, Red Hat
http://people.redhat.com/~rjones
Read my programming and virtualization blog:
http://rwmj.wordpress.com
libguestfs lets you edit virtual machines. Supports shell scripting,
bindings from many languages.
http://libguestfs.org