When running captive, we were blindly calling kill(pid) of the captive
nbdkit child process, even if that process has already exited
unexpectedly (most likely, from an assertion failure) and another
opened in its place (pid recycling is rare, but not impossible). We
need to check that the child process still exists, and if it
unexpectedly died, ensure that our exit status reflects that fact.
Note that nbdkit normally should exit with status 0 (even when it is
sent a signal like SIGHUP or SIGINT) or die from a signal.
While at it, fix the fact that system() can fail with a value that is
not appropriate to hand to WIFEXITED() if the child was not even
spawned, but cannot fail with WIFSTOPPED. Also, reflect death from
signal to a status > 128 rather than 1 (we could be even fancier and
also re-raise the signal so that we die from the same thing, but it's
not obvious we need that much work...).
Signed-off-by: Eric Blake <eblake(a)redhat.com>
---
server/captive.c | 43 +++++++++++++++++++++++++++++++++----------
1 file changed, 33 insertions(+), 10 deletions(-)
diff --git a/server/captive.c b/server/captive.c
index 90e42050..1606eb1a 100644
--- a/server/captive.c
+++ b/server/captive.c
@@ -54,7 +54,7 @@ run_command (void)
FILE *fp;
char *cmd = NULL;
size_t len = 0;
- int r;
+ int r, status;
pid_t pid;
if (!run)
@@ -135,20 +135,43 @@ run_command (void)
if (pid > 0) { /* Parent process is the run command. */
r = system (cmd);
- if (WIFEXITED (r))
+ if (r == -1) {
+ nbdkit_error ("failure to execute external command: %m");
+ r = 1;
+ }
+ else if (WIFEXITED (r))
r = WEXITSTATUS (r);
- else if (WIFSIGNALED (r)) {
+ else {
+ assert (WIFSIGNALED (r));
fprintf (stderr, "%s: external command was killed by signal %d\n",
program_name, WTERMSIG (r));
- r = 1;
- }
- else if (WIFSTOPPED (r)) {
- fprintf (stderr, "%s: external command was stopped by signal %d\n",
- program_name, WSTOPSIG (r));
- r = 1;
+ r = WTERMSIG (r) + 128;
}
- kill (pid, SIGTERM); /* Kill captive nbdkit. */
+ switch (waitpid (pid, &status, WNOHANG)) {
+ case -1:
+ nbdkit_error ("waitpid: %m");
+ r = 1;
+ break;
+ case 0:
+ /* Captive nbdkit still running; kill it, but no need to wait for it,
+ * as the captive program's exit status is good enough.
+ */
+ kill (pid, SIGTERM);
+ break;
+ default:
+ /* Captive nbdkit exited unexpectedly; update the exit status. */
+ if (WIFEXITED (status)) {
+ if (r == 0)
+ r = WEXITSTATUS (status);
+ }
+ else {
+ assert (WIFSIGNALED (status));
+ fprintf (stderr, "%s: nbdkit command was killed by signal %d\n",
+ program_name, WTERMSIG (status));
+ r = WTERMSIG (status) + 128;
+ }
+ }
_exit (r);
}
--
2.21.0