Re: [Libguestfs] [nbdkit PATCH v2 2/2] sh: Add exit status triggers for nbdkit_{shutdown, disconnect}

Friday, 11 November 2022

On Thu, Nov 10, 2022 at 04:08:57PM +0100, Laszlo Ersek wrote:
...
 On 11/09/22 21:43, Eric Blake wrote:
 > Make it possible for the sh and eval plugins to disconnect a client or
 > shut down the entire nbdkit server by use of special return values.
 > Prior to this patch we had reserved 4-7 for future use; this defines
 > 4-8, and extends the set of reserved return values to 9-15.  We figure
 > it is unlikely that anyone is using status 4-15 with the intent that
 > it behaves identically to status 1; more importantly, the choice of
 > soft disconnect with error for status 8 is the one least likely to
 > break such an existing sh or eval script.
 > 
 > For the testsuite, I only covered the eval plugin; but since it shares
 > common code with the sh plugin, both styles should work.
 > 
 > A note about the pod documentation: when the argument to =item would
 > otherwise appear to be a single number, the use of S<> to mask it is
 > necessary.

 Per perlpod(1), the canonical Formatting Code for this would be Z<>, as
 in (untested):

 =item Z<>6

 However, I can't see a single instance of Z<> in the v2v projects, so it
 could cause compatibility problems. I guess let's stick with S<> then. 
S<> it is, then.

...
 > 
 > -static void
 > -handle_script_error (const char *argv0, string *ebuf)
 > +/* Normalize return codes and parse error string. */
 > +static exit_code
 > +handle_script_error (const char *argv0, string *ebuf, exit_code code)
 >  {
 >    int err;
 >    size_t skip = 0;
 >    char *p;
 > 
 > -  if (ebuf->len == 0) {
 > +  switch (code) {
 > +  case OK:
 > +  case MISSING:
 > +  case RET_FALSE:
 > +    /* Script successful. */
 > +    return code; 
This refactoring is big enough that I decided to split it into two
separate patches: one to add the return value and get rid of all the
redundant switch statements, the other to add the new enum values.

...
 > +
 > +  case SHUTDOWN_OK:
 > +    nbdkit_shutdown ();
 > +    return OK;
 > +
 > +  case SHUTDOWN_ERR:
 > +    nbdkit_shutdown ();
 > +    err = ESHUTDOWN;
 > +    break;
 > +
 > +  case DISC_SOFT_OK:
 > +    nbdkit_disconnect (false);
 > +    return OK;
 > +
 > +  case DISC_FORCE:
 > +  case DISC_SOFT_ERR:
 > +    nbdkit_disconnect (code == DISC_FORCE);
 > +    err = ESHUTDOWN;
 > +    break;

 The user-facing documentation and the internal comment on DISC_FORCE
 both state that stderr and the retval are irrelevant, due to the client
 never getting a response after a forcible disconnect.

 Therefore, I *think* we could simplify the code here, like this:

   case DISC_FORCE:
     /* hoisted to match enum order */
     /* nbdkit_disconnect() argument simplified */ 
Both of those are compelling.

...
     /* stderr processing entirely skipped, as it is irrelevant anyway
*/
     nbdkit_disconnect (true);
     return OK; /* note: will never reach the client anyway */ 
Maybe MISSING behaves better than OK?  But yeah, there is some
additional simplification with your proposal.

...

   case DISC_SOFT_OK:
     /* unchanged */
     nbdkit_disconnect (false);
     return OK;

   case DISC_SOFT_ERR:
     /* nbdkit_disconnect() argument simplified */
     nbdkit_disconnect (false);
     err = ESHUTDOWN;
     break;

 Benefits I see from this:

 - small runtime benefit of eliminating the stderr processing code
 - *much* easier to read for a human (IMO) 
The human factor should never be underestimated :)

...

 Of course it all hinges on whether we are indeed allowed to return "OK"
 after DISC_FORCE!

 (We might as well return MISSING or RET_FALSE as well -- basically any
 return status that is not an error report, hence does not require the
 setting of "errno", from parsing stderr, or from an appropriate default.)

 This is a really complex refactoring, I think as-is it is entirely
 correct wrt. observable behavior, I just find it a little bit hard to
 read (due to the fallthrough from DISC_FORCE), and a tiny bit
 runtime-wasteful (same reason).

 NB I've only reviewed one call site thus far (from call()). 
Yep, and that's why I split this into two before pushing.

...
 > @@ -541,20 +555,10 @@ call_read (string *rbuf, const char
**argv)
 >    CLEANUP_FREE_STRING string ebuf = empty_vector;
 > 
 >    r = call3 (NULL, 0, rbuf, &ebuf, argv);
 > -  switch (r) {
 > -  case OK:
 > -  case MISSING:
 > -  case RET_FALSE:
 > -    /* Script successful. */
 > -    return r;
 > -
 > -  case ERROR:
 > -  default:
 > -    /* Error case. */
 > +  r = handle_script_error (argv[0], &ebuf, r);
 > +  if (r == ERROR)
 >      string_reset (rbuf);
 > -    handle_script_error (argv[0], &ebuf);
 > -    return ERROR;
 > -  }
 > +  return r;
 >  }

 Hmm OK I guess this is where my theory on DISC_FORCE may be flawed. If
 DISC_FORCE is supposed to call string_reset() here, then we can't return
 OK for DISC_FORCE from handle_script_error().

 But is it really a problem if we propagate "data read from stdout" after
 a forcible disconnect? 
NBD_CMD_READ expects a certain amount of data to be present on stdout
if it returned OK.  If it returned MISSING, things are odd (all other
plugins require a .pread callback).  But in the long run, even if we
return OK here and then the caller turns it into an error because
stdout was too short, I think we still end up sanely reaching the
point where the return value is irrelevant anyways since we can't send
any sort of response to the client (whether success or error) because
we already called shutdown() during the forceful disconnect.

...
...
 > +check_dead()
 > +{
 > +    # Server should die shortly, if not already dead at this point.
 > +    for (( i=0; i < 5; i++ )); do
 > +        kill -s 0 "$(cat eval-disconnect.pid)" || break
 > +        sleep 1
 > +    done
 > +    if [ $i = 5 ]; then
 > +        echo "nbdkit didn't exit fast enough"

 I'm mostly just skimming the test case; but: should we emit this
 complaint to the test case's stderr? 
Both get collected into the same .log file, but stderr does seem a bit
nicer (I'm not sure if stdout buffering vs. when flush() occurs can
cause interesting effects where the various lines of .log are no
longer in chronological order).  Updated accordingly.

...

 > +        exit 1
 > +    fi
 > +}
 > +serve
 > +
 > +# Noisy status 0 (OK) succeeds, despite text to stderr
 > +nbdsh -u "nbd+unix:///0?socket=$sock" -c - <<\EOF

 What's the deal with <<\EOF, why not just <<EOF? 
Habit.  Writing python code in a shell heredoc is a lot easier when
you don't have to also worry about ` and $ expansions being performed
by the shell.  Even if this particular python code avoided the
problematic characters.

...

 With the proposed updates, or without:

 Reviewed-by: Laszlo Ersek <lersek(a)redhat.com&gt; 
I've perhaps taken a bit of liberty by preserving your R-b even after
splitting the patch into two, but the series is now pushed as
242757dd..b1ba5275

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [Libguestfs] [nbdkit PATCH v2 2/2] sh: Add exit status triggers for nbdkit_{shutdown, disconnect}