Re: [Libguestfs] [libnbd PATCH] docs: Clarify how callbacks should handle errors

Thursday, 3 February 2022

On 02/02/22 17:58, Eric Blake wrote:
...
 Recent patches have demonstrated confusion on which order callbacks
 are reached, when it is safe or dangerous to ignore *error, and what a
 completion callback should do when auto-retirement is in use.  Add
 wording to make it more obvious that:

 - mid-command callbacks are reached before the completion callback, 
I think the documentation changes kind of imply this order, but I don't
see it stated explicitly, succinctly. Can you add just one sentence
somewhere that simply states this?

...
   and returning -1 does not prevent future callbacks
 - ignoring *error in a mid-command callback is safe
 - completion callbacks are reached unconditionally, and must NOT ignore
   *error
 - if auto-retirement is in use, completion callbacks should always use
   it rather than trying to return -1 on error 
I don't understand this. According to "docs/libnbd.pod" (and my current
understanding), auto-retirement is *only* the case when the completion
callback returns 1. Substituting that into the above paragraph, the
latter becomes:

- if the completion callback returns 1, rather than the "AIO submit"
  client logic calling nbd_aio_command_completed() manually, then the
  completion callback should always return 1, rather than trying to
  return -1 on error.

It seems to suggest that, with auto-retirement, there is no way to
report an error. If that's actually *intended* (IOW, only use
auto-retirement when you don't care about reporting errors), then I
think the docs should be more explicit about that.

... Hmm, OK, I think the actual documentation change is mostly
understandable below, I'm only confused by the commit message. Basically
the idea is, "if you *decide* to use auto-retirement, then use it
consistently, and return 1 on *all* exit paths, even those handling
errors -- so that you don't have to handle errors in the original AIO
submitting code (with manual completion) in *some* cases only".

How about something like:

s/auto-retirement is in use/you choose to retire asynchronous commands
automatically, rather than manually/

...
 - the contents of buf after nbd_pread and friends is unspecified on
   error 
Undefined perhaps? Unspecified means that the buffer can contain
anything, and it's valid for the caller to access the buffer. I'd think
it's invalid to read such buffers though.

Example: we have a local ("block scope, automatic storage duration")
uint8_t array with 512 elements, and no initializer. In the containing
function, we call nbd_aio_pread(), targeting the buffer, and then we
wait for completion too. (No automatic retirement.) If
nbd_aio_command_completed() returns -1, then the array may still have
indeterminate value, and reading that is undefined behavior in C.

Otherwise, the hunks below are in sync with the commit message, so I
have no additional comments. If you agree with some of my suggestions
for the commit message, then the patch body should reflect those too.

Thank you very much for having added this patch to your todo list :)
Laszlo

...
 ---
  docs/libnbd.pod  | 44 +++++++++++++++++++++++++++++++++++---------
  generator/API.ml | 24 ++++++++++++++++++------
  2 files changed, 53 insertions(+), 15 deletions(-)

 diff --git a/docs/libnbd.pod b/docs/libnbd.pod
 index 32088f64..006d530c 100644
 --- a/docs/libnbd.pod
 +++ b/docs/libnbd.pod
 @@ -870,15 +870,41 @@ still needs to be retired.
  =head2 Callbacks with C<int *error> parameter

  Some of the high-level commands (L<nbd_pread_structured(3)>,
 -L<nbd_block_status(3)>) involve the use of a callback function invoked by
 -the state machine at appropriate points in the server's reply before
 -the overall command is complete.  These callback functions, along with
 -all of the completion callbacks, include a parameter C<error>
 -containing the value of any error detected so far; if the callback
 -function fails, it should assign back into C<error> and return C<-1>
 -to change the resulting error of the overall command.  Assignments
 -into C<error> are ignored for any other return value; similarly,
 -assigning C<0> into C<error> does not have an effect.
 +L<nbd_block_status(3)>) involve the use of a callback function invoked
 +by the state machine at appropriate points in the server's reply
 +before the overall command is complete.  These callback functions,
 +along with all of the completion callbacks, include a parameter
 +C<error> containing the value of any error detected so far.  If a
 +callback function fails and wants to change the resulting error of the
 +overall command visible later in the API sequence, it should assign
 +back into C<error> and return C<-1>.  Assignments into C<error> are
 +ignored for any other return value; similarly, assigning C<0> into
 +C<error> does not have an effect.
 +
 +Note that a mid-command callback might never be reached, such as if
 +libnbd detects that the command was invalid to send (see
 +L<nbd_set_strict_mode(3)>) or if the server reports a failure.  It is
 +safe for a mid-command callback to ignore non-zero C<error>: all the
 +other parameters to the mid-command callback will still be valid
 +(corresponding to the current portion of the server's reply), and the
 +overall command will still fail (at the completion callback or
 +L<nbd_aio_command_completed(3)> for an asynchronous command, or as the
 +result of the overall synchronous command).  Returing C<-1> from a
 +mid-command callback does not prevent that callback from being reached
 +again, if the server sends more mid-command replies that warrant
 +another use of that callback.  A mid-command callback may be reached
 +more times than expected if the server is non-compliant.
 +
 +On the other hand, if a completion callback is supplied (only possible
 +with asynchronous commands), it will always be reached exactly once,
 +and the completion callback must not ignore the value of C<error>.  In
 +particular, the content of a buffer passed to L<nbd_aio_pread(3)> or
 +L<nbd_aio_pread_structured(3)> is unspecified if C<error> is set on
 +entry to the completion callback.  It is recommended that if your code
 +uses automatic command retirement, then the completion function should
 +return C<1> on all control paths, even when handling errors (note that
 +with automatic retirement, assigning into C<error> is pointless as
 +there is no later API to see that value).

  =head1 COMPILING YOUR PROGRAM

 diff --git a/generator/API.ml b/generator/API.ml
 index cf2e7543..bdb8e7c8 100644
 --- a/generator/API.ml
 +++ b/generator/API.ml
 @@ -1,6 +1,6 @@
  (* hey emacs, this is OCaml code: -*- tuareg -*- *)
  (* nbd client library in userspace: the API
 - * Copyright (C) 2013-2021 Red Hat Inc.
 + * Copyright (C) 2013-2022 Red Hat Inc.
   *
   * This library is free software; you can redistribute it and/or
   * modify it under the terms of the GNU Lesser General Public
 @@ -1829,7 +1829,9 @@   "pread", {
  L<nbd_get_block_size(3)>.

  The C<flags> parameter must be C<0> for now (it exists for future NBD
 -protocol extensions)."
 +protocol extensions).
 +
 +Note that if this command fails, the contents of C<buf> are unspecified."
  ^ strict_call_description;
      see_also = [Link "aio_pread"; Link "pread_structured";
                  Link "get_block_size"; Link "set_strict_mode"];
 @@ -1914,7 +1916,9 @@   "pread_structured", {
  C<LIBNBD_CMD_FLAG_DF> meaning that the server should not reply with
  more than one fragment (if that is supported - some servers cannot do
  this, see L<nbd_can_df(3)>). Libnbd does not validate that the server
 -actually obeys the flag."
 +actually obeys the flag.
 +
 +Note that if this command fails, the contents of C<buf> are unspecified."
  ^ strict_call_description;
      see_also = [Link "can_df"; Link "pread";
                  Link "aio_pread_structured"; Link "get_block_size";
 @@ -2155,7 +2159,8 @@   "block_status", {
  for contexts known to libnbd, B<E<lt>libnbd.hE<gt>> contains
constants
  beginning with C<LIBNBD_STATE_> that may help decipher the values.
  On entry to the callback, the C<error> parameter contains the errno
 -value of any previously detected error.
 +value of any previously detected error, but even if an earlier error
 +was detected, the current C<metacontext> and C<entries> are valid.

  It is possible for the extent function to be called
  more times than you expect (if the server is buggy),
 @@ -2454,7 +2459,10 @@   "aio_pread", {
  as described in L<libnbd(3)/Completion callbacks>.

  Note that you must ensure C<buf> is valid until the command has
 -completed.  Other parameters behave as documented in L<nbd_pread(3)>."
 +completed.  Furthermore, the contents of C<buf> are unspecified if the
 +C<error> parameter to C<completion_callback> is set, or if
 +L<nbd_aio_command_completed(3)> reports failure.  Other parameters behave
 +as documented in L<nbd_pread(3)>."
  ^ strict_call_description;
      example = Some "examples/aio-connect-read.c";
      see_also = [SectionLink "Issuing asynchronous commands";
 @@ -2478,7 +2486,11 @@   "aio_pread_structured", {
  Or supply the optional C<completion_callback> which will be invoked
  as described in L<libnbd(3)/Completion callbacks>.

 -Other parameters behave as documented in L<nbd_pread_structured(3)>."
 +Note that you must ensure C<buf> is valid until the command has
 +completed.  Furthermore, the contents of C<buf> are unspecified if the
 +C<error> parameter to C<completion_callback> is set, or if
 +L<nbd_aio_command_completed(3)> reports failure.  Other parameters behave
 +as documented in L<nbd_pread_structured(3)>."
  ^ strict_call_description;
      see_also = [SectionLink "Issuing asynchronous commands";
                  Link "aio_pread"; Link "pread_structured";

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [Libguestfs] [libnbd PATCH] docs: Clarify how callbacks should handle errors