On Wed, Feb 2, 2022 at 7:00 PM Eric Blake <eblake(a)redhat.com> wrote:
Recent patches have demonstrated confusion on which order callbacks
are reached, when it is safe or dangerous to ignore *error, and what a
completion callback should do when auto-retirement is in use. Add
wording to make it more obvious that:
- mid-command callbacks are reached before the completion callback,
and returning -1 does not prevent future callbacks
- ignoring *error in a mid-command callback is safe
- completion callbacks are reached unconditionally, and must NOT ignore
*error
- if auto-retirement is in use, completion callbacks should always use
it rather than trying to return -1 on error
- the contents of buf after nbd_pread and friends is unspecified on
error
---
docs/libnbd.pod | 44 +++++++++++++++++++++++++++++++++++---------
generator/API.ml | 24 ++++++++++++++++++------
2 files changed, 53 insertions(+), 15 deletions(-)
diff --git a/docs/libnbd.pod b/docs/libnbd.pod
index 32088f64..006d530c 100644
--- a/docs/libnbd.pod
+++ b/docs/libnbd.pod
@@ -870,15 +870,41 @@ still needs to be retired.
=head2 Callbacks with C<int *error> parameter
Some of the high-level commands (L<nbd_pread_structured(3)>,
-L<nbd_block_status(3)>) involve the use of a callback function invoked by
-the state machine at appropriate points in the server's reply before
-the overall command is complete. These callback functions, along with
-all of the completion callbacks, include a parameter C<error>
-containing the value of any error detected so far; if the callback
-function fails, it should assign back into C<error> and return C<-1>
-to change the resulting error of the overall command. Assignments
-into C<error> are ignored for any other return value; similarly,
-assigning C<0> into C<error> does not have an effect.
+L<nbd_block_status(3)>) involve the use of a callback function invoked
+by the state machine at appropriate points in the server's reply
+before the overall command is complete. These callback functions,
+along with all of the completion callbacks, include a parameter
+C<error> containing the value of any error detected so far. If a
Error is a pointer to the value
+callback function fails and wants to change the resulting error of
the
+overall command visible later in the API sequence, it should assign
+back into C<error> and return C<-1>. Assignments into C<error> are
+ignored for any other return value; similarly, assigning C<0> into
+C<error> does not have an effect.
But maybe the text about assigning into it is good enough to make this clear.
+
+Note that a mid-command callback might never be reached, such as if
+libnbd detects that the command was invalid to send (see
+L<nbd_set_strict_mode(3)>) or if the server reports a failure. It is
+safe for a mid-command callback to ignore non-zero C<error>: all the
+other parameters to the mid-command callback will still be valid
+(corresponding to the current portion of the server's reply), and the
+overall command will still fail (at the completion callback or
+L<nbd_aio_command_completed(3)> for an asynchronous command, or as the
+result of the overall synchronous command). Returing C<-1> from a
+mid-command callback does not prevent that callback from being reached
+again, if the server sends more mid-command replies that warrant
+another use of that callback. A mid-command callback may be reached
+more times than expected if the server is non-compliant.
+
+On the other hand, if a completion callback is supplied (only possible
+with asynchronous commands), it will always be reached exactly once,
+and the completion callback must not ignore the value of C<error>. In
+particular, the content of a buffer passed to L<nbd_aio_pread(3)> or
+L<nbd_aio_pread_structured(3)> is unspecified if C<error> is set on
+entry to the completion callback. It is recommended that if your code
+uses automatic command retirement, then the completion function should
+return C<1> on all control paths, even when handling errors (note that
+with automatic retirement, assigning into C<error> is pointless as
+there is no later API to see that value).
I think we miss a warning here, that if you don't use automatic retirement,
you *must* call nbd_aio_command_completed() to retire the command,
otherwise commands will leak until you close the handle.
The text should make it clear that the caller need to chose how to use
the library, either use callbacks or use completion checks.
Can we make this simpler by preventing usage of completion checks
if you define a completion callback? This will eliminate the auto retire
concept - if you define a callback, commands always retires.
=head1 COMPILING YOUR PROGRAM
diff --git a/generator/API.ml b/generator/API.ml
index cf2e7543..bdb8e7c8 100644
--- a/generator/API.ml
+++ b/generator/API.ml
@@ -1,6 +1,6 @@
(* hey emacs, this is OCaml code: -*- tuareg -*- *)
(* nbd client library in userspace: the API
- * Copyright (C) 2013-2021 Red Hat Inc.
+ * Copyright (C) 2013-2022 Red Hat Inc.
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
@@ -1829,7 +1829,9 @@ "pread", {
L<nbd_get_block_size(3)>.
The C<flags> parameter must be C<0> for now (it exists for future NBD
-protocol extensions)."
+protocol extensions).
+
+Note that if this command fails, the contents of C<buf> are unspecified."
^ strict_call_description;
see_also = [Link "aio_pread"; Link "pread_structured";
Link "get_block_size"; Link "set_strict_mode"];
@@ -1914,7 +1916,9 @@ "pread_structured", {
C<LIBNBD_CMD_FLAG_DF> meaning that the server should not reply with
more than one fragment (if that is supported - some servers cannot do
this, see L<nbd_can_df(3)>). Libnbd does not validate that the server
-actually obeys the flag."
+actually obeys the flag.
+
+Note that if this command fails, the contents of C<buf> are unspecified."
^ strict_call_description;
see_also = [Link "can_df"; Link "pread";
Link "aio_pread_structured"; Link "get_block_size";
@@ -2155,7 +2159,8 @@ "block_status", {
for contexts known to libnbd, B<E<lt>libnbd.hE<gt>> contains
constants
beginning with C<LIBNBD_STATE_> that may help decipher the values.
On entry to the callback, the C<error> parameter contains the errno
-value of any previously detected error.
+value of any previously detected error, but even if an earlier error
+was detected, the current C<metacontext> and C<entries> are valid.
It is possible for the extent function to be called
more times than you expect (if the server is buggy),
@@ -2454,7 +2459,10 @@ "aio_pread", {
as described in L<libnbd(3)/Completion callbacks>.
Note that you must ensure C<buf> is valid until the command has
-completed. Other parameters behave as documented in L<nbd_pread(3)>."
+completed. Furthermore, the contents of C<buf> are unspecified if the
+C<error> parameter to C<completion_callback> is set, or if
+L<nbd_aio_command_completed(3)> reports failure. Other parameters behave
+as documented in L<nbd_pread(3)>."
^ strict_call_description;
example = Some "examples/aio-connect-read.c";
see_also = [SectionLink "Issuing asynchronous commands";
@@ -2478,7 +2486,11 @@ "aio_pread_structured", {
Or supply the optional C<completion_callback> which will be invoked
as described in L<libnbd(3)/Completion callbacks>.
-Other parameters behave as documented in L<nbd_pread_structured(3)>."
+Note that you must ensure C<buf> is valid until the command has
+completed. Furthermore, the contents of C<buf> are unspecified if the
+C<error> parameter to C<completion_callback> is set, or if
+L<nbd_aio_command_completed(3)> reports failure. Other parameters behave
+as documented in L<nbd_pread_structured(3)>."
^ strict_call_description;
see_also = [SectionLink "Issuing asynchronous commands";
Link "aio_pread"; Link "pread_structured";
--
Otherwise this makes error handling much more clear.
Nir