Re: [Libguestfs] libnbd: When are callbacks freed

Thursday, 13 July 2023

On Thu, Jul 13, 2023 at 03:05:30PM +0000, Tage Johansson wrote:
...

 On 7/13/2023 4:36 PM, Eric Blake wrote:
 >On Thu, Jul 13, 2023 at 01:37:49PM +0000, Tage Johansson wrote:
 >>On 7/13/2023 3:26 PM, Richard W.M. Jones wrote:
 >>>On Thu, Jul 13, 2023 at 08:01:09AM -0500, Eric Blake wrote:
 >>>>On Thu, Jul 13, 2023 at 07:13:37AM -0500, Eric Blake wrote:
 >>>>>>>I have replaced a call to `nbd_opt_info()` with a call to
 >>>>>>>`nbd_aio_opt_info()` and passed in a completion callback
which just
 >>>>>>>calls `exit(EXIT_FAILURE)`. So if the completion callback is
called
 >>>>>>>the test should fail, which it doesn't, at least not on
my machine.
 >>>>>>Isn't that OK?  Only .free is required to be called.
 >>>>>For the context callback (for opt_set_meta), .callback may be called
 >>>>>zero, one, or multiple times, but .free should be called exactly
once.
 >>>>>But for the completion callback, I agree that the docs state that
both
 >>>>>.callback and .free should each be called exactly once ("On the
other
 >>>>>hand, if a completion callback is supplied (only possible with
 >>>>>asynchronous commands), it will always be reached exactly once, and
 >>>>>the completion callback must not ignore the value pointed to by
 >>>>>C<error>."); we are indeed missing the call to .callback. 
I'll work
 >>>>>on a patch.
 >>>>Eww, the bug is bigger than just nbd_aio_opt* not always calling
 >>>>completion.callback exactly once.  With just this diff (to be folded
 >>>>into the larger patch I'm working on), I'm getting an assertion
 >>>>failure that we fail to call completion.callback for
 >>>>nbd_aio_pread_structured when calling nbd_close() prior to the command
 >>>>running to completion, so I'll have to fix that too.
 >>>Just to be clear about this, are we really sure the completion
 >>>callback should always be called once?  I'm not clear why that should
 >>>be the case.  (The .free callback however should be.)
 >>>
 >>>For example, if I call a function with bogus invalid parameters so
 >>>that it fails very early on (or when the handle is in an incorrect
 >>>state), should I expect the completion callback to be called?  I would
 >>>expect not.
 >>>
 >>>Rich.
 >>
 >>The user needs a way to know if an error occurred. So the completion
 >>callback must be called if the asynchronous function did not fail (returned
 >>0). If the completion callback should be called, with the error parameter
 >>set, even when the asynchronous function immediately failed with a non-zero
 >>return value is another question. I see two possibilities: Either the
 >>completion callback should always be called. Or it should be called iff the
 >>asynchronous function returned 0 (did not fail).
 >Good point.
 >
 >Our documentation currently states (docs/libnbd.pod) that the
 >completion callback is ALWAYS called, but this is inconsistent with
 >current code - you are correct that at present, the completion
 >callback is NOT called if the aio command returns non-zero (easy test:
 >call nbd_aio_block_status before nbd_connect*).  I'm game to updating
 >that part of the documentation to match existing practice (changing
 >our guarantee to the completion callback will be called once iff the
 >aio command reported success), since our testsuite wasn't covering it,
 >and it is probably an easier fix than munging the generator to call
 >completion.callback even for aio failures.  Meanwhile, the .free
 >callback MUST be called unconditionally, and I think our testsuite is
 >already covering that.
 >
 >Life-cycle wise, I see the following sequence as being something we
 >could usefully rely on (although we don't yet have enough testsuite
 >coverage to prove that we already have it).  Note that it is not only
 >important how many times things are called, but I would like it if we
 >can guarantee the specific ordering between them (neither .free should
 >be called until all .callback opportunities are complete finished, to
 >avoid any use-after-free issues regardless of which of the two .free
 >slots the programmer actually uses):
 >
 >- if aio command fails:
 >mid-command .callback: 0 calls
 >completion .callback: 0 calls
 >mid-command .free: 1 call
 >completion .free: 1 call
 >
 >- if aio command succeeds:
 >mid-command .callback: 0, 1, or multiple times
 >completion .callback: 1 call
 >mid-command .free: 1 call
 >completion .free: 1 call
 >
 >What I'm not sure of yet (without more code inspection) is whether we
 >can sometimes call completion.callback after mid-command.free.
 >
 >>
 >>By the way, if the error parameter is set in the completion callback, how
 >>can the user retrieve the error text? Is it possible to call
 >>`get_nbd_error(3)` from inside the completion callback?
 >You mean nbd_get_error().  We currently document that it is not safe
 >to call any nbd_* from within a callback.  Maybe we could relax that
 >to carve out exceptions for nbd_get_error/errno as special cases,
 >since they only inspect thread-local state and can be used even when
 >there is no current nbd_handle.  The problem is that I'm not sure if
 >there is a reliable string at that point in time: part of the reason
 >the completion callback has an errnum parameter is that the point of
 >reporting the completion failure may be in a different thread than
 >when the failure was first detected, and we only preserved a numeric
 >value in cmd->error rather than also preserving a string.
 >
 >Put another way, there is no way to guarantee that nbd_get_errno()
 >will return the same value as the errnum parameter to the callback;
 >and if that is true, then even if calling nbd_get_error() doesn't
 >crash or deadlock from within a callback, it is likewise probable that
 >any such string returned is not consistent with the errnum parameter
 >passed to the callback.

 So is there any safe way to get some description of the error from a
 completion callback apart from a non-zero number? It isn't too
 helpful to report to the user that the read operation faild with -1. 
As I recall, from the callback, no.

The error is not lost however, it will be returned by the API call
itself.  eg. If you're in nbd_aio_opt_list -> callback (error) then
nbd_aio_opt_list will return -1 and at that point you can use
nbd_get_error to report the error.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [Libguestfs] libnbd: When are callbacks freed