Although rare, it is possible that nbd_internal_run(cmd_issue) will
report failure (perhaps because the server disconnected at the wrong
time), even though we have queued the user's command. If we have a
valid cookie, we MUST return it for the sake of users that will be
calling nbd_aio_command_complete, as otherwise the user has no idea
what cookie to wait on. Ignoring the state machine failure is okay;
the whole idea of an asynch command is that the user will be calling
more APIs to track the eventual completion, and will eventually learn
that the state machine has moved to DEAD.
---
lib/rw.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/lib/rw.c b/lib/rw.c
index 51ee691..6d37daa 100644
--- a/lib/rw.c
+++ b/lib/rw.c
@@ -229,6 +229,12 @@ nbd_internal_command_common (struct nbd_handle *h,
/* Add the command to the end of the queue. Kick the state machine
* if there is no other command being processed, otherwise, it will
* be handled automatically on a future cycle around to READY.
+ * Beyond this point, we have to return a cookie to the user, since
+ * we are queuing the command, even if kicking the state machine
+ * detects a failure. Not reporting a state machine failure here is
+ * okay - any caller of an async command will be calling more API to
+ * await results, and will eventually learn that the machine has
+ * moved on to DEAD at that time.
*/
h->in_flight++;
if (h->cmds_to_issue != NULL) {
@@ -240,7 +246,7 @@ nbd_internal_command_common (struct nbd_handle *h,
h->cmds_to_issue = h->cmds_to_issue_tail = cmd;
if (nbd_internal_is_state_ready (get_next_state (h)) &&
nbd_internal_run (h, cmd_issue) == -1)
- return -1;
+ nbd_internal_debug (h, "command queued, ignoring state machine
failure");
}
return cmd->cookie;
--
2.20.1