On 6/5/19 6:15 AM, Richard W.M. Jones wrote:
For each nbd_(unlocked_)?aio_is_* function, split it into an
internal
function that tests the state and a public visible API function.
For calls within the library, use the internal function.
This is simple refactoring.
As a side effect this fixes a race condition identified by Eric Blake:
Thread A Thread B
(in a call that holds h->lock) (calling nbd_aio_pread)
--------------------------------------------------------------------
h->state is processing
checks nbd_aio_is_ready
(it's false)
h->state is moved to READY
checks nbd_aio_is_processing
(it's false)
validation check fails
(However the state was valid so the validation check should have
succeeded).
Fixes commit e63a11736930c381a79a8cc2d03844cfff5db3ef.
Thanks: Eric Blake for identifying the race condition above.
---
@@ -2841,19 +2841,20 @@ let generate_lib_api_c () =
pr " /* We can check the state outside the handle lock because the\n";
pr " * the state is atomic.\n";
pr " */\n";
+ pr " enum state state = h->state;\n";
let tests =
List.map (
function
- | Created -> "nbd_aio_is_created (h)"
- | Connecting -> "nbd_aio_is_connecting (h)"
- | Connected -> "nbd_aio_is_ready (h) || nbd_aio_is_processing
(h)"
- | Closed -> "nbd_aio_is_closed (h)"
- | Dead -> "nbd_aio_is_dead (h)"
+ | Created -> "nbd_internal_is_state_created (state)"
+ | Connecting -> "nbd_internal_is_state_connecting (state)"
+ | Connected -> "nbd_internal_is_state_ready (state) ||
nbd_internal_is_state_processing (state)"
Yes, this fixes the race: this code is executed outside the lock, so it
must read h->state exactly once before using it in multiple spots.
+++ b/lib/connect.c
@@ -38,16 +38,16 @@
static int
error_unless_ready (struct nbd_handle *h)
{
- if (nbd_unlocked_aio_is_ready (h))
+ if (nbd_internal_is_state_ready (h->state))
return 0;
/* Why did it fail? */
- if (nbd_unlocked_aio_is_closed (h)) {
+ if (nbd_internal_is_state_closed (h->state)) {
set_error (0, "connection is closed");
return -1;
}
- if (nbd_unlocked_aio_is_dead (h))
+ if (nbd_internal_is_state_dead (h->state))
error_unless_ready() is called while lock is held, so multiple reads of
h->state are fine here.
+++ b/lib/disconnect.c
@@ -29,15 +29,14 @@
int
nbd_unlocked_shutdown (struct nbd_handle *h)
{
-
- if (nbd_unlocked_aio_is_ready (h) ||
- nbd_unlocked_aio_is_processing (h)) {
+ if (nbd_internal_is_state_ready (h->state) ||
+ nbd_internal_is_state_processing (h->state)) {
Likewise safe for multiple h->state reads.
+/* NB: is_locked = false, may_set_error = false. */
+int
+nbd_unlocked_aio_is_created (struct nbd_handle *h)
+{
+ return nbd_internal_is_state_created (h->state);
+}
+
+/* NB: is_locked = false, may_set_error = false. */
+int
+nbd_unlocked_aio_is_connecting (struct nbd_handle *h)
+{
+ return nbd_internal_is_state_connecting (h->state);
+}
While we've fixed our internal uses, external callers may still have a
race when calling multiple nbd_aio_is_* functions in a row, since the
state can change between those functions. Is that a problem? Are there
any common state combinations (such as is_ready||is_processing) that
warrant their own API entry points so as to be race-free?
At any rate, this is better than what we've had, and does fix a real
race (even if it doesn't solve everything yet), so ACK.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3226
Virtualization:
qemu.org |
libvirt.org