The python docs are clear that in a multi-threaded app, we CANNOT call
any python functions from C code (other than a short-list of
exceptions) without first owning the GIL in the calling thread.
Although many users of nbdsh are single-threaded (where the GIL does
not matter), it is indeed possible to write a python script that
creates python threads to manage h.poll(-1) in one thread while using
other threads to trigger aio_pread and aio_pwrite requests, so
callbacks can be reached from a different thread than the
corresponding thread that initiated the nbd I/O request.
Unfortunately, our generated code was not doing it correctly (for
example, debug_wrapper() to handle h.set_debug_callback() calls
Py_BuildValue outside PyGILState_Ensure()). But it appears that we
have so far been lucky that our callback actions generally occur based
on state machine actions triggered by another Python-based action in
the same thread, coupled with the fact that we have not been
explicitly releasing the GIL around calls into libnbd C API, so in
practice all our callbacks have already owned the GIL in spite of
coding it in the wrong place. However, I cannot rule out the risk of
a mysterious crash in a multi-threaded python program using the nbd
module. Worse, holding the GIL across a potentially-blocking C
operation is bad form; no other Python thread can make progress if we
don't release the GIL.
Fix this by widening the scope of PyGILState in all callback wrappers
(all other code calling into Python is assumed to already own the
GIL), and by dropping the GIL around calls into libnbd C code.
Note that the Python docs also state that manipulation of PyGILState
is incompatible with the use of sub-interpreters with
Py_NewInterpreter(); but if someone actually wants to write a complex
program that uses sub-interpreters to interact with libnbd via Python
independently from some other use of python, rather than just directly
interacting with libnbd via C, I'd be surprised.
Fixes: 936488d4 ("python: Implement Callback properly.", v0.1)
---
generator/Python.ml | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/generator/Python.ml b/generator/Python.ml
index e94f37b..f196e8e 100644
--- a/generator/Python.ml
+++ b/generator/Python.ml
@@ -158,7 +158,7 @@ let
pr " const struct user_data *data = user_data;\n";
pr " int ret = -1;\n";
pr "\n";
- pr " PyGILState_STATE py_save = PyGILState_UNLOCKED;\n";
+ pr " PyGILState_STATE py_save = PyGILState_Ensure();\n";
pr " PyObject *py_args, *py_ret;\n";
List.iter (
function
@@ -227,9 +227,7 @@ let
pr ");\n";
pr " if (!py_args) { PyErr_PrintEx (0); goto out; }\n";
pr "\n";
- pr " py_save = PyGILState_Ensure ();\n";
pr " py_ret = PyObject_CallObject (data->fn, py_args);\n";
- pr " PyGILState_Release (py_save);\n";
pr "\n";
pr " Py_DECREF (py_args);\n";
pr "\n";
@@ -268,6 +266,7 @@ let
| CBUInt _ | CBUInt64 _ -> ()
| CBArrayAndLen _ | CBMutable _ -> assert false
) cbargs;
+ pr " PyGILState_Release (py_save);\n";
pr " return ret;\n";
pr "}\n";
pr "\n"
@@ -478,7 +477,8 @@ let
| BytesPersistOut (n, _) ->
pr " if (nbd_internal_py_init_aio_buffer (%s) < 0) goto out;\n" n
| _ -> ()
- ) args;
+ ) args;
+ pr " Py_BEGIN_ALLOW_THREADS\n";
pr " ret = nbd_%s (" name;
pr_wrap ',' (fun () ->
pr "h";
@@ -487,6 +487,7 @@ let
| _, _, n -> pr ", %s" n
) params);
pr ");\n";
+ pr " Py_END_ALLOW_THREADS\n";
List.iter (
function
| Closure { cbname } -> pr " %s_user_data = NULL;\n" cbname
@@ -613,8 +614,10 @@ let
pr " struct user_data *data = user_data;\n";
pr "\n";
pr " if (data) {\n";
+ pr " PyGILState_STATE py_save = PyGILState_Ensure();\n";
pr " Py_XDECREF (data->fn);\n";
pr " Py_XDECREF (data->buf);\n";
+ pr " PyGILState_Release (py_save);\n";
pr " free (data);\n";
pr " }\n";
pr "}\n";
--
2.36.1