On 5/31/23 20:29, Jürgen Hötzel wrote:
"Richard W.M. Jones" <rjones(a)redhat.com> writes:
> On Sat, May 27, 2023 at 03:35:38PM +0200, Jürgen Hötzel wrote:
>> Fixes deadlocks on OCaml5 when trying to get the lock that is already
>> held:
>>
>> Fatal error during lock: Resource deadlock avoided
>> ---
>> ocaml/guestfs-c.c | 8 ++++++--
>> 1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/ocaml/guestfs-c.c b/ocaml/guestfs-c.c
>> index 3888c9456..bcf8e6ab3 100644
>> --- a/ocaml/guestfs-c.c
>> +++ b/ocaml/guestfs-c.c
>> @@ -395,12 +395,16 @@ event_callback_wrapper (guestfs_h *g,
>> /* Ensure we are holding the GC lock before any GC operations are
>> * possible. (RHBZ#725824)
>> */
>> - caml_leave_blocking_section ();
>> + bool in_blocking_section = (caml_state == NULL);
>> +
>> + if (in_blocking_section)
>> + caml_leave_blocking_section ();
>>
>> event_callback_wrapper_locked (g, data, event, event_handle, flags,
>> buf, buf_len, array, array_len);
>>
>> - caml_enter_blocking_section ();
>> + if (in_blocking_section)
>> + caml_enter_blocking_section ();
>> }
>
> I don't understand the reason why this patch is needed.
the event_callback_wrapper is ALSO called from libguestfs code that already
holds the OCaml domain lock. (when blocking = false).
Where does that happen? I don't see it.
Either way, wherever it happens: that particular call site should be
modified to call event_callback_wrapper_locked() directly, IMO.
event_callback_wrapper() does nothing other than bracket
event_callback_wrapper_locked() with leave+enter, so if another call
site needs everything that event_callback_wrapper() does *except*
leave+enter (because it already holds the OCaml domain lock), then that
site should just directly call event_callback_wrapper_locked(). For this
purpose, event_callback_wrapper_locked() may have to be made a public
function (but again I don't know where that new call is supposed to come
from).
Laszlo
My Patch is just a (confusing) workaround to fix this issue at runtime.
An better solution might be using different callbacks at compile time:
event_callback_wrapper_locked (when called from a non-blocking function)
event_callback_wrapper (when called from a blocking function).
This problem also occurs only with OCaml 5
Minimal invalid OCAML/C Code:
============================== stubs.c ===============================
#include <caml/alloc.h>
#include <caml/custom.h>
#include <caml/memory.h>
#include <caml/mlvalues.h>
void dummy_leave_blocking() {
CAMLparam0();
/* we did not call caml_enter_blocking_section_before in C */
caml_leave_blocking_section();
caml_enter_blocking_section();
CAMLreturn0;
}
============================== main.ml ===============================
external dummy_leave_blocking : unit -> unit = "dummy_leave_blocking"
let () =
Printf.printf "Hello from OCaml %s\n%!" Sys.ocaml_version;
dummy_leave_blocking();
Printf.printf "Bye from OCaml %s\n%!" Sys.ocaml_version;
======================================================================
Using Ocaml 5 results in crash:
# dune exec -- bin/main.exe
Hello from OCaml 5.0.0
Fatal error: Fatal error during lock: Resource deadlock avoided
Aborted (core dumped)
VS OCaml 4:
# dune exec -- bin/main.exe
Hello from OCaml 4.11.1
Bye from OCaml 4.11.1
Jürgen
_______________________________________________
Libguestfs mailing list
Libguestfs(a)redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs