"Richard W.M. Jones" <rjones(a)redhat.com> writes:
On Sat, May 27, 2023 at 03:35:38PM +0200, Jürgen Hötzel wrote:
> Fixes deadlocks on OCaml5 when trying to get the lock that is already
> held:
>
> Fatal error during lock: Resource deadlock avoided
> ---
> ocaml/guestfs-c.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/ocaml/guestfs-c.c b/ocaml/guestfs-c.c
> index 3888c9456..bcf8e6ab3 100644
> --- a/ocaml/guestfs-c.c
> +++ b/ocaml/guestfs-c.c
> @@ -395,12 +395,16 @@ event_callback_wrapper (guestfs_h *g,
> /* Ensure we are holding the GC lock before any GC operations are
> * possible. (RHBZ#725824)
> */
> - caml_leave_blocking_section ();
> + bool in_blocking_section = (caml_state == NULL);
> +
> + if (in_blocking_section)
> + caml_leave_blocking_section ();
>
> event_callback_wrapper_locked (g, data, event, event_handle, flags,
> buf, buf_len, array, array_len);
>
> - caml_enter_blocking_section ();
> + if (in_blocking_section)
> + caml_enter_blocking_section ();
> }
I don't understand the reason why this patch is needed.
the event_callback_wrapper is ALSO called from libguestfs code that already
holds the OCaml domain lock. (when blocking = false).
My Patch is just a (confusing) workaround to fix this issue at runtime.
An better solution might be using different callbacks at compile time:
event_callback_wrapper_locked (when called from a non-blocking function)
event_callback_wrapper (when called from a blocking function).
This problem also occurs only with OCaml 5
Minimal invalid OCAML/C Code:
============================== stubs.c ===============================
#include <caml/alloc.h>
#include <caml/custom.h>
#include <caml/memory.h>
#include <caml/mlvalues.h>
void dummy_leave_blocking() {
CAMLparam0();
/* we did not call caml_enter_blocking_section_before in C */
caml_leave_blocking_section();
caml_enter_blocking_section();
CAMLreturn0;
}
============================== main.ml ===============================
external dummy_leave_blocking : unit -> unit = "dummy_leave_blocking"
let () =
Printf.printf "Hello from OCaml %s\n%!" Sys.ocaml_version;
dummy_leave_blocking();
Printf.printf "Bye from OCaml %s\n%!" Sys.ocaml_version;
======================================================================
Using Ocaml 5 results in crash:
# dune exec -- bin/main.exe
Hello from OCaml 5.0.0
Fatal error: Fatal error during lock: Resource deadlock avoided
Aborted (core dumped)
VS OCaml 4:
# dune exec -- bin/main.exe
Hello from OCaml 4.11.1
Bye from OCaml 4.11.1
Jürgen