[Libguestfs] Re: [PATCH libnbd] lib: Add nbd_get_subprocess_pid to return h->pid

Monday, 26 August 2024

On Sun, Aug 25, 2024 at 02:13:39PM GMT, Richard W.M. Jones wrote:
...
 While debugging an ugly nbdkit / OCaml crash[1] which only happened
 when nbdkit was run as a subprocess of libnbd, I needed a way to start
 gdb attached to nbdkit.  It's not possible to do this by running
 "gdb --args nbdkit -s ..." as a command for various reasons.

 My solution to this was (error checking removed for clarity):

   nbd_connect_command (nbd, { "nbdkit", "-s" /* ... */});
   char pid[100];
   snprintf (pid, sizeof pid, "%d", nbd->pid);
   if (fork () > 0) {
     execl ("/usr/bin/gdb", "gdb", "../server/nbdkit", pid,
NULL);
     _exit (1);
   }
   while (access ("/tmp/continue", F_OK) != 0) {
     sleep (1);
   }
   /* Test case continues after the file is created. */

 In this case I was fishing directly into the private nbd_handle struct
 to get the pid field.  However this seems like a genuine (if rare)
 case for being able to get the process ID directly.

 The documentation is written to be clear that this is only useful for
 debugging cases, only works on some platforms, and shouldn't be relied
 on more generally. 
Since we have given the appropriate disclaimer about it being mainly
useful to debugging sessions, I don't see a problem with including
this.

...

 [1]
https://discuss.ocaml.org/t/free-uninitalized-data-when-calling-caml-c-th...

Quite a debug session you went through there.

...
 +
 +  "get_subprocess_pid", {
 +    default_call with
 +    args = []; ret = RInt64;
 +    shortdesc = "get the process ID of the subprocess";
 +    longdesc = "\
 +For connections which create a subprocess such as
 +L<nbd_connect_command(3)>, this returns the process ID (PID)
 +of the subprocess.  This is only supported on some platforms.
 +
 +This is mainly useful in debugging cases.  For example we used
 +this to attach L<gdb(1)> to an nbdkit subprocess that was crashing
 +inside a plugin."; 
The first-person past-tense tone of the sentence is a bit out of
character with the rest of the documentation.  Maybe simplify it to
just:

For example, this could be used to learn where to attach L<gdb(1)> to
diagnose a crash in an nbdkit subprocess.

...
 +++ b/tests/get-subprocess-pid.c 
...
 +int
 +main (int argc, char *argv[])
 +{
 +  struct nbd_handle *nbd;
 +  int64_t pid;
 +  const char *cmd[] = { NBDKIT, "-s", "--exit-with-parent",
 +                        "memory", "size=1m", NULL }; 
Is there any (easy?) way to get nbdkit to confess its own pid, to
compare that it matches with what we queried from libnbd?  Maybe we
want to add a new mode to 'nbdkit info mode=pid' which exposes its own
pid as the lone big-endian 8-byte value of the export?  (Of course,
willingly exposing your pid over the network makes it that much easier
for clients to try and compromise that pid - but it's no worse than
'nbdkit info mode=time' documenting that exposing the server's
wallclock time may be insecure)

Reviewed-by: Eric Blake <eblake(a)redhat.com&gt;

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

[Libguestfs] Re: [PATCH libnbd] lib: Add nbd_get_subprocess_pid to return h->pid