Re: [Libguestfs] [PATCH libnbd 2/2] api: Implement local command with systemd socket activation.

Friday, 27 September 2019

On 9/27/19 7:28 AM, Richard W.M. Jones wrote:

...
>> +      /* We must unset CLOEXEC on the fd.  (dup2 above does
this
>> +       * implicitly because CLOEXEC is set on the fd, not on the
>> +       * socket).
>> +       */
>> +      flags = fcntl (s, F_GETFD, 0);
>> +      if (flags == -1) {
>> +        perror ("fcntl: F_GETFD");
>
> perror is not async-signal-safe.  Calling it in a child of a
> potentially-multi-threaded parent is therefore prone to deadlock, if
> perror() triggers a request to grab any resource that was left
> locked by a different thread holding the lock but stranded by the
> fork.

 I suspect these sort of problems are unsolvable.  I've studied enough
 customer bug reports where we've had to reverse engineer exactly how
 something failed given only a slim error log to know that error
 messages are always useful and I wouldn't want to remove these.
 However because of the chance of deadlock, how about a deadlock-free
 function that write(2)'s the message + errno to stderr? 
Yes, sticking to write() is better than nothing.  Open-coding an 
int-to-string for displying errno is doable, or bypassing strerror() and 
going straight for the non-portable sys_errlist[errno] lookup (if 
configure says sys_errlist is available) can also be used - it's a 
question of how nice do you want to make things, while still trying to 
remain safe.

...
>> +
>> +    snprintf (listen_pid, sizeof listen_pid, "%ld", (long) getpid
());
>> +    setenv ("LISTEN_PID", listen_pid, 1);
>> +    setenv ("LISTEN_FDS", "1", 1);

 And we could use similar code here. 
Yeah, if we're going to open-code a signal-safe int-to-string, we have 
several uses for it :)

...

> snprintf() and setenv() are also not async-signal-safe.  Which is a
> bummer, since you really don't know the child pid until in the
> child. You could open-code the translation of a decimal number into
> a buffer without snprintf, but you also have to figure out how to
> safely put it into the environ seen by the child.  How does systemd
> do it?

 For setenv(), systemd does this by building a char **environ which is
 passed to execve.  The code is complicated to say the least:

https://github.com/systemd/systemd/blob/5ac1530eca652cd16819853fe06e76e15...

Can we pre-build the bulk of the replacement environ in the parent, 
leaving just the one entry to munge in the child?  At least that way we 
can malloc without worries.

...

>> +    _exit (EXIT_FAILURE);
>
> Is EXIT_FAILURE the best here, or should we consider exiting with
> status 126/127 to match what other programs (sh, env, nice, nohup,
> ...) do when execvp() fails?

 Yes (this is also a bug elsewhere).  Not sure if 126 or 127 is
 right:

 https://www.gnu.org/software/bash/manual/html_node/Exit-Status.html 
Depends on the value of errno after the fact.  ENOENT => 127, all others 
126.

...

 ...
> Do we document that nbd_close() may block?

 We should do.  I've added it in a separate patch.

 I'll have a rethink about this patch and come back with something
 better in a while.

 Rich.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [Libguestfs] [PATCH libnbd 2/2] api: Implement local command with systemd socket activation.