On 9/27/19 7:28 AM, Richard W.M. Jones wrote:
>> + /* We must unset CLOEXEC on the fd. (dup2 above does
this
>> + * implicitly because CLOEXEC is set on the fd, not on the
>> + * socket).
>> + */
>> + flags = fcntl (s, F_GETFD, 0);
>> + if (flags == -1) {
>> + perror ("fcntl: F_GETFD");
>
> perror is not async-signal-safe. Calling it in a child of a
> potentially-multi-threaded parent is therefore prone to deadlock, if
> perror() triggers a request to grab any resource that was left
> locked by a different thread holding the lock but stranded by the
> fork.
I suspect these sort of problems are unsolvable. I've studied enough
customer bug reports where we've had to reverse engineer exactly how
something failed given only a slim error log to know that error
messages are always useful and I wouldn't want to remove these.
However because of the chance of deadlock, how about a deadlock-free
function that write(2)'s the message + errno to stderr?
Yes, sticking to write() is better than nothing. Open-coding an
int-to-string for displying errno is doable, or bypassing strerror() and
going straight for the non-portable sys_errlist[errno] lookup (if
configure says sys_errlist is available) can also be used - it's a
question of how nice do you want to make things, while still trying to
remain safe.
>> +
>> + snprintf (listen_pid, sizeof listen_pid, "%ld", (long) getpid
());
>> + setenv ("LISTEN_PID", listen_pid, 1);
>> + setenv ("LISTEN_FDS", "1", 1);
And we could use similar code here.
Yeah, if we're going to open-code a signal-safe int-to-string, we have
several uses for it :)
> snprintf() and setenv() are also not async-signal-safe. Which is a
> bummer, since you really don't know the child pid until in the
> child. You could open-code the translation of a decimal number into
> a buffer without snprintf, but you also have to figure out how to
> safely put it into the environ seen by the child. How does systemd
> do it?
For setenv(), systemd does this by building a char **environ which is
passed to execve. The code is complicated to say the least:
https://github.com/systemd/systemd/blob/5ac1530eca652cd16819853fe06e76e15...
Can we pre-build the bulk of the replacement environ in the parent,
leaving just the one entry to munge in the child? At least that way we
can malloc without worries.
>> + _exit (EXIT_FAILURE);
>
> Is EXIT_FAILURE the best here, or should we consider exiting with
> status 126/127 to match what other programs (sh, env, nice, nohup,
> ...) do when execvp() fails?
Yes (this is also a bug elsewhere). Not sure if 126 or 127 is
right:
https://www.gnu.org/software/bash/manual/html_node/Exit-Status.html
Depends on the value of errno after the fact. ENOENT => 127, all others
126.
...
> Do we document that nbd_close() may block?
We should do. I've added it in a separate patch.
I'll have a rethink about this patch and come back with something
better in a while.
Rich.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3226
Virtualization:
qemu.org |
libvirt.org