On Thu, Sep 26, 2019 at 03:56:27PM -0500, Eric Blake wrote:
>+ /* Use /tmp instead of TMPDIR because we must ensure the path
is
>+ * short enough to store in the sockaddr_un. On some platforms this
>+ * may cause problems so we may need to revisit it. XXX
>+ */
Is the use of socketpair() any better than creating a socket under /tmp?
AIUI socketpair() can't be used to create a listening socket. Systemd
socket activation has two modes -- one where you pass an already
accepted socket, and one where you pass a listening socket and the
server accepts connections until it is killed. In nbdkit and qemu-nbd
we are using the latter one.
...
If we fail here or later, do/should we try to clean up the
/tmp/libnbdXXX directory created earlier?
/me reads ahead - nbd_close tries to address it
Still, if we fail at this point, h->sa_sockpath is set but not yet
created [1]
I modified the code so that it doesn't try to delete literal
/tmp/libnbdXXXXXX on some error paths and should be more robust.
>+ close (s);
>+ }
>+ else {
>+ /* We must unset CLOEXEC on the fd. (dup2 above does this
>+ * implicitly because CLOEXEC is set on the fd, not on the
>+ * socket).
>+ */
>+ flags = fcntl (s, F_GETFD, 0);
>+ if (flags == -1) {
>+ perror ("fcntl: F_GETFD");
perror is not async-signal-safe. Calling it in a child of a
potentially-multi-threaded parent is therefore prone to deadlock, if
perror() triggers a request to grab any resource that was left
locked by a different thread holding the lock but stranded by the
fork.
I suspect these sort of problems are unsolvable. I've studied enough
customer bug reports where we've had to reverse engineer exactly how
something failed given only a slim error log to know that error
messages are always useful and I wouldn't want to remove these.
However because of the chance of deadlock, how about a deadlock-free
function that write(2)'s the message + errno to stderr?
>+ _exit (EXIT_FAILURE);
>+ }
>+ if (fcntl (s, F_SETFD, flags & ~FD_CLOEXEC) == -1) {
>+ perror ("fcntl: F_SETFD");
>+ _exit (EXIT_FAILURE);
>+ }
About all we can _safely_ do is let our _exit() status inform the
parent process that something failed, and let the parent process
print the error message. But even passing errno as the _exit()
value is not portable (GNU Hurd errno values are intentionally
larger than what fit in 8-bit returns that the parent would see).
It's also possible to set up a cloexec pipe (in addition to
everything else) so that the child can write() error details into
the pipe, but that's a lot of effort compared to just blindly
stating that the child failed but declaring full details of why as
lost.
>+ }
>+
>+ snprintf (listen_pid, sizeof listen_pid, "%ld", (long) getpid ());
>+ setenv ("LISTEN_PID", listen_pid, 1);
>+ setenv ("LISTEN_FDS", "1", 1);
And we could use similar code here.
snprintf() and setenv() are also not async-signal-safe. Which is a
bummer, since you really don't know the child pid until in the
child. You could open-code the translation of a decimal number into
a buffer without snprintf, but you also have to figure out how to
safely put it into the environ seen by the child. How does systemd
do it?
For setenv(), systemd does this by building a char **environ which is
passed to execve. The code is complicated to say the least:
https://github.com/systemd/systemd/blob/5ac1530eca652cd16819853fe06e76e15...
>+ _exit (EXIT_FAILURE);
Is EXIT_FAILURE the best here, or should we consider exiting with
status 126/127 to match what other programs (sh, env, nice, nohup,
...) do when execvp() fails?
Yes (this is also a bug elsewhere). Not sure if 126 or 127 is
right:
https://www.gnu.org/software/bash/manual/html_node/Exit-Status.html
...
Do we document that nbd_close() may block?
We should do. I've added it in a separate patch.
I'll have a rethink about this patch and come back with something
better in a while.
Rich.
--
Richard Jones, Virtualization Group, Red Hat
http://people.redhat.com/~rjones
Read my programming and virtualization blog:
http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW