[libnbd PATCH v4 0/3] lib/utils: add async-signal-safe assert()

Changes in qcow2 not...

libnbd | Failed pipeline for...

Laszlo Ersek

Wednesday, 15 March 2023 Wed, 15 Mar '23

6:01 a.m.

This is version 4 of the following sub-series: [libnbd PATCH v3 06/29] lib/utils: introduce xwrite() as a more robust write() [libnbd PATCH v3 07/29] lib/utils: add async-signal-safe assert() [libnbd PATCH v3 08/29] lib/utils: add unit test for async-signal-safe assert() http://mid.mail-archive.com/20230215141158.2426855-7-lersek@redhat.com http://mid.mail-archive.com/20230215141158.2426855-8-lersek@redhat.com http://mid.mail-archive.com/20230215141158.2426855-9-lersek@redhat.com The Notes section on each patch records the updates for that patch. Thanks for reviewing, Laszlo Laszlo Ersek (3): lib/utils: introduce xwritel() as a more robust and convenient write() lib/utils: add async-signal-safe assert() lib/utils: add unit test for async-signal-safe assert() .gitignore | 2 + configure.ac | 5 + lib/Makefile.am | 38 +++++++- lib/internal.h | 13 +++ lib/test-fork-safe-assert.c | 66 +++++++++++++ lib/test-fork-safe-assert.sh | 32 ++++++ lib/utils.c | 103 +++++++++++++++++++- 7 files changed, 251 insertions(+), 8 deletions(-) create mode 100644 lib/test-fork-safe-assert.c create mode 100755 lib/test-fork-safe-assert.sh base-commit: bbf47ffd4ac48706cbdac080ad36d1250aa1c57b

Show replies by date

Laszlo Ersek

Wednesday, 15 March Wed, 15 Mar

6:01 a.m.

New subject: [libnbd PATCH v4 1/3] lib/utils: introduce xwritel() as a more robust and convenient write()

While nbd_internal_fork_safe_perror() must indeed call write(), and arguably justifiedly ignores the return value of write(), we can still make the write operations slightly more robust and convenient. Let's do that by introducing xwritel(): - Let the caller pass a list of NUL-terminated strings, via stdarg / ellipsis notation in the parameter list. - Handle partial writes. - Cope with EINTR and EAGAIN errors. (A busy loop around EAGAIN on a non-blocking file is not great in the general case, but it's good enough for writing out diagnostics before giving up.) - In the common case, handle an nbd_internal_fork_safe_perror() call with a single xwritel() -> writev() call chain, rather than with four separate write() calls. In practice, this tends to make the error message written to a regular file contiguous, even if other threads are writing to the same file. Multiple separate write() calls tend to interleave chunks of data from different threads. As a side bonus, remove the path in nbd_internal_fork_safe_perror() where at least one of the first two write() syscalls fails, and overwrites "errno", before we get to formatting the error string from "errno". Thanks to Eric Blake for helping me understand the scope of Austin Group bug reports. Signed-off-by: Laszlo Ersek <lersek(a)redhat.com> --- Notes: v4: - Rework with <stdarg.h> and writev(). - Don't split the output into chunks of SSIZE_MAX bytes. In v3, the goal of that chunking was to avoid implementation-defined behavior. However, POSIX requires writev() to fail cleanly when more than SSIZE_MAX bytes would be transfered in a single call. Hence the original goal (avoiding implementation-defined behavior) is ensured simply by switching to writev(). The SSIZE_MAX limit is not expected to be hit in practice (_POSIX_SSIZE_MAX is 32767). - As a "bonus", remove the pre-patch possibility to trample "errno" before formatting the error string. - Refresh the commit message. - The "contiguous output" from a single xwritel() -> writev() call (as opposed to the "interleaved output" from multiple xwrite() -> write() calls in v3) is easily testable in practice (on my end) with the following small patch, even though this "contiguity" is of course not guaranteed:

...

diff --git a/generator/states-connect-socket-activation.c b/generator/states-connect-socket-activation.c index e4b3b565ae2e..c66c638d331f 100644 --- a/generator/states-connect-socket-activation.c +++ b/generator/states-connect-socket-activation.c @@ -179,6 +179,8 @@ CONNECT_SA.START: * socket). */ int flags = fcntl (s, F_GETFD, 0); + flags = -1; + errno = EBADF; if (flags == -1) { nbd_internal_fork_safe_perror ("fcntl: F_GETFD"); _exit (126);

It results in the following snippet in "tests/connect-systemd-socket-activation.log":

...

libnbd: debug: nbd1: nbd_connect_systemd_socket_activation: transition: CONNECT.CONNECTING -> MAGIC.START fcntl: F_GETFD: Bad file descriptor libnbd: debug: nbd1: nbd_connect_systemd_socket_activation: transition: MAGIC.START -> MAGIC.RECV_MAGIC

Note that the child process's output is on an isolated line. - Do not pick up R-b tags from Eric and Rich due to significant changes in v4. context:-U5 lib/utils.c | 87 +++++++++++++++++++- 1 file changed, 83 insertions(+), 4 deletions(-) diff --git a/lib/utils.c b/lib/utils.c index 6df4f14ce9f4..62b4bfdda5c3 100644 --- a/lib/utils.c +++ b/lib/utils.c @@ -23,11 +23,14 @@ #include <string.h> #include <unistd.h> #include <ctype.h> #include <errno.h> #include <fcntl.h> +#include <stdarg.h> +#include <sys/uio.h> +#include "array-size.h" #include "minmax.h" #include "internal.h" void @@ -179,33 +182,109 @@ nbd_internal_fork_safe_itoa (long v, char *buf, size_t bufsize) #if defined (__GNUC__) #pragma GCC diagnostic push #pragma GCC diagnostic ignored "-Wunused-result" #endif +/* "Best effort" function for writing out a list of NUL-terminated strings to a + * file descriptor (without the NUL-terminators). The list is terminated with + * (char *)NULL. Partial writes, and EINTR and EAGAIN failures are handled + * internally. No value is returned; only call this function for writing + * diagnostic data on error paths, when giving up on a higher-level action + * anyway. + * + * No more than 16 strings, excluding the NULL terminator, will be written. (As + * of POSIX Issue 7 + TC2, _XOPEN_IOV_MAX is 16.) + * + * The function is supposed to remain async-signal-safe. + * + * (The va_*() macros, while not marked async-signal-safe in Issue 7 + TC2, are + * considered such, per <https://www.austingroupbugs.net/view.php?id=711>;, which + * is binding for Issue 7 implementations via the Interpretations Track. + * + * Furthermore, writev(), while also not marked async-signal-safe in Issue 7 + + * TC2, is considered such, per + * <https://www.austingroupbugs.net/view.php?id=1455>;, which is slated for + * inclusion in Issue 7 TC3 (if there's going to be a TC3), and in Issue 8.) + */ +static void +xwritel (int fildes, ...) +{ + /* Open-code the current value of _XOPEN_IOV_MAX, in order to contain stack + * footprint, should _XOPEN_IOV_MAX grow in the future. + */ + struct iovec iovec[16], *filled, *end, *pos; + va_list ap; + char *arg; + + /* Translate the variable argument list to IO vectors. Note that we cast away + * const-ness intentionally. + */ + filled = iovec; + end = iovec + ARRAY_SIZE (iovec); + va_start (ap, fildes); + while (filled < end && (arg = va_arg (ap, char *)) != NULL) + *filled++ = (struct iovec){ .iov_base = arg, .iov_len = strlen (arg) }; + va_end (ap); + + /* Write out the IO vectors. */ + pos = iovec; + while (pos < filled) { + ssize_t written; + + /* Skip any empty vectors at the front. */ + if (pos->iov_len == 0) { + ++pos; + continue; + } + + /* Write out the vectors. */ + do + written = writev (fildes, pos, filled - pos); + while (written == -1 && (errno == EINTR || errno == EAGAIN)); + + if (written == -1) + return; + + /* Consume the vectors that have been written out (fully, or in part). Note + * that "written" is positive here. + */ + do { + size_t advance; + + advance = MIN (written, pos->iov_len); + /* Note that "advance" is positive here iff "pos->iov_len" is positive. */ + pos->iov_base = (char *)pos->iov_base + advance; + pos->iov_len -= advance; + written -= advance; + + /* At least one of "written" and "pos->iov_len" is zero here. */ + if (pos->iov_len == 0) + ++pos; + } while (written > 0); + } +} + /* Fork-safe version of perror. ONLY use this after fork and before * exec, the rest of the time use set_error(). */ void nbd_internal_fork_safe_perror (const char *s) { const int err = errno; const char *m = NULL; char buf[32]; - write (2, s, strlen (s)); - write (2, ": ", 2); #ifdef HAVE_STRERRORDESC_NP m = strerrordesc_np (errno); #else #if HAVE_SYS_ERRLIST /* NB Don't use #ifdef */ m = errno >= 0 && errno < sys_nerr ? sys_errlist[errno] : NULL; #endif #endif if (!m) m = nbd_internal_fork_safe_itoa ((long) errno, buf, sizeof buf); - write (2, m, strlen (m)); - write (2, "\n", 1); + xwritel (STDERR_FILENO, s, ": ", m, "\n", (char *)NULL); /* Restore original errno in case it was disturbed by the system * calls above. */ errno = err;

Eric Blake

9:01 a.m.

New subject: [libnbd PATCH v4 1/3] lib/utils: introduce xwritel() as a more robust and convenient write()

On Wed, Mar 15, 2023 at 12:01:55PM +0100, Laszlo Ersek wrote:

...

All nice benefits, even if we don't normally exercise the code.

...

Thanks to Eric Blake for helping me understand the scope of Austin Group bug reports. Signed-off-by: Laszlo Ersek <lersek(a)redhat.com> --- Notes: v4: - Rework with <stdarg.h> and writev(). - Don't split the output into chunks of SSIZE_MAX bytes. In v3, the goal of that chunking was to avoid implementation-defined behavior. However, POSIX requires writev() to fail cleanly when more than SSIZE_MAX bytes would be transfered in a single call. Hence the original goal (avoiding implementation-defined behavior) is ensured simply by switching to writev(). The SSIZE_MAX limit is not expected to be hit in practice (_POSIX_SSIZE_MAX is 32767).

Concur.

...

- As a "bonus", remove the pre-patch possibility to trample "errno" before formatting the error string.

Nice find; and one I missed in my earlier review.

...

- Refresh the commit message. - The "contiguous output" from a single xwritel() -> writev() call (as opposed to the "interleaved output" from multiple xwrite() -> write() calls in v3) is easily testable in practice (on my end) with the following small patch, even though this "contiguity" is of course not guaranteed: > diff --git a/generator/states-connect-socket-activation.c b/generator/states-connect-socket-activation.c > index e4b3b565ae2e..c66c638d331f 100644 > --- a/generator/states-connect-socket-activation.c > +++ b/generator/states-connect-socket-activation.c > @@ -179,6 +179,8 @@ CONNECT_SA.START: > * socket). > */ > int flags = fcntl (s, F_GETFD, 0); > + flags = -1; > + errno = EBADF; > if (flags == -1) { > nbd_internal_fork_safe_perror ("fcntl: F_GETFD"); > _exit (126); It results in the following snippet in "tests/connect-systemd-socket-activation.log": > libnbd: debug: nbd1: nbd_connect_systemd_socket_activation: transition: CONNECT.CONNECTING -> MAGIC.START > fcntl: F_GETFD: Bad file descriptor > libnbd: debug: nbd1: nbd_connect_systemd_socket_activation: transition: MAGIC.START -> MAGIC.RECV_MAGIC Note that the child process's output is on an isolated line.

Without the patch, it had a high likelihood (but not guarantee) of interleaving. And writev() is not a bulletproof avoidance of interleaving (if we hit a short write and retry the tail, interleaving is possible) - but we are very unlikely to see that in practice.

...

- Do not pick up R-b tags from Eric and Rich due to significant changes in v4. context:-U5 lib/utils.c | 87 +++++++++++++++++++- 1 file changed, 83 insertions(+), 4 deletions(-) diff --git a/lib/utils.c b/lib/utils.c index 6df4f14ce9f4..62b4bfdda5c3 100644 --- a/lib/utils.c +++ b/lib/utils.c @@ -23,11 +23,14 @@ #include <string.h> #include <unistd.h> #include <ctype.h> #include <errno.h> #include <fcntl.h> +#include <stdarg.h> +#include <sys/uio.h> +#include "array-size.h" #include "minmax.h" #include "internal.h" void @@ -179,33 +182,109 @@ nbd_internal_fork_safe_itoa (long v, char *buf, size_t bufsize) #if defined (__GNUC__) #pragma GCC diagnostic push #pragma GCC diagnostic ignored "-Wunused-result" #endif +/* "Best effort" function for writing out a list of NUL-terminated strings to a + * file descriptor (without the NUL-terminators). The list is terminated with + * (char *)NULL. Partial writes, and EINTR and EAGAIN failures are handled + * internally. No value is returned; only call this function for writing + * diagnostic data on error paths, when giving up on a higher-level action + * anyway. + * + * No more than 16 strings, excluding the NULL terminator, will be written. (As + * of POSIX Issue 7 + TC2, _XOPEN_IOV_MAX is 16.) + * + * The function is supposed to remain async-signal-safe. + * + * (The va_*() macros, while not marked async-signal-safe in Issue 7 + TC2, are + * considered such, per <https://www.austingroupbugs.net/view.php?id=711>;, which + * is binding for Issue 7 implementations via the Interpretations Track. + * + * Furthermore, writev(), while also not marked async-signal-safe in Issue 7 + + * TC2, is considered such, per + * <https://www.austingroupbugs.net/view.php?id=1455>;, which is slated for + * inclusion in Issue 7 TC3 (if there's going to be a TC3), and in Issue 8.)

Correct summary, and matches our off-list collaboration on determining what the Austin Group guarantees (while still waiting for their release of POSIX Issue 8 later this year).

...

+ */ +static void +xwritel (int fildes, ...) +{

Good candidate for __attribute__ ((sentinel)), so that -Wformat marks callers that fail to supply a trailing NULL argument.

...

+ /* Open-code the current value of _XOPEN_IOV_MAX, in order to contain stack + * footprint, should _XOPEN_IOV_MAX grow in the future. + */ + struct iovec iovec[16], *filled, *end, *pos; + va_list ap; + char *arg; + + /* Translate the variable argument list to IO vectors. Note that we cast away + * const-ness intentionally. + */ + filled = iovec; + end = iovec + ARRAY_SIZE (iovec); + va_start (ap, fildes); + while (filled < end && (arg = va_arg (ap, char *)) != NULL) + *filled++ = (struct iovec){ .iov_base = arg, .iov_len = strlen (arg) }; + va_end (ap); + + /* Write out the IO vectors. */ + pos = iovec; + while (pos < filled) { + ssize_t written; + + /* Skip any empty vectors at the front. */ + if (pos->iov_len == 0) { + ++pos; + continue; + }

In practice, writev() will handle 0-length iovs on all file types; but I agree with your effort to skip them since POSIX only guarantees behavior on regular files (and our typical fd of stderr is not always a regular file).

...

+ + /* Write out the vectors. */ + do + written = writev (fildes, pos, filled - pos); + while (written == -1 && (errno == EINTR || errno == EAGAIN)); + + if (written == -1) + return; + + /* Consume the vectors that have been written out (fully, or in part). Note + * that "written" is positive here.

The POSIX wording is a bit tricky to read in this regards, but I think you are correct that write() (and therefore writev()) will never return 0 if passed a non-zero length on input: either a short write happens because of a signal before anything is written (return is -1, errno is EINTR or EAGAIN), or the short write occurred after partial write (return must be positive); the only time return can be 0 is if length was 0 but we don't have that issue.

...

+ */ + do { + size_t advance; + + advance = MIN (written, pos->iov_len); + /* Note that "advance" is positive here iff "pos->iov_len" is positive. */ + pos->iov_base = (char *)pos->iov_base + advance; + pos->iov_len -= advance; + written -= advance; + + /* At least one of "written" and "pos->iov_len" is zero here. */ + if (pos->iov_len == 0) + ++pos; + } while (written > 0); + } +}

Nice! Took me more than one pass to fully understand it, but I agree that your documented loop invariants hold, and that it does indeed do a best effort vectored write.

...

+ /* Fork-safe version of perror. ONLY use this after fork and before * exec, the rest of the time use set_error(). */ void nbd_internal_fork_safe_perror (const char *s) { const int err = errno; const char *m = NULL; char buf[32]; - write (2, s, strlen (s)); - write (2, ": ", 2); #ifdef HAVE_STRERRORDESC_NP m = strerrordesc_np (errno); #else #if HAVE_SYS_ERRLIST /* NB Don't use #ifdef */ m = errno >= 0 && errno < sys_nerr ? sys_errlist[errno] : NULL; #endif #endif if (!m) m = nbd_internal_fork_safe_itoa ((long) errno, buf, sizeof buf);

The bonus bug you fixed here could be independently fixed by s/errno/err/ without the rest of your patch, but now that nothing else touches errno prior to assigning to m, I don't see the need to make that change now.

...

- write (2, m, strlen (m)); - write (2, "\n", 1); + xwritel (STDERR_FILENO, s, ": ", m, "\n", (char *)NULL);

I also like the change of s/2/STDERR_FILENO/ that you snuck in here. The only change I recommend is the addition of the __attribute__; but with or without it, I'm happy with: Reviewed-by: Eric Blake <eblake(a)redhat.com> -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org

Laszlo Ersek

9:30 a.m.

New subject: [libnbd PATCH v4 1/3] lib/utils: introduce xwritel() as a more robust and convenient write()

On 3/15/23 15:01, Eric Blake wrote:

...

[...]

Thanks for the thorough review; I'm glad all the fine points I sought to put in the patch were received -- and well-received! :) One question:

...

The only change I recommend is the addition of the __attribute__; but with or without it, I'm happy with:

Do we have general rules on attribute usage in libnbd vs. nbdkit? The __sentinel__ (aka sentinel) attribute is used in nbdkit, but not yet in libnbd. Now, that could be happenstance, but it rhymes with another (obscure?) discrepancy in attribute usage. Namely, when I was comparing the common/ subdirectories of both projects, I noticed that nbdkit used the cleanup attribute, and libnbd didn't. First I thought it was a mistake / oversight, but then I found a porting note from Rich, in libnbd commit f306e231d294 ("common/utils: Add extensible string, based on vector", 2022-03-12): RWMJ: This removes the CLEANUP_FREE_STRING macro since libnbd does not use __attribute__((cleanup)). and then again in f3828bfd42be ("common/utils: Add new string vector types", 2022-03-12): RWMJ: Removed the CLEANUP_* macros. So those comments (esp. the one on commit f306e231d294) at least confirm that the difference is intentional. I still don't know the reason for the difference. And now I wonder: does the same (unexplained) reason underlie the "sentinel" attribute's absence too, in libnbd? If there is a common reason for avoiding both "cleanup" and "sentinel" in libnbd, we should probably not start using "sentinel" now. If, on the other hand, "sentinel" is not covered by the same argument as "cleanup" (not to mention if there isn't an actual reason for avoiding "cleanup" in the first place!), then I can add the sentinel attribute when merging this patch.

...

Reviewed-by: Eric Blake <eblake(a)redhat.com>

Thanks! Laszlo

Eric Blake

12:14 p.m.

New subject: [libnbd PATCH v4 1/3] lib/utils: introduce xwritel() as a more robust and convenient write()

On Wed, Mar 15, 2023 at 03:30:12PM +0100, Laszlo Ersek wrote:

...

On 3/15/23 15:01, Eric Blake wrote: > [...] Thanks for the thorough review; I'm glad all the fine points I sought to put in the patch were received -- and well-received! :) One question: > The only change I recommend is the addition of the __attribute__; but > with or without it, I'm happy with: Do we have general rules on attribute usage in libnbd vs. nbdkit? The __sentinel__ (aka sentinel) attribute is used in nbdkit, but not yet in libnbd. Now, that could be happenstance, but it rhymes with another (obscure?) discrepancy in attribute usage.

I think it's happenstance; until today, libnbd did not yet have a varargs function where annotating the need for a NULL terminator was useful to let the compiler aid in flagging erroneous usage.

...

Namely, when I was comparing the common/ subdirectories of both projects, I noticed that nbdkit used the cleanup attribute, and libnbd didn't. First I thought it was a mistake / oversight, but then I found a porting note from Rich, in libnbd commit f306e231d294 ("common/utils: Add extensible string, based on vector", 2022-03-12): RWMJ: This removes the CLEANUP_FREE_STRING macro since libnbd does not use __attribute__((cleanup)). and then again in f3828bfd42be ("common/utils: Add new string vector types", 2022-03-12): RWMJ: Removed the CLEANUP_* macros.

Most attributes are merely extensions that aid the compiler in aiding you. __attribute__((sentinel)) is squarely in this camp - compilers that understand it can warn on questionable code, while you can #define a wrapper to an empty string for all other compilers with no change in program behavior. But __attribute__((cleanup)) is a special beast - it affects runtime behavior, and if you use it, you are REQUIRED to have compiler support. There is no way to write preprocessor macros that will provide the same runtime functionality that cleanup implies for use by a purely standards-compliant cc. That said, it is still a localized compile-time effect, and does not impact ABI - it is merely reducing (a lot!) of boilerplate coding that would otherwise be needed without the attribute in play. I see no problem in mixing an executable that uses it with a library that does not (nbdkit does just that - our server uses cleanup, but can run a plugin compiled without cleanups just fine), nor with mixing a library that uses it with an executable does not (which could be the case if libnbd starts using it). Rather, the drawback of using __attribute__((cleanup)) in libnbd is that we would now REQUIRE libnbd to be compiled with gcc or clang. Right now, I don't know if anyone is trying to use libnbd with an alternative compiler (no one has submitted patches or bug reports for using libnbd under MSVC, for example), so it may be a non-issue. But it's a one-way bridge - once we explicitly decide that we expect a particular extension to the standards to even be able to use the library, it becomes a lot harder to port the code to other platforms without that specific compiler extension without replacing it back to a lot of boilerplate code in its place. At the time we copied the vector code from nbdkit over to libnbd, we weren't sure what environments would try to use libnbd, so we intentionally did not port attribute cleanup stuff to avoid crippling an unknown user. I'm not opposed to using the cleanup attribute, and if we DO decide to use it, I'd love to go all in and utilize it wherever it makes sense, which is more than just with of vectors. Maybe the thing to do is have one major release where we announce our intention to utilize the attribute in a future release, unless someone speaks up with a reason why it would break with their preferred toolchain; it delays the decision, and means we can't use it right away, but at least would be a documented transition rather than a blind "sorry you can't build anymore".

...

So those comments (esp. the one on commit f306e231d294) at least confirm that the difference is intentional. I still don't know the reason for the difference. And now I wonder: does the same (unexplained) reason underlie the "sentinel" attribute's absence too, in libnbd? If there is a common reason for avoiding both "cleanup" and "sentinel" in libnbd, we should probably not start using "sentinel" now. If, on the other hand, "sentinel" is not covered by the same argument as "cleanup" (not to mention if there isn't an actual reason for avoiding "cleanup" in the first place!), then I can add the sentinel attribute when merging this patch.

I think the argument for not backporting "cleanup" is much different than the one for not having needed to use "sentinel" to date. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org

Laszlo Ersek

Thursday, 16 March Thu, 16 Mar

4:30 a.m.

New subject: [libnbd PATCH v4 1/3] lib/utils: introduce xwritel() as a more robust and convenient write()

On 3/15/23 18:14, Eric Blake wrote:

...

On Wed, Mar 15, 2023 at 03:30:12PM +0100, Laszlo Ersek wrote: > On 3/15/23 15:01, Eric Blake wrote: > >> [...] > > Thanks for the thorough review; I'm glad all the fine points I sought to > put in the patch were received -- and well-received! :) > > One question: > >> The only change I recommend is the addition of the __attribute__; but >> with or without it, I'm happy with: > > Do we have general rules on attribute usage in libnbd vs. nbdkit? > > The __sentinel__ (aka sentinel) attribute is used in nbdkit, but not yet > in libnbd. Now, that could be happenstance, but it rhymes with another > (obscure?) discrepancy in attribute usage. I think it's happenstance; until today, libnbd did not yet have a varargs function where annotating the need for a NULL terminator was useful to let the compiler aid in flagging erroneous usage. > > Namely, when I was comparing the common/ subdirectories of both > projects, I noticed that nbdkit used the cleanup attribute, and libnbd > didn't. First I thought it was a mistake / oversight, but then I found a > porting note from Rich, in libnbd commit f306e231d294 ("common/utils: > Add extensible string, based on vector", 2022-03-12): > > RWMJ: This removes the CLEANUP_FREE_STRING macro since libnbd does not > use __attribute__((cleanup)). > > and then again in f3828bfd42be ("common/utils: Add new string vector > types", 2022-03-12): > > RWMJ: Removed the CLEANUP_* macros. > Most attributes are merely extensions that aid the compiler in aiding you. __attribute__((sentinel)) is squarely in this camp - compilers that understand it can warn on questionable code, while you can #define a wrapper to an empty string for all other compilers with no change in program behavior. But __attribute__((cleanup)) is a special beast - it affects runtime behavior, and if you use it, you are REQUIRED to have compiler support. There is no way to write preprocessor macros that will provide the same runtime functionality that cleanup implies for use by a purely standards-compliant cc. That said, it is still a localized compile-time effect, and does not impact ABI - it is merely reducing (a lot!) of boilerplate coding that would otherwise be needed without the attribute in play. I see no problem in mixing an executable that uses it with a library that does not (nbdkit does just that - our server uses cleanup, but can run a plugin compiled without cleanups just fine), nor with mixing a library that uses it with an executable does not (which could be the case if libnbd starts using it). Rather, the drawback of using __attribute__((cleanup)) in libnbd is that we would now REQUIRE libnbd to be compiled with gcc or clang. Right now, I don't know if anyone is trying to use libnbd with an alternative compiler (no one has submitted patches or bug reports for using libnbd under MSVC, for example), so it may be a non-issue. But it's a one-way bridge - once we explicitly decide that we expect a particular extension to the standards to even be able to use the library, it becomes a lot harder to port the code to other platforms without that specific compiler extension without replacing it back to a lot of boilerplate code in its place. At the time we copied the vector code from nbdkit over to libnbd, we weren't sure what environments would try to use libnbd, so we intentionally did not port attribute cleanup stuff to avoid crippling an unknown user. I'm not opposed to using the cleanup attribute, and if we DO decide to use it, I'd love to go all in and utilize it wherever it makes sense, which is more than just with of vectors. Maybe the thing to do is have one major release where we announce our intention to utilize the attribute in a future release, unless someone speaks up with a reason why it would break with their preferred toolchain; it delays the decision, and means we can't use it right away, but at least would be a documented transition rather than a blind "sorry you can't build anymore". > So those comments (esp. the one on commit f306e231d294) at least confirm > that the difference is intentional. I still don't know the reason for > the difference. And now I wonder: does the same (unexplained) reason > underlie the "sentinel" attribute's absence too, in libnbd? > > If there is a common reason for avoiding both "cleanup" and "sentinel" > in libnbd, we should probably not start using "sentinel" now. If, on the > other hand, "sentinel" is not covered by the same argument as "cleanup" > (not to mention if there isn't an actual reason for avoiding "cleanup" > in the first place!), then I can add the sentinel attribute when merging > this patch. I think the argument for not backporting "cleanup" is much different than the one for not having needed to use "sentinel" to date.

Thanks for the detailed explanation. While I don't like this extra (orthogonal) complexity, I've made an effort to collect some information. (1) I couldn't figure out at what clang / gcc version the sentinel attribute was introduced. (2) The gcc manual says that __attribute__ ((__sentinel__)) is equivalent to __attribute__ ((sentinel)); it's just that the former is more suitable for public header files, to be included by client apps, where "sentinel" could already exist as a macro, while __sentinel__ couldn't. This is in fact (at least superficially) consistent with nbdkit's usage of the attribute; we have "__sentinel__" in "common/utils/utils.h" and "tests/test.h" (header files -- albeit not public), and "sentinel" in "tests/test-layers.c". So I'll squash "__attribute__ ((sentinel))" into the C code of this patch. (3) Whether or not we should add a wrapper macro. Libnbd is not really consistent in this regard. The generator defines LIBNBD_ATTRIBUTE_NONNULL and LIBNBD_ATTRIBUTE_ALLOC_DEALLOC -- those are for the public "libnbd.h" header, so the wrapper macros are certainly justified there. The question is however what the libnbd-internal code does. And that's *seemingly* inconsistent: (3.a) the internal code consumes LIBNBD_ATTRIBUTE_NONNULL and LIBNBD_ATTRIBUTE_ALLOC_DEALLOC liberally, from the public (generated) library header -- likely because that's the convenient thing to do, (3.b) in a single case, we have an internal-only wrapper: NBD_ATTRIBUTE_PACKED in "lib/nbd-protocol.h" (whose definition effectively enforces clang or gcc) -- and this seems to be shared with nbdkit, (3.c) we have a bunch of internal code that uses naked attributes, such as "__nonnull__", "__unused__", "constructor", "noreturn", "destructor", "packed". For (3.a) and (3.b), I can cook up a guiding principle -- both "libnbd.h" and "nbd-protocol.h" look public, at least to an extent, while the stuff in (3.c) is totally internal. So, I can equate wrapper macros to public headers, for now, and I won't introduce a new macro just for this one application of "__attribute__ ((sentinel))" in "lib/utils.c". Laszlo

Eric Blake

8:30 p.m.

New subject: [libnbd PATCH v4 1/3] lib/utils: introduce xwritel() as a more robust and convenient write()

On Thu, Mar 16, 2023 at 10:30:22AM +0100, Laszlo Ersek wrote:

...

> I think the argument for not backporting "cleanup" is much different > than the one for not having needed to use "sentinel" to date. Thanks for the detailed explanation.

No problem - it also helps me to write it down in the archives ;)

...

While I don't like this extra (orthogonal) complexity, I've made an effort to collect some information. (1) I couldn't figure out at what clang / gcc version the sentinel attribute was introduced.

I didn't search either, but it's been a while. A quick grep of gnulib finds: m4/gnulib-common.m4:# define _GL_ATTR_sentinel _GL_GNUC_PREREQ (4, 0) which is fairly old these days. (That file dates a lot of other attributes as well...)

...

(2) The gcc manual says that __attribute__ ((__sentinel__)) is equivalent to __attribute__ ((sentinel)); it's just that the former is more suitable for public header files, to be included by client apps, where "sentinel" could already exist as a macro, while __sentinel__ couldn't. This is in fact (at least superficially) consistent with nbdkit's usage of the attribute; we have "__sentinel__" in "common/utils/utils.h" and "tests/test.h" (header files -- albeit not public), and "sentinel" in "tests/test-layers.c". So I'll squash "__attribute__ ((sentinel))" into the C code of this patch.

In general, gcc has __name__ aliases for ALL __attribute__((name))s, precisely for public headers. So my personal rule of thumb is if it is a .h to be installed, add wings; if it is local to the project, the shorter version is fine.

...

(3) Whether or not we should add a wrapper macro. Libnbd is not really consistent in this regard. The generator defines LIBNBD_ATTRIBUTE_NONNULL and LIBNBD_ATTRIBUTE_ALLOC_DEALLOC -- those are for the public "libnbd.h" header, so the wrapper macros are certainly justified there. The question is however what the libnbd-internal code does. And that's *seemingly* inconsistent: (3.a) the internal code consumes LIBNBD_ATTRIBUTE_NONNULL and LIBNBD_ATTRIBUTE_ALLOC_DEALLOC liberally, from the public (generated) library header -- likely because that's the convenient thing to do,

Indeed.

...

(3.b) in a single case, we have an internal-only wrapper: NBD_ATTRIBUTE_PACKED in "lib/nbd-protocol.h" (whose definition effectively enforces clang or gcc) -- and this seems to be shared with nbdkit,

nbdkit made the conscious decision to enforce clang or gcc for __attribute__((cleanup)); as such, it can rely on that assumption in most cases. But you are right about nbd-protocol.h being shared; this may be one case where we could rework nbdkit's copy to not be so compiler-specific if it lets libnbd compile on a wider array of compilers. Or it may be proof that no one is really compiling libnbd with anything other than clang/gcc, at which point libnbd using __attribute__((cleanup)) is not going to cause anyone grief after all.

...

(3.c) we have a bunch of internal code that uses naked attributes, such as "__nonnull__", "__unused__", "constructor", "noreturn", "destructor", "packed". For (3.a) and (3.b), I can cook up a guiding principle -- both "libnbd.h" and "nbd-protocol.h" look public, at least to an extent, while the stuff in (3.c) is totally internal. So, I can equate wrapper macros to public headers, for now, and I won't introduce a new macro just for this one application of "__attribute__ ((sentinel))" in "lib/utils.c".

Fair enough. If it fails to compile, we can add a wrapper at that time (the main reason for a wrapper is to allow building with a wider array of compilers). -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org

Laszlo Ersek

Wednesday, 15 March Wed, 15 Mar

6:01 a.m.

New subject: [libnbd PATCH v4 2/3] lib/utils: add async-signal-safe assert()

Add an assert() variant that we may call between fork() and exec*(). Signed-off-by: Laszlo Ersek <lersek(a)redhat.com> --- Notes: v4: - rework with xwritel() - do not pick up R-b tags due to the above context:-U13 lib/internal.h | 13 +++++++++++++ lib/utils.c | 16 ++++++++++++++++ 2 files changed, 29 insertions(+) diff --git a/lib/internal.h b/lib/internal.h index 73d243a13743..4e75f97d2a8a 100644 --- a/lib/internal.h +++ b/lib/internal.h @@ -517,14 +517,27 @@ extern char *nbd_internal_printable_string (const char *str) extern char *nbd_internal_printable_string_list (char **list) LIBNBD_ATTRIBUTE_ALLOC_DEALLOC (free); /* These are wrappers around socket(2) and socketpair(2). They * always set SOCK_CLOEXEC. nbd_internal_socket can set SOCK_NONBLOCK * according to the nonblock parameter. */ extern int nbd_internal_socket (int domain, int type, int protocol, bool nonblock); extern int nbd_internal_socketpair (int domain, int type, int protocol, int *fds) LIBNBD_ATTRIBUTE_NONNULL (4); +extern void nbd_internal_fork_safe_assert (int result, const char *file, + long line, const char *func, + const char *assertion) + LIBNBD_ATTRIBUTE_NONNULL (2, 4, 5); + +#ifdef NDEBUG +#define NBD_INTERNAL_FORK_SAFE_ASSERT(expression) ((void)0) +#else +#define NBD_INTERNAL_FORK_SAFE_ASSERT(expression) \ + (nbd_internal_fork_safe_assert ((expression) != 0, __FILE__, __LINE__, \ + __func__, #expression)) +#endif + #endif /* LIBNBD_INTERNAL_H */ diff --git a/lib/utils.c b/lib/utils.c index 62b4bfdda5c3..255f428dd8b6 100644 --- a/lib/utils.c +++ b/lib/utils.c @@ -479,13 +479,29 @@ nbd_internal_socketpair (int domain, int type, int protocol, int *fds) if (ret == 0) { for (i = 0; i < 2; i++) { if (fcntl (fds[i], F_SETFD, FD_CLOEXEC) == -1) { close (fds[0]); close (fds[1]); return -1; } } } #endif return ret; } + +void +nbd_internal_fork_safe_assert (int result, const char *file, long line, + const char *func, const char *assertion) +{ + const char *line_out; + char line_buf[32]; + + if (result) + return; + + line_out = nbd_internal_fork_safe_itoa (line, line_buf, sizeof line_buf); + xwritel (STDERR_FILENO, file, ":", line_out, ": ", func, ": Assertion `", + assertion, "' failed.\n", (char *)NULL); + abort (); +}

Eric Blake

12:18 p.m.

New subject: [libnbd PATCH v4 2/3] lib/utils: add async-signal-safe assert()

On Wed, Mar 15, 2023 at 12:01:56PM +0100, Laszlo Ersek wrote:

...

Add an assert() variant that we may call between fork() and exec*(). Signed-off-by: Laszlo Ersek <lersek(a)redhat.com> ---

...

+void +nbd_internal_fork_safe_assert (int result, const char *file, long line, + const char *func, const char *assertion) +{ + const char *line_out; + char line_buf[32]; + + if (result) + return; + + line_out = nbd_internal_fork_safe_itoa (line, line_buf, sizeof line_buf); + xwritel (STDERR_FILENO, file, ":", line_out, ": ", func, ": Assertion `", + assertion, "' failed.\n", (char *)NULL);

xwritel() makes this so much shorter ;) Reviewed-by: Eric Blake <eblake(a)redhat.com> -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org

Laszlo Ersek

Thursday, 16 March Thu, 16 Mar

4:31 a.m.

New subject: [libnbd PATCH v4 2/3] lib/utils: add async-signal-safe assert()

On 3/15/23 18:18, Eric Blake wrote:

...

On Wed, Mar 15, 2023 at 12:01:56PM +0100, Laszlo Ersek wrote: > Add an assert() variant that we may call between fork() and exec*(). > > Signed-off-by: Laszlo Ersek <lersek(a)redhat.com> > --- > > +void > +nbd_internal_fork_safe_assert (int result, const char *file, long line, > + const char *func, const char *assertion) > +{ > + const char *line_out; > + char line_buf[32]; > + > + if (result) > + return; > + > + line_out = nbd_internal_fork_safe_itoa (line, line_buf, sizeof line_buf); > + xwritel (STDERR_FILENO, file, ":", line_out, ": ", func, ": Assertion `", > + assertion, "' failed.\n", (char *)NULL); xwritel() makes this so much shorter ;)

I know, right? :)

...

Reviewed-by: Eric Blake <eblake(a)redhat.com>

Thanks!

Laszlo Ersek

Wednesday, 15 March Wed, 15 Mar

6:01 a.m.

New subject: [libnbd PATCH v4 3/3] lib/utils: add unit test for async-signal-safe assert()

Don't try to test async-signal-safety, only that NBD_INTERNAL_FORK_SAFE_ASSERT() works similarly to assert(): - it prints diagnostics to stderr, - it calls abort(). Some unfortunate gymnastics are necessary to avoid littering the system with unwanted core dumps. Signed-off-by: Laszlo Ersek <lersek(a)redhat.com> --- Notes: v4: - update Red Hat Copyright Notices in the new files, in accordance with commit bbf47ffd4ac4 ("Update Red Hat Copyright Notices", 2023-03-04) - explain the TRUE and FALSE macro definitions in code comments [Eric] - test that the pattern "`TRUE'" is not present in the output file; that is, the passing assertion does not generate unexpected output [Eric] - do not pick up R-b tags due to the logic update in the last bullet above configure.ac | 5 ++ lib/test-fork-safe-assert.c | 66 ++++++++++++++++++++ lib/Makefile.am | 38 +++++++++-- lib/test-fork-safe-assert.sh | 32 ++++++++++ .gitignore | 2 + 5 files changed, 139 insertions(+), 4 deletions(-) diff --git a/configure.ac b/configure.ac index b6d60c3df6a1..62fe470b6cd5 100644 --- a/configure.ac +++ b/configure.ac @@ -132,11 +132,16 @@ dnl Check for various libc functions, all optional. dnl dnl posix_fadvise helps to optimise linear reads and writes. dnl +dnl When /proc/sys/kernel/core_pattern starts with a pipe (|) symbol, Linux +dnl ignores "ulimit -c" and (equivalent) setrlimit(RLIMIT_CORE) actions, for +dnl disabling core dumping. Only prctl() can be used then, for that purpose. +dnl dnl strerrordesc_np (glibc only) is preferred over sys_errlist: dnl https://lists.fedoraproject.org/archives/list/glibc@lists.fedoraproject.o... AC_CHECK_FUNCS([\ posix_fadvise \ posix_memalign \ + prctl \ strerrordesc_np \ valloc]) diff --git a/lib/test-fork-safe-assert.c b/lib/test-fork-safe-assert.c new file mode 100644 index 000000000000..4a4f6e88ce65 --- /dev/null +++ b/lib/test-fork-safe-assert.c @@ -0,0 +1,66 @@ +/* nbd client library in userspace + * Copyright Red Hat + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include <config.h> + +#include <stdio.h> +#include <stdlib.h> +#ifdef HAVE_PRCTL +#include <sys/prctl.h> +#endif +#include <sys/resource.h> + +#undef NDEBUG + +#include "internal.h" + +/* Define these to verify that NBD_INTERNAL_FORK_SAFE_ASSERT() properly + * stringifies the expression passed to it. + */ +#define TRUE 1 +#define FALSE 0 + +int +main (void) +{ + struct rlimit rlimit; + + /* The standard approach for disabling core dumping. Has no effect on Linux if + * /proc/sys/kernel/core_pattern starts with a pipe (|) symbol. + */ + if (getrlimit (RLIMIT_CORE, &rlimit) == -1) { + perror ("getrlimit"); + return EXIT_FAILURE; + } + rlimit.rlim_cur = 0; + if (setrlimit (RLIMIT_CORE, &rlimit) == -1) { + perror ("setrlimit"); + return EXIT_FAILURE; + } + +#ifdef HAVE_PRCTL + if (prctl (PR_SET_DUMPABLE, 0, 0, 0, 0) == -1) { + perror ("prctl"); + return EXIT_FAILURE; + } +#endif + + NBD_INTERNAL_FORK_SAFE_ASSERT (TRUE); + NBD_INTERNAL_FORK_SAFE_ASSERT (FALSE); + return 0; +} diff --git a/lib/Makefile.am b/lib/Makefile.am index 52b525819bde..326b550f90bd 100644 --- a/lib/Makefile.am +++ b/lib/Makefile.am @@ -25,15 +25,30 @@ generator_built = \ unlocked.h \ $(NULL) +CLEANFILES += \ + test-fork-safe-assert.err \ + $(NULL) + EXTRA_DIST = \ $(generator_built) \ libnbd.syms \ + test-fork-safe-assert.sh \ $(NULL) lib_LTLIBRARIES = libnbd.la BUILT_SOURCES = $(generator_built) +AM_CPPFLAGS = \ + -I$(top_srcdir)/include \ + -I$(top_srcdir)/common/include \ + -I$(top_srcdir)/common/utils \ + $(NULL) + +AM_CFLAGS = \ + $(WARNINGS_CFLAGS) \ + $(NULL) + libnbd_la_SOURCES = \ aio.c \ api.c \ @@ -60,13 +75,11 @@ libnbd_la_SOURCES = \ utils.c \ $(NULL) libnbd_la_CPPFLAGS = \ - -I$(top_srcdir)/include \ - -I$(top_srcdir)/common/include \ - -I$(top_srcdir)/common/utils \ + $(AM_CPPFLAGS) \ -Dsysconfdir=\"$(sysconfdir)\" \ $(NULL) libnbd_la_CFLAGS = \ - $(WARNINGS_CFLAGS) \ + $(AM_CFLAGS) \ $(PTHREAD_CFLAGS) \ $(GNUTLS_CFLAGS) \ $(LIBXML2_CFLAGS) \ @@ -86,3 +99,20 @@ libnbd_la_LDFLAGS = \ pkgconfigdir = $(libdir)/pkgconfig pkgconfig_DATA = libnbd.pc + +# Unit tests. + +TESTS = \ + test-fork-safe-assert.sh \ + $(NULL) + +check_PROGRAMS = \ + test-fork-safe-assert \ + $(NULL) + +test_fork_safe_assert_SOURCES = \ + $(top_srcdir)/common/utils/vector.c \ + errors.c \ + test-fork-safe-assert.c \ + utils.c \ + $(NULL) diff --git a/lib/test-fork-safe-assert.sh b/lib/test-fork-safe-assert.sh new file mode 100755 index 000000000000..2cf0a29a4c6d --- /dev/null +++ b/lib/test-fork-safe-assert.sh @@ -0,0 +1,32 @@ +#!/usr/bin/env bash +# nbd client library in userspace +# Copyright Red Hat +# +# This library is free software; you can redistribute it and/or +# modify it under the terms of the GNU Lesser General Public +# License as published by the Free Software Foundation; either +# version 2 of the License, or (at your option) any later version. +# +# This library is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# Lesser General Public License for more details. +# +# You should have received a copy of the GNU Lesser General Public +# License along with this library; if not, write to the Free Software +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + +set +e + +./test-fork-safe-assert 2>test-fork-safe-assert.err +exit_status=$? + +set -e + +test $exit_status -gt 128 +signal_name=$(kill -l $exit_status) +test "x$signal_name" = xABRT || test "x$signal_name" = xSIGABRT + +ptrn="^test-fork-safe-assert\\.c:[0-9]+: main: Assertion \`FALSE' failed\\.\$" +grep -E -q -- "$ptrn" test-fork-safe-assert.err +grep -v -q "\`TRUE'" test-fork-safe-assert.err diff --git a/.gitignore b/.gitignore index 4ebd96519c29..54bc20173f95 100644 --- a/.gitignore +++ b/.gitignore @@ -128,6 +128,8 @@ Makefile.in /lib/states-run.c /lib/states.c /lib/states.h +/lib/test-fork-safe-assert +/lib/test-fork-safe-assert.err /lib/unlocked.h /libtool /ltmain.sh

Eric Blake

12:25 p.m.

New subject: [libnbd PATCH v4 3/3] lib/utils: add unit test for async-signal-safe assert()

On Wed, Mar 15, 2023 at 12:01:57PM +0100, Laszlo Ersek wrote:

...

diff --git a/configure.ac b/configure.ac index b6d60c3df6a1..62fe470b6cd5 100644 --- a/configure.ac +++ b/configure.ac @@ -132,11 +132,16 @@ dnl Check for various libc functions, all optional. dnl dnl posix_fadvise helps to optimise linear reads and writes. dnl +dnl When /proc/sys/kernel/core_pattern starts with a pipe (|) symbol, Linux +dnl ignores "ulimit -c" and (equivalent) setrlimit(RLIMIT_CORE) actions, for +dnl disabling core dumping. Only prctl() can be used then, for that purpose. +dnl dnl strerrordesc_np (glibc only) is preferred over sys_errlist: dnl https://lists.fedoraproject.org/archives/list/glibc@lists.fedoraproject.o... AC_CHECK_FUNCS([\ posix_fadvise \ posix_memalign \ + prctl \ strerrordesc_np \ valloc])

AC_CHECK_FUNCS looks for whether the given entry point can be linked with, which is okay for functions in the common headers (<stdlib.h>, <unistd.h>, ...) that autoconf includes in its test programs by default. But...

...

diff --git a/lib/test-fork-safe-assert.c b/lib/test-fork-safe-assert.c new file mode 100644 index 000000000000..4a4f6e88ce65 --- /dev/null +++ b/lib/test-fork-safe-assert.c @@ -0,0 +1,66 @@

...

+ +#include <stdio.h> +#include <stdlib.h> +#ifdef HAVE_PRCTL +#include <sys/prctl.h> +#endif

...the fact that prctl() is in a non-standard header makes me wonder if we might fail to detect the function merely because we didn't include the right header, rather than because its symbol was not exported. On the other hand, prctl() is definitely Linux-specific, so I think you are quite safe in assuming that <sys/prctl.h> exists if and only if prctl() is a linkable entry point. If it does turn out to break someone, we can fix it in a followup patch, so no change needed in your usage at this time.

...

+++ b/lib/test-fork-safe-assert.sh @@ -0,0 +1,32 @@ +#!/usr/bin/env bash

Reviewed-by: Eric Blake <eblake(a)redhat.com> -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org

Laszlo Ersek

Thursday, 16 March Thu, 16 Mar

4:50 a.m.

New subject: [libnbd PATCH v4 3/3] lib/utils: add unit test for async-signal-safe assert()

On 3/15/23 18:25, Eric Blake wrote:

...

On Wed, Mar 15, 2023 at 12:01:57PM +0100, Laszlo Ersek wrote: > Don't try to test async-signal-safety, only that > NBD_INTERNAL_FORK_SAFE_ASSERT() works similarly to assert(): > > - it prints diagnostics to stderr, > > - it calls abort(). > > Some unfortunate gymnastics are necessary to avoid littering the system > with unwanted core dumps. > > Signed-off-by: Laszlo Ersek <lersek(a)redhat.com> > --- > diff --git a/configure.ac b/configure.ac > index b6d60c3df6a1..62fe470b6cd5 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -132,11 +132,16 @@ dnl Check for various libc functions, all optional. > dnl > dnl posix_fadvise helps to optimise linear reads and writes. > dnl > +dnl When /proc/sys/kernel/core_pattern starts with a pipe (|) symbol, Linux > +dnl ignores "ulimit -c" and (equivalent) setrlimit(RLIMIT_CORE) actions, for > +dnl disabling core dumping. Only prctl() can be used then, for that purpose. > +dnl > dnl strerrordesc_np (glibc only) is preferred over sys_errlist: > dnl https://lists.fedoraproject.org/archives/list/glibc@lists.fedoraproject.o... > AC_CHECK_FUNCS([\ > posix_fadvise \ > posix_memalign \ > + prctl \ > strerrordesc_np \ > valloc]) AC_CHECK_FUNCS looks for whether the given entry point can be linked with, which is okay for functions in the common headers (<stdlib.h>, <unistd.h>, ...) that autoconf includes in its test programs by default. But... > > diff --git a/lib/test-fork-safe-assert.c b/lib/test-fork-safe-assert.c > new file mode 100644 > index 000000000000..4a4f6e88ce65 > --- /dev/null > +++ b/lib/test-fork-safe-assert.c > @@ -0,0 +1,66 @@ > + > +#include <stdio.h> > +#include <stdlib.h> > +#ifdef HAVE_PRCTL > +#include <sys/prctl.h> > +#endif ...the fact that prctl() is in a non-standard header makes me wonder if we might fail to detect the function merely because we didn't include the right header, rather than because its symbol was not exported. On the other hand, prctl() is definitely Linux-specific, so I think you are quite safe in assuming that <sys/prctl.h> exists if and only if prctl() is a linkable entry point. If it does turn out to break someone, we can fix it in a followup patch, so no change needed in your usage at this time.

*shudder* Another hideous piece of complexity that's orthogonal to what one wants to do :) So, I investigated a bit. If I understand correctly, your point is that we could get a *false negative* here, because AC_CHECK_FUNCS might not find prctl() due to the autoconf-generated test program not #including <sys/prctl.h>. Assuming I got that right, I have two comments on it: (1) A false negative in this case would not be a huge problem; we'd miss out on prctl(), i.e. the test program would remain dumpable on Linux. The test would still function as needed, just litter the user's machine with a coredump during "make check". Not ideal, but also not tragic. (2) I believe I disagree with the idea that AC_CHECK_FUNCS might not find an otherwise existent prctl() *due to* AC_CHECK_FUNCS not generating "#include <sys/prctl.h>" into the test program. On my RHEL-9.1 install (using autoconf-2.69-38.el9.noarch), I've checked the generated "configure" script. First, we have 18509 for ac_func in \ 18510 posix_fadvise \ 18511 posix_memalign \ 18512 prctl \ 18513 strerrordesc_np \ 18514 valloc 18515 do : 18516 as_ac_var=`$as_echo "ac_cv_func_$ac_func" | $as_tr_sh` 18517 ac_fn_c_check_func "$LINENO" "$ac_func" "$as_ac_var" 18518 if eval test \"x\$"$as_ac_var"\" = x"yes"; then : 18519 cat >>confdefs.h <<_ACEOF 18520 #define `$as_echo "HAVE_$ac_func" | $as_tr_cpp` 1 18521 _ACEOF 18522 18523 fi 18524 done I.e., we call "ac_fn_c_check_func" with "prctl" passed as second argument. Then, that function is defined as follows: 2001 # ac_fn_c_check_func LINENO FUNC VAR 2002 # ---------------------------------- 2003 # Tests whether FUNC exists, setting the cache variable VAR accordingly 2004 ac_fn_c_check_func () 2005 { 2006 as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack 2007 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $2" >&5 2008 $as_echo_n "checking for $2... " >&6; } 2009 if eval \${$3+:} false; then : 2010 $as_echo_n "(cached) " >&6 2011 else 2012 cat confdefs.h - <<_ACEOF >conftest.$ac_ext 2013 /* end confdefs.h. */ 2014 /* Define $2 to an innocuous variant, in case <limits.h> declares $2. 2015 For example, HP-UX 11i <limits.h> declares gettimeofday. */ 2016 #define $2 innocuous_$2 2017 2018 /* System header to define __stub macros and hopefully few prototypes, 2019 which can conflict with char $2 (); below. 2020 Prefer <limits.h> to <assert.h> if __STDC__ is defined, since 2021 <limits.h> exists even on freestanding compilers. */ 2022 2023 #ifdef __STDC__ 2024 # include <limits.h> 2025 #else 2026 # include <assert.h> 2027 #endif 2028 2029 #undef $2 2030 2031 /* Override any GCC internal prototype to avoid an error. 2032 Use char because int might match the return type of a GCC 2033 builtin and then its argument prototype would still apply. */ 2034 #ifdef __cplusplus 2035 extern "C" 2036 #endif 2037 char $2 (); 2038 /* The GNU C library defines this for functions which it implements 2039 to always fail with ENOSYS. Some functions are actually named 2040 something starting with __ and the normal name is an alias. */ 2041 #if defined __stub_$2 || defined __stub___$2 2042 choke me 2043 #endif 2044 2045 int 2046 main () 2047 { 2048 return $2 (); 2049 ; 2050 return 0; 2051 } 2052 _ACEOF 2053 if ac_fn_c_try_link "$LINENO"; then : 2054 eval "$3=yes" 2055 else 2056 eval "$3=no" 2057 fi 2058 rm -f core conftest.err conftest.$ac_objext \ 2059 conftest$ac_exeext conftest.$ac_ext 2060 fi 2061 eval ac_res=\$$3 2062 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_res" >&5 2063 $as_echo "$ac_res" >&6; } 2064 eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno 2065 2066 } # ac_fn_c_check_func As far as I can tell, the test program provides its own declaration for prctl() on line 2037, so it does not depend on any system header providing a real prototype. ... I figured autoconf should have a "header check" too, and indeed it does: AC_CHECK_HEADER, AC_CHECK_HEADERS, at <https://www.gnu.org/software/autoconf/manual/autoconf-2.69/html_node/Gene...;. We do use AC_CHECK_HEADERS in libnbd. The question is now whether checking for <sys/prctl.h> with AC_CHECK_HEADERS is "more robust" than checking for prctl() with AC_CHECK_FUNCS(). I'd say: AC_CHECK_FUNCS() is a tiny bit stronger (we actually want to be able to call the particular prctl() function), but they should be mostly *interchangeable*. I'm saying that because prctl() is "Linux-specific" (per prctl(2) manual), so: (a) <sys/prctl.h> existing (per AC_CHECK_HEADERS), but not exposing the real -- and linkable -- prctl() prototype, (b) a call to prctl() being linkable (via the bogus declaration used by AC_CHECK_HEADERS), but <sys/prctl.h> not exposing a proper prctl() prototype, are *equally* Linux userspace bugs. So indeed I'll stick with the AC_CHECK_FUNCS approach.

...

> +++ b/lib/test-fork-safe-assert.sh > @@ -0,0 +1,32 @@ > +#!/usr/bin/env bash Reviewed-by: Eric Blake <eblake(a)redhat.com>

Thank you! Laszlo

Eric Blake

8:45 p.m.

New subject: [libnbd PATCH v4 3/3] lib/utils: add unit test for async-signal-safe assert()

On Thu, Mar 16, 2023 at 10:50:06AM +0100, Laszlo Ersek wrote:

...

On 3/15/23 18:25, Eric Blake wrote: > On Wed, Mar 15, 2023 at 12:01:57PM +0100, Laszlo Ersek wrote: >> Don't try to test async-signal-safety, only that >> NBD_INTERNAL_FORK_SAFE_ASSERT() works similarly to assert(): >> >> - it prints diagnostics to stderr, >> >> - it calls abort(). >> >> Some unfortunate gymnastics are necessary to avoid littering the system >> with unwanted core dumps. >> >> Signed-off-by: Laszlo Ersek <lersek(a)redhat.com> >> --- > >> diff --git a/configure.ac b/configure.ac >> index b6d60c3df6a1..62fe470b6cd5 100644 >> --- a/configure.ac >> +++ b/configure.ac >> @@ -132,11 +132,16 @@ dnl Check for various libc functions, all optional. >> dnl >> dnl posix_fadvise helps to optimise linear reads and writes. >> dnl >> +dnl When /proc/sys/kernel/core_pattern starts with a pipe (|) symbol, Linux >> +dnl ignores "ulimit -c" and (equivalent) setrlimit(RLIMIT_CORE) actions, for >> +dnl disabling core dumping. Only prctl() can be used then, for that purpose. >> +dnl >> dnl strerrordesc_np (glibc only) is preferred over sys_errlist: >> dnl https://lists.fedoraproject.org/archives/list/glibc@lists.fedoraproject.o... >> AC_CHECK_FUNCS([\ >> posix_fadvise \ >> posix_memalign \ >> + prctl \ >> strerrordesc_np \ >> valloc]) > > AC_CHECK_FUNCS looks for whether the given entry point can be linked > with, which is okay for functions in the common headers (<stdlib.h>, > <unistd.h>, ...) that autoconf includes in its test programs by > default. But... > >> >> diff --git a/lib/test-fork-safe-assert.c b/lib/test-fork-safe-assert.c >> new file mode 100644 >> index 000000000000..4a4f6e88ce65 >> --- /dev/null >> +++ b/lib/test-fork-safe-assert.c >> @@ -0,0 +1,66 @@ > >> + >> +#include <stdio.h> >> +#include <stdlib.h> >> +#ifdef HAVE_PRCTL >> +#include <sys/prctl.h> >> +#endif > > ...the fact that prctl() is in a non-standard header makes me wonder > if we might fail to detect the function merely because we didn't > include the right header, rather than because its symbol was not > exported. > > On the other hand, prctl() is definitely Linux-specific, so I think > you are quite safe in assuming that <sys/prctl.h> exists if and only > if prctl() is a linkable entry point. If it does turn out to break > someone, we can fix it in a followup patch, so no change needed in > your usage at this time. *shudder* Another hideous piece of complexity that's orthogonal to what one wants to do :) So, I investigated a bit. If I understand correctly, your point is that we could get a *false negative* here, because AC_CHECK_FUNCS might not find prctl() due to the autoconf-generated test program not #including <sys/prctl.h>.

Bingo - you caught my poorly stated conclusion.

...

Assuming I got that right, I have two comments on it: (1) A false negative in this case would not be a huge problem; we'd miss out on prctl(), i.e. the test program would remain dumpable on Linux. The test would still function as needed, just litter the user's machine with a coredump during "make check". Not ideal, but also not tragic.

I'm not sure if it remains a core dump, or if it becomes dumpabale to ABRT (or whatever else was consuming the pipeline) which in turn leads to unnecessary load on the ABRT servers and bugzilla (etc) for dealing with an expected crash, but yeah, that appears to be the worst effect of a false negative.

...

(2) I believe I disagree with the idea that AC_CHECK_FUNCS might not find an otherwise existent prctl() *due to* AC_CHECK_FUNCS not generating "#include <sys/prctl.h>" into the test program. On my RHEL-9.1 install (using autoconf-2.69-38.el9.noarch), I've checked the generated "configure" script. First, we have

...

As far as I can tell, the test program provides its own declaration for prctl() on line 2037, so it does not depend on any system header providing a real prototype.

Yep, that's how autoconf checks whether an external symbol is provided by the current set of linked libraries. Where it can go wrong is when a public header has a macro that redefines the normal symbol name into the actual library linkage name (think about the public stat() vs. internal __stat64() mess when big files were first introduced, or more recently, figuring out how to support 64-bit time on 32-bit systems).

...

... I figured autoconf should have a "header check" too, and indeed it does: AC_CHECK_HEADER, AC_CHECK_HEADERS, at <https://www.gnu.org/software/autoconf/manual/autoconf-2.69/html_node/Gene...;. We do use AC_CHECK_HEADERS in libnbd.

Autoconf also has a way to write one-off function checks with AC_CHECK_FUNCTION where you can supply your own #include's specific to the normal usage of that function (the most robust configure tests are the ones that mirror actual usage as closely as possible); but AC_CHECK_FUNCTIONS (with its shell loop) is so much more compact, that I didn't think it is worth worrying about it.

...

The question is now whether checking for <sys/prctl.h> with AC_CHECK_HEADERS is "more robust" than checking for prctl() with AC_CHECK_FUNCS(). I'd say: AC_CHECK_FUNCS() is a tiny bit stronger (we actually want to be able to call the particular prctl() function), but they should be mostly *interchangeable*. I'm saying that because prctl() is "Linux-specific" (per prctl(2) manual), so: (a) <sys/prctl.h> existing (per AC_CHECK_HEADERS), but not exposing the real -- and linkable -- prctl() prototype, (b) a call to prctl() being linkable (via the bogus declaration used by AC_CHECK_HEADERS), but <sys/prctl.h> not exposing a proper prctl() prototype, are *equally* Linux userspace bugs.

And your conclusion matched mine - this is such a niche function that even though we are checking only one of the two aspects that both have to be in play to use the function (the header, and the external linkage name), we can rely on both or neither aspects being present on a given system, effectively letting us pick just one configure probe as the witness for both aspects.

...

So indeed I'll stick with the AC_CHECK_FUNCS approach.

Yep, I think the function check is stronger than the header check, and that's why I gave R-b:

...

> >> +++ b/lib/test-fork-safe-assert.sh >> @@ -0,0 +1,32 @@ >> +#!/usr/bin/env bash > > Reviewed-by: Eric Blake <eblake(a)redhat.com> > Thank you!

And while you spent some time learning things, I'm glad we are in agreement that as ugly as this niche case is, we don't have to do even more churn to "improve" the situation. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org

Laszlo Ersek

Friday, 17 March Fri, 17 Mar

3:49 a.m.

New subject: [libnbd PATCH v4 3/3] lib/utils: add unit test for async-signal-safe assert()

On 3/17/23 02:45, Eric Blake wrote:

...

Where it can go wrong is when a public header has a macro that redefines the normal symbol name into the actual library linkage name (think about the public stat() vs. internal __stat64() mess when big files were first introduced, or more recently, figuring out how to support 64-bit time on 32-bit systems).

Ah, I got it now. The autoconf-generated test could theoretically fail (leading to a false negative) because "prctl" might not exist as a linkable symbol *at all* in the C library (so the fake function declaration in the generated test file would not link against anything). The actual function could be provided by the shared object under a symbol like "__prctl_foobar" *only*, and the only way to get from "prctl" to "__prctl_foobar" could be a #define available via inclusion of the <sys/prctl.h> header. Yeah, let's deal with *that* horror if it ever happens. :) Thanks! Laszlo

Laszlo Ersek

Thursday, 16 March Thu, 16 Mar

5:30 a.m.

On 3/15/23 12:01, Laszlo Ersek wrote:

...

Series merged as commit range 48eca6a25468..c7d02b4b08ee. Thanks! Laszlo

1003

days inactive

1005

days old

guestfs@lists.libguestfs.org

Manage subscription

15 comments

2 participants

tags (0)

participants (2)

Eric Blake
Laszlo Ersek