Hi Eric,
lots of great stuff in your email, thanks! Let me respond:
On 3/21/23 18:28, Eric Blake wrote:
On Tue, Mar 21, 2023 at 03:56:22PM +0100, Laszlo Ersek wrote:
>> $ podman build -f ci/containers/alpine-edge.Dockerfile -t libnbd-alpine-edge
>> $ podman run -it --rm --userns=keep-id -v .:/repo:z -w /repo libnbd-alpine-edge
bash
>> $ ./configure
>> $ make check
>> $ grep tmpd= lib/test-suite.log
>> + tmpd=/tmp/tmp.EMgKeF
>> $ /tmp/tmp.EMgKeF/bin/f 1 + 1
>> f: applet not found
>> 0b748c9fe495:~$
>>
>> So it looks like we need some way to work around busybox' insistance
>> that argv[0] determines which applet to run.
>
> I couldn't come up with a reproducer like yours. I couldn't figure
> out how to *quickly* get an interactive Alpine Linux environment with
> the test failing, and I also couldn't figure out how to trigger
> "verbose" test runs on gitlab, without polluting the master branch
> with debug patches. (I tried forking the project in my own space on
> gitlab, and pushed a debug patch with just "set -x" onto a non-master
> branch *there* -- but CI didn't start in response to that.)
ci/README.rst is informative on that regards; I'm impressed at what
lcitool (from the libvirt-ci) project gives us for local
reproducibility of repeating various CI build failures.
Indeed very informative. I was hoping I could skimp on reading it, but
now I've bitten the bullet.
It does offer precisely such podman commands (two of them) that I need
for seamless testing in an Alpine Linux container. Unfortunately, podman
does not work for me on RHEL-9.1 as a mere user!
podman build -f ci/containers/alpine-edge.Dockerfile -t libnbd-alpine-edge
it prints:
Error: error creating tmpdir: mkdir /run/user/500: permission denied
I've searched the web -- I've found nothing conclusive (for example, I
don't have a .config/containers/storage.conf file, so its contents
cannot be blamed for the misbehavior). However, at least I've determined
that Giuseppe Scrivano is an expert in this area, so I've emailed him
now. We'll see where that goes.
> So, I only installed a new Alpine Linux virtual machine -- what
a
> pain *that* was --, and investigated what /usr/bin/expr was. When I
> found it was a symlink to /bin/busybox, I started looking for Alpine
> Linux specific tweaks that could replace busybox (in this role) with
> a real binary executable "expr" utility.
>
> I was relieved to find the following wiki article:
>
>
https://wiki.alpinelinux.org/wiki/How_to_get_regular_stuff_working
>
> which promised -- I thought anyways -- a real coreutils package.
>
> Imagine my dismay when I found that, after installing coreutils with
> apk in the Alpine Linux VM, the symlink stayed in place, only its
> target binary changed from "/bin/busybox" to "coreutils". Well
done,
> Alpine Linux, well done.
Yeah, I remember the debates back when GNU coreutils was offered
patches to choose to build a multiname binary - it fits with Alpine's
goals and matches busybox's minimalist footprint, but it violates GNU
Coding Standards (which state that in general, an application's
behavior should not depend on argv[0], precisely so that renaming it
doesn't change its behavior).
Haha, I didn't expect this. Background that's good to know!
>
> So, no, this mess (= Alpine Linux) is not salvageable. The
> "lib/test-fork-safe-execvpe.sh" script depends on "expr" being
> functional under the name "f". And yes I want "f" to be a
> single-character filename; otherwise the nice tabular formatting of
> the script falls apart (or produces overlong lines).
>
> As last step, I learned about ci/skipped_tests, and used it to
> disable the test on alpine linux.
That works, but is not always my favorite: it fixes CI, but does not
help any developer on a similar platform. I probably would have tried
this in the test itself (untested here):
cp /bin/expr f
cleanup_fn rm f
requires ./f 1 + 1 # We insist on expr behavior independent of argv[0]
which will then skip the test (exit status 77) if a renamed /bin/expr
doesn't behave the same as the original. But since we haven't heard
complaints from anyone using Alpine as their primary development box
(or any other non-Alpine but otherwise-busybox setup), we can make
that change when such a complaint arises, rather than adding even more
churn to your latest series of patches.
Another thought - instead of skipping the test when /bin/expr is
underwhelming when renamed, we could rewrite the test to avoid that
dependency altogether. After all, we are building a
test-fork-safe-execvpe binary that does exactly what we say it should;
we could hack up test-fork-safe-execvpe.c to depend on argv[0] (I know
- using the very thing I don't like about busybox) so that when run by
its usual name it performs the unit test, but when run by the basename
of f, it does just enough for us to prove that we indeed executed our
desired binary (whether that be emulating a subset of /bin/expr, or
something else). But again, that's more effort to code up than just
skipping the test on a platform that's less likely to be a primary
development station in the first place.
This morning I have thought of two other approaches:
1. Simply write another test program in C that prints "hello world" to
the standard output. Totally deterministic behavior, a binary
executable, and does not depend on any system-provided utility. (The
original reason for using "expr" was to avoid the need for such a C
source file.)
2. I've recalled that "test-fork-safe-execvpe.c" actually takes argv[0]
for the program to execute, as another explicit argument.
Correspondingly, the "run" function in "test-fork-safe-execvpe.sh"
passes its $2 positional parameter *twice* to the
"test-fork-safe-execvpe" binary:
# $2 and onward are passed to $execvpe; note that $2 becomes *both*
# "program-to-exec" for the helper *and* argv[0] for the program executed by
the
# helper.
I *might* be able to add a "run0" function that does not perform this
duplication, i.e., it lets the caller pass in argv[0] separately from
"program-to-exec". Then (by my count) I should have to modify only 4
invocations of the form "f 1 + 1"; and in advance I hope that such a
change wouldn't destroy the formatting of the script.
>
> The latest pipeline passed:
> <
https://gitlab.com/nbdkit/libnbd/-/pipelines/813280321>.
>
> Either way, I wanted to highlight the following commits on the list:
>
> 1 b29ff42e5d00 lib: account for realpath deficiency on some platforms
I didn't even pay attention to this one before you pointed it out; but
it is indeed a bug in busybox now that POSIX is moving towards
standardizing realpath, so I've filed it:
https://bugs.busybox.net/show_bug.cgi?id=15466
Thanks!
I remember Austin Group bug #1457 from your earlier messages. Back then
I noticed that said ticket called for a utility named "readlink", not
"realpath". So it's not exactly that "realpath" (which I believe
will
remain non-standard?) should accept "--", but that a "readlink"
utility,
accepting "--", should exist.
... Ah wait, no, that's not right. Upon reading the ticket more
carefully now, there is a late comment in the ticket that *does* mandate
"realpath": <
https://www.austingroupbugs.net/view.php?id=1457#c5898>.
OK then! :)
(The "Summary" field on Austin Group bug #1457 might be worth updating!)
Thanks!
Laszlo
> 2 65631e5dfff5 lib/utils: try to placate attribute placement restriction from gcc
> 3 4cae20ccefaf Revert "lib: account for realpath deficiency on some
platforms"
> 4 f5a065aa3a9c ci: skip "lib/test-fork-safe-execvpe.sh" on Alpine Linux
>
> Thanks for investigating!
> Laszlo
>