Hi Richard,
On Fri. 16 Jun. 2023 à 03:08, Richard W.M. Jones <rjones(a)redhat.com> wrote:
On Thu, Jun 15, 2023 at 09:18:38PM +0900, Vincent Mailhol wrote:
> Hello,
>
> I am using libguestfs in a Bazel's linux-sandbox environment[1].
>
> When executing in that sandbox environment, I got frequent crashes.
>
> Please find attached below the results of libguestfs-test-tool when
> run into that linux-sandbox environment. The most relevant part seems
> to be:
>
> [ 0.797233] ldmtool[164]: segfault at 0 ip 0000564a892506a5 sp 00007fff8ee5b900
error 4 in ldmtool[564a8924e000+3000]
> [ 0.798117] Code: 18 64 48 33 1c 25 28 00 00 00 75 5e 48 83 c4 28 5b 5d 41 5c
41 5d 41 5e 41 5f c3 66 2e 0f 1f 84 00 00 00 00 00 e8 db fd ff ff <4c> 8b 20 48 89
44 24 08 4c 89 e7 e8 0b e1 ff ff 45 31 c0 4c 89 e1
> /init: line 154: 164 Segmentation fault ldmtool create all
>
> So the root cause seems to be around libldm. This mailing list seems
> to cover both libguestfs and libldm, so hopefully, I am at the right
> place to ask :)
>
> Needless to say, when run outside of the sandbox environment, no crash
> were observed.
>
> [1] linux-sandbox.cc
> Link:
https://github.com/bazelbuild/bazel/blob/master/src/main/tools/linux-sand...
>
> ---
...
> supermin: picked /sys/block/sdb/dev (8:16) as root device
> supermin: creating /dev/root as block special 8:16
> supermin: mounting new root on /root
> [ 0.678248] EXT4-fs (sdb): mounting ext2 file system using the ext4 subsystem
> [ 0.679832] EXT4-fs (sdb): mounted filesystem without journal. Opts: . Quota
mode: none.
> supermin: deleting initramfs files
> supermin: chroot
> Starting /init script ...
> mount: only root can use "--types" option (effective UID is 65534)
> /init: line 38: /proc/cmdline: No such file or directory
> mount: only root can use "--types" option (effective UID is 65534)
> mount: only root can use "--options" option (effective UID is 65534)
> mount: only root can use "--types" option (effective UID is 65534)
> mount: only root can use "--types" option (effective UID is 65534)
> mount: only root can use "--options" option (effective UID is 65534)
It really goes wrong from here, where apparently it's not running as
root (instead UID 65534), even though we're supposed to be running
inside a Linux appliance virtual machine.
Any idea why that would be?
I looked at the sandbox and that would run the qemu process as UID
"nobody" (which might be 65534). However I don't understand why that
would affect anything running on the new kernel inside the appliance.
And you were right. It was a fact that I got a crash in the sandbox
but did not outside of it and I jumped to the conclusion that the root
cause was linked to the sandbox.
I continued the analysis and looked at all the differences between a
successful libguestfs-test-tool log and the failed one. It turned out
that the sandbox was not the cause. The culprit turns out to be the
first line of the log: TMPDIR=/tmp.
If I force TMPDIR=/var/tmp, the problem disappears !!
This gave me a minimal reproducer:
TMPDIR=/tmp/ libguestfs-test-tool
That one crashed outside the sandbox. Next, my attention went to this line:
libguestfs: checking for previously cached test results of
/usr/bin/qemu-system-x86_64, in /tmp/.guestfs-1001
I did a:
rm -rf /tmp/.guestfs-1001
and that solved my issue \o/
I still do not understand how I could get the issue of running of UID
65534 instead of root in the first place. I did other qemu
experimentation, so not sure how, but I somehow got a corrupted
environment under /tmp/.guestfs-1001.
Last thing, the segfault on ldmtool [1] still seems a valid issue.
Even if I now do have a workaround for my problem, that segfault might
be worth a bit more investigation.
Regardless, thanks a lot for your quick answer, that helped me to
continue the troubleshooting.
[1] ldmtool line 164
Link:
https://github.com/mdbooth/libldm/blob/master/src/ldmtool.c#L164