Hi Richard,
On Fri. 16 Jun. 2023 à 03:08, Richard W.M. Jones <rjones(a)redhat.com> wrote:
 On Thu, Jun 15, 2023 at 09:18:38PM +0900, Vincent Mailhol wrote:
 > Hello,
 >
 > I am using libguestfs in a Bazel's linux-sandbox environment[1].
 >
 > When executing in that sandbox environment, I got frequent crashes.
 >
 > Please find attached below the results of libguestfs-test-tool when
 > run into that linux-sandbox environment. The most relevant part seems
 > to be:
 >
 >   [    0.797233] ldmtool[164]: segfault at 0 ip 0000564a892506a5 sp 00007fff8ee5b900
error 4 in ldmtool[564a8924e000+3000]
 >   [    0.798117] Code: 18 64 48 33 1c 25 28 00 00 00 75 5e 48 83 c4 28 5b 5d 41 5c
41 5d 41 5e 41 5f c3 66 2e 0f 1f 84 00 00 00 00 00 e8 db fd ff ff <4c> 8b 20 48 89
44 24 08 4c 89 e7 e8 0b e1 ff ff 45 31 c0 4c 89 e1
 >   /init: line 154:   164 Segmentation fault      ldmtool create all
 >
 > So the root cause seems to be around libldm. This mailing list seems
 > to cover both libguestfs and libldm, so hopefully, I am at the right
 > place to ask :)
 >
 > Needless to say, when run outside of the sandbox environment, no crash
 > were observed.
 >
 > [1] linux-sandbox.cc
 > Link:
https://github.com/bazelbuild/bazel/blob/master/src/main/tools/linux-sand...
 >
 > ---
 ...
 > supermin: picked /sys/block/sdb/dev (8:16) as root device
 > supermin: creating /dev/root as block special 8:16
 > supermin: mounting new root on /root
 > [    0.678248] EXT4-fs (sdb): mounting ext2 file system using the ext4 subsystem
 > [    0.679832] EXT4-fs (sdb): mounted filesystem without journal. Opts: . Quota
mode: none.
 > supermin: deleting initramfs files
 > supermin: chroot
 > Starting /init script ...
 > mount: only root can use "--types" option (effective UID is 65534)
 > /init: line 38: /proc/cmdline: No such file or directory
 > mount: only root can use "--types" option (effective UID is 65534)
 > mount: only root can use "--options" option (effective UID is 65534)
 > mount: only root can use "--types" option (effective UID is 65534)
 > mount: only root can use "--types" option (effective UID is 65534)
 > mount: only root can use "--options" option (effective UID is 65534)
 It really goes wrong from here, where apparently it's not running as
 root (instead UID 65534), even though we're supposed to be running
 inside a Linux appliance virtual machine.
 Any idea why that would be?
 I looked at the sandbox and that would run the qemu process as UID
 "nobody" (which might be 65534).  However I don't understand why that
 would affect anything running on the new kernel inside the appliance. 
And you were right. It was a fact that I got a crash in the sandbox
but did not outside of it and I jumped to the conclusion that the root
cause was linked to the sandbox.
I continued the analysis and looked at all the differences between a
successful libguestfs-test-tool log and the failed one. It turned out
that the sandbox was not the cause. The culprit turns out to be the
first line of the log: TMPDIR=/tmp.
If I force TMPDIR=/var/tmp, the problem disappears !!
This gave me a minimal reproducer:
  TMPDIR=/tmp/ libguestfs-test-tool
That one crashed outside the sandbox. Next, my attention went to this line:
  libguestfs: checking for previously cached test results of
/usr/bin/qemu-system-x86_64, in /tmp/.guestfs-1001
I did a:
  rm -rf /tmp/.guestfs-1001
and that solved my issue \o/
I still do not understand how I could get the issue of running of UID
65534 instead of root in the first place. I did other qemu
experimentation, so not sure how, but I somehow got a corrupted
environment under /tmp/.guestfs-1001.
Last thing, the segfault on ldmtool [1] still seems a valid issue.
Even if I now do have a workaround for my problem, that segfault might
be worth a bit more investigation.
Regardless, thanks a lot for your quick answer, that helped me to
continue the troubleshooting.
[1] ldmtool line 164
Link: 
https://github.com/mdbooth/libldm/blob/master/src/ldmtool.c#L164