Re: [Libguestfs] libldm crashes in a linux-sandbox context

Monday, 19 June 2023

On 6/19/23 13:18, Vincent MAILHOL wrote:
...
 On Fri. 16 juin 2023 at 16:34, Richard W.M. Jones
<rjones(a)redhat.com&gt; wrote:
 (...)
>> Last thing, the segfault on ldmtool [1] still seems a valid issue.
>> Even if I now do have a workaround for my problem, that segfault might
>> be worth a bit more investigation.
>
> Yes that does look like a real problem.  Does it crash if you just run
> ldmtool as a normal command, nothing to do with libguestfs?  Might be
> a good idea to try to get a stack trace of the crash.

 The fact is that it only crashes with the UUID 65534 in the qemu VM. I
 am not sure what command line is passed to ldmtool for this crash to
 occur.

 I can help to gather information, but my biggest issue is that I do
 not know how to interact with the VM under /tmp/.guestfs-1001/

   [    0.777352] ldmtool[164]: segfault at 0 ip 0000563a225cd6a5 sp
 00007ffe54965a60 error 4 in ldmtool[563a225cb000+3000]
                                         ^^^^ ^^^^^^^^^^^^^^^^^^^
 This smells like a NULL pointer dereference. 
... Hey this is actually my line from an email I started writing earlier
today :) , but I then decided not to send it.

It certainly looks like a null pointer dereference, and if you
disassemble the instruction byte stream dump (the "Code:" line from the
kernel log) with (e.g.) ndisasm, that confirms it. You get something like

00000025  E8DBFDFFFF        call 0xfffffffffffffe05
0000002A  4C8B20            mov r12,[rax]              <---- crash
0000002D  4889442408        mov [rsp+0x8],rax
00000032  4C89E7            mov rdi,r12
00000035  E80BE1FFFF        call 0xffffffffffffe145

with the "mov r12,[rax]" instruction faulting (with the previously
called function presumably having returned 0 in rax). See the "<4c> 8b
20" substring in the "Code:" line -- the angle brackets point at the
first byte of the crashing instruction.

I didn't send the email ultimately because your email included a link
[1] pointing at a particular line number:

https://github.com/mdbooth/libldm/blob/master/src/ldmtool.c#L164

and so I assumed you actually traced the crash to that line.

Is that the case?

Or did you perhaps mistake *PID* 164 (from the kernel log) for the line
number?

...
 The instruction pointer
 being 563a225cd6a5, I installed libguestfs-tools-dbgsym and tried a:

   addr2line -e /usr/bin/ldmtool 564a892506a5

 Results:

   ??:0

 Without conviction, I also tried in GDB:

   $ gdb /usr/bin/ldmtool
   (...)
   Reading symbols from /usr/bin/ldmtool...
   Reading symbols from
 /usr/lib/debug/.build-id/21/37b4a64903ebe427c242be08b8d496ba570583.debug...
   (gdb) info line *0x564a892506a5
   No line number information available for address 0x564a892506a5

 Debug symbols are correctly installed but impossible to convert that
 instruction pointer into a line number. It is as if the ldmtool on my
 host and the ldmtool in the qemu VM were from a different build. I
 tried to mount /tmp/.guestfs-1001/appliance.d/root but that disk image
 did not contain ldmtool.

 I am not sure how to generate a stack trace or a core dump within that
 qemu VM. If you can tell me how to get an interactive prompt (or any
 other guidance) I can try to collect more information. 
The IP where the crash occurs is 0000563a225cd6a5. The ldmtool binary
(as opposed to a shared object / library) is mapped into the process's
address space at 563a225cb000, for a length of 0x3000 bytes. So the
offending instruction is supposed to be 0000563a225cd6a5 - 563a225cb000
= 26A5.

With the debug symbols installed, can you attach the output of

  objdump --headers --wide -S /usr/bin/ldmtool

?

Can you try

  addr2line -p -i -f -e /usr/bin/ldmtool 26A5

?

(This still may not be good enough; we might have to offset the
difference 0x26A5 with some address related to the .text section... The
objdump output should help us experiment.)

Laszlo

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [Libguestfs] libldm crashes in a linux-sandbox context