Hey everybody,
Richard, you were right. I managed to reproduce the same crash without
dealing with v2v (and libguestfs).
Actually - it's reproducible really ease - I write a big file to /tmp
on L0 (till it 100% full) and then run a L2 VM. Almost every time it
crushes with double fault.
Debugging, debugging and more debugging.
Marcelo/Paolo, if you have any clue, I would like to hear from you.
Thanks,
Rom
On Fri, Jan 17, 2014 at 5:06 PM, Rom Freiman <rom(a)stratoscale.com> wrote:
Kashyap, just to be sure - it happens to you during the v2v
conversion? on L2? While L1 and L0 works fine afterwords, right?
Thanks
On Fri, Jan 17, 2014 at 4:45 PM, Kashyap Chamarthy <kchamart(a)redhat.com> wrote:
> On 01/17/2014 03:38 PM, Richard W.M. Jones wrote:
>> On Fri, Jan 17, 2014 at 04:14:03PM +0200, Rom Freiman wrote:
>>> How do you know that the problem is with KVM/QEMU and not with libguestfs?
>>
>> The guestfsd daemon is simply running the regular 'mount' command.
>> The mount command causes the kernel to panic. There should be no
>> circumstances where running an ordinary command like that, albeit as
>> root, should cause the kernel to panic. Unless the kernel (or in this
>> case, something underneath the kernel) is broken.
>>
>> mount -o ro /dev/sdb /sysroot/
>> [ 12.645305] PANIC: double fault, error_code: 0x0
>> [ 12.645305] CPU: 0 PID: 141 Comm: mount Not tainted
>> 3.11.8-200.strato0002.fc19.strato.c3850ae03e9d.x86_64 #1
>> [ 12.645305] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
>> [ 12.645305] task: ffff88001cc816e0 ti: ffff88001cde6000 task.ti:
>> ffff88001cde6000
>> [ 12.645305] RIP: 0033:[<00007fa602c5b99b>] [<00007fa602c5b99b>]
>> 0x7fa602c5b99a
>> [ 12.645305] RSP: 002b:00007fff4f5884a0 EFLAGS: 00010216
>> [ 12.645305] RAX: 00007fa602008ff8 RBX: 00007fa601ff0000 RCX: 00007fa601ff0000
>> [ 12.645305] RDX: 00000000003b7068 RSI: 00007fff4f588560 RDI: 00007fa601ff3d18
>> [ 12.645305] RBP: 00007fff4f5885d0 R08: 00007fa60200f310 R09: 0000000000000000
>> [ 12.645305] R10: 0000000000000022 R11: 00007fa60200f310 R12: 00007fa60200e9b0
>> [ 12.645305] R13: 0000000000000000 R14: 0000000000000000 R15: 00007fa602e6e990
>> [ 12.645305] FS: 00007fa602e69880(0000) GS:ffff88001f000000(0000)
>> knlGS:0000000000000000
>> [ 12.645305] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>> [ 12.645305] CR2: 0000000000000000 CR3: 000000001d7fb000 CR4: 00000000000006f0
>> [ 12.645305]
>> [ 12.645305] Kernel panic - not syncing: Machine halted.
>> [ 12.645305] CPU: 0 PID: 141 Comm: mount Not tainted
>> 3.11.8-200.strato0002.fc19.strato.c3850ae03e9d.x86_64 #1
>> [ 12.645305] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
>> [ 12.645305] ffff88001f005f58 ffff88001f005e90 ffffffff8164024b
>> ffffffff819e89dc
>> [ 12.645305] ffff88001f005f08 ffffffff8163c272 0000000000000008
>> ffff88001f005f18
>> [ 12.645305] ffff88001f005eb8 ffffffff8163c8e5 0000000000000046
>> 00000000000000b1
>> [ 12.645305] Call Trace:
>> [ 12.645305] <#DF> [<ffffffff8164024b>] dump_stack+0x45/0x56
>> [ 12.645305] [<ffffffff8163c272>] panic+0xc8/0x1d7
>> [ 12.645305] [<ffffffff8163c8e5>] ? printk+0x67/0x69
>> [ 12.645305] [<ffffffff81048ae1>] df_debug+0x31/0x40
>> [ 12.645305] [<ffffffff810132ed>] do_double_fault+0x5d/0x80
>> [ 12.645305] [<ffffffff81650b88>] double_fault+0x28/0x30
>> [ 12.645305] <<EOE>>
>
> Correct.
>
> I encountered this same double_fault panic a week ago:
>
>
>
http://kashyapc.fedorapeople.org/temp/double-fault-panic-nested-kvm-envir...
>
> With these versions:
>
> $ uname -r ; rpm -q libvirt qemu-system-x86
> 3.11.10-301.fc20.x86_64
> libvirt-1.1.3.1-2.fc20.x86_64
> qemu-system-x86-1.6.1-2.fc20.x86_64
>
> When I briefly discussed this double fault panic with Paolo Bonzini (KVM
> maintainer), he mentioned it is probably a host hypervisor bug. But this
> needs more investigation (ftrace for nested guest, x86info in L1 and L2
> - if possible).
>
>
> Answering Rom's earlier question ("Kashyap, can you please share your
> experience?"): Yes, ested virtualization with KVM and Intel is not
> *really* the most stable, but there's on going work upstream to improve
> this and fix bbugs.
>
> Refer these recent bugs I filed while in a nested KVM environment:
>
>
https://bugzilla.kernel.org/show_bug.cgi?id=67761
>
https://bugzilla.kernel.org/show_bug.cgi?id=68051
>
https://bugzilla.kernel.org/show_bug.cgi?id=67751
>
>
>
> --
> /kashyap