[Adding Paolo and Vitaly, but FYI only as the bug seems to have an
upstream fix already.]
On Mon, Mar 26, 2018 at 09:13:45AM +0300, Roman Kagan wrote:
On Sat, Mar 24, 2018 at 03:11:12PM +0000, Richard W.M. Jones wrote:
> On Sat, Mar 24, 2018 at 03:08:16PM +0000, Tanmoy Sinha wrote:
> > Even though force_tcg works, I intend not to run it on emulation. Is there
> > way I can run it over kvm? The other observation is, without force_tcg if I
> > use the machine type as *pc-i440fx-2.**1*,accel=kvm it works fine. The
> > default machine type for my host *pc-i440fx-2.8, *which seems to crib.
>
> I don't know, but this is basically a bug in VMware, so you need
> to ask them to fix their nested KVM-on-ESXi use case.
We've encountered this problem, too.
Strictly speaking, the bug is not in VMWare, but rather in KVM: on
EPT_MISCONFIG vmexits it assumed the processor to set the instuction
length field. This wasn't mandated by the spec but the real processors
did that. OTOH some hypervisors (VMWare, Hyper-V) didn't do that for
the nested hypervisor. As a result, when handling MMIO the guest
instruction pointer didn't get advanced, i.e. the guest got stuck in an
infinite loop.
The difference between the old and the new machine type is that the
latter turns on newer virtio protocol version employing MMIO, exposing
this bug.
The fix is commit d391f1207067268261add0485f0f34503539c5b0 which went
into 4.16-rc1.
Can you (Tanmoy) please try a newer kernel inside the VMware guest?
Rich.
--
Richard Jones, Virtualization Group, Red Hat
http://people.redhat.com/~rjones
Read my programming and virtualization blog:
http://rwmj.wordpress.com
libguestfs lets you edit virtual machines. Supports shell scripting,
bindings from many languages.
http://libguestfs.org