On 07/03/2012 07:03 PM, Richard W.M. Jones wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=836710
https://bugzilla.redhat.com/show_bug.cgi?id=836913
There are at least two related bugs going on:
(1) Linux sync(2) system call doesn't send a write barrier to the
disk, so in effect it doesn't force the hard disk to flush its cache.
libguestfs used sync(2) to force changes to disk.
Surprising. So sync(2) is currently async. Ho hum.
I just noticed Jan Kara's patch set today actually:
https://lkml.org/lkml/2012/7/3/272
Would fix the issue at the kernel level?
We didn't expect
that qemu was caching anything because we used 'cache=none' for all
writable disks, but it turns out that qemu creates a writeback cache
anyway when you do this (you need to use 'cache=directsync' when you
don't want a cache at all).
And we're not using 'directsync' for performance reasons?
(2) qemu's qcow2 disk cache code is buggy. If there are I/Os in
flight when qemu shuts down, then qemu segfaults or assert fails.
This can result in unwritten data. Unfortunately libguestfs ignored
the result of waitpid(2) so we didn't see this problem happening.
Patch 1/7 fixes the first problem by issuing fsync(2) on each whole
block device when we sync.
Patches 2/7 - 7/7 are needed to fix the second problem. We add a new
API (guestfs_shutdown) so that we can actually catch the case where
qemu is segfaulting instead of just ignoring it. Since qemu itself
isn't likely to be fixed any time soon, patch 7/7 adds a crude but
effective workaround to virt-resize.
thanks for looking into this tricky issue so thoroughly,
Pádraig.