Re: [Libguestfs] Concurrent scanning of same disk

Thursday, 28 May 2015

2015-05-28 10:10 GMT+03:00 Richard W.M. Jones <rjones(a)redhat.com&gt;:

...
 On Thu, May 28, 2015 at 09:48:41AM +0300, NoxDaFox wrote:
 > 2015-05-27 15:21 GMT+03:00 Richard W.M. Jones <rjones(a)redhat.com&gt;:
 >
 > > On Wed, May 27, 2015 at 09:38:38AM +0300, NoxDaFox wrote:
 > > >  * RuntimeError: file receive cancelled by daemon - On r =
 > > > libguestfsmod.checksums_out (self._o, csumtype, directory, sumsfile)
 > > >  * RuntimeError: hivex_close: do_hivex_close: you must call
 'hivex-open'
 > > > first to initialize the hivex handle - On r =
 libguestfsmod.inspect_os
 > > > (self._o)
 > >
 > > This error is likely to be -EIO (it's actually a bug in libguestfs
 > > that it doesn't report these properly in the error message).  However
 > > we cannot be sure unless you enable debugging and get the complete
 > > messages.
 > >
 > > http://libguestfs.org/guestfs-faq.1.html#debugging-libguestfs
 > >
 > > Rich.
 > >
 > >
 >
 > I'm starting to wonder whether these errors are due to the fact that I
 > compare snapshots of unconsistent disks. If so, is there a way to
 instruct
 > guestfs to ignore corrupted files?

 Are the snapshots "consistent"? - ie. taken in such as way that they
 provide a single point-in-time view across the whole disk?  You
 mentioned using 'qemu-img convert' before.  'qemu-img convert' on its
 own will not take a consistent snapshot (well, not unless you pause
 the guest during the copy, or you use some fancy new backup features
 recently added to qemu).

 > It's a bit challenging to generate such logs as the error appears every
 now
 > ant then.
 > Here's the log related to a "RuntimeError: file receive cancelled by
 > daemon".
 [...]
 > mount -o  /dev/sda2 /sysroot/
 > The disk contains an unclean file system (0, 0).
 > The file system wasn't safely closed on Windows. Fixing.
 > libguestfs: trace: mount = 0
 > libguestfs: trace: checksums_out "sha1" "/"
"/tmp/tmpAWHkYv"
 > guestfsd: main_loop: proc 1 (mount) took 2.02 seconds
 > guestfsd: main_loop: new request, len 0x38
 > cd /sysroot/ && find -type f -print0 | xargs -0 sha1sum
 > [   25.580340] perf interrupt took too long (2540 > 2500), lowering
 > kernel.perf_event_max_sample_rate to 50000
 > sha1sum: ./Windows/Prefetch/ReadyBoot/Trace7.fx: Value too large for
 > defined data type
 > [   67.835952] perf interrupt took too long (5048 > 5000), lowering
 > kernel.perf_event_max_sample_rate to 25000
 > [  143.304037] perf interrupt took too long (10010 > 10000), lowering
 > kernel.perf_event_max_sample_rate to 12500
 > pclose: /: Success
 > guestfsd: main_loop: proc 244 (checksums_out) took 245.25 seconds
 > libguestfs: trace: checksums_out = -1 (error)
 [...]
 >   File "/usr/lib/python2.7/dist-packages/guestfs.py", line 1427, in
 > checksums_out
 >     r = libguestfsmod.checksums_out (self._o, csumtype, directory,
 sumsfile)
 > RuntimeError: file receive cancelled by daemon

 The error is confusing, but I think you are correct that it happens
 because the filesystem is unclean at the point at which it was
 snapshotted, maybe combined with partially written metadata which to
 the ntfs-3g driver looks like disk corruption.

 This is just what happens when you make inconsistent snapshots of disk
 unfortunately.

 My best suggestion would be:

  - Catch the exception in Python

  - When you hit this error, skip this snapshot and go on to the next one

 That may involve rearchitecting your application a bit, but if the
 error is rare, it seems like the best way to handle it.

 An alternative, if you're not doing it already, would be to take a
 consistent snapshot.  Assuming the guest is well-behaved and the
 filesystem uses journalling and the journalling is implemented
 correctly, a consistent snapshot should not have such errors.

 Rich. 

To create the snapshots I'm using the libvirt command snapshotCreateXML
with no flag set. Does libvirt support consistent snapshotting or shall I
rely on QEMU backup new feature only?

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [Libguestfs] Concurrent scanning of same disk