Hi Alex!
Thank you for your suggestions. Fiwalk seems great. I couldn't test it
yet, since the version in Debian does not support ext4 yet and because
upstream does not sign their images. (And suggesting to use lots of code
only verified by github https seems to open up a bigger hole than I am
trying to close.) I would be curious if it produces deterministic
results (i.e. when running on the same image, always producing the very
same report and when running on a marginally different image, be in
roughly the same order with only marginal changes.). I will keep an eye
on Fiwalk. The one or the other issue my resolve itself at some point.
Cheers,
adrelanos
Alex Nelson:
Hi all,
Piping in here as someone who has worked on file system and Registry
differencing for a few years now. Taking diffs of a storage system is not
a straightforward task. Hopefully, this message saves you some
re-implementation heartache.
In the forensics world, there is a tool called Fiwalk, which enumerates the
contents of a file system and its metadata (with some basic data summaries,
including libmagic and checksums). The tool "idifference" compares file
system states and enumerates differences, using the Digital Forensics XML
output from Fiwalk.
A research publication on the forensic differencing process and idifference
is here:
http://dfrws.org/2012/proceedings/DFRWS2012-6.pdf
Fiwalk is a component of The SleuthKit, here:
https://github.com/sleuthkit/sleuthkit
If you wish to use Fiwalk on your images, you should convert any of your
disk images to a raw image or Expert Witness Format.
Actually, I don't suppose qemu-img has a FUSE-like wrapper that exposes the
underlying image as a raw file?
DFXML has an entry on the Forensics Wiki:
http://www.forensicswiki.org/wiki/Category:Digital_Forensics_XML
As for your external-to-filesystem data question: I think you got the
essential non-file-system data. I can imagine data fragments from
past/shrunken file systems, or hidden-data regions that fall outside what's
recorded in the partition table. My imagination runs dry there, though.
--Alex
On Fri, Nov 22, 2013 at 12:56 PM, adrelanos <adrelanos(a)riseup.net> wrote:
> Thank you all for your suggestions!
>
> Richard W.M. Jones:
>> I keep meaning to write a comprehensive "virt-diff" tool. I needed it
>> myself just yesterday.
>
> Most interesting. I guess there are two reasons for creating such a
> tool: just compare the images (show the diff) and/or check for malicious
> additions in the other image.
>
> Did you consider implementing the former or both?
>
> Do you think it's realistic to compare vm images with the goal of
> eventually finding deliberately hard to detect (malicious) changes?
>
> At the moment I am not trying to write a virt-diff like tool, but
> something simpler. A tool to create a report of all of a vm image's
> contents. (Checksums for all files, filesystem, for MBR and Volume Boot
> Record.) When publishing VM images, it might be useful to publish such a
> report together with the image, so others who re-build from source can
> be certain, they ended up with a very similar image. When having created
> two such reports, one could easily get a virt-diff like tool.
>
>> although that *only*
>> compares files, not the other data outside the filesystem
>
> What other data can there be outside the filesystem?
>
> I can think of:
>
> - MBR
> - Volume Boot Record
>
> Anything else?
>
> If these have been compared, the compared image should be as safe to use
> as the original one?
>
> (I could imagine that there can be extra data outside filesystem, maybe
> in regions outside the partition table, but those data shouldn't get
> executed after starting the image in a VM.)
>
> Cheers,
> adrelanos
>
> _______________________________________________
> Libguestfs mailing list
> Libguestfs(a)redhat.com
>
https://www.redhat.com/mailman/listinfo/libguestfs
>