Hi all,
Piping in here as someone who has worked on file system and Registry differencing for a few years now. Taking diffs of a storage system is not a straightforward task. Hopefully, this message saves you some re-implementation heartache.
In the forensics world, there is a tool called Fiwalk, which enumerates the contents of a file system and its metadata (with some basic data summaries, including libmagic and checksums). The tool "idifference" compares file system states and enumerates differences, using the Digital Forensics XML output from Fiwalk.
A research publication on the forensic differencing process and idifference is here:
Fiwalk is a component of The SleuthKit, here:
If you wish to use Fiwalk on your images, you should convert any of your disk images to a raw image or Expert Witness Format.
Actually, I don't suppose qemu-img has a FUSE-like wrapper that exposes the underlying image as a raw file?
DFXML has an entry on the Forensics Wiki:
As for your external-to-filesystem data question: I think you got the essential non-file-system data. I can imagine data fragments from past/shrunken file systems, or hidden-data regions that fall outside what's recorded in the partition table. My imagination runs dry there, though.
--Alex