On Mon, Feb 27, 2023 at 6:41 PM Richard W.M. Jones <rjones(a)redhat.com> wrote:
On Mon, Feb 27, 2023 at 04:24:33PM +0200, Nir Soffer wrote:
> On Mon, Feb 27, 2023 at 3:56 PM Richard W.M. Jones <rjones(a)redhat.com> wrote:
> >
> >
> >
https://github.com/kubevirt/containerized-data-importer/issues/1520
> >
> > Hi Eric,
> >
> > We had a question from the Kubevirt team related to the above issue.
> > The question is roughly if it's possible to calculate the checksum of
> > an image as an nbdkit filter and/or in the qemu block layer.
> >
> > Supplemental #1: could qemu-img convert calculate a checksum as it goes
> > along?
> >
> > Supplemental #2: could we detect various sorts of common errors, such
> > a webserver that is incorrectly configured and serves up an error page
> > containing "<html>"; or something which is supposed to be a
disk image
> > but does not "look like" (in some ill-defined sense) a disk image,
> > eg. it has no partition table.
> >
> > I'm not sure if qemu has any existing features covering the above (and
> > I know for sure that nbdkit doesn't).
> >
> > One issue is that calculating a checksum involves a linear scan of the
> > image, although we can at least skip holes.
>
> Kubvirt can use blksum
>
https://fosdem.org/2023/schedule/event/vai_blkhash_fast_disk/
>
> But we need to package it for Fedora/CentOS Stream.
>
> I also work on "qemu-img checksum", getting more reviews on this can
help:
> Lastest version:
>
https://lists.nongnu.org/archive/html/qemu-block/2022-11/msg00971.html
> Last reveiw are here:
>
https://lists.nongnu.org/archive/html/qemu-block/2022-12/
>
> More work is needed on the testing framework changes.
I think it would be more useful if (or in addition) it could compute
the checksum of a stream which is being converted with 'qemu-img
convert'. Extra points if it can compute the checksum over either the
input or output stream.
I thought about this, it could be a filter that you add in the graph
that gives you checksum as a side effect of copying. But this requires
disabling unordered writes, which is pretty bad for performance.
But even if you compute the checksum during a transfer, you want to
verify it by reading the transferred data from storage. Once you computed
the checksum you can keep it for verifying the same image in the future.
Nir