On 3/21/19 7:18 AM, Martin Kletzander wrote:
> For a read-only client: caching 'data' regions is okay,
caching 'zero'
> or 'hole' regions is bad (because even though you are not modifying the
> image, another writer might be; demoting 'hole' to 'data' is safe -
it
> merely pessimizes into a read() instead of skipping work; but caching
> 'hole' that is later promoted to 'data' is wrong - it can cause data
> lass if the client doesn't read the actual data).
>
> For a writing client: either you are an exclusive writer (and should
> know what you wrote, so the cache is the fact that you changed the state
> yourself) or you are on a cluster filesystem (at which point, your
> cluster system better have its own rules for how to resolve races
> inherent in multiple writers, where you shouldn't be relying on block
> status but on the cluster protocol in the first place).
>
Even for the reading client, you don't need to need to access a place on
the
disk twice, even one access is racy because there can be a change between
BLOCK_STATUS and READ. And that same thing happens in the plugins for
files and
everything that someone else can access. I don't think it is designed for
concurrent access. Or is it?
Indeed, there is always a TOCTTOU race when you rely on block status if
there is ever a concurrent writer. But, is it dangerous? Without a
block status, we can have either:
reader writer
------------------------
read sector X as A
write sector X as B
or:
reader writer
------------------------
write sector X as B
read sector X as B
with the two steps, we have one of:
reader writer
------------------------
learn sector X has status Y
read sector X as A
write sector X as B
reader writer
------------------------
learn sector X has status Y
write sector X as B
read sector X as B
reader writer
------------------------
write sector X as B
learn sector X has status Y
read sector X as B
where the shortcut is that if the reader sees status 'hole', it skips a
read. Had it done the read in spite of learning about a hole, it would
either see all 0s (contents A - but that's no different than read
winning the race without a status check), or the new content (contents B
just written by the writer - proving the hole status is out of date, but
no different from losing the race without a status check).
In general, trying to copy an image while it is being modified is not
going to work reliably; the main point of block status is to make
copying more efficient, but when copying, you are assuming no concurrent
modifications.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3226
Virtualization:
qemu.org |
libvirt.org