On 11/21/18 10:17 AM, Richard W.M. Jones wrote:
Actually I think we are going to need to retain the block cache. It
solves a slightly different problem from placing the cache filter on
top (in fact both are useful).
Let's say you have an XZ file with a 100,000 byte block size. Then
reading two blocks at 0-1000 and 1000-2000 would result in reading and
uncompressing a whole block twice. The block cache in the xz
plugin/filter avoids this; the cache on top does not.
Interesting factoid:
www.mirrorsite.org rapidly throttles any
connection that makes repeated range requests ... However if you open
a new connection it is unaffected by the throttling on the existing
connection (I thought it would throttle based on IP address). Anyway
this, combined with the large block size in the Fedora Cloud image,
makes xz + curl virtually unusable.
I also think the new filter would be better if it made larger reads.
The plugin makes 8K reads (BUFSIZ) which is likely reasonable for
reading from a local file. But the overhead of reading from the curl
plugin probably makes much larger reads sensible. I wonder if the
filter can intuit a good block size to use somehow?
Yes, we need to revisit adding block sizing into nbdkit, as filters may
easily optimize based on preferred blocksize of the lower layer, while
possibly advertising a different blocksize up to the client. The
existing nbdkit-blocksize-filter would then gain some smarts for being
more useful for controlling sizes between layers (again, back to the
question of whether we should improve nbdkit filters to allow multiple
reuse of the same filter on a single plugin).
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization:
qemu.org |
libvirt.org