On Wed, Jan 17, 2018 at 08:01:35PM -0600, Eric Blake wrote:
Actually, thinking about this more:
When I added zero support, I documented in commit ac3f294a that for the
file plugin, wdelay is indeed doubled on systems lacking efficient zero
support. But the fallback for handling EOPNOTSUPP is currently done at
the plugins.c level (ie. it is part of next->zero()) - so a filter
should never see a plugin returning EOPNOTSUPP. Or put another way,
your movement of wdelay from being part of the file plugin to now being
a separate filter actually FIXES the double delay. On the other hand, a
filter can itself return EOPNOTSUPP (rather than calling next->zero), in
which case we still need to support the fallback to .pwrite higher up
in the call stack
OK, so ignore my previous email.
However, I'm wary of doing fallbacks in general in the filter layer,
for a couple of reasons:
(1) If you run some filters twice (eg. offset filter) then you end up
doubling offsets which will cause data corruption. This doesn't apply
in the EOPNOTSUPP case, but it does apply in the .zero fallback case
(where it falls back to .write).
(2) Filters are a new thing and we can just declare that they cannot
use EOPNOTSUPP.
Along the same lines, with my FUA patches, if we keep the fallback
of
calling .flush in connections.c, then the flush will pass through the
entire filter chain; but I had proposed moving the fallback flush into
plugins.c, at which point the fallback flush is done at the layer that
lacks FUA support. And maybe we want a flush delay to mirror the
existing rdelay and wdelay (simulating a long flush time can indeed be
useful in tests); where it definitely matters whether the flush delay is
triggered as part of FUA fallback, or only triggered on an actual
NBD_CMD_FLUSH from the client.
I'm also thinking ahead to expansions - for example, there are proposals
to add resize and block status commands to NBD. In your implementation,
if a filter does not define .can_flush or .flush, then its .can_flush
implementation merely returns whatever the underlying backend's version
would report. But as we add new callbacks, such as a new .can_resize,
we need to make sure we have sane defaults. For example, the offset
filter should probably not allow resizes (even if the underlying plugin
does). If the offset filter compiled in a version of nbdkit that lacks
the .can_resize hook is run by nbdkit that has added the hook, we
therefore want to default .can_resize to false as part of loading the
smaller filter struct into memory (and NOT default to falling through to
the plugin's .can_resize); a default based on fall-through should only
happen if the filter had a size large enough to admit that the user
explicitly omitted the callback in the filter, rather than it being NULL
because of version mismatch.
We may have to say that the ABI guarantee only applies to plugins,
which might not be a bad thing since filters are much harder to write
than plugins and are going to be quite closely tied to both the
version and precise implementation of nbdkit.
Rich.
--
Richard Jones, Virtualization Group, Red Hat
http://people.redhat.com/~rjones
Read my programming and virtualization blog:
http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine. Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/