On 01/18/2018 02:55 AM, Richard W.M. Jones wrote:
On Wed, Jan 17, 2018 at 08:01:35PM -0600, Eric Blake wrote:
> Actually, thinking about this more:
>
> When I added zero support, I documented in commit ac3f294a that for the
> file plugin, wdelay is indeed doubled on systems lacking efficient zero
> support. But the fallback for handling EOPNOTSUPP is currently done at
> the plugins.c level (ie. it is part of next->zero()) - so a filter
> should never see a plugin returning EOPNOTSUPP. Or put another way,
> your movement of wdelay from being part of the file plugin to now being
> a separate filter actually FIXES the double delay. On the other hand, a
> filter can itself return EOPNOTSUPP (rather than calling next->zero), in
> which case we still need to support the fallback to .pwrite higher up
> in the call stack
OK, so ignore my previous email.
However, I'm wary of doing fallbacks in general in the filter layer,
for a couple of reasons:
(1) If you run some filters twice (eg. offset filter) then you end up
doubling offsets which will cause data corruption. This doesn't apply
in the EOPNOTSUPP case, but it does apply in the .zero fallback case
(where it falls back to .write).
Good point - we have to be sure that any fallbacks pass the original
parameters, not munged parameters, when calling back into the same layer.
(2) Filters are a new thing and we can just declare that they cannot
use EOPNOTSUPP.
Yes, that's also reasonable.
> Along the same lines, with my FUA patches, if we keep the fallback of
> calling .flush in connections.c, then the flush will pass through the
> entire filter chain; but I had proposed moving the fallback flush into
> plugins.c, at which point the fallback flush is done at the layer that
> lacks FUA support. And maybe we want a flush delay to mirror the
> existing rdelay and wdelay (simulating a long flush time can indeed be
> useful in tests); where it definitely matters whether the flush delay is
> triggered as part of FUA fallback, or only triggered on an actual
> NBD_CMD_FLUSH from the client.
>
> I'm also thinking ahead to expansions - for example, there are proposals
> to add resize and block status commands to NBD. In your implementation,
> if a filter does not define .can_flush or .flush, then its .can_flush
> implementation merely returns whatever the underlying backend's version
> would report. But as we add new callbacks, such as a new .can_resize,
> we need to make sure we have sane defaults. For example, the offset
> filter should probably not allow resizes (even if the underlying plugin
> does). If the offset filter compiled in a version of nbdkit that lacks
> the .can_resize hook is run by nbdkit that has added the hook, we
> therefore want to default .can_resize to false as part of loading the
> smaller filter struct into memory (and NOT default to falling through to
> the plugin's .can_resize); a default based on fall-through should only
> happen if the filter had a size large enough to admit that the user
> explicitly omitted the callback in the filter, rather than it being NULL
> because of version mismatch.
We may have to say that the ABI guarantee only applies to plugins,
which might not be a bad thing since filters are much harder to write
than plugins and are going to be quite closely tied to both the
version and precise implementation of nbdkit.
Here's a thought: you've already mentioned that for adding FUA to
nbdkit-plugins.h, we would want to have the user choose between
'NBDKIT_PLUGIN_LEVEL 2' for new or 1 (default) for old, in addition to
the existing define for NBDKIT_API_VERSION 1. The struct already has an
_api_version field; my original thought was to leave that at 1, and then
either declare by documentation that old nbdkit can't load a
PLUGIN_LEVEL 2 plugin, or else provide automatic shims for any plugin
compiled to PLUGIN_LEVEL 2 so that it can still be run by an older
nbdkit. If we do the former, then we are really bumping API version
(rather than just documenting that new plugins can't run on old nbdkit,
we can explicitly set the _api_version field to 2, and old nbdkit will
refuse to load the plugin); only if we do the latter would we keep
_api_version at 1.
Then, in nbdkit proper, as part of loading a plugin, we would ensure
that nbdkit can load either version 1 or version 2 plugins in new
nbdkit, while old nbdkit is hard-wired to only load version 1 plugins.
Providing our ABI guarantee means that newer nbdkit must always be able
to load any version of plugin, from 1 to the level currently defined at
the time nbdkit was compiled. But your statement that filters are
stricter and don't have to provide ABI guarantees means that a similar
change to filter callback signatures would be an unconditional
_api_version bump (with no way for the user to opt into the older
spelling), and nbdkit performing loads of a filter would insist on exact
match rather than anything from 1 to the current version.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization:
qemu.org |
libvirt.org