On 02/08/22 21:29, Eric Blake wrote:
 On Tue, Feb 08, 2022 at 12:39:02PM +0100, Laszlo Ersek wrote:
> On 02/04/22 17:44, Eric Blake wrote:
>> While writing tests for an nbdcopy bug, I found myself wanting a way
>> to easily view an entire image as data, but without disabling extents
>> support altogether.  The existing extentlist filter can do this, but
>> requires a secondary file.
>>
>> Still an RFC because I need testsuite coverage similar to
>> test-nozero.sh (eek - we don't have any direct test of the noextent
>> filter, but only indirect coverage through other tests).  Also, are
>> there any other extentmode=MODE values that might make sense?
>> ---
 
>> +=item B<extentmode=mask>
>> +
>> +Extent support is not advertised to the client; clients should not
>> +query for extent information, and must assume the entire disk is
>> +allocated.
>> +
>> +This is the default if the C<extentmode> parameter is not specified.
>> +
>> +=item B<extentmode=data>
>> +
>> +(nbdkit E<ge> 1.30)
>> +
>> +Extent support is advertised, but extent requests from the client will
>> +be answered with a claim that the entire disk forms a single allocated
>> +data extent.
>> +
>> +=back
>> +
>> +All other parameters are passed through to and processed by the
>> +underlying plugin in the normal way.
>
> Is this necessary to spell out? (Looked at a random other filter's
> documentation, and didn't see such a statement, so I think it's the
> default.)
>
> (The same question applies to "plugin-args" in the synopsys, more or
> less...)
 
 Hmm, we aren't always consistent, but I agree that it can be pruned
 without loss.
 
>> +/* Produce single data extent. */
>> +static int
>> +noextents_extents (nbdkit_next *next,
>> +                   void *handle, uint32_t count, uint64_t offset,
>> +                   uint32_t flags,
>> +                   struct nbdkit_extents *ret_extents,
>> +                   int *err)
>> +{
>> +  assert (extentmode == DATA);
>> +  return nbdkit_add_extent (ret_extents, offset, count, 0);
>>  }
>
> I don't have the context in which this function may be called, in my
> head, but this implementation seems to reflect precisely the requested
> offset range back at the client, so a question arises:
>
> - What if the client requests an offset range that (at least partially)
> exceeds the size of the file? In that case, I think we should not report
> that range as existent. For example, the chunked block status query in
> virt-v2v asks for 2GB offset ranges, and it's expected that the returned
> extent list will not exceed the file size.
 
 The code in server/ guarantees that we cannot call into a filter or
 plugin with an extents request that would read out of bounds; ie. on
 input, offset+count will never exceed what next->get_size() would tell
 us anyways. 
Great!
(This is what I sort of expected, but wasn't sure about, hence "I don't
have the context in which this function may be called".)
 Conversely, nbdkit_add_extent() already permits us to
 pass in redundant information prior to offset (as long as we make
 forward progress by eventually using nbdkit_add_extent for at least
 one byte at offset before returning), as well as to provide more
 information than needed (the upper layer can set a clamp, such as when
 FLAG_REQ_ONE is in use by the client, or at the 32-bit boundary
 condition, where our additions beyond that clamp are merely ignored).
 So we could just as easily write
 
   return nbdkit_add_extent (ret_extents, 0, INT64_MAX, 0);
 
 which will be trimmed to size as needed before going over the wire
 back to the client. 
Wow, that's very safe API then!
 But now that I've written that, I can see that
 there is indeed a useful benchmarking distinction between using
 offset/count vs. 0/INT_MAX - the former behaves as if the client had
 always passed in FLAG_REQ_ONE (we can never progress faster than the
 client wants us to go), while the latter gives as much information to
 the client as possible (subject to 32-bit limits, until my 64-bit
 block status work is revived).  So instead of just one new mode named
 'data', maybe we want a larger family of possible modes:
 
 - mask (default; existing behavior of masking extent support so client
   can't query them at all); client must assume disk is data
 - data-match (reply that the extent is data, but match the size of the
   client request; behaves as if the client unconditionally uses
   FLAG_REQ_ONE); client sees disk as data
 - data-full (reply that the entire remainder of the extent was data,
   although the client's use of FLAG_REQ_ONE clamps it back to the
   query size); client sees disk as data
 - req-one (pass the query on through to the plugin, but then
   purposefully trim it back before replying to the client so that it
   behaves as if the client unconditionally uses FLAG_REQ_ONE); client
   can see holes, but cannot read ahead
 - plugin (pass the query on through to the plugin, with no
   alterations to the plugin's result - normalizes the impact of having
   the filter in the stack when comparing to other modes); client can
   see holes, and can benefit from reading ahead 
Hmmmm. This sounds quite exhaustive, but it seems we don't have a use
for all of these modes at the moment. If you add all of them, then I
guess you'd want to write test cases for them too. Seems a bit like
overkill. Maybe capture these options as possible extensions for the
filter in comments, or in a TODO file?
 
 We can also bike-shed on better names for those modes (maybe
 'disabled' instead of 'mask'; or 'passthrough' instead of
'plugin',
 for example).
 
 In fact, as written above, it means the 'noextents' filter could serve
 the role of controlling three orthogonal aspects: are extents
 advertised, can extents return additional information beyond the
 client's requested length (when the client didn't use REQ_ONE), and
 can extents report hole/zero information.
 
 Is there ever a reason that we would want to always call into the
 plugin, but then mask some (or all) of the returned extent bits to
 only allow certain bits to return to the client?  For example, a mode
 that lets the plugin report the 'zero' bit but not the 'hole' bit?
 Being able to time how long the plugin takes to answer the query (even
 if we then throw away the plugin's answer to claim that we have only
 data) may be an interesting benchmark, as well, when comparing to the
 speed at which we simulate the entire disk as data without reading
 into the plugin. 
I think we shouldn't generalize / speculate until there's a need for all
this. It's useful to remain extensible, but the more logic we add, the
more we should keep covered by tests as well. I think there's enough
work already that's actually *needed* for various purposes :)
 
>
> (I understand that a server is permitted to cover a larger offset range
> than requested in its reply, unless LIBNBD_CMD_FLAG_REQ_ONE is set;
> however, this is different. Without LIBNBD_CMD_FLAG_REQ_ONE, the
> response may well be permitted to exceed the requested range, but it's
> still not permitted, AIUI, to report nonexistent data.)
 
 The nbdkit core takes care of that.  A filter or plugin can report on
 nonexistent data, but it will be clamped back so that the client never
 sees more than the advertised disk length. 
OK!
 
>
> In short, I think we should call get_size() (?) on the underlying plugin
> (?), and truncate the requested range accordingly.
 
 Already done implicitly by the core.
 
>
>>
>>  static struct nbdkit_filter filter = {
>>    .name              = "noextents",
>>    .longname          = "nbdkit noextents filter",
>> +  .config            = noextents_config,
>> +  .config_help       = noextents_config_help,
>>    .can_extents       = noextents_can_extents,
>> +  .extents           = noextents_extents,
>>  };
>>
>>  NBDKIT_REGISTER_FILTER(filter)
>>
>
> Thanks,
> Laszlo
>
  
So I think my only comments that remain relevant are possibly the slight
tweaks to the documentation?
Reviewed-by: Laszlo Ersek <lersek(a)redhat.com>
Thanks!
Laszlo