Re: [Libguestfs] [PATCH 1/2] spec: Recommend cap on NBD_REPLY_TYPE_BLOCK_STATUS length

Friday, 8 April 2022

On Fri, Apr 08, 2022 at 09:25:01AM +0200, Wouter Verhelst wrote:
...
 Hi Eric,

 On Thu, Apr 07, 2022 at 04:37:19PM -0500, Eric Blake wrote:
 > The spec was silent on how many extents a server could reply with.
 > However, both qemu and nbdkit (the two server implementations known to
 > have implemented the NBD_CMD_BLOCK_STATUS extension) implement a hard
 > cap, and will truncate the amount of extents in a reply to avoid
 > sending a client a reply larger than the maximum NBD_CMD_READ response
 > they are willing to tolerate:
 > 
 > When qemu first implemented NBD_CMD_BLOCK_STATUS for the
 > base:allocation context (qemu commit e7b1948d51, Mar 2018), it behaved
 > as if NBD_CMD_FLAG_REQ_ONE were always passed by the client, and never
 > responded with more than one extent.  Later, when adding its
 > qemu:dirty-bitmap:XXX context extension (qemu commit 3d068aff16, Jun
 > 2018), it added a cap to 128k extents (1M+4 bytes), and that cap was
 > applied to base:allocation once qemu started sending multiple extents
 > for that context as well (qemu commit fb7afc797e, Jul 2018).  Qemu
 > extents are never smaller than 512 bytes (other than an exception at
 > the end of a file whose size is not aligned to 512), but even so, a
 > request for just under 4G of block status could produce 8M extents,
 > resulting in a reply of 64M if it were not capped smaller.
 > 
 > When nbdkit first implemented NBD_CMD_BLOCK_STATUS (nbdkit 4ca66f70a5,
 > Mar 2019), it did not impose any restriction on the number of extents
 > in the reply chunk.  But because it allows extents as small as one
 > byte, it is easy to write a server that can amplify a client's request
 > of status over 1M of the image into a reply over 8M in size, and it
 > was very easy to demonstrate that a hard cap was needed to avoid
 > crashing clients or otherwise killing the connection (a bad server
 > impacting the client negatively); unique to nbdkit's situation is the
 > fact that because it is designed for plugin server implementations,
 > not capping the length of extent also posed a problem to nbdkit as the
 > server (a client requesting large block status could cause the server
 > to run out of memory depending on the plugin providing the server
 > callbacks).  So nbdkit enforced a bound of 1M extents (8M+4 bytes,
 > nbdkit commit 6e0dc839ea, Jun 2019).
 > 
 > Since the limit chosen by these two implementations is different, and
 > since nbdkit has versions that were not limited, add this as a SHOULD
 > NOT instead of MUST NOT constraint on servers implementing block
 > status.  It does not matter that qemu picked a smaller limit that it
 > truncates to, since we have already documented that the server may
 > truncate for other reasons (such as it being inefficient to collect
 > that many extents in the first place).  But documenting the limit now
 > becomes even more important in the face of a future addition of 64-bit
 > requests, where a client's request is no longer bounded to 4G and
 > could thereby produce even more than 8M extents for the corner case
 > when every 512 bytes is a new extent, if it were not for this
 > recommendation.

 It feels backwards to me to make this a restriction on the server side.
 You're saying there are server implementations that will be inefficient
 if there are more than 2^20 extents, and therefore no server should send
 more than those, even if it can do so efficiently.

 Isn't it up to the server implementation to decide what can be done
 efficiently?

 Perhaps we can make the language about possibly reducing length of
 extens a bit stronger; but I don't think adding explicit limits for a
 server's own protection is necessary. 
I agree, but for a different reason.

I think Eric should add language that servers can consider limiting
response sizes in order to prevent possible amplification issues
and/or simply overwhelming the client with work (bad server DoS
attacks against clients are a thing!), but I don't think it's
necessarily a "SHOULD" issue.

BTW attached is an nbdkit plugin that creates an NBD server that
responds with massive numbers of byte-granularity extents, in case
anyone wants to test how nbdkit and/or clients respond:

$ chmod +x /var/tmp/lots-of-extents.py
$ /var/tmp/lots-of-extents.py -f

$ nbdinfo --map nbd://localhost | head
         0           1    3  hole,zero
         1           1    0  data
         2           1    3  hole,zero
         3           1    0  data
         4           1    3  hole,zero
         5           1    0  data
         6           1    3  hole,zero
         7           1    0  data
         8           1    3  hole,zero
         9           1    0  data
$ nbdinfo --map --totals nbd://localhost 
    524288  50.0%   0 data
    524288  50.0%   3 hole,zero

...
From experimentation I found this really hurts my laptop :-( 
Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
nbdkit - Flexible, fast NBD server with plugins
https://gitlab.com/nbdkit/nbdkit

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Re: [Libguestfs] [PATCH 1/2] spec: Recommend cap on NBD_REPLY_TYPE_BLOCK_STATUS length