Commit 9f30fedb improved the spec to allow non-payload requests that
exceed any advertised maximum block size. Take this one step further
by permitting the server to use NBD_EOVERFLOW as a hint to the client
when a request is oversize (while permitting NBD_EINVAL for
back-compat), and by rewording the text to explicitly call out that
what is being advertised is the maximum payload length, not maximum
block size. This becomes more important when we add 64-bit
extensions, where it becomes possible to extend `NBD_CMD_BLOCK_STATUS`
to have both an effect length (how much of the image does the client
want status on) and a payload length (filtering the response to a
subset of negotiated metadata contexts).
---
doc/proto.md | 70 ++++++++++++++++++++++++++++++----------------------
1 file changed, 40 insertions(+), 30 deletions(-)
diff --git a/doc/proto.md b/doc/proto.md
index c3c7cd9..e332e21 100644
--- a/doc/proto.md
+++ b/doc/proto.md
@@ -769,8 +769,8 @@ learn the server's constraints without committing to them.
If block size constraints have not been advertised or agreed on
externally, then a server SHOULD support a default minimum block size
-of 1, a preferred block size of 2^12 (4,096), and a maximum block size
-that is effectively unlimited (0xffffffff, or the export size if that
+of 1, a preferred block size of 2^12 (4,096), and a maximum block payload
+size that is at least 2^25 (33,554,432) (or the export size if that
is smaller), while a client desiring maximum interoperability SHOULD
constrain its requests to a minimum block size of 2^9 (512), and limit
`NBD_CMD_READ` and `NBD_CMD_WRITE` commands to a maximum block size of
@@ -815,8 +815,8 @@ the preferred block size for that export. The server MAY advertise
an
export size that is not an integer multiple of the preferred block
size.
-The maximum block size represents the maximum length that the server
-is willing to handle in one request. If advertised, it MAY be
+The maximum block size represents the maximum payload length that the
+server is willing to handle in one request. If advertised, it MAY be
something other than a power of 2, but MUST be either an integer
multiple of the minimum block size or the value 0xffffffff for no
inherent limit, MUST be at least as large as the smaller of the
@@ -825,7 +825,20 @@ preferred block size or export size, and SHOULD be at least 2^20
MAY advertise a maximum block size that is larger than the export
size, although in that case, the client MUST treat the export size as
the effective maximum block size (as further constrained by a nonzero
-offset).
+offset). For commands that require a payload in either direction and
+where the client controls the payload length (`NBD_CMD_WRITE` or
+`NBD_CMD_READ`), the client MUST NOT use a length larger than the
+maximum block size. For replies where the payload length is controlled
+by the server (such as `NBD_CMD_BLOCK_STATUS` without the flag
+`NBD_CMD_FLAG_REQ_ONE`), the server MUST NOT send a reply larger than
+the maximum block size. For commands that do not require a payload
+(such as `NBD_CMD_TRIM`), the client MAY request a length larger than
+the maximum; the server SHOULD NOT disconnect, but MAY reply with an
+`NBD_EOVERFLOW` or `NBD_EINVAL` error if the oversize request would
+require more server resources than the same command operating on only
+the maximum block size (such as some implementations of
+`NBD_CMD_WRITE_ZEROES` without the flag `NBD_CMD_FLAG_FAST_ZERO`, or
+`NBD_CMD_CACHE`).
Where a transmission request can have a nonzero *offset* and/or
*length* (such as `NBD_CMD_READ`, `NBD_CMD_WRITE`, or `NBD_CMD_TRIM`),
@@ -833,24 +846,17 @@ the client MUST ensure that *offset* and *length* are integer
multiples of any advertised minimum block size, and SHOULD use integer
multiples of any advertised preferred block size where possible. For
those requests, the client MUST NOT use a *length* which, when added to
-*offset*, would exceed the export size. Also for NBD_CMD_READ,
-NBD_CMD_WRITE, NBD_CMD_CACHE and NBD_CMD_WRITE_ZEROES (except for
-when NBD_CMD_FLAG_FAST_ZERO is set), the client MUST NOT use a *length*
-larger than any advertised maximum block size.
-The server SHOULD report an `NBD_EINVAL` error if
-the client's request is not aligned to advertised minimum block size
-boundaries, or is larger than the advertised maximum block size.
+*offset*, would exceed the export size. The server SHOULD report an
+`NBD_EINVAL` error if the client's request is not aligned to advertised
+minimum block size boundaries or would exceed the export size.
+
Notwithstanding any maximum block size advertised, either the server
-or the client MAY initiate a hard disconnect if the payload of an
-`NBD_CMD_WRITE` request or `NBD_CMD_READ` reply would be large enough
-to be deemed a denial of service attack; however, for maximum
-portability, any *length* less than 2^25 (33,554,432) bytes SHOULD NOT
-be considered a denial of service attack (even if the advertised
-maximum block size is smaller). For all other commands, where the
-*length* is not reflected in the payload (such as `NBD_CMD_TRIM` or
-`NBD_CMD_WRITE_ZEROES`), a server SHOULD merely fail the command with
-an `NBD_EINVAL` error for a client that exceeds the maximum block size,
-rather than initiating a hard disconnect.
+or the client MAY initiate a hard disconnect if a payload length of
+either a request or a reply would be large enough to be deemed a
+denial of service attack; however, for maximum portability, any
+*length* less than 2^25 (33,554,432) bytes SHOULD NOT be considered a
+denial of service attack (even if the advertised maximum block size is
+smaller).
## Metadata querying
@@ -1592,7 +1598,7 @@ during option haggling in the fixed newstyle negotiation.
- 16 bits, `NBD_INFO_BLOCK_SIZE`
- 32 bits, minimum block size
- 32 bits, preferred block size
- - 32 bits, maximum block size
+ - 32 bits, maximum block payload size
* `NBD_REP_META_CONTEXT` (4)
@@ -1740,7 +1746,8 @@ Some chunk types can additionally be categorized by role, such as
*error chunks* or *content chunks*. Each type determines how to
interpret the "length" bytes of payload. If the client receives
an unknown or unexpected type, other than an *error chunk*, it
-MUST initiate a hard disconnect.
+MUST initiate a hard disconnect. A server MUST NOT send a chunk
+larger than any advertised maximum block payload size.
* `NBD_REPLY_TYPE_NONE` (0)
@@ -2111,11 +2118,12 @@ The following request types exist:
If the server advertised `NBD_FLAG_SEND_FAST_ZERO` but
`NBD_CMD_FLAG_FAST_ZERO` is not set, then the server MUST NOT fail
with `NBD_ENOTSUP`, even if the operation is no faster than a
- corresponding `NBD_CMD_WRITE`. Conversely, if
- `NBD_CMD_FLAG_FAST_ZERO` is set, the server MUST fail quickly with
- `NBD_ENOTSUP` unless the request can be serviced in less time than
- a corresponding `NBD_CMD_WRITE`, and SHOULD NOT alter the contents
- of the export when returning this failure. The server's
+ corresponding `NBD_CMD_WRITE`. Conversely, if `NBD_CMD_FLAG_FAST_ZERO`
+ is set, the server SHOULD NOT fail with `NBD_EOVERFLOW` regardless of
+ the client length, MUST fail quickly with `NBD_ENOTSUP` unless the
+ request can be serviced in less time than a corresponding
+ `NBD_CMD_WRITE`, and SHOULD NOT alter the contents of the export when
+ returning an `NBD_ENOTSUP` failure. The server's
determination on whether to fail a fast request MAY depend on a
number of factors, such as whether the request was suitably
aligned, on whether the `NBD_CMD_FLAG_NO_HOLE` flag was present,
@@ -2266,7 +2274,9 @@ SHOULD return `NBD_EPERM` if it receives a write or trim request on
a
read-only export.
The server SHOULD NOT return `NBD_EOVERFLOW` except as documented in
-response to `NBD_CMD_READ` when `NBD_CMD_FLAG_DF` is supported.
+response to `NBD_CMD_READ` when `NBD_CMD_FLAG_DF` is supported, or when
+a command without payload requests a length larger than an advertised
+maximum block payload length.
The server SHOULD NOT return `NBD_ENOTSUP` except as documented in
response to `NBD_CMD_WRITE_ZEROES` when `NBD_CMD_FLAG_FAST_ZERO` is
--
2.35.1