[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

[Nbd] [PATCH] doc: Clarify maximum size limits



Requiring a server to support 32M as its maximum block size
may be a bit large for some setups; rewrite things to make it
obvious that a server may advertise a maximum as small as 1M,
but that without an advertisement, there are existing clients
(hello, kernel NBD module) that expect 32M to work.

Add some clarity that servers SHOULD NOT treat anything
smaller than 32M as a denial of service (and in contrast to
the recent patch for a 16k limit during option haggling),
and that denial-of-service limits may differ according to
command.

Also add a way for servers to advertise on a per-command basis
when it is willing to accept larger limits because there is no
payload involved; the two obvious commands are NBD_CMD_TRIM and
NBD_CMD_WRITE_ZEROES, which now have NBD_INFO_TRIM_SIZE and
NBD_INFO_ZERO_SIZE.  These new limits are listed separately
from NBD_INFO_BLOCK_SIZE since both commands are optional and
need not be implemented by a minimal server; but letting the
server advertise them makes sense since at least scsi devices
natively have three different size constraints for read/write,
unmap, and writesame.  A server MAY honor larger trim requests
even without advertising, but a client cannot rely on that
working unless the additional sizes are advertised.

The full text for NBD_INFO_ZERO_SIZE will be written later once
either that extension (or this one) is promoted to stable; it can
copy the pattern of NBD_INFO_TRIM_SIZE, except that where
CMD_TRIM can round the request (and only actually trim a subset
of the client request), CMD_WRITE_ZEROES must write actual zeroes
over the entire request (but if it can punch holes, only a
subset of the client request need result in holes).

Signed-off-by: Eric Blake <eblake@...696...>
---

For the extension-info branch, and builds on the earlier
patch sent for the master branch about handshake limits

 doc/proto.md | 92 +++++++++++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 79 insertions(+), 13 deletions(-)

diff --git a/doc/proto.md b/doc/proto.md
index 5692be5..bf29b3a 100644
--- a/doc/proto.md
+++ b/doc/proto.md
@@ -686,7 +686,9 @@ of 2^9 (512), a preferred block size of 2^16 (65,536), and a maximum
 block size of 2^25 (33,554,432)).  Notwithstanding any maximum block
 size advertised, either the server or the client MAY initiate a hard
 disconnect if the size of a request or a reply is large enough to be
-deemed a denial of service attack.
+deemed a denial of service attack.  However, for maximum portability,
+any length less than 2^25 (33,554,432) bytes SHOULD NOT be considered
+a denial of service attack.

 The minimum block size represents the smallest addressable length and
 alignment within the export, although writing to an area that small
@@ -710,29 +712,47 @@ preferred block size for that export.  The server MAY advertise an
 export size that is not an integer multiple of the preferred block
 size.

-The maximum block size represents the maximum length that the server
-is willing to handle in one request.  If advertised, it MUST be either
+The maximum block size represents the maximum payload length that the
+server is willing to handle in one request (whether the payload is
+from the client as in `NBD_CMD_WRITE` or from the server as in
+`NBD_CMD_READ`).  If advertised, it SHOULD be either
 an integer multiple of the minimum block size or the value 0xffffffff
 for no inherent limit, MUST be at least as large as the smaller of the
-preferred block size or export size, and SHOULD be at least 2^25
-(33,554,432) if the export is that large, but MAY be something other
+preferred block size or export size, and SHOULD be at least 2^20
+(1,048,576) if the export is that large, but MAY be something other
 than a power of 2.  For convenience, the server MAY advertise a
 maximum block size that is larger than the export size, although in
 that case, the client MUST treat the export size as the effective
 maximum block size (as further constrained by a non-zero offset).

+Some commands (such as `NBD_CMD_TRIM`) allow the client to specify
+*length* but do not involve a payload; for these commands, the client
+and server may negotiate additional information about larger size
+limits for those commands (such as `NBD_INFO_TRIM_SIZE`).  For these
+commands, a client SHOULD NOT use a *length* larger than the
+negotiated maximum block size from `NBD_INFO_BLOCK_SIZE` unless the
+additional information was successfully negotiated.
+
 Where a transmission request can have a non-zero *offset* and/or
 *length* (such as `NBD_CMD_READ`, `NBD_CMD_WRITE`, or `NBD_CMD_TRIM`),
 the client MUST ensure that *offset* and *length* are integer
 multiples of any advertised minimum block size, and SHOULD use integer
 multiples of any advertised preferred block size where possible.  For
 those requests, the client MUST NOT use a *length* larger than any
-advertised maximum block size or which, when added to *offset*, would
-exceed the export size.  The server SHOULD report an `EINVAL` error if
-the client's request is not aligned to advertised minimum block size
-boundaries, or is larger than the advertised maximum block size,
-although the server MAY instead initiate a hard disconnect if a large
-*length* could be deemed as a denial of service attack.
+appropriate advertised maximum block size or which, when added to
+*offset*, would exceed the export size.  The client SHOULD NOT depend
+on the server to diagnose requests that violate sizing constraints.
+
+The server SHOULD report an `EINVAL` error if the client's request is
+not aligned to advertised minimum block size boundaries, or is larger
+than the relevant advertised maximum block size, although the server
+MAY instead initiate a hard disconnect if a large *length* could be
+deemed as a denial of service attack.  The size limits and resulting
+behaviour of what the server deems as a denial of service attack MAY
+be different according to the request (for example, even if an
+`NBD_CMD_TRIM` request for 2^30 bytes succeeds or gives a graceful
+`EINVAL` error, the same size request for `NBD_CMD_WRITE` could result
+in a hard disconnect).

 ## Values

@@ -1114,6 +1134,36 @@ during option haggling in the fixed newstyle negotiation.
       - 32 bits, preferred block size  
       - 32 bits, maximum block size  

+    * `NBD_INFO_TRIM_SIZE` (4)
+
+      Represents alternate limits that the server will honour during
+      `NBD_CMD_TRIM`.  The server SHOULD NOT send this info unless it
+      will also be advertising the transmission flag
+      `NBD_CMD_SEND_TRIM`.  The minimum trim size MUST be a power of
+      2, and at least as large as any preferred block size advertised
+      in `NBD_INFO_BLOCK_SIZE`, and represents the alignment and
+      minimum granularity that can be discarded.  The maximum trim
+      size MUST be either 0xffffffff (for no inherent 32-bit limit) or
+      a power of 2 at least as large as any maximum block size
+      advertised in `NBD_INFO_BLOCK_SIZE`.  Since a trim command
+      carries no payload and can often be performed in less time than
+      a read or a write, servers SHOULD advertise larger trim sizes
+      where supported.  If the server does not advertise separate
+      limits, the client MUST limit `NBD_CMD_TRIM` alignment and sizes
+      to the same limits as any other command.  The *length* MUST be
+      10, and the reply payload is interpreted as:
+
+      - 16 bits, `NBD_INFO_TRIM_SIZE`  
+      - 32 bits, minimum trim size  
+      - 32 bits, maximum trim size  
+
+    * `NBD_INFO_ZERO_SIZE` (5)
+
+      Reserved for advertising alternate limits that the server will
+      honour with the `NBD_CMD_WRITE_ZEROES`
+      [extension](https://github.com/yoe/nbd/blob/extension-write-zeroes/doc/proto.md).
+      The server SHOULD NOT send this info unless it will also be
+      advertising the transmission flag `NBD_CMD_SEND_WRITE_ZEROES`.

 There are a number of error reply types, all of which are denoted by
 having bit 31 set. All error replies MAY have some data set, in which
@@ -1266,8 +1316,10 @@ The following request types exist:
 * `NBD_CMD_TRIM` (4)

     A hint to the server that the data defined by len and offset is no
-    longer needed. A server MAY discard len bytes starting at offset, but
-    is not required to.
+    longer needed. A server MAY discard *length* bytes starting at
+    offset, but is not required to; and MAY round *offset* up and
+    *length* down to meet internal alignment constraints so that only
+    a portion of the client's request is actually discarded.

     After issuing this command, a client MUST NOT make any assumptions
     about the contents of the export affected by this command, until
@@ -1276,10 +1328,24 @@ The following request types exist:
     A client MUST NOT send a trim request unless `NBD_FLAG_SEND_TRIM`
     was set in the transmission flags field.

+    A client SHOULD request `NBD_INFO_TRIM_SIZE` during `NBD_OPT_INFO`
+    or `NBD_OPT_GO` to learn if the server has separate minimum
+    alignment and maximum size constraints for trim operations; if
+    this information is not available, the client SHOULD use the
+    preferred block size for alignment and the maximum block size for
+    the maximum trim length.
+
 * `NBD_CMD_WRITE_ZEROES` (6)

     Defined by the experimental `WRITE_ZEROES` [extension](https://github.com/yoe/nbd/blob/extension-write-zeroes/doc/proto.md).

+    A client SHOULD request `NBD_INFO_ZERO_SIZE` during `NBD_OPT_INFO`
+    or `NBD_OPT_GO` to learn if the server has separate minimum
+    alignment and maximum size constraints for write zero operations;
+    if this information is not available, the client SHOULD use the
+    preferred block size for alignment and the maximum block size for
+    the maximum zero length.
+
 * Other requests

     Some third-party implementations may require additional protocol
-- 
2.5.5




Reply to: