[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

[PATCH 1/2] doc: Clarify maximum size limits



The existing text on block sizes had a redundant statement about
hard disconnects on large requests, but did not specify a size
where that comes into play.  In practice, several NBD
implementations have settled on 32M (qemu, kernel module) or 64M
(nbdkit) as the limit for maximum request size (at the client)
or hard disconnect size (at the server); thus we should document
32M as the portable limit that clients can expect and servers
should support as a way to increase interoperability.

There is also some confusion on whether WRITE_ZEROES must abide
by the same limits as WRITE.  In practice, servers permit larger
limits for TRIM and WRITE_ZEROES than for WRITE because there is
no payload involved; but a server may still want to enforce a
maximum size other than 4G because of other constraints (for
example, if a WRITE_ZEROES request is implemented by malloc'ing
a buffer and performing a naive writes, clamping the maximum
malloc size or limiting the amount of time spent on looping
to perform the writes may still be desirable for the server).
But in practice, existing servers reserve hard disconnect for
overlarge request payloads (NBD_CMD_WRITE) or what would require
an overlarge reply payload (NBD_CMD_READ), while any other
command will just receive an error rather than a hard disconnect.

With just one value for 'maximum block size', we have a slight
ambiguity where the server has to decide whether to advertise the
hard disconnect limit (32M) or the maximum size for other
commands, but hopefully the wording chosen here will make it
acceptable for a server that advertises 32M and then permits
clients to try larger WRITE_ZEROES for less traffic (where the
client knows to fall back to smaller requests if the large
request fail with EINVAL); as well as a server that advertises
a larger maximum size (as its actual limit for TRIM and
WRITE_ZEROES) while expecting the client to not exceed 32M
for the denial of service aspect.  The former interpretation
is slightly preferred, especially if a later patch adds a new
NBD_INFO_ field for documenting exact limits for TRIM and
WRITE_ZEROES during NBD_OPT_GO.

Finally, requiring a server to support 32M as its maximum block
size may be a bit large for some setups; when sizes are
advertised, it is reasonable to allow a server with a smaller
explicit limit of 1M.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 doc/proto.md | 60 ++++++++++++++++++++++++++++++++++--------------------------
 1 file changed, 34 insertions(+), 26 deletions(-)

diff --git a/doc/proto.md b/doc/proto.md
index c7803e0..c607fc9 100644
--- a/doc/proto.md
+++ b/doc/proto.md
@@ -739,28 +739,28 @@ information request, it MUST abide by the block size constraints it
 receives. Clients MAY issue `NBD_OPT_INFO` with `NBD_INFO_BLOCK_SIZE` to
 learn the server's constraints without committing to them.

-If block size constraints have not been advertised or agreed on externally,
-then a client SHOULD assume a default minimum block size of 1, a preferred
-block size of 2^12 (4,096), and a maximum block size of the smaller of
-the export size or 0xffffffff (effectively unlimited).  A server that
-wants to enforce block sizes other than the defaults specified here
-MAY refuse to go into transmission phase with a client that uses
-`NBD_OPT_EXPORT_NAME` (via a hard disconnect) or which fails to use
-`NBD_INFO_BLOCK_SIZE` with `NBD_OPT_GO` (where the server uses
-`NBD_REP_ERR_BLOCK_SIZE_REQD`), although a server SHOULD permit such
-clients if block size constraints are the default or can be agreed on
-externally.  When allowing such clients, the server MUST cleanly error
-commands that fall outside block size constraints without corrupting
-data; even so, this may limit interoperability.
+If block size constraints have not been advertised or agreed on
+externally, then a server SHOULD support a default minimum block size
+of 1, a preferred block size of 2^12 (4,096), and a maximum block size
+of the smaller of the export size or 0xffffffff (effectively
+unlimited), while a client desiring maximum interoperability SHOULD
+constrain its requests to a minimum block size of 2^9 (512), and limit
+`NBD_CMD_READ` and `NBD_CMD_WRITE` commands to a maximum block size of
+2^25 (33,554,432).  A server that wants to enforce block sizes other
+than the defaults specified here MAY refuse to go into transmission
+phase with a client that uses `NBD_OPT_EXPORT_NAME` (via a hard
+disconnect) or which fails to use `NBD_INFO_BLOCK_SIZE` with
+`NBD_OPT_GO` (where the server uses `NBD_REP_ERR_BLOCK_SIZE_REQD`),
+although a server SHOULD permit such clients if block size constraints
+are the default or can be agreed on externally.  When allowing such
+clients, the server MUST cleanly error commands that fall outside
+block size constraints without corrupting data; even so, this may
+limit interoperability.

-A client MAY choose to operate as if tighter block size constraints had
-been specified (for example, even when the server advertises the default
-minimum block size of 1, a client may safely use a minimum block size
-of 2^9 (512), a preferred block size of 2^16 (65,536), and a maximum
-block size of 2^25 (33,554,432)).  Notwithstanding any maximum block
-size advertised, either the server or the client MAY initiate a hard
-disconnect if the size of a request or a reply is large enough to be
-deemed a denial of service attack.
+A client MAY choose to operate as if tighter block size constraints
+had been specified (for example, even when the server advertises the
+default minimum block size of 1, a client may safely use a minimum
+block size of 2^9 (512)).

 The minimum block size represents the smallest addressable length and
 alignment within the export, although writing to an area that small
@@ -788,8 +788,8 @@ The maximum block size represents the maximum length that the server
 is willing to handle in one request.  If advertised, it MUST be either
 an integer multiple of the minimum block size or the value 0xffffffff
 for no inherent limit, MUST be at least as large as the smaller of the
-preferred block size or export size, and SHOULD be at least 2^25
-(33,554,432) if the export is that large, but MAY be something other
+preferred block size or export size, and SHOULD be at least 2^20
+(1,048,576) if the export is that large, but MAY be something other
 than a power of 2.  For convenience, the server MAY advertise a
 maximum block size that is larger than the export size, although in
 that case, the client MUST treat the export size as the effective
@@ -804,9 +804,17 @@ those requests, the client MUST NOT use a *length* larger than any
 advertised maximum block size or which, when added to *offset*, would
 exceed the export size.  The server SHOULD report an `EINVAL` error if
 the client's request is not aligned to advertised minimum block size
-boundaries, or is larger than the advertised maximum block size,
-although the server MAY instead initiate a hard disconnect if a large
-*length* could be deemed as a denial of service attack.
+boundaries, or is larger than the advertised maximum block size.
+Notwithstanding any maximum block size advertised, either the server
+or the client MAY initiate a hard disconnect if the payload of an
+`NBD_CMD_WRITE` request or `NBD_CMD_READ` reply would be large enough
+to be deemed a denial of service attack; however, for maximum
+portability, any length less than 2^25 (33,554,432) bytes SHOULD NOT
+be considered a denial of service attack.  For all other commands,
+where the *length* is not reflected in the payload (such as
+`NBD_CMD_TRIM` or `NBD_CMD_WRITE_ZEROES`), a server SHOULD merely fail
+the command with an `EINVAL` error for a client that exceeds the
+maximum block size, rather than initiating a hard disconnect.

 ## Values

-- 
2.14.3


Reply to: