[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [PATCH v3 3/5] doc: Clarify maximum size limits



28.03.2018 22:43, Eric Blake wrote:
The existing text on block sizes had a redundant statement about
hard disconnects on large requests, but did not specify a size
where that comes into play.  In practice, several NBD
implementations have settled on 32M (qemu, kernel module) or 64M
(nbdkit) as the limit for maximum request size (at the client)
or hard disconnect size (at the server); thus we should document
32M as the portable limit that clients can expect and servers
should support as a way to increase interoperability.

There is also some confusion on whether WRITE_ZEROES must abide
by the same limits as WRITE.  In practice, servers permit larger
limits for TRIM and WRITE_ZEROES than for WRITE because there is
no payload involved; but a server may still want to enforce a
maximum size other than 4G because of other constraints (for
example, if a WRITE_ZEROES request is implemented by malloc'ing
a buffer and performing a naive writes, clamping the maximum
malloc size or limiting the amount of time spent on looping
to perform the writes may still be desirable for the server).
But in practice, existing servers reserve hard disconnect for
overlarge request payloads (NBD_CMD_WRITE) or what would require
an overlarge reply payload (NBD_CMD_READ), while any other
command will just receive an error rather than a hard disconnect.

With just one value for 'maximum block size', we have a slight
ambiguity where the server has to decide whether to advertise the
hard disconnect limit (32M) or the maximum size for other
commands, but hopefully the wording chosen here will make it
acceptable for a server that advertises 32M and then permits
clients to try larger WRITE_ZEROES for less traffic (where the
client knows to fall back to smaller requests if the large
request fail with EINVAL); as well as a server that advertises
a larger maximum size (as its actual limit for TRIM and
WRITE_ZEROES) while expecting the client to not exceed 32M
for the denial of service aspect.  The former interpretation
is slightly preferred, especially if a later patch adds a new
NBD_INFO_ field for documenting exact limits for TRIM and
WRITE_ZEROES during NBD_OPT_GO.

Finally, requiring a server to support 32M as its maximum block
size may be a bit large for some setups; when sizes are
advertised, it is reasonable to allow a server with a smaller
explicit limit of 1M.  Rephrase some constraints for easier
reading.

[A later patch will do a no-op reflowing of paragraphs that end
up looking ragged from this patch]

Signed-off-by: Eric Blake <eblake@redhat.com>

---
v3: split reflow from content change
v2: more wording tweaks
---
  doc/proto.md | 59 +++++++++++++++++++++++++++++++++++------------------------
  1 file changed, 35 insertions(+), 24 deletions(-)

diff --git a/doc/proto.md b/doc/proto.md
index 48a4ef5..4b6f638 100644
--- a/doc/proto.md
+++ b/doc/proto.md
@@ -740,27 +740,29 @@ receives. Clients MAY issue `NBD_OPT_INFO` with `NBD_INFO_BLOCK_SIZE` to
  learn the server's constraints without committing to them.

  If block size constraints have not been advertised or agreed on externally,
-then a client SHOULD assume a default minimum block size of 1, a preferred
-block size of 2^12 (4,096), and a maximum block size of the smaller of
-the export size or 0xffffffff (effectively unlimited).  A server that
+then a server SHOULD support a default minimum block size of 1, a preferred
+block size of 2^12 (4,096), and a maximum block size that is effectively unlimited (0xffffffff, or the export size if that
+is smaller), while a client desiring maximum interoperability SHOULD
+constrain its requests to a minimum block size of 2^9 (512), and limit
+`NBD_CMD_READ` and `NBD_CMD_WRITE` commands to a maximum block size of
+2^25 (33,554,432).  A server that
  wants to enforce block sizes other than the defaults specified here
  MAY refuse to go into transmission phase with a client that uses
-`NBD_OPT_EXPORT_NAME` (via a hard disconnect) or which fails to use
-`NBD_INFO_BLOCK_SIZE` with `NBD_OPT_GO` (where the server uses
-`NBD_REP_ERR_BLOCK_SIZE_REQD`), although a server SHOULD permit such
-clients if block size constraints are the default or can be agreed on
-externally.  When allowing such clients, the server MUST cleanly error
-commands that fall outside block size constraints without corrupting
-data; even so, this may limit interoperability.
+`NBD_OPT_EXPORT_NAME` (via a hard disconnect) or which uses
+`NBD_OPT_GO` without requesting `NBD_INFO_BLOCK_SIZE` (via an error reply of
+`NBD_REP_ERR_BLOCK_SIZE_REQD`); but servers SHOULD NOT refuse clients
+that do not request sizing information when the server supports
+default sizing or where sizing constraints can be agreed on
+externally.  When allowing clients that did not negotiate sizing via
+NBD, a server that enforces stricter block size constraints than the
+defaults MUST cleanly error commands that fall outside the constraints
+without corrupting data; even so, enforcing constraints in this manner
+may limit interoperability.

  A client MAY choose to operate as if tighter block size constraints had
  been specified (for example, even when the server advertises the default
  minimum block size of 1, a client may safely use a minimum block size
-of 2^9 (512), a preferred block size of 2^16 (65,536), and a maximum
-block size of 2^25 (33,554,432)).  Notwithstanding any maximum block
-size advertised, either the server or the client MAY initiate a hard
-disconnect if the size of a request or a reply is large enough to be
-deemed a denial of service attack.
+of 2^9 (512)).

  The minimum block size represents the smallest addressable length and
  alignment within the export, although writing to an area that small
@@ -785,12 +787,12 @@ export size that is not an integer multiple of the preferred block
  size.

  The maximum block size represents the maximum length that the server
-is willing to handle in one request.  If advertised, it MUST be either
-an integer multiple of the minimum block size or the value 0xffffffff
-for no inherent limit, MUST be at least as large as the smaller of the
-preferred block size or export size, and SHOULD be at least 2^25
-(33,554,432) if the export is that large, but MAY be something other
-than a power of 2.  For convenience, the server MAY advertise a
+is willing to handle in one request.  If advertised, it MAY be
+something other than a power of 2, but MUST be either an integer
+multiple of the minimum block size or the value 0xffffffff for no
+inherent limit, MUST be at least as large as the smaller of the
+preferred block size or export size, and SHOULD be at least 2^20
+(1,048,576) if the export is that large.  For convenience, the server MAY advertise a
  maximum block size that is larger than the export size, although in
  that case, the client MUST treat the export size as the effective
  maximum block size (as further constrained by a nonzero offset).
@@ -804,9 +806,18 @@ those requests, the client MUST NOT use a *length* larger than any
  advertised maximum block size or which, when added to *offset*, would
  exceed the export size.  The server SHOULD report an `EINVAL` error if
  the client's request is not aligned to advertised minimum block size
-boundaries, or is larger than the advertised maximum block size,
-although the server MAY instead initiate a hard disconnect if a large
-*length* could be deemed as a denial of service attack.
+boundaries, or is larger than the advertised maximum block size.
+Notwithstanding any maximum block size advertised, either the server
+or the client MAY initiate a hard disconnect if the payload of an
+`NBD_CMD_WRITE` request or `NBD_CMD_READ` reply would be large enough
+to be deemed a denial of service attack; however, for maximum
+portability, any length less than 2^25 (33,554,432) bytes SHOULD NOT
+be considered a denial of service attack (even if the advertised
+maximum block size is smaller).

intersting: "any lengths", what is it? is it length field? Or overall length of the whole request, which will be a bit larger?

   For all other commands, where the
+*length* is not reflected in the payload (such as `NBD_CMD_TRIM` or
+`NBD_CMD_WRITE_ZEROES`), a server SHOULD merely fail the command with
+an `EINVAL` error for a client that exceeds the maximum block size,
+rather than initiating a hard disconnect.

  ## Values



--
Best regards,
Vladimir


Reply to: