[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Nbd] [PATCH] Document format of strings in one place, limit to 4096 bytes



On Wed, Apr 06, 2016 at 08:06:04PM +0100, Alex Bligh wrote:
> * Document the format of strings centrally using correct terminology
> 
> * Remove duplications of dealing with `NUL` characters.
> 
> Signed-off-by: Alex Bligh <alex@...872...>
> ---
>  doc/proto.md | 39 ++++++++++++++++++++-------------------
>  1 file changed, 20 insertions(+), 19 deletions(-)
> 
> Note this should be applied on top of:
> 
>  0001-docs-proto.md-Clarify-SHOULD-MUST-MAY-etc.patch
> 
> diff --git a/doc/proto.md b/doc/proto.md
> index 49298e1..96d05cb 100644
> --- a/doc/proto.md
> +++ b/doc/proto.md
> @@ -31,6 +31,12 @@ The key words "**MUST**", "**MUST NOT**", "**REQUIRED**", "**SHALL**",
>  described in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt).
>  The same words in lower case carry their natural meaning.
>  
> +Where this document refers to a string, then unless otherwise stated,
> +that string is a sequence of UTF-8 code points, which is not `NUL`
> +terminated, **MUST NOT** contain `NUL` characters, and is of length
> +4096 bytes or less. This applies to export names and error messages

I would like to make that a "SHOULD be less than X bytes". A client
could read the first protocol-demanded bytes, drop the rest, and give a
warning to the user (while doing a read() into a dummy buffer).

Also, most of those strings exist solely to make user interaction
easier. In that context, a 4K string is *way* too long; I'm thinking
something like 256 seems saner (80 characters, times three because
UTF-8, round up to the next power of two).

> +(amongst others).
> +
>  ## Protocol phases
>  
>  The NBD protocol has two phases: the handshake and the transmission. During the
> @@ -358,11 +364,10 @@ of the newstyle negotiation.
>  - `NBD_OPT_EXPORT_NAME` (1)
>  
>      Choose the export which the client would like to use, end option
> -    haggling, and proceed to the transmission phase. Data: name of the
> -    export, free-form UTF-8 text (subject to limitations by server
> -    implementation). The length of the name is determined from the
> -    option header. The name is not `NUL` terminated, and **MUST NOT**

No bold (as per previous mails)

> -    contain embedded `NUL` characters. If the
> +    haggling, and proceed to the transmission phase.
> +
> +    Data: name of the export, being a free-form UTF-8 text string.
> +    The length of the name is determined from the option header. If the
>      chosen export does not exist or requirements for the chosen export
>      are not met (e.g., the client did not negotiate TLS for an export
>      where the server requires it), the server **MUST** close the
> @@ -440,16 +445,14 @@ during option haggling in the fixed newstyle negotiation.
>  
>      - 32 bits, length of name (unsigned); **MUST** be no larger than the
>        reply packet header length - 4
> -    - Name of the export, as expected by `NBD_OPT_EXPORT_NAME` (note
> -      that the length of name does NOT include a `NUL` terminator)
> +    - Name of the export, a string as expected by `NBD_OPT_EXPORT_NAME`.

For consistency, might be good to have something along the lines of

string, name of the export as expected by `NBD_OPT_EXPORT_NAME`.

(the rest of the document always puts the type of the variable first,
only then its description)

>      - If length of name < (reply packet header length - 4), then the
>        rest of the data contains some implementation-specific details
>        about the export. This is not currently implemented, but future
>        versions of nbd-server may send along some details about the
>        export. Therefore, unless explicitly documented otherwise by a
>        particular client request, this field is defined to be UTF-8
> -      encoded data suitable for direct display to a human being; with
> -      no embedded or terminating `NUL` characters.
> +      encoded data suitable for direct display to a human being.

Should refer to string here.

>  
>      The experimental `INFO` extension (see below) is a client
>      request where the extra data has a definition other than a
> @@ -457,8 +460,8 @@ during option haggling in the fixed newstyle negotiation.
>  
>  There are a number of error reply types, all of which are denoted by
>  having bit 31 set. All error replies **MAY** have some data set, in which
> -case that data is an error message in UTF-8 encoding suitable for
> -display to the user, with no embedded or terminating `NUL` characters.
> +case that data is an error message string in UTF-8 encoding suitable for
> +display to the user.

Can drop UTF-8 reference (since you say string)

>  * `NBD_REP_ERR_UNSUP` (2^31 + 1)
>  
> @@ -718,7 +721,7 @@ Therefore these commands share common documentation.
>  
>      Data (both commands):
>  
> -    - Name of the export (as with `NBD_OPT_EXPORT_NAME`, the length
> +    - Name of the export as a string (as with `NBD_OPT_EXPORT_NAME`, the length

string first (as above)

>        comes from the option header).
>  
>      If no name is specified (i.e. a zero length string is provided),
> @@ -747,7 +750,7 @@ Therefore these commands share common documentation.
>      - 64 bits, size of the export in bytes (unsigned)
>      - 16 bits, transmission flags.
>      - 32 bits, length of name (unsigned)
> -    - Name of the export. This name **MAY** be different from the one
> +    - Name of the export as a string. This name **MAY** be different from the one

idem

>        given in the `NBD_OPT_INFO` or `NBD_OPT_GO` option in case the server has
>        multiple alternate names for a single export, or a default
>        export was specified.
> @@ -930,9 +933,8 @@ error, and alters the reply to the `NBD_CMD_READ` request.
>        The payload is structured as:
>  
>        32 bits: error (**MUST** be nonzero)  
> -      *length - 4* bytes: (optional UTF-8 encoded data suitable for
> -         direct display to a human being, with no embedded or
> -         terminating `NUL` characters)  
> +      *length - 4* bytes: (optional UTF-8 encoded string suitable for
> +         direct display to a human being)

idem

>  
>      - `NBD_REPLY_TYPE_ERROR_OFFSET` (2)
>  
> @@ -952,9 +954,8 @@ error, and alters the reply to the `NBD_CMD_READ` request.
>  
>        32 bits: error (**MUST** be non-zero)  
>        64 bits: offset (unsigned)  
> -      *length - 12* bytes: (optional UTF-8 encoded data suitable for
> -         direct display to a human being, with no embedded or
> -         terminating `NUL` characters)  
> +      *length - 12* bytes: (optional UTF-8 encoded string suitable for
> +         direct display to a human being)

idem

-- 
< ron> I mean, the main *practical* problem with C++, is there's like a dozen
       people in the world who think they really understand all of its rules,
       and pretty much all of them are just lying to themselves too.
 -- #debian-devel, OFTC, 2016-02-12



Reply to: