[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Nbd] [PATCH v4] doc: Propose STRUCTURED_REPLY extension



Eric,

I know this has now been merged, but I have a few comments still (sorry).

On 1 Apr 2016, at 06:29, Eric Blake <eblake@...696...> wrote:

> +    A structured reply in the transmission phase consists of one or
> +    more structured reply chunk messages.  The server MUST NOT send
> +    this reply type unless the client has successfully negotiated
> +    structured replies via `NBD_OPT_STRUCTURED_REPLY`.  Conversely, if
> +    structured replies are negotiated, the server MUST use a
> +    structured reply for any response with a payload, and MUST NOT use
> +    a simple reply for `NBD_CMD_READ` (even for the case of an early
> +    `EINVAL` due to bad flags).

I think we are left with an ambiguity. If
structured replies HAVE been negotiated, this doesn't say what
the server MUST/MAY do with replies that do *not* have a payload.
Also it's not clear whether a read response with an error immediate
error 'has a payload'.

I don't want to rehash the discussion but please could the text
definitively set the position out (one way or another).

I'm hoping we allowed structured replies to things without payloads
in such circumstance if only so they can get extended errors.

> +    chunk.  Relative ordering of the chunks MAY be important;
> +    individual commands will document constraints on whether multiple
> +    chunks may be rearranged;

I think this would be better written with a default position, as no
doubt people will forget.

I suggest:

The data within the chunks MAY be sent in any order unless the individual
command documents otherwise. Therefore (save where documented) chunks
may be rearranged save that the final chunk must always have
`NBD_REPLY_FLAG_DONE` set.

> +  * Structured Reply types
> +
> +    These values are used in the "type" field of a structured reply.
> +    Some chunk types can additionally be categorized by role, such as
> +    *error chunks* or *content chunks*.  Each type determines how to
> +    interpret the "length" bytes of payload.  If the client receives
> +    an unknown or unexpected type, it SHOULD close the connection.

Nit: this is a 'MUST' everywhere else. I would have thought either
'MUST ignore', or 'MUST close'

> +    - `NBD_REPLY_TYPE_ERROR` (1)
> +
> +      This chunk type is in the error chunk category.  *length* MUST
> +      be at least 4.  This chunk represents that an error occurred,
> +      and the client MAY NOT make any assumptions about partial
> +      success. This type SHOULD NOT be used more than once in a
> +      structured reply.  Valid as a reply to any request.
> +
> +      The payload is structured as:
> +
> +      32 bits: error (MUST be nonzero)  
> +      *length - 4* bytes: (optional UTF-8 encoded data suitable for
> +         direct display to a human being, not NUL terminated)

s/terminated/terminated and may not contain NUL bytes/

> +    - `NBD_REPLY_TYPE_ERROR_OFFSET` (2)
> +
> +      This chunk type is in the error chunk category.  *length* MUST
> +      be at least 12.  This reply represents that an error occurred at
> +      a given offset, which MUST lie within the original offset and
> +      length of the request; the client can use this offset to
> +      determine if request had any partial success.  This chunk type
> +      MAY appear multiple times in a structured reply, although the
> +      same offset SHOULD NOT be repeated.  Likewise, if content chunks
> +      were sent earlier in the structured reply, the server SHOULD NOT
> +      send multiple distinct offsets that lie within the bounds of a
> +      single content chunk.  Valid as a reply to `NBD_CMD_READ`,
> +      `NBD_CMD_WRITE`, and `NBD_CMD_TRIM`.
> +
> +      The payload is structured as:
> +
> +      32 bits: error (MUST be nonzero)  
> +      64 bits: offset (unsigned)  
> +      *length - 12* bytes: (optional UTF-8 encoded data suitable for
> +         direct display to a human being, not NUL terminated)  

s/terminated/terminated and may not contain NUL bytes/

> +* `NBD_FLAG_SEND_DF`
> +
> +    The server MUST set this transmission flag to 1 if the
> +    `NBD_CMD_READ` request supports the `NBD_CMD_FLAG_DF` flag, and
> +    MUST leave this flag clear if structured replies have not been
> +    negotiated.  Clients MUST NOT rely on the state of this flag prior
> +    the final flags value reported by `NBD_OPT_EXPORT_NAME` or
> +    experimental `NBD_OPT_GO`.  Additionally, clients MUST NOT set the
> +    `NBD_CMD_FLAG_DF` request flag unless this transmission flag is
> +    set.

Now we've fixed the NBD_OPT_GO thing, do we need to mention this here,
as this surely applies to all transmission flags?
> 
> +* `NBD_CMD_READ`
> +
> +    If structured replies were not negotiated, then a read request
> +    MUST always be answered by a simple reply, as documented above
> +    (using magic 0x67446698 `NBD_SIMPLE_REPLY_MAGIC`, and containing
> +    length bytes of data according to the client's request, although
> +    those bytes MAY be invalid if an error is returned, and the
> +    connection MUST be closed if an error occurs after a header
> +    claiming no error).
> +
> +    If structured replies are negotiated, then a read request MUST
> +    result in a structured reply with one or more chunks (each using
> +    magic 0x668e33ef `NBD_STRUCTURED_REPLY_MAGIC`), where the final
> +    chunk has the flag `NBD_REPLY_FLAG_DONE`, and with the following
> +    additional constraints.
> +
> +    The server MAY split the reply into any number of content chunks;
> +    each chunk MUST describe at least one byte, although to minimize
> +    overhead, the server SHOULD use chunks with lengths and offsets as
> +    an integer multiple of 512 bytes, where possible (the first and
> +    last chunk of an unaligned read being the most obvious places for
> +    an exception).  The server MUST NOT content chunks that overlap

s/MUST NOT content/MUST NOT send content/

> +    with any earlier content or error chunk, and MUST NOT send chunks
> +    that describe data outside the offset and length of the request,
> +    but MAY send the chunks in any order (the client MUST reassemble
> +    content chunks into the correct order), and MAY send additional
> +    data chunks even after reporting an error chunk.  Note that a
> +    request for more than 2^32 - 8 bytes MUST be split into at least
> +    two chunks, so as not to overflow the length field of a reply
> +    while still allowing space for the offset of each chunk.  When no
> +    error is detected, the server MUST send enough data chunks to
> +    cover the entire region described by the offset and length of the
> +    client's request.
> +
> +    To minimize traffic, the server MAY use a content or error chunk
> +    as the final chunk by setting the `NBD_REPLY_FLAG_DONE` flag, but
> +    MUST NOT do so for a content chunk if it would still be possible
> +    to detect an error while transmitting the chunk.  The
> +    `NBD_REPLY_TYPE_DONE` chunk is always acceptable as the final
> +    chunk.

s/NBD_REPLY_TYPE_DONE/NBD_REPLY_TYPE_NONE/

> +    If an error is detected, the server MUST still complete the
> +    transmission of any current chunk (it SHOULD use padding bytes of
> +    zero

it MUST use padding bytes which SHOULD be zero

> for any remaining data portion of a chunk with type
> +    `NBD_REPLY_TYPE_OFFSET_DATA`), but MAY omit further content
> +    chunks.  The server MUST include an error chunk as one of the
> +    subsequent chunks, but MAY defer the error reporting behind other
> +    queued chunks.  An error chunk of type `NBD_REPLY_TYPE_ERROR`
> +    implies that the client MAY NOT make any assumptions about
> +    validity of data chunks,

validity of any data chunks (whether sent before or after the
error chunk)

> and if used, SHOULD be the only error
> +    chunk in the reply.  On the other hand, an error chunk of type
> +    `NBD_REPLY_TYPE_ERROR_OFFSET` gives fine-grained information about
> +    which earlier data chunk(s) encountered a failure, and MAY also be
> +    sent in lieu of a data chunk; as such, a server MAY still usefully
> +    follow it with further non-overlapping content chunks or with
> +    error offsets for other content chunks.  Generally, a server
> +    SHOULD NOT mix errors with offsets with a generic error.  As long
> +    as all errors are accompanied by offsets, the client MAY assume
> +    that any data chunks with no subsequent error offset are valid,
> +    that chunks with an overlapping error offset errors are valid up
> +    until the reported offset, and that portions of the read that do
> +    not have a corresponding content chunk are not valid.
> +
> +    A client MAY close the connection if it detects that the server

MUST?

> +    has sent invalid chunks (such as overlapping data, or not enough
> +    data before claiming success).
> +
> +    In order to avoid the burden of reassembly, the client MAY set the
> +    `NBD_CMD_FLAG_DF` flag ("don't fragment").  If this flag is set,
> +    the server MUST send at most one content chunk, although it MAY
> +    still send multiple chunks (the remaining chunks would be error
> +    chunks or a final type of `NBD_REPLY_TYPE_NONE`).  If the area
> +    being read contains both data and a hole, the server MUST use
> +    `NBD_REPLY_TYPE_OFFSET_DATA` with the zeroes explicitly present.
> +    A server MAY reject a client's request with the error `EOVERFLOW`
> +    if the length is too large to send without fragmentation, in which
> +    case it MUST NOT send a content chunk; however, the server MUST
> +    NOT use this error if the client's requested length does not
> +    exceed 65,536 bytes.

"the server MUST support unfragmented reads in which the client's request length
does not exceed 65,536 bytes."

Current text implies it might be acceptable to error the read with a
different error, e.g. EIO purely on the grounds it doesn't like the
length.

> +
> ## About this file
> 
> This file tries to document the NBD protocol as it is currently
> -- 
> 2.5.5
> 

-- 
Alex Bligh







Reply to: