[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

[PATCH] doc: Improvements to block-status



A patch was recently proposed to qemu to implement the block status
extension (targetted for qemu 2.12, to be released in a few more
weeks).  Make some tweaks to the extension documentation based on
lessons learned during the qemu implementation:

- Make it obvious that the export name in the final
NBD_OPT_SET_META_CONTEXT must match the export name given to
NBD_OPT_GO (or NBD_OPT_EXPORT_NAME, although any client using
structured replies should be favoring NBD_OPT_GO)

- Allow a difference between valid query strings during
NBD_OPT_LIST_META_CONTEXT and NBD_OPT_SET_META_CONTEXT. The
obvious example: in the "base:" namespace, the query "base:"
makes sense to LIST (list ALL contexts that this server knows
about in the base namespace), but not to SET (if a client
knows about only "base:allocation", but the server has also
learned "base:foo", having the server reply with 2 contexts
is a waste of computation since the client will be discarding
"base:foo" on every NBD_CMD_BLOCK_STATUS).

- Explicitly mention that unknown namespaces should be ignored
rather than causing an error (errors should only occur for
failure to find a "namespace:" prefix, although even that
should not be required as ignoring is fine there too; and for
a known "namespace:" prefix with a leaf-name that cannot be
parsed according to the namespace's rules).  Thus, each query
in the client's request can map to 0 (ignored), 1 (recognized),
or many (wildcarding) responses from the server (although
1-to-many expansion may be more appropriate during LIST than
SET).

- Mention that NBD_REP_ERR_TOO_BIG is not appropriate when the
client does NBD_OPT_LIST_META_CONTEXT with zero queries

- Allow the server responses in a different order than the
client's request (both for NBD_OPT_{LIST,SET}_META_CONTEXT,
and for NBD_CMD_BLOCK_STATUS)

- Require the server's response to NBD_CMD_BLOCK_STATUS to be
aligned to any advertised minimum block size

- Consistently uses 'x-' as the prefix for namespaces that will
not be standardized (rather than a mix of 'x-' and 'X-')

- Similar to recent changes to NBD_CMD_{READ,WRITE}, make it
clear that a 0-length NBD_CMD_BLOCK_STATUS is unsupported
(at which point, because NBD_REPLY_TYPE_BLOCK_STATUS requires
extent descriptors to have a non-zero length, a client will
always make progress if the command succeeds)

- General wording improvements

Signed-off-by: Eric Blake <eblake@redhat.com>
---

Applies to the extension-blockstatus branch.  I think this
covers all the points raised in various emails either here
or on the qemu list in the last few weeks, as we have been
pushing to get block-status implemented in qemu 2.12

I'm open to suggestions if any of the wording changes here are
still difficult to understand, or don't quite map to the
patchset currently pending for qemu:
https://lists.gnu.org/archive/html/qemu-devel/2018-03/msg04031.html

 doc/proto.md | 192 ++++++++++++++++++++++++++++++++++-------------------------
 1 file changed, 112 insertions(+), 80 deletions(-)

diff --git a/doc/proto.md b/doc/proto.md
index dc79c80..db52401 100644
--- a/doc/proto.md
+++ b/doc/proto.md
@@ -849,9 +849,11 @@ The procedure works as follows:
   of type `NBD_REPLY_TYPE_BLOCK_STATUS`.

 A client MUST NOT use `NBD_CMD_BLOCK_STATUS` unless it selected a
-nonzero number of metadata contexts during negotiation. Servers SHOULD
-reply to clients sending `NBD_CMD_BLOCK_STATUS` without
-selecting metadata contexts with `EINVAL`.
+nonzero number of metadata contexts during negotiation, and used the
+same export name for the subsequent `NBD_OPT_GO` (or
+`NBD_OPT_EXPORT_NAME`). Servers SHOULD reply with `EINVAL` to clients
+sending `NBD_CMD_BLOCK_STATUS` without selecting at least one metadata
+context.

 The reply to the `NBD_CMD_BLOCK_STATUS` request MUST be sent as a
 structured reply; this implies that in order to use metadata querying,
@@ -905,6 +907,11 @@ one of two forms:
   that exists within the `base` namespace (currently just
   `base:allocation`).

+The query string with no `[leaf-name]` is only valid within
+`NBD_OPT_LIST_META_CONTEXT`.  If a `[leaf-name]' requested by the
+client is not recognized, the server MUST ignore it rather than
+reporting an error.
+
 #### `base:allocation` metadata context

 The `base:allocation` metadata context is the basic "allocated at all"
@@ -1261,11 +1268,6 @@ of the newstyle negotiation.
     Return a list of `NBD_REP_META_CONTEXT` replies, one per context,
     followed by an `NBD_REP_ACK`.

-    If the query string is syntactically invalid, the server SHOULD send
-    `NBD_REP_ERR_INVALID`. If the query string is syntactically valid
-    but finds no metadata contexts, the server MUST send a single
-    reply of type `NBD_REP_ACK`.
-
     This option MUST NOT be requested unless structured replies have
     been negotiated first. If a client attempts to do so, a server
     SHOULD send `NBD_REP_ERR_INVALID`.
@@ -1278,43 +1280,56 @@ of the newstyle negotiation.
     - Zero or more queries, each being:
        - 32 bits, length of query.
        - String, query to list a subset of the available metadata
-         contexts.
+         contexts.  The syntax of this query is
+         implementation-defined, except that it MUST start with a
+         namespace and a colon.

-    For details on the query string, see under `NBD_OPT_SET_META_CONTEXT`.
+    For details on the query string, see the "Metadata querying"
+    section; note that a namespace may document that a different set
+    of queries are valid for `NBD_OPT_LIST_META_CONTEXT` than for
+    `NBD_OPT_SET_META_CONTEXT`.
+
+    If the query string is syntactically invalid (such as lacking the
+    colon that would delimit the namespace, or using a leaf name that
+    is invalid for a known namespace), the server MAY fail the request
+    with `NBD_REP_ERR_INVALID`.  However, the server MUST ignore query
+    strings belonging to an unknown namespace.  If none of the query
+    strings find any metadata contexts, the server MUST send a single
+    reply of type `NBD_REP_ACK`.

     The server MUST either reply with an error (for instance
-    `NBD_REP_ERR_UNSUP` if the option is not supported), or reply with a
-    list of `NBD_REP_META_CONTEXT` replies followed by `NBD_REP_ACK`.
+    `NBD_REP_ERR_UNSUP` if the option is not supported), or reply with
+    a list of zero or more `NBD_REP_META_CONTEXT` replies followed by
+    `NBD_REP_ACK`.

-    If zero queries are sent, then the server MUST return all
-    the metadata contexts that are available to the client to select
-    on the given export with `NBD_OPT_SET_META_CONTEXT`, except
-    as set out below.
+    If zero queries are sent, then the server MUST return all the
+    metadata contexts that are available to the client to select on
+    the given export with `NBD_OPT_SET_META_CONTEXT`, except as set
+    out below.  In this case, the server SHOULD NOT fail with
+    `NBD_REP_ERR_TOO_BIG`.

-    If one or more queries are sent, then the server MUST return
-    those metadata contexts that are available to the client to
-    select on the given export with `NBD_OPT_SET_META_CONTEXT`,
-    and which match one or more of the queries given. The
-    support of wildcarding within the leaf-name portion of
-    the query string is dependent upon the namespace.
+    If one or more queries are sent, then the server MUST return those
+    metadata contexts that are available to the client to select on
+    the given export with `NBD_OPT_SET_META_CONTEXT`, and which match
+    one or more of the queries given. The support of wildcarding
+    within the leaf-name portion of the query string is dependent upon
+    the namespace.  The server MAY send contexts in a different order
+    than in the client's query.

-    In either case, however, for any given namespace the
-    server MAY, instead of exhaustively listing every
-    matching context available to select (or every context
-    available to select where no query is given), send
-    sufficient context records back to allow a client with
-    knowledge of the namespace to select any context.
-     This may be helpful where a client can
-    construct algorithmic queries. For instance, a client might
-    reply simply with the namespace with no leaf-name (e.g.
-    'X-FooBar:') or with a range of values (e.g.
-    'X-ModifiedDate:20160310-20161214'). The semantics of
-    such a reply are a matter for the definition of the
-    namespace. However each namespace returned MUST begin
-    with the relevant namespace, followed by a colon, and then
-    other UTF-8 characters, with the entire string following the
-    restrictions for strings set out earlier in this
-    document.
+    In either case, however, for any given namespace the server MAY,
+    instead of exhaustively listing every matching context available
+    to select (or every context available to select where no query is
+    given), send sufficient context records back to allow a client
+    with knowledge of the namespace to select any context.  This may
+    be helpful where a client can construct algorithmic queries. For
+    instance, a client might reply simply with the namespace with no
+    leaf-name (e.g.  'x-FooBar:') or with a range of values (e.g.
+    'x-ModifiedDate:20160310-20161214'). The semantics of such a reply
+    are a matter for the definition of the namespace. However each
+    namespace returned MUST begin with the relevant namespace,
+    followed by a colon, and then other UTF-8 characters, with the
+    entire string following the restrictions for strings set out
+    earlier in this document.

     The metadata context ID in these replies is reserved and SHOULD be
     set to zero; clients MUST disregard it.
@@ -1322,16 +1337,21 @@ of the newstyle negotiation.
 * `NBD_OPT_SET_META_CONTEXT` (10)

     Change the set of active metadata contexts. Issuing this command
-    replaces all previously-set metadata contexts; clients must ensure
-    that all metadata contexts they are interested in are selected with
-    the final query that they sent.
+    replaces all previously-set metadata contexts (including when this
+    command fails); clients must ensure that all metadata contexts
+    they are interested in are selected with the final query that they
+    sent.

-    A client MUST NOT send `NBD_CMD_BLOCK_STATUS` unless
-    within the negotiation phase it sent `NBD_OPT_SET_META_CONTEXT`
-    at least once, and the final time it was sent, it referred
-    to the export name that was ultimately selected, the server
-    responded without an error, and returned at least one metadata
-    context.
+    This option MUST NOT be requested unless structured replies have
+    been negotiated first. If a client attempts to do so, a server
+    SHOULD send `NBD_REP_ERR_INVALID`.
+
+    A client MUST NOT send `NBD_CMD_BLOCK_STATUS` unless within the
+    negotiation phase it sent `NBD_OPT_SET_META_CONTEXT` at least
+    once, and where the final time it was sent, it referred to the
+    same export name that was ultimately selected for transmission
+    phase, and where the server responded by returning least one
+    metadata context without error.

     Data:
     - 32 bits, length of export name.
@@ -1346,17 +1366,19 @@ of the newstyle negotiation.

     If zero queries are sent, the server MUST select no metadata contexts.

-    The server MAY return `NBD_REP_ERR_TOO_BIG` if a request
-    seeks to select too many contexts. Otherwise
-    the server MUST reply with a number of `NBD_REP_META_CONTEXT`
-    replies, one for each selected metadata context, each with a unique
-    metadata context ID, followed by `NBD_REP_ACK`. The metadata context
-    ID is transient and may vary across calls to `NBD_OPT_SET_META_CONTEXT`;
-    clients MUST therefore treat the ID as an opaque value and not (for
-    instance) cache it between connections. It is not an error if a
-    `NBD_OPT_SET_META_CONTEXT` option does not select any metadata
-    context, provided the client then does not attempt to issue
-    `NBD_CMD_BLOCK_STATUS` commands.
+    The server MAY return `NBD_REP_ERR_TOO_BIG` if a request seeks to
+    select too many contexts. Otherwise the server MUST reply with a
+    number of `NBD_REP_META_CONTEXT` replies, one for each selected
+    metadata context, each with a unique metadata context ID, followed
+    by `NBD_REP_ACK`. The server MUST ignore queries that do not
+    select a single metadata context, and MAY return selected contexts
+    in a different order than in the client's request.  The metadata
+    context ID is transient and may vary across calls to
+    `NBD_OPT_SET_META_CONTEXT`; clients MUST therefore treat the ID as
+    an opaque value and not (for instance) cache it between
+    connections. It is not an error if a `NBD_OPT_SET_META_CONTEXT`
+    option does not select any metadata context, provided the client
+    then does not attempt to issue `NBD_CMD_BLOCK_STATUS` commands.

 #### Option reply types

@@ -1928,20 +1950,24 @@ The following request types exist:

 * `NBD_CMD_BLOCK_STATUS` (7)

-    A block status query request. Length and offset define the range of
-    interest.
+    A block status query request. Length and offset define the range
+    of interest.  The client SHOULD NOT request a status length of 0;
+    the behavior of a server on such a request is unspecified although
+    the server SHOULD NOT disconnect.

-    A client MUST NOT send `NBD_CMD_BLOCK_STATUS` unless
-    within the negotiation phase it sent `NBD_OPT_SET_META_CONTEXT`
-    at least once, and the final time it was sent, it referred
-    to the export name that was finally selected, the server
-    responded without an error, and returned at least one metadata
-    context. This in turn requires the client to
-    first negotiate structured replies. For a successful return, the
-    server MUST use a structured reply, containing at least one chunk of
-    type `NBD_REPLY_TYPE_BLOCK_STATUS`, where the status field of each
-    descriptor is determined by the flags field as defined by the
-    metadata context.
+    A client MUST NOT send `NBD_CMD_BLOCK_STATUS` unless within the
+    negotiation phase it sent `NBD_OPT_SET_META_CONTEXT` at least
+    once, and where the final time that was sent, it referred to the
+    same export name used to enter transmission phase, and where the
+    server returned at least one metadata context without an error.
+    This in turn requires the client to first negotiate structured
+    replies. For a successful return, the server MUST use a structured
+    reply, containing exactly one chunk of type
+    `NBD_REPLY_TYPE_BLOCK_STATUS` per selected context id, where the
+    status field of each descriptor is determined by the flags field
+    as defined by the metadata context.  The server MAY send chunks in
+    a different order than the context ids were assigned in reply to
+    `NBD_OPT_SET_META_CONTEXT`.

     The list of block status descriptors within the
     `NBD_REPLY_TYPE_BLOCK_STATUS` chunk represent consecutive portions
@@ -1949,14 +1975,20 @@ The following request types exist:
     *length* fields of each descriptor MUST not be greater than the
     overall *length* of the request. This means that the server MAY
     return less data than required. However the server MUST return at
-    least one status descriptor.  The server SHOULD use different
-    *status* values between consecutive descriptors, and SHOULD use
-    descriptor lengths that are an integer multiple of 512 bytes where
-    possible (the first and last descriptor of an unaligned query being
-    the most obvious places for an exception). The status flags are
-    intentionally defined so that a server MAY always safely report a
-    status of 0 for any block, although the server SHOULD return
-    additional status values when they can be easily detected.
+    least one status descriptor (and since each status descriptor has
+    a non-zero length, a client can always make progress on a
+    successful return).  The server SHOULD use different *status*
+    values between consecutive descriptors where feasible, although
+    the client SHOULD be prepared to handle consecutive descriptors
+    with the same *status* value.  The server SHOULD use descriptor
+    lengths that are an integer multiple of 512 bytes where possible
+    (the first and last descriptor of an unaligned query being the
+    most obvious places for an exception), and MUST use descriptor
+    lengths that are an integer multiple of any advertised minimum
+    block size. The status flags are intentionally defined so that a
+    server MAY always safely report a status of 0 for any block,
+    although the server SHOULD return additional status values when
+    they can be easily detected.

     If an error occurs, the server SHOULD set the appropriate error
     code in the error field of an error chunk. However, if the error
-- 
2.14.3


Reply to: