On 04/07/2016 10:05 AM, Alex Bligh wrote: > This commit adds documentation for NBD_CMD_CLOSE which provides a > safer way for the client to close connections, and indicates > unambiguously to the server and client whether a close is safe. > > Signed-off-by: Alex Bligh <alex@...872...> > --- > doc/proto.md | 69 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 69 insertions(+) > > This applies on top of: > [PATCHv5] docs/proto.md: Clarify SHOULD / MUST / MAY etc > so it can take advantage of the 'exposes' language. > > Changes from v1: > > * Make a NBD_CMD_CLOSE imply a flush > > * Nits from Eric Blake > > diff --git a/doc/proto.md b/doc/proto.md > index e5042aa..86dac02 100644 > --- a/doc/proto.md > +++ b/doc/proto.md > @@ -342,6 +342,7 @@ The field has the following format: > experimental `WRITE_ZEROES` extension; see below. > - bit 7, `NBD_FLAG_SEND_DF`: defined by the experimental `STRUCTURED_REPLY` > extension; see below. > +- bit 8, `NBD_FLAG_SEND_CLOSE`: exposes support for `NBD_CMD_CLOSE`. > > Clients SHOULD ignore unknown flags. > > @@ -608,6 +609,12 @@ The following request types exist: > The values of the length and offset fields in a disconnect request > are not defined. > > + A client SHOULD NOT send `NBD_CMD_DISC` if `NBD_FLAG_SEND_CLOSE` > + is set in the transmission flags field, but SHOULD use > + `NBD_CMD_CLOSE` instead; in such circumstances, if the client > + cannot wait for inflight requests to complete it SHOULD simply > + disconnect. > + > * `NBD_CMD_FLUSH` (3) > > A flush request; a write barrier. The server MUST NOT send a > @@ -638,6 +645,68 @@ The following request types exist: > > Defined by the experimental `WRITE_ZEROES` extension; see below. > > +* `NBD_CMD_CLOSE` (7) > + > + A close request. The server MUST handle all outstanding > + requests (there should be none by the spec), and then MUST send an > + `NBD_REP_ACK`. The server MUST NOT send an error reply. > + The server MUST NOT send anything after sending an `NBD_REP_ACK` > + in response to an `NBD_CMD_CLOSE`. Sorry for not catching this in v1, but NBD_REP_ACK is a handshake phase message, but we're in transmission phase. Particularly in the case of the userspace/kernel split-brained client, that would mean teaching the kernel to look for NBD_REP_ACK. Not good. Rather, we have to require that the server MUST send a normal transmission phase reply with no error set (we could go into details that the reply must be a simple reply if structured replies are not negotiated; or that if structured replies are available the reply should have NBD_REPLY_TYPE_NONE, but I don't think that adds anything over what the structured reply extension already documents). Furthermore, do we really require no error here... > + A client MUST wait until there are no outstanding requests > + before sending an `NBD_CMD_CLOSE`, and following this > + point it MUST treat either the receipt of an `NBD_REP_ACK` > + or a TCP level disconnect as a safe close. Again, NBD_REP_ACK is the wrong thing here; this should be a transmission reply with the error field set to 0. > + > + A disconnect in any other circumstances MUST be considered an > + unsafe close, except for `NBD_CMD_DISC` sent or received without > + outstanding requests, which the server or client MAY treat > + as it wishes. > + > + A client MUST NOT send anything to the server after sending an > + `NBD_CMD_CLOSE` command. On receipt of an `NBD_REP_ACK` in > + response to a `NBD_CMD_CLOSE`, the client MUST close the > + connection. > + > + The server MAY close the connection after sending `NBD_REP_ACK` > + provided it has first ensured all network traffic has been > + flushed to the socket (especially important if using TLS). More bad references to NBD_REP_ACK. > + > + The length and offset fields in an `NBD_CMD_CLOSE` request > + MUST both be zero. ...what happens if they aren't? Can the server set EINVAL in the error field, even though we said no errors up above? If we allow setting an error, would we want to leave the connection open on error? On the other hand, I guess the MUST is sufficient to state that a compliant client will pass 0, and then we don't care what happens to a non-compliant client. I'm trying to see if there's ever any way a future extension could make use of a magic length/offset pair (perhaps with a new flag), but even that would require that we decide whether the server has to behave differently on getting an unrecognized flag. I'm leaning towards: the server MAY report an error (such as EINVAL for an unrecognized flag or nonzero offset or length), but still MUST reply and send no further data; the client MUST disconnect whether the reply received was successful or an error. Then, if we ever DO need an extension, I would state that the extension requires structured replies, and we can amend the client to realize that if a new NBD_REPLY_TYPE_FOO is sent, then the extension behavior is in effect instead of the usual "traffic is done, client must close" semantics. Also, in reading this, I reviewed NBD_OPT_ABORT. Should we tighten any of the wording there? But a clean close after NBD_OPT_ABORT is less important: until the transmission phase starts, there is no data that needs to be flushed to persistent storage. So it doesn't matter if client and/or server do an unclean disconnect after client sends NBD_OPT_ABORT. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
Attachment:
signature.asc
Description: OpenPGP digital signature