Re: [Nbd] [PATCH v3 4/5] doc: Propose STRUCTURED_REPLY extension

To: Eric Blake <eblake@...696...>
Cc: "nbd-general@lists.sourceforge.net" <nbd-general@lists.sourceforge.net>, Wouter Verhelst <w@...112...>
Subject: Re: [Nbd] [PATCH v3 4/5] doc: Propose STRUCTURED_REPLY extension
From: Alex Bligh <alex@...872...>
Date: Thu, 31 Mar 2016 16:28:37 +0100
Message-id: <C88C5437-C921-474E-A728-27A018BAC9D1@...872...>
In-reply-to: <56FD3966.5010307@...696...>
References: <1459404384-5258-1-git-send-email-eblake@...696...> <1459404384-5258-5-git-send-email-eblake@...696...> <A6C62F41-19EF-4C8A-AC3F-031903A0F9F3@...872...> <56FD3966.5010307@...696...>

On 31 Mar 2016, at 15:51, Eric Blake <eblake@...696...> wrote:
>>> -The server replies with:
>>> +[option #A1 - only replies with payload are affected]
>>> +The simple reply message MUST be sent by the server in response to a
>>> +request that requires no data payload.
>> 
>> Well that's *definitely* wrong (even if I wrote it!) as it implies
>> that if structured replies have been negotiated, you must still
>> send a simple reply if the result is an error. That can't be right!
>> I think you (I?) mean "in response to a request that never returns
>> a data payload".
> 
> Indeed, my wording isn't quite right.  I'm starting to favor option #2
> rather than #1, so maybe we don't need to try any harder on the wording
> here, but I am trying to convey that "if a request never requires data
> in its reply (which covers all existing requests except CMD_READ at the
> current time), then for back-compat reasons those requests must continue
> to use simple replies for both success and error even when structured
> replies are negotiated".
> 
> But having written that, I still see two variations within both option 1
> and option 2:
> 
> - if a request has any mode that requires data in its reply, then ALL
> replies to that request must be structured
> 
> - a reply that includes data MUST use a structured reply, but a request
> (other than NBD_CMD_READ) can still have a simple reply in cases where
> no data is required (such as when reporting an error like EINVAL on
> improper flags

Please let's have the first of these!

> That is, Option #A1, #A2, and #A3 were concerned about what
> NBD_CMD_WRITE can reply,

although as I indicate elsewhere, it would actually be useful if
NBD_CMD_WRITE can return a structured reply so it can indicate an
error offset.

> while my two variations are concerned about
> what NBD_CMD_GET_LBA_STATUS can reply (I still think, for back-compat
> reasons, the NBD_CMD_READ must be special-cased, and be all-or-none with
> structured replies, even when reporting an error like EINVAL for an
> unrecognized flag).
> 
> If we go with option #A1 (CMD_WRITE never uses structured), then the
> parallel argument is that commands with possibility of data must never
> use simple.  If we go with option #A2 (CMD_WRITE may use structured),
> then (other than CMD_READ), it makes a bit more sense to allow a simple
> reply if the server is sending an error without data (then again, the
> whole notion of the structured error with UTF-8 human-readable string
> makes it less likely that we'd even have a reason to reply without data).

Hmm.. Well my favourite remains #A2 over #A1, but this is one reason I
preferred #A3 - the simplicity of negotiating a single reply format
and no complex rules about when one can be used versus the other! I
suppose the second simplest route is "If structured replies have
been negotiated, then the server MAY always send either structured
or unstructured replies". Sigh.

>> Could someone recap why we need this flag? We know when DF can be
>> sent, it's exactly when structured replies have been negotiated.
>> I think Wouter suggested there should be a 1:1 relationship between
>> SEND_ flags and CMD_ flags for convenience of user space. But that
>> seems odd as there isn't at the moment. For instance SEND_FUA is
>> on bit 3, CMD_FUA is on bit 0 or (cough) 16. None of the others
>> have CMD flag equivalents as far as I can see.
>> 
>> My vote would be delete the flag (again)!
> 
> Wouter proposed a pretty compelling use case: when the client is split
> between user/kernel, then the flow is:
> 
> user->kernel: ioctl(): are structured replies supported? if so:
> user->server: NBD_OPT_STRUCTURED_REPLY
> user->server: OPT_EXPORT_NAME/OPT_SELECT-OPT_GO
> server->user: transmission flags reflect if structured replies are on
> user->kernel: ioctl(): inform about transmission flags
> user->kernel: ioctl(): phase change
> kernel->server: send CMD_READ, reported flag says which response to expect
> 
> Yeah, the user space already has to ask the kernel if the kernel is
> willing to receive structured replies before sending
> NBD_OPT_STRUCTURED_REPLY to the server, but making the server echo back
> the negotiation in the transmission flags saves the userspace from
> needing an additional ioctl() to inform the kernel about whether the
> negotiation with the server was successful.

Well it sounds to me like instead we need a NBD_FLAG_SEND_STRUCTURED_REPLY
i.e. we need a flag to say "send structured replies", not a flag to say
"send DF".

Or am I missing the point.

>> You don't specifically prohibit more than one NBD_REPLY_TYPE_ERROR which
>> is probably a good thing, but if you intended to, you should document
>> it.
> 
> Okay, at this point, I'm leaning towards 'this chunk type SHOULD NOT be
> sent more than once in a structured reply' (SHOULD NOT, rather than MUST
> NOT, because there may still be lazy implementations that uses this type
> of error after each chunk with an error rather than reporting offsets,
> but good implementations will not do it), and drop the bit about it
> needing to be last or next to last.

+1

>>> +      This reply represents the contents of *length - 8* bytes of the
>>> +      file, starting at *offset*.  The data MUST lie within the bounds
>>> +      of the original offset and length of the client's request, and
>>> +      MUST NOT overlap with any earlier data or error chunks of the
>>> +      same reply.
>> 
>> Nope. It can overlap with error chunks (see the sendfile example). It
>> can't overlap with holes.
> 
> No, I meant the wording I used.  If you send an error chunk, you MUST
> NOT then send a data chunk that repeats the offset mentioned in the
> earlier error chunk.  An error chunk may overlap an earlier data chunk,
> but not the other way around.

Ah. The key is 'later'.

>> "and MUST NOT overlap with any earlier data or hole chunks of the
>> same reply"
> 
> And here, I was TRYING to be clever by using "data chunk" to mean
> "either TYPE_OFFSET_DATA or TYPE_OFFSET_HOLE".  But that overloads the
> term "data chunk", so maybe I can come up with a better term?

Doh. Hoist by my own terminology. Perhaps we need a different term.

> Again, I was trying to lump data and hole types both as "data chunks".
> Maybe "content chunks" is a good unambiguous term?

Yes.

>>> +* `EOVERFLOW`
>>> +
>>> +    The server SHOULD return `EOVERFLOW`, rather than `EINVAL`, when a
>>> +    client has requested `NBD_CMD_FLAG_DF` for a length that is too
>>> +    large to read without fragmentation.  The server MUST NOT return
>>> +    this error if the read request did not exceed 65,536 bytes, and
>>> +    SHOULD NOT return this error if `NBD_CMD_FLAG_DF` is not set.
>> 
>> Although I'm implementing a server at the moment where I'd like to
>> do just that with lengths in excess of 2^31.
> 
> Oh, good point - particularly true if you are targeting 32-bit hardware
> where ssize_t limits read() and write() to 2^31 bytes in a transaction
> (and in that case, both CMD_READ and CMD_WRITE should consider this case).
> 
> But what do we do about Wouter's request that we not permit EOVERFLOW
> without structured replies being negotiated?  Does that mean that
> CMD_WRITE cannot reply with EOVERFLOW unless structured replies are
> negotiated?

I'm guessing EIO?

> I _want_ to call the NONE type a chunk, but it is neither a 'data chunk'
> ('content chunk'?) nor an 'error chunk', and is a special type in that
> if it is used in a structured reply, it must always be the final chunk.

How about you define:

* content chunks (HOLE and DATA)
* non-content chunks (NONE and ERROR)

and define everything explicitly as a chunk.

--
Alex Bligh

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply to:

Follow-Ups:
- Re: [Nbd] [PATCH v3 4/5] doc: Propose STRUCTURED_REPLY extension
  - From: Eric Blake <eblake@...696...>
- Re: [Nbd] [PATCH v3 4/5] doc: Propose STRUCTURED_REPLY extension
  - From: Wouter Verhelst <w@...112...>

References:
- [Nbd] [PATCH v3 0/5] Structured replies
  - From: Eric Blake <eblake@...696...>
- [Nbd] [PATCH v3 4/5] doc: Propose STRUCTURED_REPLY extension
  - From: Eric Blake <eblake@...696...>
- Re: [Nbd] [PATCH v3 4/5] doc: Propose STRUCTURED_REPLY extension
  - From: Alex Bligh <alex@...872...>
- Re: [Nbd] [PATCH v3 4/5] doc: Propose STRUCTURED_REPLY extension
  - From: Eric Blake <eblake@...696...>

Prev by Date: Re: [Nbd] [PATCH v3 4/5] doc: Propose STRUCTURED_REPLY extension
Next by Date: Re: [Nbd] [PATCH v3 4/5] doc: Propose STRUCTURED_REPLY extension
Previous by thread: Re: [Nbd] [PATCH v3 4/5] doc: Propose STRUCTURED_REPLY extension
Next by thread: Re: [Nbd] [PATCH v3 4/5] doc: Propose STRUCTURED_REPLY extension
Index(es):
- Date
- Thread