On 31 Mar 2016, at 15:51, Eric Blake <eblake@...696...> wrote: >>> -The server replies with: >>> +[option #A1 - only replies with payload are affected] >>> +The simple reply message MUST be sent by the server in response to a >>> +request that requires no data payload. >> >> Well that's *definitely* wrong (even if I wrote it!) as it implies >> that if structured replies have been negotiated, you must still >> send a simple reply if the result is an error. That can't be right! >> I think you (I?) mean "in response to a request that never returns >> a data payload". > > Indeed, my wording isn't quite right. I'm starting to favor option #2 > rather than #1, so maybe we don't need to try any harder on the wording > here, but I am trying to convey that "if a request never requires data > in its reply (which covers all existing requests except CMD_READ at the > current time), then for back-compat reasons those requests must continue > to use simple replies for both success and error even when structured > replies are negotiated". > > But having written that, I still see two variations within both option 1 > and option 2: > > - if a request has any mode that requires data in its reply, then ALL > replies to that request must be structured > > - a reply that includes data MUST use a structured reply, but a request > (other than NBD_CMD_READ) can still have a simple reply in cases where > no data is required (such as when reporting an error like EINVAL on > improper flags Please let's have the first of these! > That is, Option #A1, #A2, and #A3 were concerned about what > NBD_CMD_WRITE can reply, although as I indicate elsewhere, it would actually be useful if NBD_CMD_WRITE can return a structured reply so it can indicate an error offset. > while my two variations are concerned about > what NBD_CMD_GET_LBA_STATUS can reply (I still think, for back-compat > reasons, the NBD_CMD_READ must be special-cased, and be all-or-none with > structured replies, even when reporting an error like EINVAL for an > unrecognized flag). > > If we go with option #A1 (CMD_WRITE never uses structured), then the > parallel argument is that commands with possibility of data must never > use simple. If we go with option #A2 (CMD_WRITE may use structured), > then (other than CMD_READ), it makes a bit more sense to allow a simple > reply if the server is sending an error without data (then again, the > whole notion of the structured error with UTF-8 human-readable string > makes it less likely that we'd even have a reason to reply without data). Hmm.. Well my favourite remains #A2 over #A1, but this is one reason I preferred #A3 - the simplicity of negotiating a single reply format and no complex rules about when one can be used versus the other! I suppose the second simplest route is "If structured replies have been negotiated, then the server MAY always send either structured or unstructured replies". Sigh. >> Could someone recap why we need this flag? We know when DF can be >> sent, it's exactly when structured replies have been negotiated. >> I think Wouter suggested there should be a 1:1 relationship between >> SEND_ flags and CMD_ flags for convenience of user space. But that >> seems odd as there isn't at the moment. For instance SEND_FUA is >> on bit 3, CMD_FUA is on bit 0 or (cough) 16. None of the others >> have CMD flag equivalents as far as I can see. >> >> My vote would be delete the flag (again)! > > Wouter proposed a pretty compelling use case: when the client is split > between user/kernel, then the flow is: > > user->kernel: ioctl(): are structured replies supported? if so: > user->server: NBD_OPT_STRUCTURED_REPLY > user->server: OPT_EXPORT_NAME/OPT_SELECT-OPT_GO > server->user: transmission flags reflect if structured replies are on > user->kernel: ioctl(): inform about transmission flags > user->kernel: ioctl(): phase change > kernel->server: send CMD_READ, reported flag says which response to expect > > Yeah, the user space already has to ask the kernel if the kernel is > willing to receive structured replies before sending > NBD_OPT_STRUCTURED_REPLY to the server, but making the server echo back > the negotiation in the transmission flags saves the userspace from > needing an additional ioctl() to inform the kernel about whether the > negotiation with the server was successful. Well it sounds to me like instead we need a NBD_FLAG_SEND_STRUCTURED_REPLY i.e. we need a flag to say "send structured replies", not a flag to say "send DF". Or am I missing the point. >> You don't specifically prohibit more than one NBD_REPLY_TYPE_ERROR which >> is probably a good thing, but if you intended to, you should document >> it. > > Okay, at this point, I'm leaning towards 'this chunk type SHOULD NOT be > sent more than once in a structured reply' (SHOULD NOT, rather than MUST > NOT, because there may still be lazy implementations that uses this type > of error after each chunk with an error rather than reporting offsets, > but good implementations will not do it), and drop the bit about it > needing to be last or next to last. +1 >>> + This reply represents the contents of *length - 8* bytes of the >>> + file, starting at *offset*. The data MUST lie within the bounds >>> + of the original offset and length of the client's request, and >>> + MUST NOT overlap with any earlier data or error chunks of the >>> + same reply. >> >> Nope. It can overlap with error chunks (see the sendfile example). It >> can't overlap with holes. > > No, I meant the wording I used. If you send an error chunk, you MUST > NOT then send a data chunk that repeats the offset mentioned in the > earlier error chunk. An error chunk may overlap an earlier data chunk, > but not the other way around. Ah. The key is 'later'. >> "and MUST NOT overlap with any earlier data or hole chunks of the >> same reply" > > And here, I was TRYING to be clever by using "data chunk" to mean > "either TYPE_OFFSET_DATA or TYPE_OFFSET_HOLE". But that overloads the > term "data chunk", so maybe I can come up with a better term? Doh. Hoist by my own terminology. Perhaps we need a different term. > Again, I was trying to lump data and hole types both as "data chunks". > Maybe "content chunks" is a good unambiguous term? Yes. >>> +* `EOVERFLOW` >>> + >>> + The server SHOULD return `EOVERFLOW`, rather than `EINVAL`, when a >>> + client has requested `NBD_CMD_FLAG_DF` for a length that is too >>> + large to read without fragmentation. The server MUST NOT return >>> + this error if the read request did not exceed 65,536 bytes, and >>> + SHOULD NOT return this error if `NBD_CMD_FLAG_DF` is not set. >> >> Although I'm implementing a server at the moment where I'd like to >> do just that with lengths in excess of 2^31. > > Oh, good point - particularly true if you are targeting 32-bit hardware > where ssize_t limits read() and write() to 2^31 bytes in a transaction > (and in that case, both CMD_READ and CMD_WRITE should consider this case). > > But what do we do about Wouter's request that we not permit EOVERFLOW > without structured replies being negotiated? Does that mean that > CMD_WRITE cannot reply with EOVERFLOW unless structured replies are > negotiated? I'm guessing EIO? > I _want_ to call the NONE type a chunk, but it is neither a 'data chunk' > ('content chunk'?) nor an 'error chunk', and is a special type in that > if it is used in a structured reply, it must always be the final chunk. How about you define: * content chunks (HOLE and DATA) * non-content chunks (NONE and ERROR) and define everything explicitly as a chunk. -- Alex Bligh
Attachment:
signature.asc
Description: Message signed with OpenPGP using GPGMail