[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Nbd] [PATCH v3 4/5] doc: Propose STRUCTURED_REPLY extension



On Thu, Mar 31, 2016 at 04:28:37PM +0100, Alex Bligh wrote:
> On 31 Mar 2016, at 15:51, Eric Blake <eblake@...696...> wrote:
> >>> -The server replies with:
> >>> +[option #A1 - only replies with payload are affected]
> >>> +The simple reply message MUST be sent by the server in response to a
> >>> +request that requires no data payload.
> >> 
> >> Well that's *definitely* wrong (even if I wrote it!) as it implies
> >> that if structured replies have been negotiated, you must still
> >> send a simple reply if the result is an error. That can't be right!
> >> I think you (I?) mean "in response to a request that never returns
> >> a data payload".
> > 
> > Indeed, my wording isn't quite right.  I'm starting to favor option #2
> > rather than #1, so maybe we don't need to try any harder on the wording
> > here, but I am trying to convey that "if a request never requires data
> > in its reply (which covers all existing requests except CMD_READ at the
> > current time), then for back-compat reasons those requests must continue
> > to use simple replies for both success and error even when structured
> > replies are negotiated".
> > 
> > But having written that, I still see two variations within both option 1
> > and option 2:
> > 
> > - if a request has any mode that requires data in its reply, then ALL
> > replies to that request must be structured
> > 
> > - a reply that includes data MUST use a structured reply, but a request
> > (other than NBD_CMD_READ) can still have a simple reply in cases where
> > no data is required (such as when reporting an error like EINVAL on
> > improper flags
> 
> Please let's have the first of these!

Agreed.

[...]
> >> Could someone recap why we need this flag? We know when DF can be
> >> sent, it's exactly when structured replies have been negotiated.
> >> I think Wouter suggested there should be a 1:1 relationship between
> >> SEND_ flags and CMD_ flags for convenience of user space. But that
> >> seems odd as there isn't at the moment. For instance SEND_FUA is
> >> on bit 3, CMD_FUA is on bit 0 or (cough) 16. None of the others
> >> have CMD flag equivalents as far as I can see.
> >> 
> >> My vote would be delete the flag (again)!
> > 
> > Wouter proposed a pretty compelling use case: when the client is split
> > between user/kernel, then the flow is:
> > 
> > user->kernel: ioctl(): are structured replies supported? if so:

This doesn't actually have to be an ioctl(), it could be a stat() on a file
under /sys.

> > user->server: NBD_OPT_STRUCTURED_REPLY
> > user->server: OPT_EXPORT_NAME/OPT_SELECT-OPT_GO
> > server->user: transmission flags reflect if structured replies are on
> > user->kernel: ioctl(): inform about transmission flags
> > user->kernel: ioctl(): phase change
> > kernel->server: send CMD_READ, reported flag says which response to expect
> > 
> > Yeah, the user space already has to ask the kernel if the kernel is
> > willing to receive structured replies before sending
> > NBD_OPT_STRUCTURED_REPLY to the server, but making the server echo back
> > the negotiation in the transmission flags saves the userspace from
> > needing an additional ioctl() to inform the kernel about whether the
> > negotiation with the server was successful.
> 
> Well it sounds to me like instead we need a NBD_FLAG_SEND_STRUCTURED_REPLY
> i.e. we need a flag to say "send structured replies", not a flag to say
> "send DF".
> 
> Or am I missing the point.

The client doesn't *need* to be informed that "there may be structured
replies coming your way". It can simply detect that the next reply
coming its way is a structured reply by looking at the magic number. Of
course, the server needs to know whether the client understands it (for
which there is the NBD_OPT_STRUCTURED_REPLY), but once the server's been
told, everything's fine as far as the client is concerned.

However, the client *does* need to be informed whether it may send a
"don't fragment" bit. If the first command the client wants to send is a
READ command, and it is not prepared to handle fragmented replies, then
it *has* to know whether it can tell the server not to send fragmented
replies.

The bit is not about telling the client about structured replies, it is
about telling the client about the DF bit.

When we add NBD_CMD_GET_BLOCK_STATUS, then there should be a similar
flag for that feature, informing the client that it may use it.

The above interaction (with /sys rather than ioctl) shows that by doing
this, we don't need to add any userspace API, which is probably a good
thing.

[...]
> >>> +* `EOVERFLOW`
> >>> +
> >>> +    The server SHOULD return `EOVERFLOW`, rather than `EINVAL`, when a
> >>> +    client has requested `NBD_CMD_FLAG_DF` for a length that is too
> >>> +    large to read without fragmentation.  The server MUST NOT return
> >>> +    this error if the read request did not exceed 65,536 bytes, and
> >>> +    SHOULD NOT return this error if `NBD_CMD_FLAG_DF` is not set.
> >> 
> >> Although I'm implementing a server at the moment where I'd like to
> >> do just that with lengths in excess of 2^31.

Why?

> > Oh, good point - particularly true if you are targeting 32-bit hardware
> > where ssize_t limits read() and write() to 2^31 bytes in a transaction
> > (and in that case, both CMD_READ and CMD_WRITE should consider this case).
> > 
> > But what do we do about Wouter's request that we not permit EOVERFLOW
> > without structured replies being negotiated?  Does that mean that
> > CMD_WRITE cannot reply with EOVERFLOW unless structured replies are
> > negotiated?
> 
> I'm guessing EIO?

No.

Large-file support on 32 bit architectures has existed since several
decades now. We should not cater to that; people should just write
software that can handle LFS offsets.

Even if your server cannot handle offsets in excess of 2^31, I still
don't see a problem -- in that case, you won't even be able to *open*
the file (let alone serve it), so what's the problem?

In other words, *if* a client sends an offset that the server cannot
handle, that eithe means the server sent an export size to the client
that it shouldn't have sent, *or* that the client sent a request for
some bytes outside of the announced size of the device, which would put
the server well within its rights to send an EINVAL.

-- 
< ron> I mean, the main *practical* problem with C++, is there's like a dozen
       people in the world who think they really understand all of its rules,
       and pretty much all of them are just lying to themselves too.
 -- #debian-devel, OFTC, 2016-02-12



Reply to: