[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Nbd] Thoughts about the handshake protocol



On Tue, May 31, 2011 at 09:03:32AM +0200, Goswin von Brederlow wrote:
> Wouter Verhelst <w@...112...> writes:
> 
> > *sigh*
> >
> > On Mon, May 30, 2011 at 01:45:27PM +0200, Goswin von Brederlow wrote:
> >> Hi,
> >> 
> >> revently it was mentioned that the new-style handshake has some problems
> >> and needs some fixing up for the future. So I wanted to collect some
> >> thoughts about this to maybe spark a redesign. Turned out to describe
> >> pretty much the complete handshake. Let me know what you think:
> >> 
> >> ----------------------------------------------------------------------
> >> Currently the handshake starts with:
> >> 
> >> S: "NBDMAGIC" (as in the old style handshake)
> >> S: 0x49484156454F5054 (note different magic number)
> >> S: 16 bits of zero (reserved for future use)
> >> C: 32 bits of zero (reserved for future use)
> >> 
> >> So the server identifies itself as NBD server. The client though does
> >> not identify itself but just starts sending payload.
> >> 
> >> Now what are those 16/32 bit of zero for? Some sort of version number or
> >> bitfield of options? The problem I have with them is that they allow no
> >> room for compromise. If the client is not at least as new as the server
> >> it will not understand all the bits the server sends. The only option
> >> the client then has it to give up.
> >
> > No, that's not true. Why would the client have to give up? If it doesn't
> > understand what a particular bit in the reserved field means, it must
> > ignore it. If a negotiation happens between a client and a server
> > whereby the client does not acknowledge understanding of a particular
> > option that is announced in the server's reserved field, then it's up to
> > the server to either end the communication, or fall back to old
> > behaviour. That's the whole point of these reserved bits.
> 
> Ah, ok. That isn't clear from the proto.txt.
> 
> > The only kind of 'required' future bits that I could imagine adding are
> > those that would pertain to authentication. Everything else we can
> > disable if the client would not support it. And since we're not
> > encrypting data between the client and the server (and I have no
> > intention of doing that, either), any authenticated TCP session could
> > theoretically be hijacked, which makes authentication useless; so I'm
> > not even sure authentication is ever going to happen.
> 
> +1.
> 
> >> Next the option haggling is a good idea. But it lacks a way for the
> >> server to say when it doesn't understand an option and for the client to
> >> say there are no more option comming.
> >
> > We've talked about that, earlier, and there is a solution already that
> > does not require breaking backwards compatibility.
> >
> > Now while I don't oppose improving the handshake per se if it can be
> > done and there is a benefit in doing so, I do oppose breaking backwards
> > compatibility if it can be avoided.
> 
> Didn't you say you didn't like that either?
> 
> >> ----------------------------------------------------------------------
> >> So I would like to get a bit more flexibility into this:
> >> 
> >> S: "NBDMAGIC" (as in the old style handshake)
> >> S: 8 byte magic number
> >> S: 16 bits required features
> >> S: 48 bits offer optional features
> >
> > Doing it that way *will* break backwards compatibility, because you're
> > changing the initial greeting you get from the server which is sent to
> > the client before the client can even respond. That's *bad*, period.
> >
> > I've changed the protocol incompatibly once, and I *do* *not* want to do
> > that again. Yes, I made some mistakes, and it would've been better with
> > some more review, but unfortunately I didn't get any at the time, so now
> > we're stuck with it. And that's the end of it.
> >
> > Also, there's no need to differ between 'optional' and 'required'
> > features. If there is a required feature, there are several ways in
> > which the client can inform the server that it understands this new
> > option (through setting a bit in its reserved field, or through sending
> > an option message). If that doesn't happen, the server can simply close
> > the connection.
> >
> > It might be nice to add a message that the server can send which would
> > say something like "you haven't acknowledged an option that I require",
> > or to have a way for the server to say "I don't understand that option".
> > But beyond that, I don't see the benefit of changing the negotation any
> > further.
> >
> > [...]
> >> A client MUST understand all required features set by the server. For
> >> now all bits would be reserved and 0 and any incompatible change would
> >> allocate one bit and set it. Hopefully there won't be any. Bit 15 of
> >> required features can be reserved to indicate another 8 byte of feature
> >> bits follow. The client acknowledges that it understands all the
> >> required features by echoing them back. Again bit 15 of the required
> >> features would indicate another 8 byte of feature bits follow the reply
> >> just like in the servers hello.
> >> 
> >> Optional features will be set by the server to indicate it supports a
> >> certain feature (e.g. FUA or FLUSH). The client responds with a bitfield
> >> indicating those features it wishes to use. Any unknown feature must be
> >> disabled (set to 0) and this must be safe to do so. Only bits set by the
> >> server might be returned set by the client. Only those features set by
> >> both server and client MAY be used.
> >
> > There's no need for any of that, IMO.
> >
> > Things I will change include:
> > - Supporting more options by making NBD_OPT_EXPORT_NAME 'special', as
> >   discussed previously.
> > - Setting a flag, which (if the client acknowledges it through its
> >   reserved field) will allow the server to drop the 124 bytes of zeroes
> >   (they're useless cruft that I forgot to remove)
> >
> > Beyond that, I don't see why it would be necessary to change more
> > things. As long as the client and the server can communicate and
> > negotiate with one another properly (and they can), there's no need to
> > change.
> 
> Ok. So you will use up the first reserved bit now. My suggestion then
> would be to do this all through setting a flag instead of making
> NBD_OPT_EXPORT_NAME 'special':
> 
> Bit 0 (server+client): support new-style handshake V2
> 
> Required changes:
> 
> - NBD_OPT_EXPORT_NAME now recieves a reply containing size+flags of the
>   export. No trailing zeroes. This option is required before NBD_OPT_END.

It also selects that export for use. I suppose a second option could be
added that gets the info but does not select it for use later on, if
needs be, but I don't think that would be necessary.

> - All options are acknowledged by the server
>   S: 0x49484156454F5054
>   S: 32 bits denoting the requested option
>   S: 32 bit error code (0 == SUCCESS)
>   S: 32 bit length of option data
>   S: length bytes of data
> 
>   On success the payload is option specific, on error the payload is
>   plain text describing the error.

Hmm. It seems more efficient if the server only replies when it has to:
- The client is using an option that the server does not understand: the
  server must reply with an error
- The client is using an option that requests some information from the
  server (like the "list exports" one you talked about)

> - If any further reserved bits (1-15) are not ACKed by the client or any
>   of the client options (bits 16-31) are not understood the server may
>   repsond to the first option with an error explaining what required
>   feature is not met and terminate the connection even though that reply
>   has nothing to do with the requested option.

There's no need to define this as "may respond to". The server replies
can be structured, and we can define that the NBD_OPT_END must also be
the last message sent by the server. If there is something going wrong,
the server can send a message that signals the client does not implement
a required option, with an error message in the payload.

What the server sends to the client may be in response to what the
client sends to the server, but it does not *have* to be.

> - New option NBD_OPT_END
>   This must be the last option send. On success the NBD protocol
>   starts. On error the connection is terminated.

Right, with an error message in the payload.

> Optional features (i.e. just reserve the numbers for the options,
> clients don't have to implement them):
> 
> - New option NBD_OPT_LIST_EXPORTS
>   Return a list of available exports (array of 0 terminated strings?)

Mmm. I'd prefer a newline-separated list of export names, that's going
to be easier -- and there's no need to support complex export names.

> - New option NBD_OPT_SYNC
>   Open the export with O_SYNC

Maybe make the 'sync' per-export option a tristate (yes, no, maybe, with
'maybe' defined as 'yes if the client asks for it, no otherwise').

> - New option NBD_OPT_READ_ONLY
>   Open the export read-only

No need for the client to ask the server that -- if it wants to open an
export read-only, all it needs to do is not issue write requests.

> - New option NBD_OPT_AUTO_SYNC
>   Run in sync mode until the first FUA/FLUSH command is used.

Don't see the benefit, but sure, why not.

> - New option NBD_OPT_ENABLE_PRIVATE
>   Enable private options between client and server. Used for
>   experimental features between client and server of the same make.
>   Client payload: client identifier
>   Server payload: defined by client

I'd prefer to define that as "server identifier", rather than "defined
by client". Or make it "defined by implementor", that works too.

> - New options NBD_OPT_PRIVATE_1-255
>   Client defined options, only valid after NBD_OPT_ENABLE_PRIVATE
>   returned success. Used for experimental features between client and
>   server of the same make.

Right. Or for stable features between client and server that are
specific to that particular implementation; doesn't necessarily have to
be experimental.

> Does that sound better?

Sure.

While we're at it: I'd also like to define the error codes a bit better.
Currently, the error codes are defined as 'whatever errno happened to
contain at the time the sever encountered an error'. Unfortunately,
that's not very well standardized; i.e., there's a difference in errno
not only from one OS to another, but also from one architecture to
another.

So instead, it might be better to have nbd-specific error codes, so that
the client might be able to fix up a request in case of some error codes
(like "oversized request", or similar, which the client would be able to
handle in theory). These could also be used in the option haggling
phase.

-- 
The volume of a pizza of thickness a and radius z can be described by
the following formula:

pi zz a



Reply to: