[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Nbd] NBD_CMD_DISC



Wouter,

On 9 Apr 2016, at 10:16, Wouter Verhelst <w@...112...> wrote:

> On Wed, Apr 06, 2016 at 02:32:59PM -0600, Eric Blake wrote:
>> New client, old server: server does not send reply, client is stuck
>> waiting for a reply that never comes.  But since the server will close
>> the connection, the client can detect that the connection is closed.
>> Client is no worse off than old->old case.
> 
> This seems to work.
> 
>> New client, new server: server sends reply, and client waits for reply.
>> No more traffic will be sent from either end, and both ends know that
>> they need to close the connection; either end can close first without
>> hurting the other.
> 
> This will be effectively the same thing. The reply to NBD_CMD_DISC could
> even have the FIN flag set, too, resulting in no net benefit to the
> above.

I can't remember now exactly what Eric was suggesting (I think it was
a flag that effectively made NBD_CMD_DISC have the roughly the same
semantics as a close, i.e. gave a reply).

The space of available flags is limited. Why use one up? Before you
say 'well you're using a flag bit anyway', I'm using a flag bit
from the server to say it supports this, which we'd need on Eric's
proposal too (I think he converted over to the command approach).

> We can just say that the client MUST ensure that enough time has elapsed
> for the TLS layer to pass the NBD_CMD_DISC command on to the server,

How would the client know that? I'm using Go's TLS library, and there is
no way (as far as I can tell) to ensure that.

> and
> SHOULD wait for the server to close the connection, rather than doing it
> itself (SHOULD, because the server may have died and the client should
> be allowed to time out).
> 
> [...]
>> In other words, I'm not seeing what value added we have in either choice
>> 2 or choice 3 (what behavior did we guarantee by extending the
>> handshake, that we cannot already get by specifying best practice that
>> server MUST reply, and client SHOULD wait for server to close connection
>> but SHOULD NOT expect the reply due to back-compat?)
> 
> Not much. Even the reply doesn't seem necessary to me.

Without the reply, the client has no guarantee that the server has
run through the a safe disconnect.

An alternative approach would be to load all the effort onto the client,
I suppose, i.e. first wait until all inflight requests have been
transmitted, then send an NBD_CMD_FLUSH, wait for the reply to that,
and then send an NBD_CMD_DISC. But in that case, we don't even need
NBD_CMD_DISC. The client might as well just disconnect.

To me the whole concept of NBD_CMD_DISC is problematic. NBD is
in essence an RPC protocol. Normally in RPC protocols it's for the
client to close the connection when it knows it's done. The reasons
for an explicit disconnect command are firstly so the client knows
that any session state has been safely recorded (NBD_CMD_FLUSH
and waiting for inflight requests does that), and secondly so the
server knows the disconnect is intentional (that doesn't work with
NBD_CMD_DISC as in the wild we see clients closing the connection
immediately having sent NBD_CMD_DISC because they don't know how
long to wait for the client to 'reply' by closing the connection -
try 'qemu-img info nbd://...' for an obvious example), and this
makes the server think it's an unsafe disconnection. NBD_CMD_DISC
is currently no better than simply disconnecting.

If you're concerned about saving commands, the easy answer is this:

Disconnection
=============

Client side
-----------

Where the client wishes to disconnect safely, it MUST follow the following
procedure:

* First it must wait until there are no inflight commands, i.e. every
  request that it has sent has been replied to. From this point onwards
  it MUST NOT send further commands.

* Second, if the flag NBD_FLAG_SEND_FLUSH is set, it must sent a
  NBD_CMD_FLUSH command, and wait for the reply.

* At this point it closes the TCP session.


Server side
-----------

There is no explicit command to disconnect or close a connection.
If the server did not set NBD_FLAG_SEND_FLUSH, any closing of the
TCP session at a point where there are no inflight commands is a safe
close. If the server did set NBD_FLAG_SEND_FLUSH, any closing of the
TCP session where the last command replied to was NBD_CMD_FLUSH and
there are no inflight commands is a safe close.



That way we can delete the pointless NBD_CMD_DISC, an do the whole thing
with -1 new commands and no new flags. 

Why not drop the NBD_CMD_FLUSH and just do the flush on a disconnect?
Because of incompletely processed requests or sent but not processed
replies.

Alex

>> So it may just be a matter of requiring servers to reply, and requiring
>> clients to wait for the server to close the connection unless a reply is
>> received (ie. any time the client closes the connection first without
>> getting a reply may cause the server to treat shutdown as unclean and
>> behave differently).  Plus an ordering constraint: the server MUST NOT
>> reply until after all other pending requests have had replies sent.
> 
> Right.
> 
> -- 
> < ron> I mean, the main *practical* problem with C++, is there's like a dozen
>       people in the world who think they really understand all of its rules,
>       and pretty much all of them are just lying to themselves too.
> -- #debian-devel, OFTC, 2016-02-12
> 

-- 
Alex Bligh







Reply to: