[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Nbd] NBD_CMD_DISC



On 04/06/2016 10:04 AM, Alex Bligh wrote:
> NBD_CMD_DISC does not have a reply. With TLS negotiated (which changes timing, nothing else) what I'm frequently seeing is that the client sends NBD_CMD_DISC then immediately closes the connection (which per doc/proto.md is permissible). Of course NBD_CMD_DISC never actually gets transmitted by the TLS layer, or even it it does it never gets as far as being decoded before the underlying TCP session closes hard.
> 
>>From a server's point of view it is thus impossible to reliably distinguish between an intended clean close, and a dirty close, which is an unfortunate state of affairs.

Indeed, clean shutdown is a desirable property, and may have
ramifications on what the server does or what data guarantees the client
may have the next time it connects to the server for the same export (as
in, a server that knows the client is doing a clean shutdown can do
fsync(), whereas when the connection just disappears can leave data
stranded because something else went wrong).

> 
> This would be remedied by introducing a reply for NBD_CMD_DISC (signalled somehow - yet another flag I guess), the idea being that the client SHOULD wait for a reply before closing the connection. Now if the server closes the connection hard whilst the reply is still in the TLS buffer (either server side or client side) this won't be a disaster as from the server's point of view the close was clean, and the client can consider a hard close after sending NBD_CMD_DISC as a clean close anyway.

How will signalling help?

Let's consider some scenarios:

0. Existing state: old client with old server; client sends CMD_DISC and
must not send anything else, but nothing requires whether it waits
around to read replies or just closes the connection early.  Old server
can close the connection on any error without receiving CMD_DISC, and is
documented to close the connection without reply even when receiving
CMD_DISC.  A server that has out-of-order requests doesn't even have to
flush those replies before disconnecting.  Furthermore, the length and
offset fields are documented as unspecified, so we can't even assign
meaning to those fields (as in "if you pass length 1, you want to wait
for a reply") (that's why new commands should explicitly require clients
to send 0 rather than unspecified).  This scenario won't change,
regardless of what we do for new.

1. Require that the server MUST reply with NBD_CMD_ACK, but document
that the client SHOULD be prepared for older servers that did not do
this. No additional handshaking is added.

Old client, new server: server sends reply, reply may or may not be
delivered before client has closed connection.  Server is no worse off
than old->old case.

New client, old server: server does not send reply, client is stuck
waiting for a reply that never comes.  But since the server will close
the connection, the client can detect that the connection is closed.
Client is no worse off than old->old case.

New client, new server: server sends reply, and client waits for reply.
No more traffic will be sent from either end, and both ends know that
they need to close the connection; either end can close first without
hurting the other.

2. Similar to 1, but require handshaking from client->server: client
must advertise that it wants reply to CMD_DISC (how, by new NBD_OPT?).
Server is changed that it MUST send reply only if client negotiated.

Old client, new server: client does not request new behavior, so server
does not send reply to CMD_DISC.  Server still no worse off than
old->old case.

New client, old server: client request is not recognized, client now
knows that no reply will be forthcoming, and can close the connection
itself. But it can also wait for the server to close the connection, so
client is no worse off than old->old case.

New client, new server: client request is recognized, client now knows
that a reply will come.

3. Similar to 1, but require advertising from server->client: server
must advertise that it will reply to CMD_DISC (how, via transmission
flag or global flag?).  Client does not inform server about whether it
cares.

Old client, new server: client ignores the advertising; server sends
response to CMD_DISC but the response may not be delivered if the client
closes the connection early.  No worse than the old->old case.

New client, old server: client sees the missing advertisement, and now
knows that no reply will be forthcoming, so it can close the connection
itself. But it can also wait for the server to close the connection, so
client is no worse off than old->old case.

New client, new server: client sees that server will send response, so
it must wait for the reply.

In other words, I'm not seeing what value added we have in either choice
2 or choice 3 (what behavior did we guarantee by extending the
handshake, that we cannot already get by specifying best practice that
server MUST reply, and client SHOULD wait for server to close connection
but SHOULD NOT expect the reply due to back-compat?)

So it may just be a matter of requiring servers to reply, and requiring
clients to wait for the server to close the connection unless a reply is
received (ie. any time the client closes the connection first without
getting a reply may cause the server to treat shutdown as unclean and
behave differently).  Plus an ordering constraint: the server MUST NOT
reply until after all other pending requests have had replies sent.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


Reply to: