[Paul: This is in response to a mail I sent, asking for wishlist items.
Should probably have Cc'ed you on the original mail, but you may want to
check this and give your opinion]
On Tue, Jun 19, 2007 at 07:22:59PM -0400, Mike Snitzer wrote:
> Wouter,
>
> I've got one I'd like to run by you: have the nbd-client detect that
> the kernel's nbd connection to the nbd-server is unresponsive.
>
> There is a window of time where a potentially long TCP timeout
> prevents the nbd (in kernel) from erroring out back to userspace
> (nbd-client). But if the nbd-client could feel that the nbd isn't
> behaving the nbd-client could send a SIGKILL down to the kernel (nbd
> driver already aborts TCP transmit if a SIGKILL is received).
>
> Any ideas on how we might pull this off?
Yeah, the question of reliability of the connection is a big one, and
one I'm not sure can be implemented properly without protocol changes.
Currently, the client doesn't have a real ability to send a packet to
the server to ask "are you still alive". Worse, the server doesn't have
any ability to send an unsolicited message to the client; if it believes
the client is dead, there currently is no way to check.
I'm thinking it would be good to extend the protocol with two packets,
one PING and one PONG (or so) that could be sent by either the client or
the server, and that could allow either of them to check whether the
other is still there. It should include a timeout of (say) 60 seconds
(this could probably be negotiated during the handshake) during which
the other side has to reply with the appropriate packet; if it doesn't,
it is assumed dead and the connection will be killed.
I don't think any other way can reliably allow either the client or the
server to detect the other end's death. We're using TCP keepalive probes
right now already, and there's the -a option to nbd-server, but both are
not really a good solution -- the former because it takes literally days
to discover a lost connection, the latter because it a) assumes that
there is never a good reason for a client to be inactive for more than
the time given on the nbd-server command line, b) only allows the server
to detect the death of the client, never the other way around, and,
well, c) because the implementation is broken currently :)
Implementing this backwards-compatibly is going to be the hardest part,
I guess. Perhaps opening one of the NBD devices to call a specific ioctl
to verify whether it supports this interface, and then setting a bit in
the field of 'reserved' bits in the handshake could work, but I'm not
sure how this would be best done.