[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Nbd] NBD wishlist items?



On 6/21/07, Wouter Verhelst <w@...112...> wrote:
[Paul: This is in response to a mail I sent, asking for wishlist items.
Should probably have Cc'ed you on the original mail, but you may want to
check this and give your opinion]

On Tue, Jun 19, 2007 at 07:22:59PM -0400, Mike Snitzer wrote:
> Wouter,
>
> I've got one I'd like to run by you: have the nbd-client detect that
> the kernel's nbd connection to the nbd-server is unresponsive.
>
> There is a window of time where a potentially long TCP timeout
> prevents the nbd (in kernel) from erroring out back to userspace
> (nbd-client).  But if the nbd-client could feel that the nbd isn't
> behaving the nbd-client could send a SIGKILL down to the kernel (nbd
> driver already aborts TCP transmit if a SIGKILL is received).
>
> Any ideas on how we might pull this off?

Yeah, the question of reliability of the connection is a big one, and
one I'm not sure can be implemented properly without protocol changes.

Currently, the client doesn't have a real ability to send a packet to
the server to ask "are you still alive". Worse, the server doesn't have
any ability to send an unsolicited message to the client; if it believes
the client is dead, there currently is no way to check.

I'm thinking it would be good to extend the protocol with two packets,
one PING and one PONG (or so) that could be sent by either the client or
the server, and that could allow either of them to check whether the
other is still there. It should include a timeout of (say) 60 seconds
(this could probably be negotiated during the handshake) during which
the other side has to reply with the appropriate packet; if it doesn't,
it is assumed dead and the connection will be killed.

I don't think any other way can reliably allow either the client or the
server to detect the other end's death. We're using TCP keepalive probes
right now already, and there's the -a option to nbd-server, but both are
not really a good solution -- the former because it takes literally days
to discover a lost connection, the latter because it a) assumes that
there is never a good reason for a client to be inactive for more than
the time given on the nbd-server command line, b) only allows the server
to detect the death of the client, never the other way around, and,
well, c) because the implementation is broken currently :)

Implementing this backwards-compatibly is going to be the hardest part,
I guess. Perhaps opening one of the NBD devices to call a specific ioctl
to verify whether it supports this interface, and then setting a bit in
the field of 'reserved' bits in the handshake could work, but I'm not
sure how this would be best done.

Thanks for the detailed response.  Yesterday I said that this
nbd-client connection monitoring likely doesn't belong in the
nbd-client but I was looking at the problem in a completely different
way than what you suggested with protocol level changes.  I think the
PING and PONG protocol for both client and server could be very good.
I look forward to Paul's thoughts on this suggestion.

For existing nbd installations, I was looking for a less intrusive
solution to checking the nbd-client (and by association /dev/nbd<x>)
was still fully connected and functional.  So as to avoid changing the
kernel et al; the following utility obviously wouldn't require
protocol changes but it only addresses a subset of the concerns with
nbd connection reliability.

This utility could be as simple as an nbd-client -monitor (or
nbd-monitor) that upon start (via: nbd-monitor -t <timeout> [-p
<nbdClientPid>] /dev/nbd<x>) would lookup the nbd-client process (with
the new /sys/block/nbd<x>/pid interface) or just use the specified
nbdClientPid.  It would then go on to use multiple processes to
periodically perform a timed read from /dev/nbd<x>.  Each iteration
would fork a child process that performs a read from /dev/nbd<x> and
the parent process would wait the specified timeout before killing the
child's pid.  If the read times out or fails (and the nbd-client's pid
is no longer running) the determined nbd-client's pid gets a SIGKILL.

Please let me know if what I'm suggesting is fundamentally flawed in
some way.  I'm going to take a stab at prototyping this nbd-monitor in
perl and give it a shot on SLES10 (where I happen to hit the TCP
timeout issue more frequently/reliably than any other kernel).

Mike



Reply to: