[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Nbd] 3.12 BUG() on ext4, kernel crash on nbd-client when nbd server rebooting



On Sun 17-11-13 17:19:17, Alex Bligh wrote:
> 
> On 17 Nov 2013, at 09:46, Wouter Verhelst wrote:
> 
> >> 
> >> In order for nbd to seamlessly handle this situation, we'd have to do a
> >> reconnect in-kernel
> > 
> > This would be fairly complicated, since all the connection and
> > negotiation currently happens in userspace. I'm not sure I want to go
> > down that route.
> > 
> >> (or have a callout to userland to reconnect)
> > 
> > That sounds interesting, too. How would you do that?
> > 
> >> and
> >> then we'd have to retry any I/Os that may have failed in the meantime
> >> (or just let them fail, but that probably is not as useful).
> > 
> 
> Would another option be as follows:
> 
> 1. When persistency is required, a new persist flag is specified to
>    the kernel by the client.
> 
> 2. On a connection failure, if the persist flag is set, don't
>    clear up and return with a specific error number. The fd is
>    still open (as still owned by the process), but (by assumption)
>    unusable.
> 
> 3. In persist mode, The block device only gets torn down when
>    the fd closes / userland process terminates (whichever is
>    easier, detection method TBD). Until then all writes block.
> 
> 4. A newer nbd client detects the errno in persist mode, opens another
>    fd, and calls the NBD_DOIT ioctl passing the old fd as an
>    additional parameter (or does a new ioctl first to associate
>    the new fd with the old fd). A new kernel then detects this,
>    closes the old fd, and 'takes over' the existing block device
>    with the new fd.
> 
> On an old client, the kernel behaviour is thus unchanged. Similarly
> if persist is not required. If a new client in persist mode crashes
> after step (2), then the block device will still be torn down when
> the process exits.
  Just to make it clear, my only comment was that tearing blockdev down
with kill_bdev() is the wrong way to do it (at least from filesystem POV).
NBD should rather put bdev into a state where it returns EIO for anything
you try to do with it after a network failure.

If you want some kind of persistency over network failures, you can queue
IO and attempt a reconnect - that really heavily reminds me the situation
dm-multipath solves for traditional fiberchannel multipathing so it might
be easiest stack dm-multipath over NBD and hack around multipath daemon to
understand specific needs of NBD and instead of switching to a different
fiberchannel path it would try to reconnect the network connection. Adding
Hannes to CC, maybe he will know why that would be a bad idea :).

								Honza
-- 
Jan Kara <jack@...1290...>
SUSE Labs, CR



Reply to: