[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Nbd] 3.12 BUG() on ext4, kernel crash on nbd-client when nbd server rebooting

Op 15-11-13 22:11, Paul Clements schreef:
> On Fri, Nov 15, 2013 at 2:50 AM, Wouter Verhelst <w@...112...
> <mailto:w@...112...>> wrote:
>     I'm not sure if this has been implemented that way (that's Paul's area,
>     not mine), but the intention was that the nbd kernel module would only
>     do cleanup once the nbd-client process exits. 
> Not quite. It cleans up at the end of NBD_DO_IT ioctl, before returning
> to userland to do the reconnect.

Oh. That's a misunderstanding on my part, then.

>     That is, if nbd-client has
>     not yet exited, that could be because it's in -persist mode and is
>     trying to reconnect.
> The -persist mode will only work if there is no ongoing I/O. With I/O
> you're likely to get a kernel panic in the filesystem.


> In order for nbd to seamlessly handle this situation, we'd have to do a
> reconnect in-kernel

This would be fairly complicated, since all the connection and
negotiation currently happens in userspace. I'm not sure I want to go
down that route.

> (or have a callout to userland to reconnect)

That sounds interesting, too. How would you do that?

> and
> then we'd have to retry any I/Os that may have failed in the meantime
> (or just let them fail, but that probably is not as useful).
> The solution that Jack mentions is worth looking into -- it should at
> least avoid the filesystem panics that we now have. I'll take a look...

I'm not sure. This would mean that auto-reconnect couldn't work anymore
-- unless you also do the callout to userland thing you mentioned above.

This end should point toward the ground if you want to go to space.

If it starts pointing toward space you are having a bad problem and you
will not go to space today.

  -- http://xkcd.com/1133/

Reply to: