Re: [Nbd] deadlock in nbd?

To: "Peter Daum" <gator_ml@...141...>
Cc: "nbd-general@lists.sourceforge.net" <nbd-general@lists.sourceforge.net>
Subject: Re: [Nbd] deadlock in nbd?
From: "Mike Snitzer" <snitzer@...17...>
Date: Sun, 21 Oct 2007 11:20:53 -0400
Message-id: <170fa0d20710210820v332f203cr4aefeaaacb78d89a@...18...>
In-reply-to: <384107.52040.qm@...143...>
References: <4718CFA5.4060504@...135...> <384107.52040.qm@...143...>

On 10/20/07, Peter Daum <gator_ml@...141...> wrote:
> Hi,
>
> --- Eric Gerlach <egerlach@...135...> wrote:
>
> > Is anyone out there running nbd-client on Debian
> > Etch successfully?  Or
> > is this problem particular to Debian?
>
> I don't think it is debian-specific: It also happens
> with "vanilla" kernel and userspace programs. To me,
> it looks like a deadlock in the kernel module. I am
> not sure yet (I only have 1 machine without interface
> bonding) but it seems like at least in my case there
> might be some bad interaction with the interface
> bonding ...

Which IO scheduler are you using?  Please have a look at this redhat
bug; turns out that there is definitely an issue with cfq (not
exclussive to redhat):
https://bugzilla.redhat.com/show_bug.cgi?id=241540

But if you're seeing this on stock kernel.org I'd imagine your just
using anticipatory.  I've had good success using nbd with deadline
though...

What kind of workload do you have?  I've seen deadlocks (where mke2fs
is hung in
blk_congestion_wait) with 2 server being cross-connected via MD over NBD, e.g.:
http://marc.info/?l=linux-mm&m=118981112030719&w=2

I found the solution to be Peter Z's per block device dirty threshold
patches (Linus has merged them into his latest git tree, so they'll be
in the 2.6.24-rc1).

Mike

Reply to:

References:
- Re: [Nbd] deadlock in nbd?
  - From: Peter Daum <gator_ml@...141...>

Prev by Date: Re: [Nbd] deadlock in nbd?
Next by Date: Re: [Nbd] deadlock in nbd?
Previous by thread: Re: [Nbd] deadlock in nbd?
Next by thread: Re: [Nbd] deadlock in nbd?
Index(es):
- Date
- Thread