Re: [Nbd] nbd-server 2.8.6 hangs on nbd-client reconnect
- To: Mike Snitzer <snitzer@...17...>
- Cc: nbd-general@lists.sourceforge.net
- Subject: Re: [Nbd] nbd-server 2.8.6 hangs on nbd-client reconnect
- From: Wouter Verhelst <wouter@...3...>
- Date: Thu, 16 Nov 2006 10:51:38 +0100
- Message-id: <20061116095138.GC8048@...39...>
- In-reply-to: <170fa0d20611151633v254f19fra1cdc31931bdaa94@...18...>
- References: <170fa0d20608290937h59a6654fn320c2c0a2820ed5e@...18...> <20060829231548.GA17396@...39...> <170fa0d20608311502i2ec0afep7af132887335005f@...18...> <170fa0d20609011441r63a7e0dbyd597fc0e5ec46fc0@...18...> <170fa0d20611151633v254f19fra1cdc31931bdaa94@...18...>
On Wed, Nov 15, 2006 at 07:33:07PM -0500, Mike Snitzer wrote:
> On 9/1/06, Mike Snitzer <snitzer@...17...> wrote:
> > On 8/31/06, Mike Snitzer <snitzer@...17...> wrote:
> > > On 8/29/06, Wouter Verhelst <wouter@...3...> wrote:
>
> > > > Did you see this behaviour with previous versions of the server?
> > > > If not, I know where to look...
> > >
> > > SO I'm not sure if your gut on where to look was my select code but...
> >
> > FYI, I've verified this problem with both 2.8.5 (accept-based) and
> > 2.8.6 (select-based). It is clear that the nbd-server _seems_
> > perfectly fine (either blocked in accept or select) waiting for an
> > nbd-client connection. When the nbd-client connection is made without
> > gdb or strace attached the nbd-server fails the connection and then
> > promptly wedges itself trying to get a mutex.
>
> As you can see I previously established that I also saw the
> __lll_mutex_lock_wait() issue with 2.8.5.
Argl. Missed that. Okay, so then it's probably not that.
> I'm glad that you found a bug in my select code but I have my doubts
> that it _really_ matters relative to this hang.
The problem is that I /still/ haven't been able to reproduce this; that
makes debugging kinda hard.
> It is extremely strange that the 2.7 select code that you ported
> actually works given FD_SET() isn't _ever_ called... doesn't make any
> sense.
Indeed.
> I'll retest with 2.8.8 and/or 2.9.0 to see if this
> __lll_mutex_lock_wait() issue is still lurking.
FWIW, 2.8.8 isn't released yet; you'd have to pick the latest from SVN.
--
<Lo-lan-do> Home is where you have to wash the dishes.
-- #debian-devel, Freenode, 2004-09-22
Reply to: