[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Nbd] nbd-server 2.8.6 hangs on nbd-client reconnect



On Wed, Nov 15, 2006 at 07:33:07PM -0500, Mike Snitzer wrote:
> On 9/1/06, Mike Snitzer <snitzer@...17...> wrote:
> > On 8/31/06, Mike Snitzer <snitzer@...17...> wrote:
> > > On 8/29/06, Wouter Verhelst <wouter@...3...> wrote:
> 
> > > > Did you see this behaviour with previous versions of the server?
> > > > If not, I know where to look...
> > >
> > > SO I'm not sure if your gut on where to look was my select code but...
> >
> > FYI, I've verified this problem with both 2.8.5 (accept-based) and
> > 2.8.6 (select-based).  It is clear that the nbd-server _seems_
> > perfectly fine (either blocked in accept or select) waiting for an
> > nbd-client connection.  When the nbd-client connection is made without
> > gdb or strace attached the nbd-server fails the connection and then
> > promptly wedges itself trying to get a mutex.
> 
> As you can see I previously established that I also saw the
> __lll_mutex_lock_wait() issue with 2.8.5.

Argl. Missed that. Okay, so then it's probably not that.

> I'm glad that you found a bug in my select code but I have my doubts
> that it _really_ matters relative to this hang.

The problem is that I /still/ haven't been able to reproduce this; that
makes debugging kinda hard.

> It is extremely strange that the 2.7 select code that you ported
> actually works given FD_SET() isn't _ever_ called... doesn't make any
> sense.

Indeed.

> I'll retest with 2.8.8 and/or 2.9.0 to see if this
> __lll_mutex_lock_wait() issue is still lurking.

FWIW, 2.8.8 isn't released yet; you'd have to pick the latest from SVN.

-- 
<Lo-lan-do> Home is where you have to wash the dishes.
  -- #debian-devel, Freenode, 2004-09-22



Reply to: