Re: [Nbd] nbd-client hangs on negotiation - possible NBD bug
- To: Eric Tessler <maiden1134@...34...>
- Cc: email@example.com, Mike Snitzer <snitzer@...17...>
- Subject: Re: [Nbd] nbd-client hangs on negotiation - possible NBD bug
- From: Wouter Verhelst <wouter@...3...>
- Date: Tue, 1 Aug 2006 10:58:12 +0200
- Message-id: <20060801085812.GA11911@...85...>
- In-reply-to: <20060731225712.17085.qmail@...95...>
- References: <20060730231145.GF20140@...39...> <20060731225712.17085.qmail@...95...>
On Mon, Jul 31, 2006 at 03:57:12PM -0700, Eric Tessler wrote:
> Thanks to everyone for their input.
> I am not exporting the share as read-only - the failure test
> described below was only to see the client hanging on negotiation
> (typically I would share it as read only though - I tried it and the
> client still hangs).
> I think my engineers and I may take a closer look at this problem
> because it is blocking our development.
That would be great!
> We now know that the server side is blocking on the accept() call to
> the socket - even though the client is connecting to the server, the
> accept() call on the server side hangs.
One thing I've been meaning to do is to convert the current situation
(where the server blocks in accept() until a client connects) to some
select() based thing. I didn't do this yet since there's nothing else
the main server needs to do but to wait for clients to connect and fork
off servers to handle them, but I don't know whether it's actually a
good idea to keep it this way.
> The interesting thing is that when I disconnect the last client, the
> accept call unblocks the exact # of times that the client hung - it
> is almost like the client connection events are getting queued up on
> the server side of the socket and when you disconnect the last
> client, they are flushed out.
This could be because disconnecting a client means one of the forked off
servers will exit, sending a SIGCHLD to the main server. This in turn
will force it out of the accept() call to handle the SIGCHLD call, which
might be just the little kick you need to get everything rolling again.
> Maybe this is a bug in the linux kernel related to sockets? If we
> find what the problem is, I will post a fix for it.
If there is anything I can do to help while you're looking at this, feel
free to send mail to the mailinglist, and I'll do what I can.
Fun will now commence
-- Seven Of Nine, "Ashes to Ashes", stardate 53679.4