[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Nbd] nbd-client hangs on negotiation - possible NBD bug



On 7/24/06, Eric Tessler <maiden1134@...34...> wrote:

Hello,

I have been using NBD for the last year and I finally managed to break it. I
hit a case where the nbd-client hangs during the negotiation phase with the
nbd-server. In order to get this, I executed the following steps:

1. nbd-server 63333 /tmp/vol
2. nbd-client 192.168.123.203 63333 /dev/nbd1
     Negotiation: ..size = 409664KB
     bs=1024, sz=409664
3. nbd-client 192.168.123.203 63333 /dev/nbd2
     Negotiation: ..size = 409664KB
     bs=1024, sz=409664
4. nbd-client -d /dev/nbd2
5. Repeat step #3 above and the nbd-client hangs with the following output:
     Negotiation:

From this point onwards, I cannot connect to the NBD share on port 63333.
The interesting thing is that if I then kill the /dev/nbd1 device
(nbd-client -d /dev/nbd1) so there are no more clients using this share, I
can then execute steps #2 and #3 above and they both pass without a problem.
I have tried this both locally on the same machine and remotely, the
nbd-client hang always occurs. It seems like everything works fine until I
disconnect a single client, then I cannot re-connect using a new client.

I added some debug output to the nbd client and server, the client is
connecting to the server through a socket, but the server's call to accept
on the socket never returns (as it should when a new client connects to the
server). I suspect this may have something to do with improper cleanup of
the client socket when the previous client disconnected from the server.

Note that if I connect to the share using a single client and disconnect
that client, I can reconnect to the share w/o a problem. The problem appears
only when I have 2 clients connected to the share at the same time and I
disconnect one of them.

I am using NBD 2.7.3 with Fedora Core 3, 2.6.11.12 kernel.

Does anyone know of a bug that has been fixed that resolves this problem? I
am not ready to upgrade to the latest NBD just yet, but I would like to know
if this is a known problem and has been fixed.

Eric,

This negotiation problem is still present with the latest nbd release
(2.8.5) BUT it is _not_ reproducible through the same sequence you
indicated your 2.7.3 setup reliably hangs on.  I've yet to find a
reliable reproducer for 2.8.5 negotiation hangs.  I was hopeful that
the sequence you used would work to reproduce on RHEL4U3.

With 2.8.5, using RHEL4U3 (2.6.9 + redhatstuff) the negotiation
problem occurs much more frequently than with a kernel.org 2.6.15+.
On the surface, it definitely appears to be an nbd-server issue given
that if you restart the nbd-server all is fine.  But given that a
kernel change makes this issue less frequent it is far from clear what
is going on.

Mike



Reply to: