[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [Nbd] found bug in nbd-server.c, handle_info()



Hi all,

I traced the issue some more, it is related to the client side- it appears the client connection to the localhost end of the tunnel drops, but if the tunnel is connected from a different computer on the local subnet, and nbd-client sends its connection thru that, then nbp is stable.

So I'm pursing why nbp-client making a connection to a localhost tunnel endpoint is fragile. I'm going to try ssh tunnels on the local subnet so they are fast, to see if the behavior is related to wan latency/bandwidth or not.

In the circumstance of the localhost connection dropping it tends to leave the nbp-client and mount point difficult to close, SIGKILL on the entire stack of related software is sometimes unable to exit the processes so things can be unwound. When SIGKILL does work then use of the nbp device can be recovered. It has the appearance of deadlock in the nbp kernel module. Forced unload of the kernel module does not succeed. I have seen nbp-client -d also deadlock on the affected devices, similarly unresponsive to SIGKILL.

Greg


On 5/12/2017 3:23 AM, Alex Bligh wrote:
On 12 May 2017, at 09:20, Wouter Verhelst <w@...112...> wrote:

Hi Greg,

On Thu, May 11, 2017 at 09:53:54AM -0400, Menke, Gregory D. (GSFC-582.0)[Arctic Slope Technical Services, Inc.] wrote:
Thanks Wouter,

I'm chasing another issue seen both on the trunk and 3.15.2 where an
apparent deadlock occurs, leaving the nbd device jammed until the
filesystem op, client and server are SIGKILLed at which point the mount
can be cleaned up.  In this case the tcp connection is running thru an
ssh tunnel over a wan,
In that case, you're stacking TCP over TCP, which does not work very
well. See http://sites.inka.de/~W1011/devel/tcp-tcp.html for a pretty
good (technical) explanation on why that is the case.
I don't think that's write. nbd over an ssh tunnel over tcp only
has one tcp layer (the one transporting ssh).





Reply to: