[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

[Nbd] NBD server terminates on SIGPIPE during negotiation



Hi

Problem: The listener/root server process terminates on SIGPIPE during
negotiation. This is hardly the desired behavior, since any
malfunctioning client can brought the listener server down by closing
the socket unexpectedly.

A strace showing the core of the problem (thanks to Juha):

01: select(5, [4], NULL, NULL, NULL)        = 1 (in [4])
02: write(1, "accept, ", 8)                 = 8
03: accept(4, {sa_family=AF_INET, sin_port=htons(45734), sin_addr=inet_addr("10.249.8.2")}, [16]) = 5
04: write(5, "NBDMAGIC", 8)                 = 8
05: write(5, "IHAVEOPT", 8)                 = 8
06: write(5, "\0\1", 2)                     = 2
07: read(5, 0x7fff07bea2c0, 4)              = -1 ECONNRESET (Connection reset by peer)
08: write(2, "Error: Negotiation failed/4: Con"..., 63) = 63
09: read(5, "", 8)                          = 0
10: read(5, "", 4)                          = 0
11: writev(5, [{"\0\3\350\211\4Ue\251", 8}, {"\0\0\0\0", 4}, {"\200\0\0\1", 4}, {"\0\0\0\0", 4}, {NULL, 0}], 5) = -1 EPIPE (Broken pipe)
--- SIGPIPE (Broken pipe) @ 0 (0) ---

The trace above can be traced back to serveloop() and
negotiate(). select() is called from serveloop(), the modern socket is
ready and accepted and the negotiation steps 1-3 (writes 04, 05 and
06) are successful. Then something happens to the client which causes
it to close the socket. The server gets notified about this with
ECONNRESET (07), and it's logged (08), but then the server continues
the negotiation process as if nothing had happened (09 and 10). The
problem boils down to the next writev() (11): writing to the broken
pipe causes SIGPIPE to be sent and since this is all is happening in
the listener/root process context and SIGPIPE action is the default
(Term), the process terminates.

(Obviously, there is a reason why client behaves as it behaves, but
that is out of the scope of this report. The point is that the server
should not fail even if one client misbehaves.)

How this should be fixed?

#1 Set the default action of SIGPIPE to Ign.

This is simplest to do, but not the right thing in my opinion. It
would probably work and fix this specific issue, but in my opinion the
current way of pushing through the negotation procedure no matter what
happens seems a bit flawed.

#2 Move the negotation procedure to the child process.

This would be quite clean way to ensure that catastrophic failure in
one connection would not affect other clients. But it definitely would
not remove the need for more robust error handling in general. And could
it have some nasty unexpected consequences? It seems that not, but someone
who knows the complete codebase should verify this.

#3 Fix the error handling.

Fix the error handling in negotiate() to handle all failure situations
and abort the process if nothing can be done. I think this is something
which should be done anyways. The current way of walking through the
procedure no matter if it fails or not is a bit ugly.

#4 The combination: #1 + #2 + #3

At this point, I'd would probably go with combining at least #1 and
#3. I'm not sure about the #2, but if it makes sense too, then I don't
see any problem in combining all three.

Thoughts, ideas, corrections?

-- 
Tuomas



Reply to: