Bug#798375: Stale SCTP sockets / recovery only by reboot
Package: src:linux
Version: 3.16.5-1
Severity: normal
Tags: patch
I appear to have run into a SCTP bug that has been around since 2009:
http://sourceforge.net/p/lksctp/mailman/message/23492403/
and has been fixed in mainline as part of commit
bdf6fa52f01b941d4a80372d56de465bdbbd1d23.
However, the bug was still visible in the Debian kernel version
3.16.5-1, and I suppose all that's needed is a cherry-pick.
It is possible to reproduce as follows:
* start a SCTP server socket (bind/listen) on linux
* make the client connect to that
* accept() the connection on the server
* make the client disappear suddenly (e.g. power it down, disconnect its
network cable) in a way that there is no handshake to terminate the
association
* close the server program
** linux sends SHUTDOWN and associated retansmissions
* start the server program again
* make the client re-connect
** INIT/INIT_ACK/HEARTBEAT/HEARTBEAT_ACK/DT1/SACK messages can be seen
in wireshark on the server and traverse nicely to and from the client,
but none of that data ever arrives on th socket. The re-started
server never even receives any notification about the new connection,
thre is subequently no socket being created.
I double-checked, no other process had a clone of that socket open from
the userspace point of view.
The only way I fould to recover was to reboot the server system, which
of course is quite annoying. Unloading the sctp kernel module was not
an option, as it still had a reference count, despite no SCTP sockets
existing anymore in the system.
The issue was possible to reproduce several times, so itw as not some
strange one-off behavior.
-- Package-specific info:
** Version:
Linux version 3.16-3-amd64 (debian-kernel@lists.debian.org) (gcc version 4.8.3 (Debian 4.8.3-12) ) #1 SMP Debian 3.16.5-1 (2014-10-10)
** Command line:
BOOT_IMAGE=/boot/vmlinuz-3.16-3-amd64 root=UUID=2fa8d876-a03f-447f-bda0-d7f6fed53004 ro
** Not tainted
-- System Information:
Debian Release: stretch/sid
APT prefers unstable
APT policy: (500, 'unstable')
Architecture: amd64 (x86_64)
Foreign Architectures: i386
Kernel: Linux 3.16-3-amd64 (SMP w/2 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash
Init: systemd (via /run/systemd/system)
Versions of packages linux-image-3.16-3-amd64 depends on:
ii debconf [debconf-2.0] 1.5.57
ii initramfs-tools [linux-initramfs-tool] 0.120
ii kmod 21-1
ii linux-base 4.0
ii module-init-tools 21-1
Versions of packages linux-image-3.16-3-amd64 recommends:
ii firmware-linux-free 3.4
Versions of packages linux-image-3.16-3-amd64 suggests:
pn debian-kernel-handbook <none>
ii extlinux 3:6.03+dfsg-10
ii grub-pc 2.02~beta2-26
pn linux-doc-3.16 <none>
Versions of packages linux-image-3.16-3-amd64 is related to:
pn firmware-atheros <none>
pn firmware-bnx2 <none>
pn firmware-bnx2x <none>
pn firmware-brcm80211 <none>
pn firmware-intelwimax <none>
pn firmware-ipw2x00 <none>
pn firmware-ivtv <none>
ii firmware-iwlwifi 0.44
pn firmware-libertas <none>
pn firmware-linux <none>
pn firmware-linux-nonfree <none>
pn firmware-myricom <none>
pn firmware-netxen <none>
pn firmware-qlogic <none>
ii firmware-ralink 0.44
pn firmware-realtek <none>
pn xen-hypervisor <none>
-- debconf information:
linux-image-3.16-3-amd64/postinst/depmod-error-initrd-3.16-3-amd64: false
linux-image-3.16-3-amd64/prerm/removing-running-kernel-3.16-3-amd64: true
linux-image-3.16-3-amd64/postinst/mips-initrd-3.16-3-amd64:
Reply to: