[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#798375: Stale SCTP sockets / recovery only by reboot



Package: src:linux
Version: 3.16.5-1
Severity: normal
Tags: patch

I appear to have run into a SCTP bug that has been around since 2009:
http://sourceforge.net/p/lksctp/mailman/message/23492403/
and has been fixed in mainline as part of commit
bdf6fa52f01b941d4a80372d56de465bdbbd1d23.

However, the bug was still visible in the Debian kernel version
3.16.5-1, and I suppose all that's needed is a cherry-pick.

It is possible to reproduce as follows:

* start a SCTP server socket (bind/listen) on linux
* make the client connect to that
* accept() the connection on the server
* make the client disappear suddenly (e.g. power it down, disconnect its
  network cable) in a way that there is no handshake to terminate the
  association
* close the server program
** linux sends SHUTDOWN and associated retansmissions
* start the server program again
* make the client re-connect
** INIT/INIT_ACK/HEARTBEAT/HEARTBEAT_ACK/DT1/SACK messages can be seen
   in wireshark on the server and traverse nicely to and from the client,
   but none of that data ever arrives on th socket.  The re-started
   server never even receives any notification about the new connection,
   thre is subequently no socket being created.

I double-checked, no other process had a clone of that socket open from
the userspace point of view.

The only way I fould to recover was to reboot the server system, which
of course is quite annoying.  Unloading the sctp kernel module was not
an option, as it still had a reference count, despite no SCTP sockets
existing anymore in the system.

The issue was possible to reproduce several times, so itw as not some
strange one-off behavior.

-- Package-specific info:
** Version:
Linux version 3.16-3-amd64 (debian-kernel@lists.debian.org) (gcc version 4.8.3 (Debian 4.8.3-12) ) #1 SMP Debian 3.16.5-1 (2014-10-10)

** Command line:
BOOT_IMAGE=/boot/vmlinuz-3.16-3-amd64 root=UUID=2fa8d876-a03f-447f-bda0-d7f6fed53004 ro

** Not tainted

-- System Information:
Debian Release: stretch/sid
  APT prefers unstable
  APT policy: (500, 'unstable')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 3.16-3-amd64 (SMP w/2 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash
Init: systemd (via /run/systemd/system)

Versions of packages linux-image-3.16-3-amd64 depends on:
ii  debconf [debconf-2.0]                   1.5.57
ii  initramfs-tools [linux-initramfs-tool]  0.120
ii  kmod                                    21-1
ii  linux-base                              4.0
ii  module-init-tools                       21-1

Versions of packages linux-image-3.16-3-amd64 recommends:
ii  firmware-linux-free  3.4

Versions of packages linux-image-3.16-3-amd64 suggests:
pn  debian-kernel-handbook  <none>
ii  extlinux                3:6.03+dfsg-10
ii  grub-pc                 2.02~beta2-26
pn  linux-doc-3.16          <none>

Versions of packages linux-image-3.16-3-amd64 is related to:
pn  firmware-atheros        <none>
pn  firmware-bnx2           <none>
pn  firmware-bnx2x          <none>
pn  firmware-brcm80211      <none>
pn  firmware-intelwimax     <none>
pn  firmware-ipw2x00        <none>
pn  firmware-ivtv           <none>
ii  firmware-iwlwifi        0.44
pn  firmware-libertas       <none>
pn  firmware-linux          <none>
pn  firmware-linux-nonfree  <none>
pn  firmware-myricom        <none>
pn  firmware-netxen         <none>
pn  firmware-qlogic         <none>
ii  firmware-ralink         0.44
pn  firmware-realtek        <none>
pn  xen-hypervisor          <none>

-- debconf information:
  linux-image-3.16-3-amd64/postinst/depmod-error-initrd-3.16-3-amd64: false
  linux-image-3.16-3-amd64/prerm/removing-running-kernel-3.16-3-amd64: true
  linux-image-3.16-3-amd64/postinst/mips-initrd-3.16-3-amd64:


Reply to: