[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#798375: marked as done (Stale SCTP sockets / recovery only by reboot)



Your message dated Tue, 08 Sep 2015 19:06:38 +0100
with message-id <1441735598.2610.9.camel@decadent.org.uk>
and subject line Re: Bug#798375: Stale SCTP sockets / recovery only by reboot
has caused the Debian Bug report #798375,
regarding Stale SCTP sockets / recovery only by reboot
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
798375: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=798375
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
Package: src:linux
Version: 3.16.5-1
Severity: normal
Tags: patch

I appear to have run into a SCTP bug that has been around since 2009:
http://sourceforge.net/p/lksctp/mailman/message/23492403/
and has been fixed in mainline as part of commit
bdf6fa52f01b941d4a80372d56de465bdbbd1d23.

However, the bug was still visible in the Debian kernel version
3.16.5-1, and I suppose all that's needed is a cherry-pick.

It is possible to reproduce as follows:

* start a SCTP server socket (bind/listen) on linux
* make the client connect to that
* accept() the connection on the server
* make the client disappear suddenly (e.g. power it down, disconnect its
  network cable) in a way that there is no handshake to terminate the
  association
* close the server program
** linux sends SHUTDOWN and associated retansmissions
* start the server program again
* make the client re-connect
** INIT/INIT_ACK/HEARTBEAT/HEARTBEAT_ACK/DT1/SACK messages can be seen
   in wireshark on the server and traverse nicely to and from the client,
   but none of that data ever arrives on th socket.  The re-started
   server never even receives any notification about the new connection,
   thre is subequently no socket being created.

I double-checked, no other process had a clone of that socket open from
the userspace point of view.

The only way I fould to recover was to reboot the server system, which
of course is quite annoying.  Unloading the sctp kernel module was not
an option, as it still had a reference count, despite no SCTP sockets
existing anymore in the system.

The issue was possible to reproduce several times, so itw as not some
strange one-off behavior.

-- Package-specific info:
** Version:
Linux version 3.16-3-amd64 (debian-kernel@lists.debian.org) (gcc version 4.8.3 (Debian 4.8.3-12) ) #1 SMP Debian 3.16.5-1 (2014-10-10)

** Command line:
BOOT_IMAGE=/boot/vmlinuz-3.16-3-amd64 root=UUID=2fa8d876-a03f-447f-bda0-d7f6fed53004 ro

** Not tainted

-- System Information:
Debian Release: stretch/sid
  APT prefers unstable
  APT policy: (500, 'unstable')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 3.16-3-amd64 (SMP w/2 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash
Init: systemd (via /run/systemd/system)

Versions of packages linux-image-3.16-3-amd64 depends on:
ii  debconf [debconf-2.0]                   1.5.57
ii  initramfs-tools [linux-initramfs-tool]  0.120
ii  kmod                                    21-1
ii  linux-base                              4.0
ii  module-init-tools                       21-1

Versions of packages linux-image-3.16-3-amd64 recommends:
ii  firmware-linux-free  3.4

Versions of packages linux-image-3.16-3-amd64 suggests:
pn  debian-kernel-handbook  <none>
ii  extlinux                3:6.03+dfsg-10
ii  grub-pc                 2.02~beta2-26
pn  linux-doc-3.16          <none>

Versions of packages linux-image-3.16-3-amd64 is related to:
pn  firmware-atheros        <none>
pn  firmware-bnx2           <none>
pn  firmware-bnx2x          <none>
pn  firmware-brcm80211      <none>
pn  firmware-intelwimax     <none>
pn  firmware-ipw2x00        <none>
pn  firmware-ivtv           <none>
ii  firmware-iwlwifi        0.44
pn  firmware-libertas       <none>
pn  firmware-linux          <none>
pn  firmware-linux-nonfree  <none>
pn  firmware-myricom        <none>
pn  firmware-netxen         <none>
pn  firmware-qlogic         <none>
ii  firmware-ralink         0.44
pn  firmware-realtek        <none>
pn  xen-hypervisor          <none>

-- debconf information:
  linux-image-3.16-3-amd64/postinst/depmod-error-initrd-3.16-3-amd64: false
  linux-image-3.16-3-amd64/prerm/removing-running-kernel-3.16-3-amd64: true
  linux-image-3.16-3-amd64/postinst/mips-initrd-3.16-3-amd64:

--- End Message ---
--- Begin Message ---
Version: 3.16.7-1

Fixed upstream in 3.16.6.

Ben.

-- 
Ben Hutchings
This sentence contradicts itself - no actually it doesn't.

Attachment: signature.asc
Description: This is a digitally signed message part


--- End Message ---

Reply to: