[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1023563: marked as done (linux-image-5.10.0-19-amd64: Ephemeral ports are reused too quickly, even when net.ipv4.tcp_tw_reuse = 0)



Your message dated Thu, 01 May 2025 17:15:19 +0200 (CEST)
with message-id <20250501151519.2E0E3BE2DE0@eldamar.lan>
and subject line Closing this bug (BTS maintenance for src:linux bugs)
has caused the Debian Bug report #1023563,
regarding linux-image-5.10.0-19-amd64: Ephemeral ports are reused too quickly, even when net.ipv4.tcp_tw_reuse = 0
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
1023563: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1023563
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
Package: linux-image-5.10.0-19-amd64
Version: 5.10.149-2
Severity: important

Dear Maintainer,

Starting with linux-image-5.10.0-15-amd64 (5.10.120-1), it seems that
the kernel is reusing ephemeral tcp ports too quickly, even if
net.ipv4.tcp_tw_reuse is set to 0.

linux-image-5.10.0-14-amd64 (5.10.113-1) and all earlier versions did
not show that behaviour.

The behaviour is the same for IPv4 and IPv6.

* What led up to the situation?

I have a couple of medium-to-fairly busy web servers that open TCP
sessions (~15-20 new connections per second) to a dedicated port on a backend server. 
The connections are short-lived and terminated by the backend server
after 1 second on average.
This setup has been working for many years through many Debian releases
and kernel versions.

On July 2 2022 I updated (apt update) the systems, which upgraded the
linux kernel image from 5.10.0-14 to 5.10.0-15. 

Shortly afterwards I noticed an increasing number of connection errors
being reported by the web servers (timeouts).

Further analysis (mostly with tcpdump) showed that the web servers
had started reusing ephemeral TCP ports as shortly as 30 seconds after their
last use. At that time (30 sec) the backend server (which is also Debian) still
had the corresponding sockets in the TIME_WAIT status and replied to the
new SYN packet with an ACK instead of a SYN ACK (this is of course
normal behaviour, since the socket was still open). The web server did
not expect the ACK and discarded it, occasionally resending the SYN,
until a timeout occurred.

The choice of ephemeral source ports appeared quite erratic. For some
seconds they were chosen in ascending order as expected, then
seemed to jump back to some lower position, proceed in ascending order
from there again, then jump back to the higher position from where they
had left off before etc.

* What exactly did you do (or not do) that was effective (or
  ineffective)?

I first raised the port range for the ephemeral ports by setting
net.ipv4.ip_local_port_range=1024 60999 (from the default 32768 60999).
This alleviated the situation (so that the timeouts became less
frequent), but did not solve the problem.

I then set net.ipv4.tcp_tw_reuse = 0 (from the default 2), which did not
change anything (as is expected in this case).

* What was the outcome of this action?

None of the measures I took proved effective. 

So I downgraded the kernel to 5.10.0-14, and the problem immediately
went away. The web servers now cycle through the available ~60000
ephemeral ports and come around to reusing them long after the socket
on the backend server has been closed.


I am opening this bug here because I am not knowledgeable enough about
the Debian kernel patches to decide whether or not this issue is already
present in the upstream vanilla kernel.

Thank you for looking into this.

Best regards

Markus Wernig

-- System Information:
Debian Release: 11.5
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable-security'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 5.10.0-14-amd64 (SMP w/4 CPU threads)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) (ignored: LC_ALL set to en_US.utf8), LANGUAGE=en_US:en
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages linux-image-5.10.0-14-amd64 depends on:
ii  initramfs-tools [linux-initramfs-tool]  0.140
ii  kmod                                    28-1
ii  linux-base                              4.6

Versions of packages linux-image-5.10.0-14-amd64 recommends:
ii  apparmor             2.13.6-10
ii  firmware-linux-free  20200122-1

Versions of packages linux-image-5.10.0-14-amd64 suggests:
pn  debian-kernel-handbook  <none>
ii  grub-pc                 2.06-3~deb11u2
pn  linux-doc-5.10          <none>

--- End Message ---
--- Begin Message ---
Hi

This bug was filed for a (very) old kernel or the bug is old itself
without resolution. Maybe it was for a feature enablement which nobody
acted on. We are sorry we were not able to timely deal with this issue.
There are many open bugs for the src:linux package and thus we are
closing older bugs where it's unclear if they still occur in newer
versions and are still relevant to the reporter. For an overview see:
https://bugs.debian.org/src:linux .

If you can reproduce your issue with

- the current version in unstable/testing
- the latest kernel from backports

or, if it was a feature addition/wishlist and still consider it
relevant, then:

Please reopen the bug, see https://www.debian.org/Bugs/server-control
for details.

Please try to provide as much fresh details including kernel logs where
relevant. In particular were an issue is coupled with specific hardware we
might ask you to do additional debugging on your side as the owner of the
hardware.

Regards,
Salvatore

--- End Message ---

Reply to: