Bug#1037223: marked as done (Possible bug causing I/O hangs)
Your message dated Sat, 10 Aug 2024 14:20:05 +0200 (CEST)
with message-id <20240810122005.E5120BE2DE0@eldamar.lan>
and subject line Closing this bug (BTS maintenance for src:linux bugs)
has caused the Debian Bug report #1037223,
regarding Possible bug causing I/O hangs
to be marked as done.
This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.
(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)
--
1037223: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1037223
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
- To: submit@bugs.debian.org
- Subject: Possible bug causing I/O hangs
- From: Niels Hendriks <niels@rootnet.nl>
- Date: Thu, 8 Jun 2023 11:33:13 +0200
- Message-id: <2589613972-3529444@mx.rootnet.nl>
- In-reply-to: <2587769829-3531453@mx.rootnet.nl>
Package: linux-image-amd64
Version: 5.10.178-3
Hi all,
I do not usually report kernel bugs so hopefully this is the right place!
We recently updated the kernel of our Debian 11 servers and since then we have encountered a bunch of servers (both VMs and bare metal) that suffer I/O hanging issues.
We can access the server through a console where I cannot copy text, but I have attached a screenshot showing the message we see in dmesg.
We initially thought this was related to the ext4 fast_commit feature flag we have enabled, and we do feel the issue occurs less often with fast_commit disabled, but it does not appear to be solved completely when we disable this feature.
With this error, we've been googling a bit and I ended up on this thread: https://www.spinics.net/lists/linux-ext4/msg86261.html through initially https://github.com/flatcar/Flatcar/issues/847
It mentions this fix: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/fs/ext4?h=linux-5.15.y&id=5bc0b2fda4b47c86278f7c6d30c211f425bf51cf
I believe this fix is currently not present in the 5.10 kernel available for Debian 11.
However, the linked fix also mentions:
> This bug has been around for many years, but it became *much* easier
to hit after commit 65f8b80053a1 ("ext4: fix race when reusing xattr
blocks").
Looking at the changelog: https://metadata.ftp-master.debian.org/changelogs//main/l/linux-signed-amd64/linux-signed-amd64_5.10.178+3_changelog
We do see the "ext4: fix race when reusing xattr blocks" change being added in 5.10.178-1.
This is why we believe we are now hitting this bug.
My question is whether this seems plausible, and if so, whether the fix I linked can also be released for Debian 11?
We could also upgrade to the bullseye-backports kernel, but given that this issue makes the system essentially unusable and we hit it every few days on one of our servers it may be more widespread and worth it to fix it in the regular bullseye kernel as well.
Thank you!
Best regards,
--- End Message ---
--- Begin Message ---
Hi
This bug was filed for a very old kernel or the bug is old itself
without resolution.
If you can reproduce it with
- the current version in unstable/testing
- the latest kernel from backports
please reopen the bug, see https://www.debian.org/Bugs/server-control
for details.
Regards,
Salvatore
--- End Message ---
Reply to: