[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#953619: marked as done (umount stuck on D state if an isolated process running in busy loop)



Your message dated Thu, 20 Feb 2025 12:55:09 +0100 (CET)
with message-id <20250220115509.70702BE2EE7@eldamar.lan>
and subject line Closing this bug (BTS maintenance for src:linux bugs)
has caused the Debian Bug report #953619,
regarding umount stuck on D state if an isolated process running in busy loop
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
953619: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=953619
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
Package: src:linux
Version: 4.9.210-1
Severity: normal

Dear Maintainer,

I performed umount on my system and it got stuck.
Investigation findings:
* umount was stuck in D state
* after two minutes, there was a kernel message - "umount blocked for 120 seconds"
* the process that caused the umount to stuck was:
  - running on an isolated CPU (no other processes were running on that CPU)
  - running in real time priority (SCHED_FIFO)
  - running from root
 
I managed to create an easy reproduction:
* created and mounted a loopback filesystem as follows:
dd if=/dev/zero of=foodisk bs=1M count=10
mkfs.ext4 foodisk
mkdir foodir
mount foodisk foodir

* compiled & run the attached program
* performed "umount foodir"
* wait...
* when program is killed, umount is released

Note - when trying to reproduce without the syslog call, issue didn't reproduce. I don't understand the correlation.

Thanks,
Roi


-- Package-specific info:
** Version:
Linux version 4.9.0-12-amd64 (debian-kernel@lists.debian.org) (gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1) ) #1 SMP Debian 4.9.210-1 (2020-01-20)

** Command line:
BOOT_IMAGE=/vmlinuz ro isolcpus=3 quiet init=/bin/systemd systemd.sysv_console=true systemd.show_status=yes

** Tainted: O (4096)
 * Out-of-tree module has been loaded.

** Kernel log:
[  108.864321] EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null)
[  364.235368] INFO: task umount:1302 blocked for more than 120 seconds.
[  364.236530]       Tainted: G           O    4.9.0-12-amd64 #1 Debian 4.9.210-1
[  364.237682] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  364.238842] umount          D    0  1302   1267 0x00000000
[  364.238850]  0000000000000086 ffff8e9ab1e57400 0000000000000000 ffff8e8ab47f8140
[  364.238855]  ffff8e8abf218980 ffffffff85e11500 ffffb964ce97bc90 ffffffff858193f9
[  364.238860]  ffff8e8abf7989e8 00ffb964ce97bcc0 ffff8e8abf218980 000000000000732a
[  364.238865] Call Trace:
[  364.238877]  [<ffffffff858193f9>] ? __schedule+0x239/0x6f0
[  364.238881]  [<ffffffff858198e2>] ? schedule+0x32/0x80
[  364.238885]  [<ffffffff8581cc7d>] ? schedule_timeout+0x1dd/0x380
[  364.238890]  [<ffffffff852b4a22>] ? enqueue_task_fair+0x82/0x940
[  364.238894]  [<ffffffff8581a321>] ? wait_for_completion+0xf1/0x130
[  364.238899]  [<ffffffff852a72d0>] ? wake_up_q+0x70/0x70
[  364.238906]  [<ffffffff852968fe>] ? flush_work+0x10e/0x1c0
[  364.238910]  [<ffffffff85292e60>] ? destroy_worker+0x80/0x80
[  364.238916]  [<ffffffff853970dd>] ? lru_add_drain_all+0x11d/0x160
[  364.238925]  [<ffffffff854497a1>] ? invalidate_bdev+0x21/0x40
[  364.238973]  [<ffffffffc02c755b>] ? ext4_put_super+0x1fb/0x3a0 [ext4]
[  364.238980]  [<ffffffff8540ff6c>] ? generic_shutdown_super+0x6c/0xf0
[  364.238984]  [<ffffffff854102d1>] ? kill_block_super+0x21/0x60
[  364.238988]  [<ffffffff854103da>] ? deactivate_locked_super+0x3a/0x70
[  364.238993]  [<ffffffff8542f4fb>] ? cleanup_mnt+0x3b/0x80
[  364.238998]  [<ffffffff8529a2bf>] ? task_work_run+0x7f/0xa0
[  364.239003]  [<ffffffff85203754>] ? exit_to_usermode_loop+0xa4/0xb0
[  364.239007]  [<ffffffff85203bd9>] ? do_syscall_64+0xe9/0x100
[  364.239011]  [<ffffffff8581e1ce>] ? entry_SYSCALL_64_after_swapgs+0x58/0xc6

** Model information
sys_vendor: Intel Corporation
product_name: S2600WT2R
product_version: ....................
chassis_vendor: ...............................
chassis_version: ..................
bios_vendor: Intel Corporation
bios_version: SE5C610.86B.01.01.0016.C5.033120161139
board_vendor: Intel Corporation
board_name: S2600WT2R
board_version: H21573-365


** USB devices:
not available


-- System Information:
Debian Release: 9.12
  APT prefers oldstable-updates
  APT policy: (500, 'oldstable-updates'), (500, 'oldstable')
Architecture: amd64 (x86_64)

Kernel: Linux 4.9.0-12-amd64 (SMP w/88 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE=en_US:en (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages linux-image-4.9.0-12-amd64 depends on:
ii  initramfs-tools [linux-initramfs-tool]  0.130
ii  kmod                                    23-2
ii  linux-base                              4.5

Versions of packages linux-image-4.9.0-12-amd64 recommends:
pn  firmware-linux-free  <none>
pn  irqbalance           <none>

Versions of packages linux-image-4.9.0-12-amd64 suggests:
pn  debian-kernel-handbook  <none>
ii  grub-pc                 2.02~beta3-5+deb9u2
pn  linux-doc-4.9           <none>

Versions of packages linux-image-4.9.0-12-amd64 is related to:
pn  firmware-amd-graphics     <none>
pn  firmware-atheros          <none>
pn  firmware-bnx2             <none>
pn  firmware-bnx2x            <none>
pn  firmware-brcm80211        <none>
pn  firmware-cavium           <none>
pn  firmware-intel-sound      <none>
pn  firmware-intelwimax       <none>
pn  firmware-ipw2x00          <none>
pn  firmware-ivtv             <none>
pn  firmware-iwlwifi          <none>
pn  firmware-libertas         <none>
pn  firmware-linux-nonfree    <none>
pn  firmware-misc-nonfree     <none>
pn  firmware-myricom          <none>
pn  firmware-netxen           <none>
pn  firmware-qlogic           <none>
pn  firmware-realtek          <none>
pn  firmware-samsung          <none>
pn  firmware-siano            <none>
pn  firmware-ti-connectivity  <none>
pn  xen-hypervisor            <none>

-- no debconf information
#include <iostream>
#include <sched.h>
#include <syslog.h>

using namespace std;

static const int isolCpuNum = 3;

int main() {
	cout << "Starting do_nothing..." << endl;

    pid_t pid = 0;
    cpu_set_t mask;
    CPU_ZERO(&mask);
    CPU_SET(isolCpuNum, &mask);
    size_t cpusetsize = sizeof(mask);
    sched_setaffinity(pid, cpusetsize, &mask);

    int policy = SCHED_FIFO;
    sched_param param;
    param.sched_priority = 90;
    int ret_val = sched_setscheduler(pid, policy, &param);

    cout << "sched_setscheduler ret_val: " << ret_val << endl;

    syslog(LOG_NOTICE, "Hello World from do_nothing");

    int count = 0;
    while (true) {
        if (count == 50000)
            count = 0;
        count++;
    }
    return 0;
}

--- End Message ---
--- Begin Message ---
Hi

This bug was filed for a very old kernel or the bug is old itself
without resolution.

If you can reproduce it with

- the current version in unstable/testing
- the latest kernel from backports

please reopen the bug, see https://www.debian.org/Bugs/server-control
for details.

Regards,
Salvatore

--- End Message ---

Reply to: