[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1071501: marked as done (linux-image-6.1.0-21-arm64: Linux NFS client hangs in nfs4_lookup_revalidate)



Your message dated Thu, 27 Jun 2024 22:10:11 +0000
with message-id <E1sMxJj-000sZ1-JY@fasolo.debian.org>
and subject line Bug#1071501: fixed in linux 6.9.7-1
has caused the Debian Bug report #1071501,
regarding linux-image-6.1.0-21-arm64: Linux NFS client hangs in nfs4_lookup_revalidate
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)


-- 
1071501: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071501
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
Package: src:linux
Version: 6.1.90-1
Severity: normal
X-Debbugs-Cc: richard+debian+bugreport@kojedz.in

Dear Maintainer,

I am running kubernetes on debian, and pods are mounting multiple nfs
shares. I am running dovecot processes in PODs, which receive mails from
the internet, and also serves as imap server for clients. I am
monitoring my mail system by sending mails periodically (15 seconds) and
also downloading them via imap. I found a few times that some dovecot process
stuck in D state, a reboot was always needed to recover from that state.

Unfortunately, I was not able to trigger the bug really fast, I dont
really know what operations does dovecot issue and in what order to trigger
this behavior. So until I get closer, I've set up a similar, but smaller
environment with just a single dovecot process, and it also does the
same work, delivering only test mails locally, and serving them via imap
to the monitoring client, storing everything on NFS. Fortunately, this also
triggers the bug, after a few hours one of the dovecot processes is stuck
in D state. Kernel also shows blocked state:

May 19 12:16:49 k8s-node07 kernel: INFO: task lmtp:665683 blocked for more than 120 seconds.
May 19 12:16:49 k8s-node07 kernel:       Not tainted 6.1.0-21-arm64 #1 Debian 6.1.90-1
May 19 12:16:49 k8s-node07 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 19 12:16:49 k8s-node07 kernel: task:lmtp            state:D stack:0     pid:665683 ppid:2881   flags:0x00000000
May 19 12:16:49 k8s-node07 kernel: Call trace:
May 19 12:16:49 k8s-node07 kernel:  __switch_to+0xf0/0x170
May 19 12:16:49 k8s-node07 kernel:  __schedule+0x340/0x940
May 19 12:16:49 k8s-node07 kernel:  schedule+0x58/0xf0
May 19 12:16:49 k8s-node07 kernel:  __nfs_lookup_revalidate+0x118/0x160 [nfs]
May 19 12:16:49 k8s-node07 kernel:  nfs4_lookup_revalidate+0x20/0x30 [nfs]
May 19 12:16:49 k8s-node07 kernel:  lookup_fast+0x138/0x150
May 19 12:16:49 k8s-node07 kernel:  walk_component+0x30/0x1a0
May 19 12:16:49 k8s-node07 kernel:  path_lookupat+0x80/0x1a4
May 19 12:16:49 k8s-node07 kernel:  filename_lookup+0xb4/0x1b0
May 19 12:16:49 k8s-node07 kernel:  vfs_statx+0x94/0x19c
May 19 12:16:49 k8s-node07 kernel:  vfs_fstatat+0x68/0x90
May 19 12:16:49 k8s-node07 kernel:  __do_sys_newfstatat+0x58/0xa0
May 19 12:16:49 k8s-node07 kernel:  __arm64_sys_newfstatat+0x28/0x34
May 19 12:16:49 k8s-node07 kernel:  invoke_syscall+0x78/0x100
May 19 12:16:49 k8s-node07 kernel:  el0_svc_common.constprop.0+0x4c/0xf4
May 19 12:16:49 k8s-node07 kernel:  do_el0_svc+0x34/0xd0
May 19 12:16:49 k8s-node07 kernel:  el0_svc+0x34/0xd4
May 19 12:16:49 k8s-node07 kernel:  el0t_64_sync_handler+0xf4/0x120
May 19 12:16:49 k8s-node07 kernel:  el0t_64_sync+0x18c/0x190

Or, for another process:

May 20 04:50:01 k8s-node07 kernel: INFO: task imap:8337 blocked for more than 120 seconds.
May 20 04:50:01 k8s-node07 kernel:       Not tainted 6.1.0-21-arm64 #1 Debian 6.1.90-1
May 20 04:50:01 k8s-node07 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 20 04:50:01 k8s-node07 kernel: task:imap            state:D stack:0     pid:8337  ppid:3164   flags:0x00000000
May 20 04:50:01 k8s-node07 kernel: Call trace:
May 20 04:50:01 k8s-node07 kernel:  __switch_to+0xf0/0x170
May 20 04:50:01 k8s-node07 kernel:  __schedule+0x340/0x940
May 20 04:50:01 k8s-node07 kernel:  schedule+0x58/0xf0
May 20 04:50:01 k8s-node07 kernel:  __nfs_lookup_revalidate+0x118/0x160 [nfs]
May 20 04:50:01 k8s-node07 kernel:  nfs4_lookup_revalidate+0x20/0x30 [nfs]
May 20 04:50:01 k8s-node07 kernel:  lookup_fast+0x138/0x150
May 20 04:50:01 k8s-node07 kernel:  walk_component+0x30/0x1a0
May 20 04:50:01 k8s-node07 kernel:  path_lookupat+0x80/0x1a4
May 20 04:50:01 k8s-node07 kernel:  filename_lookup+0xb4/0x1b0
May 20 04:50:01 k8s-node07 kernel:  vfs_statx+0x94/0x19c
May 20 04:50:01 k8s-node07 kernel:  vfs_fstatat+0x68/0x90
May 20 04:50:01 k8s-node07 kernel:  __do_sys_newfstatat+0x58/0xa0
May 20 04:50:01 k8s-node07 kernel:  __arm64_sys_newfstatat+0x28/0x34
May 20 04:50:01 k8s-node07 kernel:  invoke_syscall+0x78/0x100
May 20 04:50:01 k8s-node07 kernel:  el0_svc_common.constprop.0+0x4c/0xf4
May 20 04:50:01 k8s-node07 kernel:  do_el0_svc+0x34/0xd0
May 20 04:50:01 k8s-node07 kernel:  el0_svc+0x34/0xd4
May 20 04:50:01 k8s-node07 kernel:  el0t_64_sync_handler+0xf4/0x120
May 20 04:50:01 k8s-node07 kernel:  el0t_64_sync+0x18c/0x190


Of course the NFS server is running, and other NFS mounts are still
working from the node. Also, this started to happen with Debian's
kernel. Before that, I was compiling my own upstream kernel, version
5.15. With that, I've never experienced such a lockup.

Unfortunately, I dont know, how to go further, how shall I collect more
relevant debugging information.

I expect thet dovecot is just an application, which should not cause any
kernel-side lockups. In my test lab, this specific NFS mount is just
mounted on one machine, so it really suggests me a linux nfs-client side
issue, not related to caching coherency between multiple clients.

-- Package-specific info:
** Version:
Linux version 6.1.0-21-arm64 (debian-kernel@lists.debian.org) (gcc-12 (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP Debian 6.1.90-1 (2024-05-03)

** Command line:
net.ifnames=0 console=ttyS2,1500000 console=tty1 root=UUID=b4ff4167-1fe9-4fd6-9b9c-c3c68d98108b rw rootwait panic=10

** Not tainted

** Kernel log:
May 20 04:52:02 k8s-node07 kernel: INFO: task imap:8337 blocked for more than 241 seconds.
May 20 04:52:02 k8s-node07 kernel:       Not tainted 6.1.0-21-arm64 #1 Debian 6.1.90-1
May 20 04:52:02 k8s-node07 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 20 04:52:02 k8s-node07 kernel: task:imap            state:D stack:0     pid:8337  ppid:3164   flags:0x00000000
May 20 04:52:02 k8s-node07 kernel: Call trace:
May 20 04:52:02 k8s-node07 kernel:  __switch_to+0xf0/0x170
May 20 04:52:02 k8s-node07 kernel:  __schedule+0x340/0x940
May 20 04:52:02 k8s-node07 kernel:  schedule+0x58/0xf0
May 20 04:52:02 k8s-node07 kernel:  __nfs_lookup_revalidate+0x118/0x160 [nfs]
May 20 04:52:02 k8s-node07 kernel:  nfs4_lookup_revalidate+0x20/0x30 [nfs]
May 20 04:52:02 k8s-node07 kernel:  lookup_fast+0x138/0x150
May 20 04:52:02 k8s-node07 kernel:  walk_component+0x30/0x1a0
May 20 04:52:02 k8s-node07 kernel:  path_lookupat+0x80/0x1a4
May 20 04:52:02 k8s-node07 kernel:  filename_lookup+0xb4/0x1b0
May 20 04:52:02 k8s-node07 kernel:  vfs_statx+0x94/0x19c
May 20 04:52:02 k8s-node07 kernel:  vfs_fstatat+0x68/0x90
May 20 04:52:02 k8s-node07 kernel:  __do_sys_newfstatat+0x58/0xa0
May 20 04:52:02 k8s-node07 kernel:  __arm64_sys_newfstatat+0x28/0x34
May 20 04:52:02 k8s-node07 kernel:  invoke_syscall+0x78/0x100
May 20 04:52:02 k8s-node07 kernel:  el0_svc_common.constprop.0+0x4c/0xf4
May 20 04:52:02 k8s-node07 kernel:  do_el0_svc+0x34/0xd0
May 20 04:52:02 k8s-node07 kernel:  el0_svc+0x34/0xd4
May 20 04:52:02 k8s-node07 kernel:  el0t_64_sync_handler+0xf4/0x120
May 20 04:52:02 k8s-node07 kernel:  el0t_64_sync+0x18c/0x190

** Model information

** Loaded modules:
sd_mod
t10_pi
crc64_rocksoft_generic
crc64_rocksoft
crc_t10dif
crct10dif_generic
crc64
sg
iscsi_tcp
libiscsi_tcp
libiscsi
scsi_transport_iscsi
scsi_mod
scsi_common
nf_conntrack_netlink
rpcsec_gss_krb5
auth_rpcgss
nfsv4
dns_resolver
nfs
lockd
grace
fscache
netfs
nft_log
nft_limit
xt_limit
xt_NFLOG
nfnetlink_log
xt_physdev
xt_TCPMSS
xt_tcpudp
xt_mark
xt_multiport
xt_addrtype
dummy
ipt_REJECT
nf_reject_ipv4
ip_set_hash_ipport
nft_chain_nat
xt_nat
xt_MASQUERADE
xt_ipvs
nf_nat
xt_set
ip_set_hash_ip
ip_set_hash_net
ip_set
veth
xt_conntrack
xt_comment
nft_compat
nf_tables
nfnetlink
overlay
sunrpc
binfmt_misc
evdev
aes_ce_blk
snd_soc_rk817
aes_ce_cipher
polyval_ce
snd_soc_core
polyval_generic
snd_pcm_dmaengine
ext4
ghash_ce
gf128mul
sha2_ce
leds_gpio
snd_pcm
sha256_arm64
sha1_ce
rockchip_thermal
crc16
mbcache
snd_timer
jbd2
snd
dw_wdt
soundcore
rk817_charger
rk805_pwrkey
cpufreq_dt
br_netfilter
bridge
stp
llc
ip_vs_sh
ip_vs_wrr
ip_vs_rr
ip_vs
nf_conntrack
nf_defrag_ipv6
nf_defrag_ipv4
drm
loop
fuse
efi_pstore
dm_mod
dax
configfs
ip_tables
x_tables
autofs4
xfs
libcrc32c
crc32c_generic
realtek
rk808_regulator
fan53555
dwmac_rk
stmmac_platform
stmmac
pcs_xpcs
spi_rockchip
phylink
dw_mmc_rockchip
dw_mmc_pltfm
of_mdio
dw_mmc
fixed
crct10dif_ce
crct10dif_common
fixed_phy
fwnode_mdio
pl330
i2c_rk3x
io_domain
libphy

** PCI devices:
not available

** USB devices:
not available


-- System Information:
Debian Release: 12.5
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable-security'), (500, 'stable')
Architecture: arm64 (aarch64)

Kernel: Linux 6.1.0-21-arm64 (SMP w/4 CPU threads)
Locale: LANG=C, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: unable to detect

Versions of packages linux-image-6.1.0-21-arm64 depends on:
ii  initramfs-tools [linux-initramfs-tool]  0.142
ii  kmod                                    30+20221128-1
ii  linux-base                              4.9

Versions of packages linux-image-6.1.0-21-arm64 recommends:
ii  apparmor             3.0.8-3
ii  firmware-linux-free  20200122-1

Versions of packages linux-image-6.1.0-21-arm64 suggests:
pn  debian-kernel-handbook  <none>
pn  linux-doc-6.1           <none>

Versions of packages linux-image-6.1.0-21-arm64 is related to:
pn  firmware-amd-graphics     <none>
pn  firmware-atheros          <none>
pn  firmware-bnx2             <none>
pn  firmware-bnx2x            <none>
pn  firmware-brcm80211        <none>
pn  firmware-cavium           <none>
pn  firmware-intel-sound      <none>
pn  firmware-intelwimax       <none>
pn  firmware-ipw2x00          <none>
pn  firmware-ivtv             <none>
pn  firmware-iwlwifi          <none>
pn  firmware-libertas         <none>
pn  firmware-linux-nonfree    <none>
pn  firmware-misc-nonfree     <none>
pn  firmware-myricom          <none>
pn  firmware-netxen           <none>
pn  firmware-qlogic           <none>
pn  firmware-realtek          <none>
pn  firmware-samsung          <none>
pn  firmware-siano            <none>
pn  firmware-ti-connectivity  <none>
pn  xen-hypervisor            <none>

-- no debconf information

--- End Message ---
--- Begin Message ---
Source: linux
Source-Version: 6.9.7-1
Done: Salvatore Bonaccorso <carnil@debian.org>

We believe that the bug you reported is fixed in the latest version of
linux, which is due to be installed in the Debian FTP archive.

A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed.  If you
have further comments please address them to 1071501@bugs.debian.org,
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Salvatore Bonaccorso <carnil@debian.org> (supplier of updated linux package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing ftpmaster@ftp-master.debian.org)


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Format: 1.8
Date: Thu, 27 Jun 2024 20:37:47 +0200
Source: linux
Architecture: source
Version: 6.9.7-1
Distribution: unstable
Urgency: medium
Maintainer: Debian Kernel Team <debian-kernel@lists.debian.org>
Changed-By: Salvatore Bonaccorso <carnil@debian.org>
Closes: 1063161 1070083 1071378 1071501
Changes:
 linux (6.9.7-1) unstable; urgency=medium
 .
   * New upstream stable update:
     https://www.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.9.3
     https://www.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.9.4
     https://www.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.9.5
     https://www.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.9.6
     - [x86] cpu: Provide default cache line size if not enumerated
       (Closes: #1071378)
     - NFS: add barriers when testing for NFS_FSDATA_BLOCKED (Closes: #1071501)
     https://www.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.9.7
 .
   [ Salvatore Bonaccorso ]
   * [x86] Refresh "intel-iommu: Add option to exclude integrated GPU only"
   * [x86] Refresh "intel-iommu: Add Kconfig option to exclude iGPU by default"
   * [rt] Drop "drm/i915/gt: Queue and wait for the irq_work item."
   * [arm64] Disable RELR.
     Temporarily disable RELR relocation packing to workaround failing boots
     on arm64 with recent binutils/2.42.50.20240618-1, cf. #1074111.
   * lib/python/debian_linux: Fix two E201/E202 whitespace errors
   * Drop "sched: Do not enable autogrouping by default" patch (Closes: #1070083)
   * [rt] init: Disable SCHED_AUTOGROUP on RT configurations
 .
   [ Aurelien Jarno ]
   * [riscv64] crypto: enable CRYPTO_AES_RISCV64, CRYPTO_CHACHA_RISCV64,
     CRYPTO_GHASH_RISCV64, CRYPTO_SHA256_RISCV64, CRYPTO_SHA512_RISCV64 as
     modules.
   * [riscv64] Improve Microchip Polarfire support: enable
     POLARFIRE_SOC_AUTO_UPDATE and USB_MUSB_POLARFIRE_SOC as modules.
   * [riscv64] Improve T-Head TH1520 support: enable MMC_SDHCI_OF_DWCMSHC as
     module.
   * [riscv64] Improve VisionFive 2 support: enable SND_DESIGNWARE_I2S,
     SND_SIMPLE_CARD and SND_SOC_JH7110_PWMDAC as modules.
   * [riscv64] Improve JH7110 support: enable STARFIVE_STARLINK_PMU as module.
 .
   [ Vincent Blut ]
   * [amd64] drivers/tee: Enable TEE as module (Closes: #1063161)
Checksums-Sha1:
 126e43ccf0af2d3d65709bec756f2088f3178322 229590 linux_6.9.7-1.dsc
 0bdc0583b2146124a9d44a14984436566b8f647a 146865244 linux_6.9.7.orig.tar.xz
 6423eac96e0aae9a6126d03c322080e4f4257d3a 1542468 linux_6.9.7-1.debian.tar.xz
 5c7977f28d13c3fd704dea5861169eca0782cd10 6895 linux_6.9.7-1_source.buildinfo
Checksums-Sha256:
 0a5660556f021f051a122659ccf8f62a8c5baabc525d5c0e6cd2f15d39c270ea 229590 linux_6.9.7-1.dsc
 babecb9b532988ab92ef2c3a32f51d11bb389edefd11caac7a9b3f5f37b4db7a 146865244 linux_6.9.7.orig.tar.xz
 be9bfd7e4877e8f8158a54342a3c039bdc414d0e9b0f6dc9f1389689f81bceb5 1542468 linux_6.9.7-1.debian.tar.xz
 010b629cb51f9ba47444e1822e33d3351e3e7bf1e08f6ea8c3bf2ce3dd136d5a 6895 linux_6.9.7-1_source.buildinfo
Files:
 d1f058aba39e6af9065dedb7b9270f07 229590 kernel optional linux_6.9.7-1.dsc
 d172ab241c16d0531911322fdb7e6b9a 146865244 kernel optional linux_6.9.7.orig.tar.xz
 d245856158e63b4efa064af03d5af75f 1542468 kernel optional linux_6.9.7-1.debian.tar.xz
 9b30bde2ddc34ababe7ff39c0399c240 6895 kernel optional linux_6.9.7-1_source.buildinfo

-----BEGIN PGP SIGNATURE-----

iQKmBAEBCgCQFiEERkRAmAjBceBVMd3uBUy48xNDz0QFAmZ9s35fFIAAAAAALgAo
aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldDQ2
NDQ0MDk4MDhDMTcxRTA1NTMxRERFRTA1NENCOEYzMTM0M0NGNDQSHGNhcm5pbEBk
ZWJpYW4ub3JnAAoJEAVMuPMTQ89EN4IP/1pm2A6c+jNOhjJHGU8XbjSDvx6hgp8R
thN4sGOQSjtMAuQjsAwcfkyYcPSv8UtZiy/BTETFzYGePxcRoQBWN9NrpwFvC/qi
eVCbafJUgxhnn+uc4X1s5X6AxgIJCzaIzsoJWu65IrMWg9iw2DAyYWHlD3BpvDUx
C91AtFRGcYQPFv8M8eGQ2Nvkf7+kkiGpW8hVq+/CC33bfnCNbGG8U7TGz0O57hbU
tXLh2T+q+LoVlLhUE9lYN7P8Z9G1z/qqPR5VYlMn5pTLoPErk3MUPVlUumjYms4c
511XPeqSW5WjphU38WOZ6SCEe/3opUmB+oWpG2XVVFssclDvY1zjwWczyKMaTDLv
g9djILLC2x4FwX/1kMCBNx6lO4KdcqepDzMNwI7fBgzLSefbfcCD7Y3hs8K8tvAy
V2xRlacwEOITP0lulbVITVXnkUVVAq68eTRbAeH3RvhUsSBaHRC9VkmNTubEjRyv
o+BHwtFpFlrFA0vSrCq4NHCM2IUmBcGWUX8xXsiBF5/A7dsvGfo+CMJLDj9hB6UB
HoJvZGNyBZCHitwQJ8Igp3o7VOY6gm6njgOwgI1slv/UEZcKQN8kHDrZj7SJrFqn
2FwPNROnXHxVvlOEK5Nau+17ClVk/DVclZcDctzVI7BEaOhXTIk/Q3UH1aEMLAcu
IsaZY2DlRBU1
=FnKH
-----END PGP SIGNATURE-----

Attachment: pgp8XwzXjvH1z.pgp
Description: PGP signature


--- End Message ---

Reply to: