Bug#1071562: nfsd blocks indefinitely in nfsd4_destroy_session
Package: nfs-kernel-server
Version: 1:2.6.2-4
Package: linux-image-6.1.0-21-amd64
Version: 6.1.90-1
During our tests of Proxmox VE with Debian NFS server as a shared storage we've noticed
that nfsd sometimes becomes unresponsive and it's necessary to reboot the server.
Probably the same error is reported here:
https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/2062568
NFS server:
* DELL PowerEdge R730xd, 2x 10C XEON E5-2640, Samsung SM863 SSDs, 8 GB RAM
* fresh installation of Debian Bookworm
* Linux 6.1.0-21-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.90-1 (2024-05-03) x86_64 GNU/Linux
* connected using 10GE link
* nfsd.conf configured with nthreads=16 (also tested with 8 and 4), other options left on defaults
* XFS mount exported with options: rw,sync,no_root_squash,no_subtree_check,no_wdelay
NFS client:
* DELL PowerEdge FC630, 2x 14C Xeon E5-2680 v4, 256 GB RAM
* fresh installation of Proxmox VE 8.2
* Proxmox Linux 6.8.4-3-pve kernel
* connected using 10GE link
* nfs client mount options: rw,noatime,nodiratime,vers=4.2,rsize=1048576,wsize=1048576,
namlen=255,hard,proto=tcp,nconnect=8,max_connect=16,timeo=600,retrans=2,sec=sys,
clientaddr=10.xx.xx.xx,local_lock=none,addr=10.xx.xx.xx
Dmesg on nfsd server side (repeats forever):
[ 3142.693181] INFO: task nfsd:1035 blocked for more than 120 seconds.
[ 3142.693217] Not tainted 6.1.0-21-amd64 #1 Debian 6.1.90-1
[ 3142.693239] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3142.693264] task:nfsd state:D stack:0 pid:1035 ppid:2 flags:0x00004000
[ 3142.693273] Call Trace:
[ 3142.693275] <TASK>
[ 3142.693279] __schedule+0x34d/0x9e0
[ 3142.693288] schedule+0x5a/0xd0
[ 3142.693294] schedule_timeout+0x118/0x150
[ 3142.693301] wait_for_completion+0x86/0x160
[ 3142.693307] __flush_workqueue+0x152/0x420
[ 3142.693317] nfsd4_destroy_session+0x1b6/0x250 [nfsd]
[ 3142.693379] nfsd4_proc_compound+0x355/0x660 [nfsd]
[ 3142.693433] nfsd_dispatch+0x1a1/0x2b0 [nfsd]
[ 3142.693478] svc_process_common+0x289/0x5e0 [sunrpc]
[ 3142.693551] ? svc_recv+0x4e5/0x890 [sunrpc]
[ 3142.693631] ? nfsd_svc+0x360/0x360 [nfsd]
[ 3142.693676] ? nfsd_shutdown_threads+0x90/0x90 [nfsd]
[ 3142.693720] svc_process+0xad/0x100 [sunrpc]
[ 3142.693790] nfsd+0xd5/0x190 [nfsd]
[ 3142.693836] kthread+0xda/0x100
[ 3142.693843] ? kthread_complete_and_exit+0x20/0x20
[ 3142.693849] ret_from_fork+0x22/0x30
[ 3142.693858] </TASK>
Dump of nfsd threads:
/proc/1032/stack:
[<0>] svc_recv+0x7f3/0x890 [sunrpc]
[<0>] nfsd+0xc3/0x190 [nfsd]
[<0>] kthread+0xda/0x100
[<0>] ret_from_fork+0x22/0x30
/proc/1033/stack:
[<0>] svc_recv+0x7f3/0x890 [sunrpc]
[<0>] nfsd+0xc3/0x190 [nfsd]
[<0>] kthread+0xda/0x100
[<0>] ret_from_fork+0x22/0x30
/proc/1034/stack:
[<0>] svc_recv+0x7f3/0x890 [sunrpc]
[<0>] nfsd+0xc3/0x190 [nfsd]
[<0>] kthread+0xda/0x100
[<0>] ret_from_fork+0x22/0x30
/proc/1035/stack:
[<0>] __flush_workqueue+0x152/0x420
[<0>] nfsd4_destroy_session+0x1b6/0x250 [nfsd]
[<0>] nfsd4_proc_compound+0x355/0x660 [nfsd]
[<0>] nfsd_dispatch+0x1a1/0x2b0 [nfsd]
[<0>] svc_process_common+0x289/0x5e0 [sunrpc]
[<0>] svc_process+0xad/0x100 [sunrpc]
[<0>] nfsd+0xd5/0x190 [nfsd]
[<0>] kthread+0xda/0x100
[<0>] ret_from_fork+0x22/0x30
/proc/130/stack:
[<0>] rpc_shutdown_client+0xf2/0x150 [sunrpc]
[<0>] nfsd4_process_cb_update+0x4c/0x270 [nfsd]
[<0>] nfsd4_run_cb_work+0x9f/0x150 [nfsd]
[<0>] process_one_work+0x1c7/0x380
[<0>] worker_thread+0x4d/0x380
[<0>] kthread+0xda/0x100
[<0>] ret_from_fork+0x22/0x30
On NFS client side, there's a number of backchannel reply errors:
[78636.676789] RPC: Could not send backchannel reply error: -110
[78647.905675] RPC: Could not send backchannel reply error: -110
[78675.207201] RPC: Could not send backchannel reply error: -110
[78744.201603] RPC: Could not send backchannel reply error: -110
[78784.138769] RPC: Could not send backchannel reply error: -110
We're able to reproduce this bug quite often (several times a day) when
restoring a 500GB virtual machine image from Proxmox Backup Server to
NFS shared storage. On the other hand, we cannot trigger it by other
ways like random and/or sequential I/O fio stress tests. According to
iostat, the VM restore job writes to NFS server in 300-400 MiB batches
separated by 3-4 secs of inactivity.
Interestingly, this issue probably occurs only when using a recent kernel
on NFS client side. We're able to hit this bug only with Proxmox Linux
6.8.4-3-pve kernel on NFS client side. When using Proxmox 6.5.13-5-pve
kernel there're no client-side backchannel reply errors and nfsd server
runs without any hungs. It seems to me that changes in NFS client code
between 6.5.x and 6.8.x accidentally uncovered a race in nfsd server code.
Based on the bug report #2062568 in Ubuntu I assume this is not
a Proxmox-specific issue but Proxmox VM restore workload together
with our testing hardware setup makes it easier to hit.
Regards,
Martin
Reply to: