Your message dated Sat, 19 Mar 2022 17:37:13 +0100 with message-id <c64b7a8e7c1fa86ae1446835c9d0a715d6b2ba72.camel@decadent.org.uk> and subject line Re: nfs-kernel-server: nfs stop working after some time - rpc.mountd and rpciod in D state has caused the Debian Bug report #620299, regarding nfs-kernel-server: nfs stop working after some time - rpc.mountd and rpciod in D state to be marked as done. This means that you claim that the problem has been dealt with. If this is not the case it is now your responsibility to reopen the Bug report if necessary, and/or fix the problem forthwith. (NB: If you are a system administrator and have no idea what this message is talking about, this may indicate a serious mail system misconfiguration somewhere. Please contact owner@bugs.debian.org immediately.) -- 620299: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=620299 Debian Bug Tracking System Contact owner@bugs.debian.org with problems
--- Begin Message ---
- To: Debian Bug Tracking System <submit@bugs.debian.org>
- Subject: nfs-kernel-server: nfs stop working after some time - rpc.mountd and rpciod in D state
- From: Yann Autissier <yann@autissier.net>
- Date: Thu, 31 Mar 2011 22:51:20 +0200
- Message-id: <20110331205120.10641.15416.reportbug@dev.lan.internethic.net>
Package: nfs-kernel-server Version: 1:1.2.2-4 Severity: important Tags: squeeze Hi, I've got a strange behavior with my nfs server since I use nfsv4. I'm using nfsv3 and nfsv4 with libpam-ldap, libnss-ldap, and unscd. The server, after some times (it can be few days or some weeks), stop responding, and the clients cannot mount shares anymore. Once the server has hung, the first rpciod process is in D state, as the rpc.mountd process. I can kill -9 the rpc.mountd process and restart it, but it get back in D state as soon as a mount request occurs. I'm unable to get nfs working again, unless rebooting the server. Here is my configuration. 'nas' and 'dev' are two nfs servers, mounted on each other. The problems occurs on the server 'dev'. root@dev:/# cat /etc/fstab # <file system> <mount point> <type> <options> <dump> <pass> proc /proc proc defaults 0 0 cgroup /cgroup cgroup defaults 0 0 /dev/sda3 /boot ext2 defaults 0 2 /dev/mapper/dev-swap none swap sw 0 0 /dev/mapper/dev-root / ext4 errors=remount-ro 0 1 /dev/mapper/dev-home /home ext4 defaults 0 2 /dev/mapper/dev-tmp /tmp ext4 defaults 0 2 /dev/mapper/dev-usr /usr ext4 defaults 0 2 /dev/mapper/dev-var /var ext4 defaults 0 2 /dev/mapper/dev-log /var/log ext4 defaults 0 2 /dev/mapper/dev-dns /var/dns-dev ext4 defaults 0 2 /dev/mapper/dev-dns /dns-dev ext4 defaults 0 2 nas:/dns-nas /dns-nas nfs4 _netdev,rw,soft,tcp 0 0 nas:/sav /var/sav nfs4 _netdev,rw,soft 0 0 root@dev:/# cat /etc/exports # nfsv3 /dns-dev 192.168.151.0/24(rw,sync,subtree_check,no_root_squash) # nfsv4 /var/ 192.168.151.0/24(fsid=0,rw,sync,crossmnt,no_subtree_check) /var/dns-dev 192.168.151.0/24(rw,sync,crossmnt,no_subtree_check,no_root_squash) root@dev:/# showmount -a rpc mount dump: RPC: Timed out root@dev:/# showmount -e rpc mount dump: RPC: Timed out root@dev:/# rpcinfo -p program no_version protocole no_port 100000 2 tcp 111 portmapper 100024 1 udp 47486 status 100024 1 tcp 33830 status 100000 2 udp 111 portmapper 100021 1 udp 49373 nlockmgr 100021 3 udp 49373 nlockmgr 100021 4 udp 49373 nlockmgr 100021 1 tcp 36200 nlockmgr 100021 3 tcp 36200 nlockmgr 100021 4 tcp 36200 nlockmgr 100003 2 tcp 2049 nfs 100003 3 tcp 2049 nfs 100003 4 tcp 2049 nfs 100003 2 udp 2049 nfs 100003 3 udp 2049 nfs 100003 4 udp 2049 nfs 100005 1 udp 41885 mountd 100005 1 tcp 54023 mountd 100005 2 udp 41885 mountd 100005 2 tcp 54023 mountd 100005 3 udp 41885 mountd 100005 3 tcp 54023 mountd root@dev:/# cat /proc/fs/nfsd/exports # Version 1.1 # Path Client(Flags) # IPs root@nas:/# cat /etc/fstab # <file system> <mount point> <type> <options> <dump> <pass> proc /proc proc defaults 0 0 /dev/hda2 none swap sw 0 0 /dev/hda1 / ext3 errors=remount-ro 0 1 /dev/mapper/nas-tmp /tmp ext3 defaults 0 2 /dev/mapper/nas-usr /usr ext3 defaults 0 2 /dev/mapper/nas-var /var ext3 defaults 0 2 /dev/mapper/nas-cache /var/cache ext3 defaults 0 2 /dev/mapper/nas-log /var/log ext3 defaults 0 2 /dev/mapper/nas-dns /dns ext4 defaults 0 2 /dev/mapper/nas-dns /dns-nas ext4 defaults 0 2 /dev/mapper/nas-dns /var/dns-nas ext4 defaults 0 2 /dev/mapper/nas-sav /var/sav ext4 defaults 0 2 dev:/dns-dev /dns-dev nfs4 _netdev,rw,soft 0 0 root@nas:/# cat /etc/exports # nfsv3 /dns 192.168.151.0/24(rw,sync,subtree_check,no_root_squash) /dns-nas 192.168.151.0/24(rw,sync,subtree_check,no_root_squash) /dns/net/internethic/ 192.168.151.0/24(rw,sync,subtree_check,no_root_squash) # /var/sav/ 192.168.151.0/24(rw,sync,subtree_check,no_root_squash) # nfsv4 /var/ 192.168.151.0/24(fsid=0,rw,sync,crossmnt,no_subtree_check) /var/dns-nas 192.168.151.0/24(rw,sync,crossmnt,no_subtree_check) /var/sav 192.168.151.0/24(rw,sync,crossmnt,no_subtree_check) root@nas:/# rpcinfo -p program no_version protocole no_port 100000 2 tcp 111 portmapper 391002 2 tcp 946 sgi_fam 100000 2 udp 111 portmapper 100021 1 udp 32768 nlockmgr 100021 3 udp 32768 nlockmgr 100021 4 udp 32768 nlockmgr 100021 1 tcp 32768 nlockmgr 100021 3 tcp 32768 nlockmgr 100021 4 tcp 32768 nlockmgr 100003 2 tcp 2049 nfs 100003 3 tcp 2049 nfs 100003 4 tcp 2049 nfs 100003 2 udp 2049 nfs 100003 3 udp 2049 nfs 100003 4 udp 2049 nfs 100005 1 udp 48417 mountd 100005 1 tcp 54133 mountd 100005 2 udp 48417 mountd 100005 2 tcp 54133 mountd 100005 3 udp 48417 mountd 100005 3 tcp 54133 mountd 100024 1 udp 38083 status 100024 1 tcp 58070 status nas:/home/aya# cat /proc/fs/nfs/exports # Version 1.1 # Path Client(Flags) # IPs /var 192.168.151.0/24(rw,root_squash,sync,wdelay,crossmnt,no_subtree_check,fsid=0,uuid=2f751ba1:c12849cd:8273c407:429f825e) /dns 192.168.151.0/24(rw,no_root_squash,sync,wdelay,uuid=fe5c0bb9:cafd4765:887a7f94:0a6045bc) Before rpc.mountd hangs, I get the following : root@nas:/# showmount -e dev Export list for dev: /var/dns-dev 192.168.151.0/24 /var 192.168.151.0/24 /dns-dev 192.168.151.0/24 After rpc.mount hangs, I get : root@nas:/# showmount -e dev rpc mount export: RPC: Timed out In the same time, rpc.mountd says, before hanging : root@dev:/home/aya# rpc.mountd -F -d all mountd: from_local: updating local if addr list mountd: from_local: checked 17 local if addrs; incoming address not found mountd: check_default: access by 192.168.151.10 ALLOWED mountd: export request from 192.168.151.10. and while hanging (because of a mount request) : mountd: auth_unix_ip: inbuf 'nfsd 192.168.151.7' mountd: auth_unix_ip: client 0xe28e90 '192.168.151.0/24' mountd: nfsd_fh: inbuf '192.168.151.0/24 1 \x00000000' I get a strace of the rcp.mountd before hanging : root@dev:/# strace -e trace=read,write -s4096 -p`pidof rpc.mountd` Process 32261 attached - interrupt to quit read(0, "\200\0\0DL\330\341\273\0\0\0\0\0\0\0\2\0\1\206\245\0\0\0\3\0\0\0\5\0\0\0\1\0\0\0\34M\224` \0\0\0\3nas\0\0\0\0\0\0\0\0\0\0\0\0\1\0\0\0\0\0\0\0\0\0\0\ 0\0", 4000) = 72 write(2, "mountd: ", 8) = 8 write(2, "from_local: updating local if addr list", 39) = 39 write(2, "\n", 1) = 1 write(2, "mountd: ", 8) = 8 write(2, "from_local: checked 17 local if addrs; incoming address not found", 65) = 65 write(2, "\n", 1) = 1 read(9, "# /etc/hosts.allow: list of hosts that are allowed to access the system.\n# See the manual pages hosts_access(5) and hosts_options(5).\n#\n# Example: ALL: LOCAL @some_netgroup\n# ALL: .foobar.edu EXCEPT terminalserver.foobar.edu\n#\n# If you're going to protect the portmapper use the name \"portmap\" for the\n# daemon name. Remember that you can only use the keyword \"ALL\" and IP\n# addresses (NOT host or domain names) for the portmapper, as well as for\n# rpc.mountd (the NFS mount daemon). See portmap(8) and rpc.mountd(8)\n# for further information.\n#\n\n", 4096) = 580 read(9, "", 4096) = 0 read(9, "# /etc/hosts.deny: list of hosts that are _not_ allowed to access the system.\n# See the manual pages hosts_access(5) and hosts_options(5).\n#\n# Example: ALL: some.host.name, .some.domain\n# ALL EXCEPT in.fingerd: other.host.name, .other.domain\n#\n# If you're going to protect the portmapper use the name \"portmap\" for the\n# daemon name. Remember that you can only use the keyword \"ALL\" and IP\n# addresses (NOT host or domain names) for the portmapper, as well as for\n# rpc.mountd (the NFS mount daemon). See portmap(8) and rpc.mountd(8)\n# for further information.\n#\n# The PARANOID wildcard matches any host whose name does not match its\n# address.\n#\n# You may wish to enable this to ensure any programs that don't\n# validate looked up hostnames still leave understandable logs. In past\n# versions of Debian this has been the default.\n# ALL: PARANOID\n\n", 4096) = 880 read(9, "", 4096) = 0 write(2, "mountd: ", 8) = 8 write(2, "check_default: access by 192.168.151.10 ALLOWED", 47) = 47 write(2, "\n", 1) = 1 write(2, "mountd: ", 8) = 8 write(2, "export request from 192.168.151.10.", 35) = 35 write(2, "\n", 1) = 1 write(0, "\200\0\0\240L\330\341\273\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\0\0\f/var/dns-dev\0\0\0\1\0\0\0\020192.168.151.0/24\0\0\0\0\0\0\0\1\0\0\0\4/var\0\0\0\1\0\0\0\020192.168.151.0/24\0\0\0\0\0\0\0\1\0\0\0\10/dns-dev\0\0\0\1\0\0\0\020192.168.151.0/24\0\0\0\0\0\0\0\0", 164) = 164 read(0, "", 4000) = 0 and while hanging : read(3, "nfsd 192.168.151.7\n", 2048) = 19 write(2, "mountd: ", 8) = 8 write(2, "auth_unix_ip: inbuf 'nfsd 192.168.151.7'", 40) = 40 write(2, "\n", 1) = 1 write(3, "nfsd 192.168.151.7 1301571202 192.168.151.0/24 \n", 48) = 48 write(2, "mountd: ", 8) = 8 write(2, "auth_unix_ip: client 0xe28e90 '192.168.151.0/24'", 48) = 48 write(2, "\n", 1) = 1 read(5, "192.168.151.0/24 1 \\x00000000\n", 2048) = 30 write(2, "mountd: ", 8) = 8 write(2, "nfsd_fh: inbuf '192.168.151.0/24 1 \\x00000000'", 46) = 46 write(2, "\n", 1) = 1 read(0, "/dev/mapper/dev-root / ext4 rw,errors=remount-ro 0 0\ntmpfs /lib/init/rw tmpfs rw,nosuid,mode=0755 0 0\nproc /proc proc rw,noexec,nosuid,nodev 0 0\nsysfs /sys sysfs rw,noexec,nosuid,nodev 0 0\nudev /dev tmpfs rw,mode=0755 0 0\ntmpfs /dev/shm tmpfs rw,nosuid,nodev 0 0\ndevpts /dev/pts devpts rw,noexec,nosuid,gid=5,mode=620 0 0\ncgroup /cgroup cgroup rw 0 0\n/dev/sda3 /boot ext2 rw 0 0\n/dev/mapper/dev-home /home ext4 rw 0 0\n/dev/mapper/dev-tmp /tmp ext4 rw 0 0\n/dev/mapper/dev-usr /usr ext4 rw 0 0\n/dev/mapper/dev-var /var ext4 rw 0 0\n/dev/mapper/dev-log /var/log ext4 rw 0 0\n/dev/mapper/dev-dns /var/dns-dev ext4 rw 0 0\n/dev/mapper/dev-dns /dns-dev ext4 rw 0 0\nnas:/sav /var/sav nfs4 rw,soft,addr=192.168.151.10,clientaddr=192.168.151.9 0 0\nnfsd /proc/fs/nfsd nfsd rw 0 0\nrpc_pipefs /var/lib/nfs/rpc_pipefs rpc_pipefs rw 0 0\n", 4096) = 820 After the hang, I can see rpciod and rpc.mountd in D state : root@dev:/# ps aux |awk '$8 ~ /D/' USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 1672 0.0 0.0 0 0 ? D Mar09 0:09 [rpciod/0] root 32261 0.0 0.0 14628 1176 pts/17 D+ 13:02 0:00 rpc.mountd -F -d all Here is the debug in my syslog, during the mount request from host 192.168.151.7 : root@dev:/# echo 32767 > /proc/sys/sunrpc/rpc_debug root@dev:/# echo 32767 > /proc/sys/sunrpc/nfs_debug root@dev:/# echo 32767 > /proc/sys/sunrpc/nfsd_debug root@dev:/# tail -f /var/log/syslog Mar 31 13:34:20 dev kernel: [1941071.365643] RPC: killing all tasks for client ffff8803efe08a00 Mar 31 13:34:21 dev kernel: [1941072.363938] RPC: killing all tasks for client ffff8803efe08a00 Mar 31 13:34:22 dev kernel: [1941073.362216] RPC: killing all tasks for client ffff8803efe08a00 Mar 31 13:34:22 dev kernel: [1941073.781870] svc: socket ffff88000dc46e80 TCP data ready (svsk ffff88042e724000) Mar 31 13:34:22 dev kernel: [1941073.781877] svc: transport ffff88042e724000 served by daemon ffff880418a00000 Mar 31 13:34:22 dev kernel: [1941073.781896] svc: socket ffff88000dc46e80 TCP (connected) state change 8 (svsk ffff88042e724000) Mar 31 13:34:22 dev kernel: [1941073.781900] svc: transport ffff88042e724000 busy, not enqueued Mar 31 13:34:22 dev kernel: [1941073.781902] svc: socket ffff88000dc46e80 TCP data ready (svsk ffff88042e724000) Mar 31 13:34:22 dev kernel: [1941073.781905] svc: transport ffff88042e724000 busy, not enqueued Mar 31 13:34:22 dev kernel: [1941073.781920] svc_recv: found XPT_CLOSE Mar 31 13:34:22 dev kernel: [1941073.781923] svc: svc_delete_xprt(ffff88042e724000) Mar 31 13:34:22 dev kernel: [1941073.781925] svc: svc_tcp_sock_detach(ffff88042e724000) Mar 31 13:34:22 dev kernel: [1941073.781927] svc: svc_sock_detach(ffff88042e724000) Mar 31 13:34:22 dev kernel: [1941073.781937] svc: transport ffff88042e724000 is dead, not enqueued Mar 31 13:34:22 dev kernel: [1941073.781940] svc: server ffff880418a00000 waiting for data (to = 900000) Mar 31 13:34:22 dev kernel: [1941073.782184] svc: socket ffff88042e00d480 TCP (listen) state change 10 Mar 31 13:34:22 dev kernel: [1941073.782188] svc: transport ffff8803a2ece000 served by daemon ffff880418a00000 Mar 31 13:34:22 dev kernel: [1941073.782202] svc: socket ffff88000dc41a00 TCP (listen) state change 1 Mar 31 13:34:22 dev kernel: [1941073.782220] svc: socket ffff88000dc41a00 TCP (listen) state change 1 Mar 31 13:34:22 dev kernel: [1941073.782226] svc: socket ffff88000dc41a00 TCP (listen) state change 1 Mar 31 13:34:22 dev kernel: [1941073.782242] svc: tcp_accept ffff8803a2ece000 sock ffff88042ee298c0 Mar 31 13:34:22 dev kernel: [1941073.782248] nfsd: connect from 192.168.151.7, port=811 Mar 31 13:34:22 dev kernel: [1941073.782250] svc: svc_setup_socket ffff88042bd24d00 Mar 31 13:34:22 dev kernel: [1941073.782253] setting up TCP socket for reading Mar 31 13:34:22 dev kernel: [1941073.782256] svc: socket ffff88042e723000(inet ffff88000dc41a00), write_space busy=1 Mar 31 13:34:22 dev kernel: [1941073.782259] svc: svc_setup_socket created ffff88042e723000 (inet ffff88000dc41a00) Mar 31 13:34:22 dev kernel: [1941073.782262] svc: transport ffff88042e723000 served by daemon ffff88040fbc6000 Mar 31 13:34:22 dev kernel: [1941073.782267] svc: transport ffff8803a2ece000 served by daemon ffff88040fbc4000 Mar 31 13:34:22 dev kernel: [1941073.782273] svc: transport ffff8803a2ece000 busy, not enqueued Mar 31 13:34:22 dev kernel: [1941073.782276] svc: server ffff880418a00000 waiting for data (to = 900000) Mar 31 13:34:22 dev kernel: [1941073.782281] svc: server ffff88040fbc6000, pool 0, transport ffff88042e723000, inuse=2 Mar 31 13:34:22 dev kernel: [1941073.782286] svc: tcp_recv ffff88042e723000 data 1 conn 0 close 0 Mar 31 13:34:22 dev kernel: [1941073.782289] svc: socket ffff88042e723000(inet ffff88000dc41a00), write_space busy=1 Mar 31 13:34:22 dev kernel: [1941073.782293] svc: transport ffff88042e723000 busy, not enqueued Mar 31 13:34:22 dev kernel: [1941073.782300] svc: socket ffff88042e723000 recvfrom(ffff88042e72328c, 0) = 4 Mar 31 13:34:22 dev kernel: [1941073.782303] svc: TCP record, 100 bytes Mar 31 13:34:22 dev kernel: [1941073.782312] svc: socket ffff88000dc41a00 TCP data ready (svsk ffff88042e723000) Mar 31 13:34:22 dev kernel: [1941073.782316] svc: transport ffff88042e723000 busy, not enqueued Mar 31 13:34:22 dev kernel: [1941073.782322] svc: socket ffff88042e723000 recvfrom(ffff880351f9c008, 4088) = 8 Mar 31 13:34:22 dev kernel: [1941073.782325] svc: tcp_accept ffff8803a2ece000 sock ffff88042ee298c0 Mar 31 13:34:22 dev kernel: [1941073.782332] svc: socket ffff88042e723000 recvfrom(ffff880351f9c064, 3996) = 92 Mar 31 13:34:22 dev kernel: [1941073.782336] svc: server ffff88040fbc4000 waiting for data (to = 900000) Mar 31 13:34:22 dev kernel: [1941073.782339] svc: TCP complete record (100 bytes) Mar 31 13:34:22 dev kernel: [1941073.782343] svc: transport ffff88042e723000 served by daemon ffff88040fbc4000 Mar 31 13:34:22 dev kernel: [1941073.782353] svc: got len=100 Mar 31 13:34:22 dev kernel: [1941073.782356] svc: svc_authenticate (1) Mar 31 13:34:22 dev kernel: [1941073.782360] svc: server ffff88040fbc4000, pool 0, transport ffff88042e723000, inuse=3 Mar 31 13:34:22 dev kernel: [1941073.782366] RPC: Want update, refage=1800, age=1860 Mar 31 13:34:22 dev kernel: [1941073.782370] svc: tcp_recv ffff88042e723000 data 1 conn 0 close 0 Mar 31 13:34:22 dev kernel: [1941073.782375] svc: socket ffff88042e723000 recvfrom(ffff88042e72328c, 0) = 4 Mar 31 13:34:22 dev kernel: [1941073.782378] svc: TCP record, 100 bytes Mar 31 13:34:22 dev kernel: [1941073.782383] svc: socket ffff88042e723000 recvfrom(ffff88041ea23008, 4088) = 8 Mar 31 13:34:22 dev kernel: [1941073.782389] svc: socket ffff88042e723000 recvfrom(ffff88041ea23064, 3996) = 92 Mar 31 13:34:22 dev kernel: [1941073.782392] svc: svc_process dropit Mar 31 13:34:22 dev kernel: [1941073.782394] svc: xprt ffff88042e723000 dropped request Mar 31 13:34:22 dev kernel: [1941073.782397] svc: TCP complete record (100 bytes) Mar 31 13:34:22 dev kernel: [1941073.782399] svc: transport ffff88042e723000 busy, not enqueued Mar 31 13:34:22 dev kernel: [1941073.782403] svc: server ffff88040fbc6000 waiting for data (to = 900000) Mar 31 13:34:22 dev kernel: [1941073.782407] svc: transnee Mar 31 13:34:22 dev kernel: 4[917.869 v:scsc_reff802720)<>1403723]sc evrff801a00,po ,tasotff802730,iue5<>1403723]s:tprc ff84e200dt on0coe0<>1403724]sc oktff802730 evrmff802738,0 Mar 31 13:34:22 dev kernel: 4[917.865 v:TPrcr,10bts<>1403724]sc oktff802730 evrmff80ba50,48)=8<>1403725]rvstcnee Mar 31 13:34:22 dev kernel: 4[917.864 eii acld<>1407.825]rvstcneld<>1407.25]rvstcnee Mar 31 13:34:22 dev kernel: 4[917.869 eii anee Mar 31 13:34:22 dev kernel: 4[917.861 eii cnee Mar 31 13:34:22 dev kernel: [1941073.782663] svc: svc_sock_free(ffff88042e727c00) Mar 31 13:34:22 dev kernel: [1941073.782675] svc: socket ffff88042e723000 recvfrom(ffff8802b0ae5064, 3996) = 92 Mar 31 13:34:22 dev kernel: [1941073.782678] revisit canceled Mar 31 13:34:22 dev kernel: [1941073.782680] revisit canceled Mar 31 13:34:22 dev kernel: [1941073.782681] revisit canceled Mar 31 13:34:22 dev kernel: [1941073.782683] revisit canceled Mar 31 13:34:22 dev kernel: [1941073.782685] revisit canceled Mar 31 13:34:22 dev kernel: [1941073.782687] svc: TCP complete record (100 bytes) Mar 31 13:34:22 dev kernel: [1941073.782690] svc: transport ffff88042e723000 served by daemon ffff88040fbc4000 Mar 31 13:34:22 dev kernel: [1941073.782693] revisit canceled Mar 31 13:34:22 dev kernel: [1941073.782694] revisit canceled Mar 31 13:34:22 dev kernel: [1941073.782697] svc: svc_sock_free(ffff88042e720800) Mar 31 13:34:22 dev kernel: [1941073.782702] svc: got len=100 Mar 31 13:34:22 dev kernel: [1941073.782705] svc: svc_authenticate (1) Mar 31 13:34:22 dev kernel: [1941073.782707] revisit canceled Mar 31 13:34:22 dev kernel: [1941073.782709] revisit canceled Mar 31 13:34:22 dev kernel: [1941073.782711] revisit canceled Mar 31 13:34:22 dev kernel: [1941073.782713] revisit canceled Mar 31 13:34:22 dev kernel: [1941073.782715] svc: transport ffff88042e723000 busy, not enqueued Mar 31 13:34:22 dev kernel: [1941073.782717] svc: calling dispatcher Mar 31 13:34:22 dev kernel: [1941073.782720] nfsd_dispatch: vers 4 proc 1 Mar 31 13:34:22 dev kernel: [1941073.782722] revisit canceled Mar 31 13:34:22 dev kernel: [1941073.782725] nfsv4 compound op #1/1: 30 (OP_RENEW) Mar 31 13:34:22 dev kernel: [1941073.782727] process_renew(4d9458d1/0000073f): starting Mar 31 13:34:22 dev kernel: [1941073.782730] nfsd4_renew: clientid not found! Mar 31 13:34:22 dev kernel: [1941073.782733] nfsv4 compound op ffff88025b2e0078 opcnt 1 #1: 30: status 10011 Mar 31 13:34:22 dev kernel: [1941073.782735] revisit canceled Mar 31 13:34:22 dev kernel: [1941073.782737] revisit canceled Mar 31 13:34:22 dev kernel: [1941073.782739] revisit canceled Mar 31 13:34:22 dev kernel: [1941073.782741] nfsv4 compound returned 10011 Mar 31 13:34:22 dev kernel: [1941073.782744] svc: svc_sock_free(ffff88042e724000) Mar 31 13:34:22 dev kernel: [1941073.782755] svc: socket ffff88042e723000 sendto([ffff8803aa615000 48... ], 48) = 48 (addr 192.168.151.7, port=811) Mar 31 13:34:22 dev kernel: [1941073.782758] revisit queued Mar 31 13:34:22 dev kernel: [1941073.782760] svc: transport ffff88042e723000 busy, not enqueued Mar 31 13:34:22 dev kernel: [1941073.782762] revisit queued Mar 31 13:34:22 dev kernel: [1941073.782765] svc: transport ffff88042e723000 busy, not enqueued Mar 31 13:34:22 dev kernel: [1941073.782767] revisit queued Mar 31 13:34:22 dev kernel: [1941073.782769] svc: transport ffff88042e723000 busy, not enqueued Mar 31 13:34:22 dev kernel: [1941073.782773] svc: transport ffff88042e723000 busy, not enqueued Mar 31 13:34:22 dev kernel: [1941073.782777] svc: server ffff880418a00000 waiting for data (to = 900000) Mar 31 13:34:22 dev kernel: [1941073.782786] svc: server ffff88040fbc4000, pool 0, transport ffff88042e723000, inuse=2 Mar 31 13:34:22 dev kernel: [1941073.782790] svc: transport ffff88042e723000 served by daemon ffff880418a00000 Mar 31 13:34:22 dev kernel: [1941073.782796] svc: got len=100 Mar 31 13:34:22 dev kernel: [1941073.782817] svc: svc_authenticate (1) Mar 31 13:34:22 dev kernel: [1941073.782821] svc: transport ffff88042e723000 busy, not enqueued Mar 31 13:34:22 dev kernel: [1941073.782823] svc: calling dispatcher Mar 31 13:34:22 dev kernel: [1941073.782825] nfsd_dispatch: vers 4 proc 1 Mar 31 13:34:22 dev kernel: [1941073.782828] nfsv4 compound op #1/1: 30 (OP_RENEW) Mar 31 13:34:22 dev kernel: [1941073.782831] process_renew(4d9458d1/0000073f): starting Mar 31 13:34:22 dev kernel: [1941073.782832] nfsd4_renew: clientid not found! Mar 31 13:34:22 dev kernel: [1941073.782835] nfsv4 compound op ffff8802afe5f078 opcnt 1 #1: 30: status 10011 Mar 31 13:34:22 dev kernel: [1941073.782838] nfsv4 compound returned 10011 Mar 31 13:34:22 dev kernel: [1941073.782857] svc: socket ffff88042e723000 sendto([ffff88041ea23000 48... ], 48) = 48 (addr 192.168.151.7, port=811) Mar 31 13:34:22 dev kernel: [1941073.782860] svc: transport ffff88042e723000 busy, not enqueued Mar 31 13:34:22 dev kernel: [1941073.782863] svc: server ffff88040fbc4000 waiting for data (to = 900000) Mar 31 13:34:22 dev kernel: [1941073.782871] svc: server ffff880418a00000, pool 0, transport ffff88042e723000, inuse=2 Mar 31 13:34:22 dev kernel: [1941073.782874] svc: transport ffff88042e723000 served by daemon ffff88040fbc4000 Mar 31 13:34:22 dev kernel: [1941073.782878] svc: got len=100 Mar 31 13:34:22 dev kernel: [1941073.782880] svc: svc_authenticate (1) Mar 31 13:34:22 dev kernel: [1941073.782885] svc: server ffff88040fbc4000, pool 0, transport ffff88042e723000, inuse=3 Mar 31 13:34:22 dev kernel: [1941073.782889] svc: transport ffff88042e723000 served by daemon ffff88040fbc6000 Mar 31 13:34:22 dev kernel: [1941073.782895] svc: got len=100 Mar 31 13:34:22 dev kernel: [1941073.782897] svc: transport ffff88042e723000 busy, not enqueued Mar 31 13:34:22 dev kernel: [1941073.782900] svc: calling dispatcher Mar 31 13:34:22 dev kernel: [1941073.782902] nfsd_dispatch: vers 4 proc 1 Mar 31 13:34:22 dev kernel: <4n eund101<>1403737]sc o e=0 Mar 31 13:34:22 dev kernel: 4[917.803 v:scatetct 1 Mar 31 13:34:22 dev kernel: 4[917.808 v:tasotff802730 uy o nuud<>1403738]sc aln ipthr<>1403738]ns_ipth es4po Mar 31 13:34:22 dev kernel: 4[917.807 v:sce ff84e200snt(ff80ffd0 8. ,4)=4 ad 9.6.5.,pr=1)<>1403739]ns4cmon (o=900)<>1403730]ns4rnw letdntfud Mar 31 13:34:22 dev kernel: 4[917.818 fv opudo ff82f508oct1#:3:sau 01 Mar 31 13:34:22 dev kernel: 4[917.812 v:sre ff848000 ol0 rnpr ff84e200 ns= Mar 31 13:34:22 dev kernel: 4[917.816 v:tprc ff84e200dt on0coe0<>1403732]ns4cmon eund101<>1403732]sc oktff802730 evrmff802738,0 Mar 31 13:34:22 dev kernel: 4[917.816 v:TPrcr,10bts<>1403733]sc oktff802730 evrmff807280,48)=8<>1403734]sc oktff802730 edo[ff8489004..] 8 8(dr1218117 ot81 Mar 31 13:34:22 dev kernel: 4[917.815 v:tasotff802730 uy o nuud<>1403735]sc oktff802730 evrmff807286,39)=9 Mar 31 13:34:22 dev kernel: 4[917.813 v:TPcmlt eod(0 ye)<>1403735]sc rnpr ff84e200sre ydeo ff84fc00<>1403735]sc evrff800b40 atn o aa(o=900)<>1403736]sc o e=0 Mar 31 13:34:22 dev kernel: 4[917.818 v:scatetct 1 Mar 31 13:34:22 dev kernel: 4[917.812 v:tasotff802730 uy o nuud<>1403737]sc aln dispatcher Mar 31 13:34:22 dev kernel: [1941073.783177] nfsd_dispatch: vers 4 proc 1 Mar 31 13:34:22 dev kernel: [1941073.783181] nfsv4 compound op #1/1: 30 (OP_RENEW) Mar 31 13:34:22 dev kernel: [1941073.783183] process_renew(4d9458d1/0000073f): starting Mar 31 13:34:22 dev kernel: [1941073.783195] nfsd4_renew: clientid not found! Mar 31 13:34:22 dev kernel: [1941073.783198] nfsv4 compound op ffff88025b2e0078 opcnt 1 #1: 30: status 10011 Mar 31 13:34:22 dev kernel: [1941073.783201] nfsv4 compound returned 10011 Mar 31 13:34:22 dev kernel: [1941073.783211] svc: socket ffff88042e723000 sendto([ffff8804132e7000 48... ], 48) = 48 (addr 192.168.151.7, port=811) Mar 31 13:34:22 dev kernel: [1941073.783215] svc: transport ffff88042e723000 busy, not enqueued Mar 31 13:34:22 dev kernel: [1941073.783217] svc: server ffff880418a00000 waiting for data (to = 900000) Mar 31 13:34:22 dev kernel: [1941073.783225] svc: server ffff88040fbc6000, pool 0, transport ffff88042e723000, inuse=2 Mar 31 13:34:22 dev kernel: [1941073.783228] svc: tcp_recv ffff88042e723000 data 1 conn 0 close 0 Mar 31 13:34:22 dev kernel: [1941073.783232] svc: socket ffff88042e723000 recvfrom(ffff88042e72328c, 0) = 4 Mar 31 13:34:22 dev kernel: [1941073.783234] svc: TCP record, 100 bytes Mar 31 13:34:22 dev kernel: [1941073.783237] svc: socket ffff88042e723000 recvfrom(ffff880351f9c008, 4088) = 8 Mar 31 13:34:22 dev kernel: [1941073.783241] svc: socket ffff88042e723000 recvfrom(ffff880351f9c064, 3996) = 92 Mar 31 13:34:22 dev kernel: [1941073.783244] svc: TCP complete record (100 bytes) Mar 31 13:34:22 dev kernel: [1941073.783246] svc: transport ffff88042e723000 served by daemon ffff880418a00000 Mar 31 13:34:22 dev kernel: [1941073.783251] svc: got len=100 Mar 31 13:34:22 dev kernel: [1941073.783253] svc: svc_authenticate (1) Mar 31 13:34:22 dev kernel: [1941073.783258] svc: server ffff880418a00000, pool 0, transport ffff88042e723000, inuse=3 Mar 31 13:34:22 dev kernel: [1941073.783261] svc: transport ffff88042e723000 busy, not enqueued Mar 31 13:34:22 dev kernel: [1941073.783265] svc: tcp_recv ffff88042e723000 data 1 conn 0 close 0 Mar 31 13:34:22 dev kernel: [1941073.783270] svc: socket ffff88042e723000 recvfrom(ffff88042e723288, 4) = -11 Mar 31 13:34:22 dev kernel: [1941073.783273] RPC: TCP recv_record got EAGAIN Mar 31 13:34:22 dev kernel: [1941073.783275] svc: calling dispatcher Mar 31 13:34:22 dev kernel: [1941073.783278] nfsd_dispatch: vers 4 proc 1 Mar 31 13:34:22 dev kernel: [1941073.783281] svc: got len=-11 Mar 31 13:34:22 dev kernel: [1941073.783283] nfsv4 compound op #1/1: 30 (OP_RENEW) Mar 31 13:34:22 dev kernel: [1941073.783286] process_renew(4d9458d1/0000073f): starting Mar 31 13:34:22 dev kernel: [1941073.783288] nfsd4_renew: clientid not found! Mar 31 13:34:22 dev kernel: [1941073.783292] nfsv4 compound op ffff880377c22078 opcnt 1 #1: 30: status 10011 Mar 31 13:34:22 dev kernel: [1941073.783295] svc: server ffff880418a00000 waiting for data (to = 900000) Mar 31 13:34:22 dev kernel: [1941073.783299] nfsv4 compound returned 10011 Mar 31 13:34:22 dev kernel: [1941073.783314] svc: socket ffff88042e723000 sendto([ffff88041ea23000 48... ], 48) = 48 (addr 192.168.151.7, port=811) Mar 31 13:34:22 dev kernel: [194ffff880351f9c008, 4088) = 8 Mar 31 13:34:22 dev kernel: [1941073.792890] svc: socket ffff88042e723000 recvfrom(ffff880351f9c0bc, 3908) = 180 Mar 31 13:34:22 dev kernel: [1941073.792893] svc: TCP complete record (188 bytes) Mar 31 13:34:22 dev kernel: [1941073.792896] svc: transport ffff88042e723000 served by daemon ffff880418a00000 Mar 31 13:34:22 dev kernel: [1941073.792900] svc: got len=188 Mar 31 13:34:22 dev kernel: [1941073.792903] svc: svc_authenticate (1) Mar 31 13:34:22 dev kernel: [1941073.792909] svc: server ffff880418a00000, pool 0, transport ffff88042e723000, inuse=3 Mar 31 13:34:22 dev kernel: [1941073.792912] svc: transport ffff88042e723000 busy, not enqueued Mar 31 13:34:22 dev kernel: [1941073.792914] svc: calling dispatcher Mar 31 13:34:22 dev kernel: [1941073.792917] nfsd_dispatch: vers 4 proc 1 Mar 31 13:34:22 dev kernel: [1941073.792920] svc: tcp_recv ffff88042e723000 data 1 conn 0 close 0 Mar 31 13:34:22 dev kernel: [1941073.792923] nfsv4 compound op #1/1: 35 (OP_SETCLIENTID) Mar 31 13:34:22 dev kernel: [1941073.792927] NFSD: nfs4_make_rec_clidname for 192.168.151.6/192.168.151.9 tcp UNIX 0 Mar 31 13:34:22 dev kernel: [1941073.792931] svc: socket ffff88042e723000 recvfrom(ffff88042e723288, 4) = -11 Mar 31 13:34:22 dev kernel: [1941073.792934] RPC: TCP recv_record got EAGAIN Mar 31 13:34:22 dev kernel: [1941073.792936] svc: got len=-11 Mar 31 13:34:22 dev kernel: [1941073.792938] svc: server ffff880418a00000 waiting for data (to = 900000) Mar 31 13:34:22 dev kernel: [1941073.792952] nfsv4 compound op ffff880377c22078 opcnt 1 #1: 35: status 0 Mar 31 13:34:22 dev kernel: [1941073.792956] nfsv4 compound returned 0 Mar 31 13:34:22 dev kernel: [1941073.792971] svc: socket ffff88042e723000 sendto([ffff8802b0ae5000 64... ], 64) = 64 (addr 192.168.151.7, port=811) Mar 31 13:34:22 dev kernel: [1941073.792979] svc: server ffff88040fbc6000 waiting for data (to = 900000) Mar 31 13:34:22 dev kernel: [1941073.793055] svc: socket ffff88000dc41a00 TCP data ready (svsk ffff88042e723000) Mar 31 13:34:22 dev kernel: [1941073.793058] svc: transport ffff88042e723000 served by daemon ffff88040fbc6000 Mar 31 13:34:22 dev kernel: [1941073.793067] svc: server ffff88040fbc6000, pool 0, transport ffff88042e723000, inuse=2 Mar 31 13:34:22 dev kernel: [1941073.793070] svc: tcp_recv ffff88042e723000 data 1 conn 0 close 0 Mar 31 13:34:22 dev kernel: [1941073.793075] svc: socket ffff88042e723000 recvfrom(ffff88042e72328c, 0) = 4 Mar 31 13:34:22 dev kernel: [1941073.793077] svc: TCP record, 128 bytes Mar 31 13:34:22 dev kernel: [1941073.793080] svc: socket ffff88042e723000 recvfrom(ffff880351f9c008, 4088) = 8 Mar 31 13:34:22 dev kernel: [1941073.793084] svc: socket ffff88042e723000 recvfrom(ffff880351f9c080, 3968) = 120 Mar 31 13:34:22 dev kernel: [1941073.793087] svc: TCP complete record (128 bytes) Mar 31 13:34:22 dev kernel: [1941073.793089] svc: transport ffff88042e723000 served by daemon ffff880418a00000 Mar 31 13:34:22 dev kernel: [1941073.793093] svc: got len=128 Mar 31 13:34:22 dev kernel: [1941073.793096] svc: svc_authenticate (1) Mar 31 13:34:22 dev kernel: [1941073.793100] svc: server ffff880418a00000, pool 0, transport ffff88042e723000, inuse=3 Mar 31 13:34:22 dev kernel: [1941073.793103] svc: transport ffff88042e723000 busy, not enqueued Mar 31 13:34:22 dev kernel: [1941073.793106] svc: calling dispatcher Mar 31 13:34:22 dev kernel: [1941073.793108] nfsd_dispatch: vers 4 proc 1 Mar 31 13:34:22 dev kernel: [1941073.793112] svc: tcp_recv ffff88042e723000 data 1 conn 0 close 0 Mar 31 13:34:22 dev kernel: [1941073.793115] nfsv4 compound op #1/3: 36 (OP_SETCLIENTID_CONFIRM) Mar 31 13:34:22 dev kernel: [1941073.793119] NFSD: move_to_confirm nfs4_client ffff88042e725800 Mar 31 13:34:22 dev kernel: [1941073.793123] svc: socket ffff88042e723000 recvfrom(ffff88042e723288, 4) = -11 Mar 31 13:34:22 dev kernel: [1941073.793125] RPC: TCP recv_record got EAGAIN Mar 31 13:34:22 dev kernel: [1941073.793128] svc: got len=-11 Mar 31 13:34:22 dev kernel: [1941073.793130] svc: server ffff880418a00000 waiting for data (to = 900000) Mar 31 13:80<>1403730]RC 21 albn sau )<>1403730]RC 21 alcnetxr ff825e00i o once Mar 31 13:34:22 dev kernel: 4[917.927 P:312xr_onc ptff80f8a0 sntonce Mar 31 13:34:22 dev kernel: 4[917.920 P:312sepo(uu xr_edn"tm 71037 Mar 31 13:34:22 dev kernel: 4[917.923 P:312addt uu ff825e20"ptpnig Mar 31 13:34:22 dev kernel: 4[917.926 P:312stigaamfr600m Mar 31 13:34:22 dev kernel: 4[917.929 P: x_onc ceue ptff80f8a0 Mar 31 13:34:22 dev kernel: 4[917.923 fv opudo ff837208oct3#:3:sau Mar 31 13:34:22 dev kernel: 4[917.926 fv opudo 23 4(PPTOTH Mar 31 13:34:22 dev kernel: 4[917.920 P: Wn pae eae10 g= Mar 31 13:34:22 dev kernel: 4[917.920 fv opudo ff837208oct3#:2:sau 00 Mar 31 13:34:22 dev kernel: 4[917.923 fv opudrtre 00 Mar 31 13:34:22 dev kernel: 4[917.925 fd rpigrqet a ervstdltr<>1403734]sc v_rcs rpt<>1403735]sc ptff802730 rpe eus Mar 31 13:34:22 dev kernel: 4[917.924 v:sre ff84fc00wiigfrdt t 000 Mar 31 13:34:22 dev kernel: 4[917.937 F:rvldtn 01/)<>1403734]RC okn pUI rd<>1403734]RC okn pUI rd<>1403734]RC e akiiilzd rci 21 Mar 31 13:34:22 dev kernel: 4[917.938 P: alctdts ff84d3b0<>1403735]RC 21 _p_xct lg=x8 Mar 31 13:34:22 dev kernel: 4[917.933 P:313cl_tr f4po EAT sn)<>1403735]RC 21 alrsre(tts0 Mar 31 13:34:22 dev kernel: 4[917.938 P: wiigfrrqetso Mar 31 13:34:22 dev kernel: 4[917.930 P:313sepo(uu xr_alg ie48014)<>140.933 P:313addt uu f80279a xr_akg Mar 31 13:34:22 dev kernel: 4[403736]P:313sn akgi ose Mar 31 13:34:23 dev kernel: [1941074.360512] RPC: killing all tasks for client ffff8803efe08a00 Mar 31 13:34:24 dev kernel: [1941075.358733] RPC: killing all tasks for client ffff8803efe08a00 Mar 31 13:34:25 dev kernel: [1941076.357001] RPC: killing all tasks for client ffff8803efe08a00 Mar 31 13:34:26 dev kernel: [1941077.355322] RPC: killing all tasks for client ffff8803efe08a00 So, I'm unable to get the nfs working again, unless I reboot the server. I'm using portmap 6.0.0-2. Shall I use rcpbind instead ? I'm not able to reproduce the bug, but I know it will come again, it's the third time that I get it. Thanks for reading. Best regards, Yann -- System Information: Debian Release: 6.0 APT prefers stable APT policy: (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 2.6.32-5-amd64 (SMP w/16 CPU cores) Locale: LANG=fr_FR.UTF-8, LC_CTYPE=fr_FR.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Versions of packages nfs-kernel-server depends on: ii libblkid1 2.17.2-9 block device id library ii libc6 2.11.2-10 Embedded GNU C Library: Shared lib ii libcomerr2 1.41.12-2 common error description library ii libgssapi-krb5-2 1.8.3+dfsg-4 MIT Kerberos runtime libraries - k ii libgssglue1 0.1-4 mechanism-switch gssapi library ii libk5crypto3 1.8.3+dfsg-4 MIT Kerberos runtime libraries - C ii libkrb5-3 1.8.3+dfsg-4 MIT Kerberos runtime libraries ii libnfsidmap2 0.23-2 An nfs idmapping library ii librpcsecgss3 0.19-2 allows secure rpc communication us ii libwrap0 7.6.q-19 Wietse Venema's TCP wrappers libra ii lsb-base 3.2-23.2squeeze1 Linux Standard Base 3.2 init scrip ii nfs-common 1:1.2.2-4 NFS support files common to client ii ucf 3.0025+nmu1 Update Configuration File: preserv nfs-kernel-server recommends no packages. nfs-kernel-server suggests no packages. -- no debconf information
--- End Message ---
--- Begin Message ---
- To: 620299-done@bugs.debian.org
- Subject: Re: nfs-kernel-server: nfs stop working after some time - rpc.mountd and rpciod in D state
- From: Ben Hutchings <ben@decadent.org.uk>
- Date: Sat, 19 Mar 2022 17:37:13 +0100
- Message-id: <c64b7a8e7c1fa86ae1446835c9d0a715d6b2ba72.camel@decadent.org.uk>
- In-reply-to: <20110331205120.10641.15416.reportbug@dev.lan.internethic.net>
- References: <20110331205120.10641.15416.reportbug@dev.lan.internethic.net>
Closing this due to lack of response. Ben., -- Ben Hutchings Beware of programmers who carry screwdrivers. - Leonard BrandweinAttachment: signature.asc
Description: This is a digitally signed message part
--- End Message ---