Bug#1120598: ls input/output error ("NFS: readdir(/) returns -5") on krb5 NFSv4 client using SHA2
- To: "Tyler W. Ross" <TWR@tylerwross.com>, "1120598@bugs.debian.org" <1120598@bugs.debian.org>, Jeff Layton <jlayton@kernel.org>, NeilBrown <neil@brown.name>, Scott Mayhew <smayhew@redhat.com>, Steve Dickson <steved@redhat.com>, Salvatore Bonaccorso <carnil@debian.org>
- Cc: Olga Kornievskaia <okorniev@redhat.com>, Dai Ngo <Dai.Ngo@oracle.com>, Tom Talpey <tom@talpey.com>, Trond Myklebust <trondmy@kernel.org>, Anna Schumaker <anna@kernel.org>, linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org
- Subject: Bug#1120598: ls input/output error ("NFS: readdir(/) returns -5") on krb5 NFSv4 client using SHA2
- From: Chuck Lever <chuck.lever@oracle.com>
- Date: Thu, 13 Nov 2025 12:47:23 -0500
- Message-id: <[🔎] c7136bad-5a00-4224-a25c-0cf7e8252f4a@oracle.com>
- Reply-to: Chuck Lever <chuck.lever@oracle.com>, 1120598@bugs.debian.org
- In-reply-to: <[🔎] c8-cRKuS2KXjk19lBwOGLCt21IbVv7HsS-V-ihDmhQ1Uae_LHNm83T0dOKvbYqsf4AeP5T8PR_xdiKLj5-Nvec-QVTLqIC4NGuU2FA0hN5U=@tylerwross.com>
- References: <[🔎] 176298368872.955.14091113173156448257.reportbug@nfsclient-sid.ipa.twrlab.net> <[🔎] aRVl8yGqTkyaWxPM@eldamar.lan> <[🔎] 8d873978-2df6-4b79-891d-f0fd78485551@oracle.com> <[🔎] c8-cRKuS2KXjk19lBwOGLCt21IbVv7HsS-V-ihDmhQ1Uae_LHNm83T0dOKvbYqsf4AeP5T8PR_xdiKLj5-Nvec-QVTLqIC4NGuU2FA0hN5U=@tylerwross.com> <[🔎] 176298368872.955.14091113173156448257.reportbug@nfsclient-sid.ipa.twrlab.net>
On 11/13/25 12:16 PM, Tyler W. Ross wrote:
> Thanks, Chunk.
>
> Suggested trace-cmd report from the client follows. Last 3 lines appear salient, but I've included the full report just in case.
>
> <idle>-0 [001] ..s2. 270.327040: xs_data_ready: peer=[10.108.2.102]:2049
> kworker/u16:0-12 [001] ...1. 270.327048: xprt_lookup_rqst: peer=[10.108.2.102]:2049 xid=0x7b569c7a status=0
> kworker/u16:0-12 [001] ...2. 270.327050: rpc_task_wakeup: task:00000008@00000005 flags=MOVEABLE|DYNAMIC|SENT|NORTO|CRED_NOREF runstate=0x6 status=0 timeout=15000 queue=xprt_pending
> kworker/u16:0-12 [001] ..... 270.327054: xs_stream_read_request: peer=[10.108.2.102]:2049 xid=0x7b569c7a copied=988 reclen=988 offset=988
> kworker/u16:0-12 [001] ..... 270.327055: xs_stream_read_data: peer=[10.108.2.102]:2049 err=-11 total=992
> ls-969 [003] ..... 270.327062: rpc_task_sync_wake: task:00000008@00000005 flags=MOVEABLE|DYNAMIC|SENT|NORTO|CRED_NOREF runstate=RUNNING|0x4 status=0 action=call_status
> ls-969 [003] ..... 270.327062: rpc_task_run_action: task:00000008@00000005 flags=MOVEABLE|DYNAMIC|SENT|NORTO|CRED_NOREF runstate=RUNNING|0x4 status=0 action=xprt_timer
> ls-969 [003] ..... 270.327063: rpc_task_run_action: task:00000008@00000005 flags=MOVEABLE|DYNAMIC|SENT|NORTO|CRED_NOREF runstate=RUNNING|0x4 status=0 action=call_status
> ls-969 [003] ..... 270.327063: rpc_task_run_action: task:00000008@00000005 flags=MOVEABLE|DYNAMIC|SENT|NORTO|CRED_NOREF runstate=RUNNING|0x4 status=0 action=call_decode
> ls-969 [003] ..... 270.327063: rpc_xdr_recvfrom: task:00000008@00000005 head=[0xffff8895c29fef64,140] page=4008(88) tail=[0xffff8895c29feff0,36] len=988
> ls-969 [003] ..... 270.327067: rpc_xdr_overflow: task:00000008@00000005 nfsv4 READDIR requested=8 p=0xffff8895c29fefec end=0xffff8895c29feff0 xdr=[0xffff8895c29fef64,140]/4008/[0xffff8895c29feff0,36]/988
Here's the problem. This is a sign of an XDR decoding issue. If you
capture the traffic with Wireshark, does Wireshark indicate where the
XDR is malformed?
If it doesn't, then there is some problem with the client code. Since
Fedora 43 is working as expected, I would guess there's a misapplied
patch on Debian 13's kernel...?
> ls-969 [003] ..... 270.327068: rpc_task_run_action: task:00000008@00000005 flags=MOVEABLE|DYNAMIC|SENT|NORTO|CRED_NOREF runstate=RUNNING|0x4 status=-5 action=rpc_exit_task
> ls-969 [003] ..... 270.327068: rpc_task_end: task:00000008@00000005 flags=MOVEABLE|DYNAMIC|SENT|NORTO|CRED_NOREF runstate=RUNNING|0x4 status=-5 action=rpc_exit_task
> ls-969 [003] ..... 270.327068: rpc_stats_latency: task:00000008@00000005 xid=0x7b569c7a nfsv4 READDIR backlog=7 rtt=110 execute=137 xprt_id=1
> ls-969 [003] ..... 270.327068: rpc_task_call_done: task:00000008@00000005 flags=MOVEABLE|DYNAMIC|SENT|NORTO|CRED_NOREF runstate=RUNNING|0x4 status=-5 action=nfs41_call_sync_done
> ls-969 [003] ..... 270.327068: nfs4_sequence_done: error=0 (OK) session=0x5988ad3c slot_nr=0 seq_nr=26 highest_slotid=63 target_highest_slotid=63 status_flags=0x0 ()
> ls-969 [003] ...1. 270.327069: xprt_release_xprt: task:00000008@00000005 snd_task:ffffffff
> ls-969 [003] ...1. 270.327070: nfs_set_cache_invalid: error=0 (OK) fileid=00:2d:262146 fhandle=0xad8c294c type=4 (DIR) version=31 size=4096 cache_validity=0x4 (INVALID_ATIME) nfs_flags=0x4 (ACL_LRU_SET)
> ls-969 [003] ..... 270.327070: nfs4_readdir: error=-5 (EIO) fileid=00:2d:262146 fhandle=0xad8c294c
> ls-969 [003] ..... 270.327071: nfs_readdir_cache_fill_done: error=-5 (IO) fileid=00:2d:262146 fhandle=0xad8c294c type=4 (DIR) version=31 size=4096 cache_validity=0x4 (INVALID_ATIME) nfs_flags=0x4 (ACL_LRU_SET)
--
Chuck Lever
Reply to: