[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1120598: ls input/output error ("NFS: readdir(/) returns -5") on krb5 NFSv4 client using SHA2



On 11/13/25 4:21 PM, Salvatore Bonaccorso wrote:
> Hi Chuck,
> 
> On Thu, Nov 13, 2025 at 12:47:23PM -0500, Chuck Lever wrote:
>> On 11/13/25 12:16 PM, Tyler W. Ross wrote:
>>> Thanks, Chunk.
>>>
>>> Suggested trace-cmd report from the client follows. Last 3 lines appear salient, but I've included the full report just in case.
>>>
>>>           <idle>-0     [001] ..s2.   270.327040: xs_data_ready:        peer=[10.108.2.102]:2049
>>>    kworker/u16:0-12    [001] ...1.   270.327048: xprt_lookup_rqst:     peer=[10.108.2.102]:2049 xid=0x7b569c7a status=0
>>>    kworker/u16:0-12    [001] ...2.   270.327050: rpc_task_wakeup:      task:00000008@00000005 flags=MOVEABLE|DYNAMIC|SENT|NORTO|CRED_NOREF runstate=0x6 status=0 timeout=15000 queue=xprt_pending
>>>    kworker/u16:0-12    [001] .....   270.327054: xs_stream_read_request: peer=[10.108.2.102]:2049 xid=0x7b569c7a copied=988 reclen=988 offset=988
>>>    kworker/u16:0-12    [001] .....   270.327055: xs_stream_read_data:  peer=[10.108.2.102]:2049 err=-11 total=992
>>>               ls-969   [003] .....   270.327062: rpc_task_sync_wake:   task:00000008@00000005 flags=MOVEABLE|DYNAMIC|SENT|NORTO|CRED_NOREF runstate=RUNNING|0x4 status=0 action=call_status
>>>               ls-969   [003] .....   270.327062: rpc_task_run_action:  task:00000008@00000005 flags=MOVEABLE|DYNAMIC|SENT|NORTO|CRED_NOREF runstate=RUNNING|0x4 status=0 action=xprt_timer
>>>               ls-969   [003] .....   270.327063: rpc_task_run_action:  task:00000008@00000005 flags=MOVEABLE|DYNAMIC|SENT|NORTO|CRED_NOREF runstate=RUNNING|0x4 status=0 action=call_status
>>>               ls-969   [003] .....   270.327063: rpc_task_run_action:  task:00000008@00000005 flags=MOVEABLE|DYNAMIC|SENT|NORTO|CRED_NOREF runstate=RUNNING|0x4 status=0 action=call_decode
>>>               ls-969   [003] .....   270.327063: rpc_xdr_recvfrom:     task:00000008@00000005 head=[0xffff8895c29fef64,140] page=4008(88) tail=[0xffff8895c29feff0,36] len=988
>>>               ls-969   [003] .....   270.327067: rpc_xdr_overflow:     task:00000008@00000005 nfsv4 READDIR requested=8 p=0xffff8895c29fefec end=0xffff8895c29feff0 xdr=[0xffff8895c29fef64,140]/4008/[0xffff8895c29feff0,36]/988
>>
>> Here's the problem. This is a sign of an XDR decoding issue. If you
>> capture the traffic with Wireshark, does Wireshark indicate where the
>> XDR is malformed?
>>
>> If it doesn't, then there is some problem with the client code. Since
>> Fedora 43 is working as expected, I would guess there's a misapplied
>> patch on Debian 13's kernel...?
> 
> if it is helpful: Debian follows the stable upstream releases (6.12.y
> for trixie/Debian 13, right now 6.17.y for Debian unstable) and we try
> to keep the patches limited which we apply on top. So far I see none
> which touches net/sunrpc/. The patches applied:
> https://salsa.debian.org/kernel-team/linux/-/tree/debian/6.17/forky/debian/patches?ref_type=heads
> (in case this could help narrowing down more the issue).
> 
> But we could try here additionally, if Tylor has the possibility to do
> so, to try directly the 6.17.7 upstream version without Debian patches
> applied.

A bisect between broken v6.12.y and working v6.17.7 could identify
what is possibly missing from v6.12.y.


-- 
Chuck Lever


Reply to: