[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1120598: ls input/output error ("NFS: readdir(/) returns -5") on krb5 NFSv4 client using SHA2



On 11/13/25 1:05 PM, Tyler W. Ross wrote:
> On Thursday, November 13th, 2025 at 10:47 AM, Chuck Lever <chuck.lever@oracle.com> wrote:
> 
>>> ls-969 [003] ..... 270.327063: rpc_xdr_recvfrom: task:00000008@00000005 head=[0xffff8895c29fef64,140] page=4008(88) tail=[0xffff8895c29feff0,36] len=988
>>> ls-969 [003] ..... 270.327067: rpc_xdr_overflow: task:00000008@00000005 nfsv4 READDIR requested=8 p=0xffff8895c29fefec end=0xffff8895c29feff0 xdr=[0xffff8895c29fef64,140]/4008/[0xffff8895c29feff0,36]/988
>>
>>
>> Here's the problem. This is a sign of an XDR decoding issue. If you
>> capture the traffic with Wireshark, does Wireshark indicate where the
>> XDR is malformed?
> 
> Wireshark appears to decode the READDIR reply without issue. Nothing is obviously marked as malformed, and values all appear sane when spot-checking fields in the decoded packet.
Then I would start looking for differences between the Debian 13 and
Fedora 43 kernel code base under net/sunrpc/ .

Alternatively, "git bisect first, ask questions later" ... :-)

So I didn't find an indication of whether this was sec=krb5, sec=krb5i,
or sec=krb5p. That might narrow down where the code changed.

Also, the xdr_buf might have a page boundary positioned in the middle of
an XDR data item. Knowing which data item is being decoded where the
"overflow" occurs might be helpful (I think adding pr_info() call sites
or trace_printk() will be adequate to gain some better observability).


-- 
Chuck Lever


Reply to: