[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1120598: ls input/output error ("NFS: readdir(/) returns -5") on krb5 NFSv4 client using SHA2



On 11/13/25 1:51 PM, Tyler W. Ross wrote:
> On Thursday, November 13th, 2025 at 11:12 AM, Chuck Lever <chuck.lever@oracle.com> wrote:
> 
>> Then I would start looking for differences between the Debian 13 and
>> Fedora 43 kernel code base under net/sunrpc/ .
>>
>> Alternatively, "git bisect first, ask questions later" ... :-)
> 
> This is outside my day-to-day, so I don't have a workflow for this kind of
> testing/debugging, but I'll see what I can do.
> 
> Thanks for the starting place.
> 
>> So I didn't find an indication of whether this was sec=krb5, sec=krb5i,
>> or sec=krb5p. That might narrow down where the code changed.
> 
> I confirmed the issue with all 3 krb5 sec modes, in both the 6.12 kernel
> that ships with Debian 13 and the 6.17 that currently ships with Debian
> Sid/unstable. Similarly, I confirmed NFSv4.2, 4.1 and 4.0 are impacted.
> 
>> Also, the xdr_buf might have a page boundary positioned in the middle of
>> an XDR data item. Knowing which data item is being decoded where the
>> "overflow" occurs might be helpful (I think adding pr_info() call sites
>> or trace_printk() will be adequate to gain some better observability).
> 
> No experience with kernel hacking, so I'm not confident I can locate
> meaningful places to insert those.
> 
> I'll see where some snooping and a bisect gets me. Failing that, if
> anyone has recommendations on where to add those calls, I'd appreciate
> the guidance.

xdr_inline_decode(). Easiest approach (but somewhat noisy) would be to
add a WARN_ON just after each of the trace_rpc_xdr_overflow() call
sites. The stack trace on the failing decode will be dumped into the
system journal.


-- 
Chuck Lever


Reply to: