Bug#1120598: ls input/output error ("NFS: readdir(/) returns -5") on krb5 NFSv4 client using SHA2
- To: Chuck Lever <chuck.lever@oracle.com>
- Cc: "1120598@bugs.debian.org" <1120598@bugs.debian.org>, Jeff Layton <jlayton@kernel.org>, NeilBrown <neil@brown.name>, Scott Mayhew <smayhew@redhat.com>, Steve Dickson <steved@redhat.com>, Salvatore Bonaccorso <carnil@debian.org>, Olga Kornievskaia <okorniev@redhat.com>, Dai Ngo <Dai.Ngo@oracle.com>, Tom Talpey <tom@talpey.com>, Trond Myklebust <trondmy@kernel.org>, Anna Schumaker <anna@kernel.org>, linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org
- Subject: Bug#1120598: ls input/output error ("NFS: readdir(/) returns -5") on krb5 NFSv4 client using SHA2
- From: "Tyler W. Ross" <TWR@tylerwross.com>
- Date: Thu, 13 Nov 2025 18:51:57 +0000
- Message-id: <[🔎] eUtqaTOrHO8Sj-82m04dsCpmYX8bPkr5r9Nla1muHxSnxBYq57wxk7LLf_RuI377WMpUcczBXteWGvF5OfNfe5gwLmfTn_YblJucaF58POo=@tylerwross.com>
- Reply-to: "Tyler W. Ross" <TWR@tylerwross.com>, 1120598@bugs.debian.org
- In-reply-to: <[🔎] 4b77bf39-bc1a-47a1-9a16-14c44c31614f@oracle.com>
- References: <[🔎] 176298368872.955.14091113173156448257.reportbug@nfsclient-sid.ipa.twrlab.net> <[🔎] aRVl8yGqTkyaWxPM@eldamar.lan> <[🔎] 8d873978-2df6-4b79-891d-f0fd78485551@oracle.com> <[🔎] c8-cRKuS2KXjk19lBwOGLCt21IbVv7HsS-V-ihDmhQ1Uae_LHNm83T0dOKvbYqsf4AeP5T8PR_xdiKLj5-Nvec-QVTLqIC4NGuU2FA0hN5U=@tylerwross.com> <[🔎] c7136bad-5a00-4224-a25c-0cf7e8252f4a@oracle.com> <[🔎] N14GL1WKSGqrFl8nF0e6sa0QxOZrnrpoC7IZlZ20YqUyfsxpsoqu2W3a31H_GfQv7OEqaEWKwDXdgtAV-xv613w_slTAFZIoyWMutIE5pKk=@tylerwross.com> <[🔎] 4b77bf39-bc1a-47a1-9a16-14c44c31614f@oracle.com> <[🔎] 176298368872.955.14091113173156448257.reportbug@nfsclient-sid.ipa.twrlab.net>
On Thursday, November 13th, 2025 at 11:12 AM, Chuck Lever <chuck.lever@oracle.com> wrote:
> Then I would start looking for differences between the Debian 13 and
> Fedora 43 kernel code base under net/sunrpc/ .
>
> Alternatively, "git bisect first, ask questions later" ... :-)
This is outside my day-to-day, so I don't have a workflow for this kind of
testing/debugging, but I'll see what I can do.
Thanks for the starting place.
> So I didn't find an indication of whether this was sec=krb5, sec=krb5i,
> or sec=krb5p. That might narrow down where the code changed.
I confirmed the issue with all 3 krb5 sec modes, in both the 6.12 kernel
that ships with Debian 13 and the 6.17 that currently ships with Debian
Sid/unstable. Similarly, I confirmed NFSv4.2, 4.1 and 4.0 are impacted.
> Also, the xdr_buf might have a page boundary positioned in the middle of
> an XDR data item. Knowing which data item is being decoded where the
> "overflow" occurs might be helpful (I think adding pr_info() call sites
> or trace_printk() will be adequate to gain some better observability).
No experience with kernel hacking, so I'm not confident I can locate
meaningful places to insert those.
I'll see where some snooping and a bisect gets me. Failing that, if
anyone has recommendations on where to add those calls, I'd appreciate
the guidance.
TWR
Reply to: