[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#1017720: nfs-common: No such file or directory



I have the same issue after adding actimeo=30 to /etc/fstab, rebooting and testing.
I also confirmed that those settings applied via /proc/mounts which shows the below snippet for each mountpoint.
nfs4 rw,relatime,vers=4.1,rsize=131072,wsize=131072,namlen=255,acregmin=30,acregmax=30,acdirmax=30,hard,noresvport,proto=tcp,timeo=600,retrans=2,sec=krb5,clientaddr=X.X.X.X,lookupcache=pos,local_lock=none,addr=Y.Y.Y.Y 0 0

> -----Original Message-----
> From: Jason Breitman
> Sent: Tuesday, August 23, 2022 2:42 PM
> To: Ben Hutchings <ben@decadent.org.uk>; 1017720@bugs.debian.org
> Subject: RE: Bug#1017720: nfs-common: No such file or directory
> 
> What additional information can I provide for us to move forward with this
> process?
> 
> To summarize and include further details, rsync is used to sync applications to
> a file server which behaves like a repository.
> We do preserve timestamps from the build server and also use --delete.  We
> do not run the applications from the file server.  All servers use NTP.
> 
> The application has a sub-directory that contain files with version numbers.
> These are libraries.
> When a new build is complete, a developer pushes their updates via rsync to
> the file server / repository.
> 
> I believe that the dentry cache thinks the "old" files exist and generates a No
> such file or directory error showing question marks for that files attributes.
> Dropping the dentry cache via echo 2 > /proc/sys/vm/drop_caches resolves
> the issue.
> 
> This behavior is not observed in Debian 10.8 with that distributions associated
> kernel and packages.
> 
> > -----Original Message-----
> > From: Jason Breitman
> > Sent: Friday, August 19, 2022 9:52 PM
> > To: Ben Hutchings <ben@decadent.org.uk>; 1017720@bugs.debian.org
> > Subject: RE: Bug#1017720: nfs-common: No such file or directory
> >
> > > -----Original Message-----
> > > From: Ben Hutchings <ben@decadent.org.uk>
> > > Sent: Friday, August 19, 2022 7:27 PM
> > > To: Jason Breitman <jbreitman@tildenparkcapital.com>;
> > > 1017720@bugs.debian.org
> > > Subject: Re: Bug#1017720: nfs-common: No such file or directory
> > >
> > > Control: tag -1 moreinfo
> > >
> > > On Fri, 2022-08-19 at 13:16 +0000, Jason Breitman wrote:
> > > > Package: nfs-common
> > > > Version: 1:1.3.4-6
> > > > Severity: important
> > > >
> > > > Kernel: 5.10.0-16-amd64 #1 SMP Debian 5.10.127-1 (2022-06-30) x86_64
> > > > GNU/Linux
> > > >
> > > > -- Description
> > > >     After updating and or creating new files on our file server via
> > > > rsync, we see many files report the error message below from NFSv4
> > > > clients since upgrading from Debian 10.8 to Debian 11.4.
> > > >     Clearing the dentry cache resolves the issue right away.
> > > >     I am not sure that nfs-common is the package to blame, but listed
> > > > it based on the bug submission recommendations.
> > >
> > > The NFS implementation is mostly in the kernel, so probably this issue
> > > belongs there.  But the kernel team is responsible for both packages.
> > >
> > > [...]
> > > > -- Error message
> > > >     ls: cannot access 'filename': No such file or directory
> > > >     -????????? ? ?    ?            ?            ? filename
> > > [...]
> > >
> > > So we know the file's there but can't stat it.  I think this means the
> > > client has cached the handle of the old file of that name, which has
> > > been deleted.
> > >
> > > - Are client and server clocks closely synchronised?  If not, that
> > > needs to be fixed.
> > >
> > The clocks are synchronized using NTP.
> >
> > > - Are clients likely to read this directory while rsync is running, or
> > > shortly before?  If so, it may help to reduce the attribute caching
> > > timeout on the client.  See the "Directory entry caching" section in
> > > the nfs(5) manual page.
> > >
> > Clients are not likely to read this directory while rsync is running for the
> > observed cases.  That can happen in our environment, but not in this case.
> > I am using the lookupcache=pos option.  I tried noac, but the performance
> > penalty was too much.  Which option are you referring to and what setting
> > do you recommend testing?
> >
> > > I don't know why you're only seeing this after an upgrade of the
> > > clients, though.  I'm not aware that there has been any big change to
> > > attribute caching.
> > >
> > I appreciate you responding to my report and am happy to answer any
> > questions.
> > We have multiple monitors and log scrapers to detect "file not found"
> > exceptions that would let us know if this was happening before.
> > To share more, I have 2 environments mounting from the same file server.
> > Each environment has several servers.  The issue is only seen in the
> > environment running Debian 11.4.
> > I also should have mentioned that the files in question have a version
> > number appended.  filename-1111.  When the file is updated via rsync, it is
> > called filename-1112 and the prior file is removed.  The error is about
> > filename-1111.
> > I am not sure if this is the proper terminology, but the issue appears to be
> > the negative dentry cache.
> >
> > > Ben.
> > >
> > > --
> > > Ben Hutchings
> > > Beware of bugs in the above code;
> > > I have only proved it correct, not tried it. - Donald Knuth
> >
> > Jason Breitman
> Jason Breitman
Jason Breitman

Reply to: