Bug#617666: nfs-kernel-server: Periodic nfsd failure - single nfsd process with high CPU and no mounts working
On 20/03/11 17:20, Luk Claes wrote:
On 10/03/11 12:54, Debian Bug Tracking System wrote:
I have some extra information about this problem - the syslog contains
some kernel error messages related to nfs and xfs (the filesystem of the
/export partition). I have attached the relevant log section...
It could be this is a problem with xfs or even with our hardware raid
controller. I have rebooted the machine with /export unmounted and am
currently running xfs_repair over it to see if that picks up any problems.
I guess your xfs_repair finished by now? Did it shed some more light on
the issue or should we look more closely into the nfs code?
thanks for getting back to me. My xfs_repair did finish and it found a
few errors, but I'm not sure if they are from hard resetting the machine
or some indication of a more serious hardware error. I am however
pretty sure that this is not a purely NFS problem - since the repair
finished, the system has crashed in a couple of different ways. Once it
dumped the kernel to the console and went completely unresponsive and
another time the /export partition unmounted itself and wouldn't remount
(giving IO errors). In both cases there was no weird NFS process
hanging around (the mounts just became inaccessible as you would expect
them to after such crashes).
At this point I am pretty sure that I have a hardware issue on my hands,
either with bad RAM or my raid controller. I think we can safely say
NFS is in the clear :) Sorry for wasting your time!