[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#837907: more on NFS client hangs



I have some more information about [what I believe to be] this problem.

We've had similar incidents from several clients, running kernels 3.16.{36,39}
and 4.9 (jessie-backports). I think this rules out a client hardware issue.

The trigger (from the client's perspective) seems to be loss of contact with
the NFS server. The incidents are almost always preceded by one or more

nfs: server <name> not responding, still trying

log entries. Sometimes there is a known server-side explanation (e.g.,
nfsd thread exhaustion), but not always. In any case, the effects persist
well after communication with the server has recovered; "reboot -f" seems
to be necessary for client recovery, as sync() also hangs indefinitely.

Kernel stack traces on the client vary, as do the affected files and
applications; the issue is by no means limited to Firefox or sqlite.
If desired, I can submit a selection of stack traces (as one bug or as several).

I'm looking for suggestions on how to debug this. I'm thinking of turning on
logging with rpcdebug on the the most frequently affected clients, to better
understand the trigger. Is there anything else I should be looking at?


Reply to: