[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Massive load average, can't log in



Hi there,

This morning I noticed that my Etch box wouldn't let me start up a web browser window. Odd I thought, so I took a look at the load average, which was sitting at 210! Some processes such as vi or firefox can no longer launch, but some simpler ones such as ps or top can still be run from a regular user terminal. A ps uaxw reveals dozens of crond processes in the all too familiar 'D' state, each one having a similarly stalled mrtg process. I've never even used mrtg on that machine save just doing an apt-get install mrtg.

If I try to su - to kill some processes, that particular terminal goes into an interruptible sleep and I have to switch to another one. I get something similar when I try to log in remotely - it never gets to a password prompt. CPU is at 0%, memory usage is below 50% and there is no disk activity.

So in light of this I have two questions:

1. Why would the mrtg cron job be stalling? Is there a known problem with this program or is it looking for some non-existent nfs share?

2. Why can't I log in or start any new large processes? Is there some load average threshold in Debian above which no one is allowed to log in? A high load average does not suggest high disk/cpu/memory usage, just stalled processes so there is plenty of computing power available. Perhaps the load average calculation needs to be updated to ignore processes that have stalled for a period?

thanks,
Greg



Reply to: