I hit my work system remotely this morning to check mail and got
bash: fork: resource temporarily unavailable
Ok, that sux.
'exec' and 'sash' are your friends (as well as the fact that sshd
appears to fork to create a session rather than 'exec'ing). I managed
to get a couple of root shells, list out /proc to get an idea of running
processes, and somewhat arbitrarily started killing high-numbered ones
(only killed my shell once <g>).
$ exec su -
> <type the password -- *correctly* or you lose your ssh session>
$ exec sash # as root
$ -ls /proc
$ -kill <arbitrary list of high-valued PIDs>
I actually had to run through this process several times as my process
table would fill periodically.
With sufficient processes killed, I could get a top running. Following
is actually *after* cleaning up, I'd had a load average of ~240 and some
520+ processes. No zombies, FWIW.
10:24am up 7 days, 11:29, 8 users, load average: 102.56, 176.03, 195.58
104 processes: 99 sleeping, 4 running, 1 zombie, 0 stopped
CPU states: 89.5% user, 5.6% system, 4.7% nice, 0.0% idle
Mem: 128180K av, 106680K used, 21500K free, 46636K shrd, 24824K buff
Swap: 409536K av, 5740K used, 403796K free 25856K cached
So it looks like a process count problem rather than out of memory (been
there too). pstree showed a heck of a lot of cron jobs. Found and
killed the root cron process, then slaughtered the rest with:
ps aux | grep CRON | awk '{ print $2 }' | egrep -f '($$|$PPID)' | xargs kill
...a bit crude, but it works. Incidentally, command was uppercased:
"/USR/SBIN/CRON", not "/usr/sbin/cron". I've seen this before but never
had the runaway problem. Not sure if a 'killall' would have worked, but
I believe it's sensitive to command name, which I didn't have exactly.
Anyone seen this or have clues to tracking down the problem?
--
Karsten M. Self <kmself@ix.netcom.com> http://www.netcom.com/~kmself
Evangelist, Opensales, Inc. http://www.opensales.org
What part of "Gestalt" don't you understand? Debian GNU/Linux rocks!
http://gestalt-system.sourceforge.net/ K5: http://www.kuro5hin.org
GPG fingerprint: F932 8B25 5FDD 2528 D595 DC61 3847 889F 55F2 B9B0
Attachment:
pgpoe8HkxJMHW.pgp
Description: PGP signature