[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

runaway crons



I hit my work system remotely this morning to check mail and got

   bash:  fork: resource temporarily unavailable

Ok, that sux.

'exec' and 'sash' are your friends (as well as the fact that sshd
appears to fork to create a session rather than 'exec'ing).   I managed
to get a couple of root shells, list out /proc to get an idea of running
processes, and somewhat arbitrarily started killing high-numbered ones
(only killed my shell once <g>).

   $ exec su -
   > <type the password -- *correctly* or you lose your ssh session>
   $ exec sash      # as root
   $ -ls /proc
   $ -kill <arbitrary list of high-valued PIDs>

I actually had to run through this process several times as my process
table would fill periodically.

With sufficient processes killed, I could get a top running.  Following
is actually *after* cleaning up, I'd had a load average of ~240 and some
520+ processes.  No zombies, FWIW.

 10:24am  up 7 days, 11:29,  8 users,  load average: 102.56, 176.03, 195.58
104 processes: 99 sleeping, 4 running, 1 zombie, 0 stopped
CPU states: 89.5% user,  5.6% system,  4.7% nice,  0.0% idle
Mem:  128180K av, 106680K used,  21500K free,  46636K shrd,  24824K buff
Swap: 409536K av,   5740K used, 403796K free                 25856K cached

So it looks like a process count problem rather than out of memory (been
there too).  pstree showed a heck of a lot of cron jobs.  Found and
killed the root cron process, then slaughtered the rest with:

  ps aux | grep CRON | awk '{ print $2 }' | egrep -f '($$|$PPID)' | xargs kill

...a bit crude, but it works.  Incidentally, command was uppercased:
"/USR/SBIN/CRON", not "/usr/sbin/cron".  I've seen this before but never
had the runaway problem.  Not sure if a 'killall' would have worked, but
I believe it's sensitive to command name, which I didn't have exactly.

Anyone seen this or have clues to tracking down the problem?

-- 
Karsten M. Self <kmself@ix.netcom.com>         http://www.netcom.com/~kmself
  Evangelist, Opensales, Inc.                       http://www.opensales.org
   What part of "Gestalt" don't you understand?      Debian GNU/Linux rocks!
     http://gestalt-system.sourceforge.net/      K5: http://www.kuro5hin.org
GPG fingerprint: F932 8B25 5FDD 2528 D595  DC61 3847 889F 55F2 B9B0

Attachment: pgpoe8HkxJMHW.pgp
Description: PGP signature


Reply to: