[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: NOC scripting



On Tue, Jan 23, 2001 at 03:37:25PM +0100, Stephane Bortzmeyer wrote:
> On Tuesday 23 January 2001, at 10 h 26, the keyboard of Michael Boman 
> <michael@elinux.com.sg> wrote:
> 
> > you may ask? Well, what if the
> > router/switch/firewall/another-single-point-of-failure between your
> > monitoring server and the rest of the network goes down? BB will scream
> > that every server/router (etc) you have on the other side is down, while
> > NetSaint understands that it's the router/firewall/etc that is down.
> 
> mon <http://www.kernel.org/software/mon/> does the same, with its
> useful feature 'depend':
> 
> watch kata
>    service http
>         interval 2m
>         monitor http.monitor
>         depend kata:ping
>         ^^^^^^^^^^^^^^^^
> If the machine does not reply to pings, there is no need testing
> Apache.

Yes, and mon is very very nice.  I have a 'main-mon' instance running on
one machine that monitors everything from pingability to router port
status (via snmp) to mysql status, radius server status, disk space,
etc, with a simple dependency setup so that if a router port goes down
(did I mention MCI sucks?), I don't get alerted about remote systems
being unreachable or ssh failing on them, just the router port. 

Because I'm paranoid, I have a 'mini-mon' running on another machine
that just makes sure 'main-mon' is running.   (Okay, it's never failed,
but the machine 'mini-mon' is running on is sorta flaky and makes me
paranoid....)

It's simple to use, simple to write your own monitors for, and flexible
(use m4 for configs!).   It can be accessed via command line ('monshow'
or 'moncmd') or from a web interface ('monshow' or 'mon.cgi').... which
I use depends on where I am.

Oh, and the way cool feature: 'acks' of alerts.  You can say 'damn, MCI
sucks again', ack the page and mon won't page you again about that
outage until it comes back up again.  (ie, it's a "disable this until
it's fixed, but then re-enable it" so you don't have to remember to do
it).  Jim Trocki has a cool pager that he can use to ack pages without
actually logging in.

Mon is what lets me sleep at night.



Reply to: