Re: Distributed monitoring

To: debian-isp@lists.debian.org
Cc: debian-isp@lists.debian.org
Subject: Re: Distributed monitoring
From: Thomas Goirand <thomas@goirand.fr>
Date: Sun, 29 Mar 2009 08:09:01 +0800
Message-id: <[🔎] 49CEBC1D.9000805@goirand.fr>
In-reply-to: <[🔎] 200903282228.07607.jesus.navarro@undominio.net>
References: <[🔎] 49CE59DA.5040505@goirand.fr> <[🔎] 200903282228.07607.jesus.navarro@undominio.net>

Jesús M. Navarro wrote:
> On Saturday 28 March 2009 18:09:46 Thomas Goirand wrote:
>> Hi *,
>>
>> I've read it about distributed monitoring, and that Nagios could do it.
>> Then I have see this:
>>
>> http://nagios.sourceforge.net/docs/1_0/images/distributed.png
>>
>> and was disappointed. This is quite not what I want. Let me explain.
> 
> [you are flooded by false-positive alerts]
> 
> You should really take more time to learn the ins and outs of Nagios. Probably 
> all of them can be avoided with the topology suggested on the docs.
> 
> You need to study about host parentship, service dependencies, state change 
> notifications and contact definitions.
> 
> What you want to do is having alerts on a 24x7 basis but...
> 
> Express host parentship.  This way in a topology such as A->B->C where A is 
> you nagios central server, C the monitored host and B some intermediate (say, 
> a Nagios satellite or a router) if B fails, A will know that doesn't mean C 
> is failing but only UNKNOWN.
> 
> Express service depencies.  Say you are monitoring some internals on a remote 
> host by means of NRPE.  With proper service dependencies in place if the 
> remote NRPE daemon dies Nagios will know that doesn't mean the dependant 
> services are failing and will mark them properly as UNKNOWN.
> 
> Declare proper notification options on your contacts.  Given the above you 
> don't want to be notified by SMS on UNKNOWN status, only on properly detected 
> CRITICAL or RECOVERY states; then define a contact that will only be notified 
> as "host_notification_options d,r" and "service_notification_options c,r" 
> (where d==DOWN, C==CRITICAL and R==RECOVER).
> 
> Remember that the nearer the nagios monitor node (be it a central server or a 
> local satellite) to the tested hosts and services the better results you will 
> get avoiding false positives and negatives.
> 
> All in all you probably will recieve more and more proper feedback on the 
> nagios users maillist than this one.  Other people manage to have Nagios 
> deployed multisite over bad quality links and found the ways not to be 
> flooded with alerts in the middle of the night (me, for one).

First of all, yes, we do have implemented topology and dependencies, and
we do not receive UNKNOWN status already. But this doesn't seem to be
enough...

I guess that the only way we'd have would be to setup 2 nagios in each
Xen server location, but that is quite a pain to maintain. I was hoping
for an out-of-the-box magical solution that would be more easy to
deploy. If we do choose this way, does it has to be a "full" nagios
setup on each location? Or is it a kind of plugin or client?

Thomas

Reply to:

Follow-Ups:
- Re: Distributed monitoring
  - From: "Jesús M. Navarro" <jesus.navarro@undominio.net>

References:
- Distributed monitoring
  - From: Thomas Goirand <thomas@goirand.fr>
- Re: Distributed monitoring
  - From: "Jesús M. Navarro" <jesus.navarro@undominio.net>

Prev by Date: Re: Distributed monitoring
Next by Date: Re: Distributed monitoring
Previous by thread: Re: Distributed monitoring
Next by thread: Re: Distributed monitoring
Index(es):
- Date
- Thread