[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

How much downtime do we afford for nagios?



Hi,

For a few days false notification of nagios reduced. But it has increased again.

Looking at the /configs/system/nagios/services/template.cfg reveals
that it is configured as
max_check_attempt = 4 and retry_check_interval  1 for hosts
and
 max_check_attempts = 3 and retry_check_interval  1.

So if a service or host is unreachable for 3 or 4 mins, we get a
notification. (However most of the cases it is false positive, due to
congestion or others).

How about finding out a working delay which we can afford, if a
service or host is really down. How about 10 mins ? (5 attempt x 2
mins?).

Also we may list services/host which are critical and which are not.
That will help to define different notification period for the
different hots/services.

I thought I shall do it after the freeze, but its becoming too annoying.

Thanks


-- 
Regards,
Susmit.

=============================================
ssh
0x86DD170A
http://www.fedoraproject.org/wiki/SusmitShannigrahi
=============================================


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]