狀態的種類

簡介

因為在監視裡頭，重試(retry)的概念有被包括進去，所以在nagios裡頭，將retry時主機或服務所處的狀態，定義為soft，軟的狀態。而將retry仍然不成功時的狀態，定義為hard也就是硬狀態。

State Types
Introduction
The current state of services and hosts is determined by two components: the status of the service or
host (i.e. OK, WARNING, UP, DOWN, etc.) and the type of state it is in. There are two state types in
Nagios - "soft" states and "hard" states. State types are a crucial part of Nagios’ monitoring logic. They
are used to determine when event handlers are executed and when notifications are sent out.
Service and Host Check Retries
In order to prevent false alarms, Nagios allows you to define how many times a service or host check
will be retried before the service or host is considered to have a real problem. The maximum number
of retries before a service or host check is considered to have a real problem is controlled by the
<max_check)attempts> option in the service and host definitions, respectively. Depending on what
attempt a service or host check is currently on determines what type of state it is is. There are a few
exceptions to this in the service monitoring logic, but we’ll ignore those for now. Let’s take a look at
the different service state types...
Soft States
Soft states occur for services and hosts in the following situations...
When a service or host check results in a non-OK state and it has not yet been (re)checked the
number of times specified by the <max_check_attempts> option in the service or host definition.
Let’s call this a soft error state...
When a service or host recovers from a soft error state. This is considered to be a soft recovery.
Soft State Events
What happens when a service or host is in a soft error state or experiences a soft recovery?
The soft error or recovery is logged if you enabled the log_service_retries or log_host_retries
options in the main configuration file.
Event handlers are executed (if you defined any) to handle the soft error or recovery for the
service or host. (Before any event handler is executed, the $HOSTSTATETYPE$ or
$SERVICESTATETYPE$ macro is set to "SOFT").
Nagios does not send out notifications to any contacts because there is (or was) no "real" problem
with the service or host.
As can be seen, the only important thing that really happens during a soft state is the execution of
event handlers. Using event handlers can be particularly useful if you want to try and proactively fix a
problem before it turns into a hard state. More information on event handlers can be found here.
Hard States
Hard states occur for services in the following situations (hard host states are discussed later)...
When a service check results in a non-OK state and it has been (re)checked the number of times
specified by the <max_check_attempts> option in the service definition. This is a hard error state.
When a service recovers from a hard error state. This is considered to be a hard recovery.
88
When a service check results in a non-OK state and its corresponding host is either DOWN or
UNREACHABLE. This is an exception to the general monitoring logic, but makes perfect sense.
If the host isn’t up why should we try and recheck the service?
Hard states occur for hosts in the following situations...
When a host check results in a non-OK state and it has been (re)checked the number of times
specified by the <max_check_attempts> option in the host definition. This is a hard error state.
When a host recovers from a hard error state. This is considered to be a hard recovery.
Hard State Changes
Before I discuss what happens when a host or service is in a hard state, you need to know about hard
state changes. Hard state changes occur when a service or host...
changes from a hard OK state to a hard non-OK state
changes from a hard non-OK state to a hard OK-state
changes from a hard non-OK state of some kind to a hard non-OK state of another kind (i.e. from
a hard WARNING state to a hard UNKNOWN state)
Hard State Events
What happens when a service or host is in a hard error state or experiences a hard recovery? Well, that
depends on whether or not a hard state change (as described above) has occurred.
If a hard state change has occurred and the service or host is in a non-OK state the following things
will occur..
The hard service or host problem is logged.
Event handlers are executed (if you defined any) to handle the hard problem for the service or
host. (Before any event handler is executed, the $HOSTSTATETYPE$ or $SERVICESTATETYPE$
macro is set to "HARD").
Contacts will be notified of the service or host problem (if the notification logic allows it).
If a hard state change has occurred and the service or host is in an OK state the following things will
occur..
The hard service or host recovery is logged.
Event handlers are executed (if you defined any) to handle the hard recovery for the service or
host. (Before any event handler is executed, the $HOSTSTATETYPE$ or $SERVICESTATETYPE$
macro is set to "HARD").
Contacts will be notified of the service or host recovery (if the notification logic allows it).
If a hard state change has NOT occurred and the service or host is in a non-OK state the following
things will occur..
Contacts will be re-notified of the service or host problem (if the notification logic allows it).
If a hard state change has NOT occurred and the service or host is in an OK state nothing happens.
This is because the service or host is in an OK state and was the last time it was checked as well.
89
90

タグ：

+ タグ編集

「狀態的種類」をウィキ内検索

最終更新：2005年12月18日 00:45

ツールボックス

下から選んでください:

新しいページを作成する

ヘルプ / FAQ もご覧ください。