| Red Hat Docs > Manuals > Red Hat Enterprise Linux Manuals > |
The following chapter describes the various administrative tasks involved in maintaining a cluster after it has been installed and configured.
Monitoring cluster and service status can help identify and resolve problems in the cluster environment. The following tools assist in displaying cluster status:
The clustat command
Log file messages
The cluster monitoring GUI
Note that status is always from the point of view of the cluster system on which an administrator is running a tool. To obtain comprehensive cluster status, run a tool on all cluster systems.
Cluster and service status includes the following information:
Cluster member system status
Power switch status
Heartbeat channel status
Service status and which cluster system is running the service or owns the service
Service monitoring status of the cluster system
The following tables describe how to analyze the status information shown by the clustat command and the cluster GUI.
Table 8-1. Member Status
| Member Status | Description |
|---|---|
| UP | The member system is communicating with the other member system and accessing the quorum partitions. |
| DOWN | The member system is unable to communicate with the other member system. |
Table 8-2. Power Switch Status
| Power Switch Status | Description |
|---|---|
| OK | The power switch is operating properly. |
| Wrn | Could not obtain power switch status. |
| Err | A failure or error has occurred. |
| Good | The power switch is operating properly. |
| Unknown | The other cluster member is DOWN. |
| Timeout | The power switch is not responding to power daemon commands, possibly because of a disconnected serial cable. |
| Error | A failure or error has occurred. |
| None | The cluster configuration does not include power switches. |
| Initializing | The switch is in the process of being initialized and its definitive status has not been concluded. |
Table 8-3. Heartbeat Channel Status
| Heartbeat Channel Status | Description |
|---|---|
| OK | The heartbeat channel is operating properly. |
| Wrn | Could not obtain channel status. |
| Err | A failure or error has occurred. |
| ONLINE | The heartbeat channel is operating properly. |
| OFFLINE | The other cluster member appears to be UP, but it is not responding to heartbeat requests on this channel. |
| UNKNOWN | Could not obtain the status of the other cluster member system over this channel, possibly because the system is DOWN or the cluster daemons are not running. |
Table 8-4. Service Status
| Service Status | Description |
|---|---|
| running | The service resources are configured and available on the cluster system that owns the service. The running state is a persistent state. From this state, a service can enter the stopping state (for example, if the preferred member rejoins the cluster) |
| disabled | The service has been disabled, and does not have an assigned owner. The disabled state is a persistent state. From this state, the service can enter the starting state (if a user initiates a request to start the service). |
| starting | The service is in the process of being started. The starting state is a transient state. The service remains in the starting state until the service start succeeds or fails. From this state, the service can enter the running state (if the service start succeeds), the stopped state (if the service stop fails), or the error state (if the status of the service resources cannot be determined). |
| stopping | The service is in the process of being stopped. The stopping state is a transient state. The service remains in the stopping state until the service stop succeeds or fails. From this state, the service can enter the stopped state (if the service stop succeeds), the running state (if the service stop failed and the service can be started). |
| stopped | The service is not running on any cluster system, does not have an assigned owner, and does not have any resources configured on a cluster system. The stopped state is a persistent state. From this state, the service can enter the disabled state (if a user initiates a request to disable the service), or the starting state (if the preferred member joins the cluster). |
To display a snapshot of the current cluster status, invoke the clustat utility. For example:
clustat
Cluster Status Monitor (Fileserver Test Cluster)
07:46:05
Cluster alias: clu1alias.boston.redhat.com
===================== M e m b e r S t a t u s =======================
Member Status Node Id Power Switch
-------------- ---------- ---------- ------------
clu1 Up 0 Good
clu2 Up 1 Good
=================== H e a r t b e a t S t a t u s ===================
Name Type Status
------------------------------ ---------- ------------
clu1 <--> clu2 network ONLINE
=================== S e r v i c e S t a t u s =======================
Last Monitor
Restart
Service Status Owner Transition Interval Count
------------- -------- ------------- ---------------- ------------
nfs1 started clu1 16:07:42 Feb 27 15 0
nfs2 started clu2 00:03:52 Feb 28 2 0
nfs3 started clu1 07:43:54 Feb 28 90 0 |
To monitor the cluster and display status at specific time intervals, invoke clustat with the -i time command-line option, where time specifies the number of seconds between status snapshots.