[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Linux-cluster] rgmanager or clustat problem



 
I am running a four node GFS cluster with about 20 services per node.  All four nodes belong to the same failover domain, and they each have a priority of 1.  My shared storage is an iSCSI SAN.
 
After rgmanager has been running for a couple of days, clustat produces the following result on all four nodes:

Timed out waiting for a response from Resource Group Manager
Member Status: Quorate

  Member Name                              Status
  ------ ----                              ------
  node01           Online, rgmanager
  node02           Online, Local, rgmanager
  node03           Online, rgmanager
  node04           Online, rgmanager

I also get a time out when I try to determine the status of a particular service with "clustat -s servicename".

All of the services seem to be up and running, but clustat does not work.  Is there something wrong?  Is there a way for me to increase the time out?

clurgmgrd and dlm_recvd seem to be using a lot of CPU cycles on Node02, 40 and 60 percent, respectively. 

Thank you for your help.

cman_tool services:

NODE01:

Service          Name                              GID LID State     Code
Fence Domain:    "default"                           4   2 run       -
[1 3 2 4]

DLM Lock Space:  "clvmd"                             1   3 run       -
[1 3 2 4]

DLM Lock Space:  "Magma"                             3   5 run       -
[1 3 2 4]

DLM Lock Space:  "gfslv"                             5   6 run       -
[2 1 3 4]

GFS Mount Group: "gfslv"                             6   7 run       -
[2 1 3 4]

User:            "usrm::manager"                     2   4 run       -
[1 3 2 4]

NODE02:
Service          Name                              GID LID State     Code
Fence Domain:    "default"                           4   5 run       -
[1 3 2 4]

DLM Lock Space:  "clvmd"                             1   1 run       -
[1 3 2 4]

DLM Lock Space:  "Magma"                             3   3 run       -
[1 3 2 4]

DLM Lock Space:  "gfslv"                             5   6 run       -
[1 4 2 3]

GFS Mount Group: "gfslv"                             6   7 run       -
[1 4 2 3]

User:            "usrm::manager"                     2   2 run       -
[1 3 2 4]

NODE03:
Service          Name                              GID LID State     Code
Fence Domain:    "default"                           4   2 run       -
[1 2 3 4]

DLM Lock Space:  "clvmd"                             1   3 run       -
[1 2 3 4]

DLM Lock Space:  "Magma"                             3   5 run       -
[1 2 3 4]

DLM Lock Space:  "gfslv"                             5   6 run       -
[1 2 4 3]

GFS Mount Group: "gfslv"                             6   7 run       -
[1 2 4 3]

User:            "usrm::manager"                     2   4 run       -
[1 2 3 4]

NODE04:
Service          Name                              GID LID State     Code
Fence Domain:    "default"                           4   2 run       -
[1 2 3 4]

DLM Lock Space:  "clvmd"                             1   3 run       -
[1 2 3 4]

DLM Lock Space:  "Magma"                             3   5 run       -
[1 2 3 4]

DLM Lock Space:  "gfslv"                             5   6 run       -
[1 4 2 3]

GFS Mount Group: "gfslv"                             6   7 run       -
[1 4 2 3]

User:            "usrm::manager"                     2   4 run       -
[1 2 3 4]


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]