I set up a http HA cluster consist of 3 nodes.
Node 1 is set to gnbd server for fencing.
Node 2 and node 3 are set to http HA.
In case the http service is running on node 3.
Once the network cable of node 3 was unplug,
the service would shift to node 2 properly,
but cman service on node 3 was killed after the catble was plugged in,
and cman's pid file was still there.
The worse thing is cman service can not be started again,
and node 3 can not be shutdown.
OS: RHEL 5 (2.6.18-8.el5)
partial log messages on node 3:
openais: [CPG ] got joinlist message from node 1
openais: [CPG ] got joinlist message from node 2
openais: [CMAN ] cman killed by node 3 for reason 2
gnbd_import: ERROR [../../utils/gnbd_utils.c:78] cman_init failed : Connection refused
gfs_controld: cman_start_notification error -1 104
dlm_controld: cluster is down, exiting
fenced: cluster is down, exiting
fence_node: agent "fence_gnbd" reports: gnbd_import: ERROR cannot get node name : Connection refused gnbd_import: ERROR If you are not planning to use a cluster manager, use -n failed: fence_gnbd, node03
kernel: dlm: closing connection to node 3
fence_node: Fence of "node03" was unsuccessful
kernel: dlm: closing connection to node 2
kernel: dlm: closing connection to node 1
ccsd: Unable to connect to cluster infrastructure after 30 seconds.
ccsd: Unable to connect to cluster infrastructure after 60 seconds.
Any help would be greatly appreciated.