[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] Fencing required for node failover





On Thu, Dec 29, 2011 at 5:06 PM, Digimer <linux alteeve com> wrote:
On 12/29/2011 04:49 PM, Achint Mehta wrote:
> Hi All,
>
> I am using RHCS in RHEL 6.2.
>
> I am trying to perform a failover for a node in the cluster.
> All the services have fail-over configured on them with recovery method
> set to relocate.
> When the node foes down the services are not relocated to to another nodes.
>
> Though the node failure is detected by rgmanager:
> ------
> Dec 29 16:20:57 rgmanager State change: pcs_linuxha_1 DOWN
> Dec 29 16:28:25 rgmanager Status Child Max set to 7
> ------
> and fenced has the following logs:
> ------
> Dec 29 16:21:04 fenced fencing node pcs_linuxha_1
> Dec 29 16:21:04 fenced fence pcs_linuxha_1 dev 0.0 agent none result:
> error no method
> Dec 29 16:21:04 fenced fence pcs_linuxha_1 failed
> ------
>
> 1. Do I require fencing to be enabled to make node failover work
> 2. If yes, what kind of failover device should I add. (all the nodes are
> simple servers.)
>
>
> Thanks!
> Achint

Yes, you absolutely needs fencing.

As soon as a node is lost, fenced informs dlm which then stops providing
locks. Only when the fence succeeds is dlm informed and will again issue
locks. In turn, rgmanager uses dlm, so with dlm not providing locks,
rgmanager can't recover services.

See this for a more specific explanation;

https://alteeve.com/w/2-Node_Red_Hat_KVM_Cluster_Tutorial#Concept.3B_Fencing

--
Digimer
E-Mail:              digimer alteeve com
Freenode handle:     digimer
Papers and Projects: http://alteeve.com
Node Assassin:       http://nodeassassin.org
"omg my singularity battery is dead again.
stupid hawking radiation." - epitron

Thanks for the explanation.

It appears that until fence daemon performs the fencing activity the recovery will not be made.
So I would have to setup fencing.

1. I am not sure what kind of fencing device to add for simple servers.
2. Is this fencing device supposed to be a dedicated machine outside of the cluster.
3. Also, would fencing be successful if the node  to be fenced is not reachable on the network.

--Achint

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]