[Linux-cluster] One question about IPMI fencing with Cluster Suite v5.1

Subhendu Ghosh sghosh at redhat.com
Fri Sep 12 23:41:46 UTC 2008


Celso K. Webber wrote:
> Hello all,
> 
> Sorry if this question has been answered before, but I didn't find 
> anything in the archives.
> 
> We deployed a Red Hat Cluster Suite on a customer, and apparently 
> everything goes fine until there's a need for one node to fence the 
> other (for instance, we turn it off to test failover).
> 
> As usual for us, we configured the fencing using IPMI, which is 
> available on every modern branded server.
> 
> It seems that sometimes, one machine can't fence the other. Although we 
> can see the Cluster starting "ipmitool -I lanplus -H xxx -U xxx -P xxx 
> chassis power off", it times out while trying to power off the other 
> machine.

Have you tried the above command by itself to see if IPMI on the systems 
responds correctly by shutting down?

> 
> The more incredible thing is that if, at this exact moment, we issue an 
> "ipmitool ... chassis power status" at the command line, it works ok 
> with the same node failing.
> 
> So I have a few questions:
> * can a problem like this (fencing agent not being able to fence) cause 
> instability on the cluster? In our case, the clusters gets crazy even if 
> we reboot the failed node, it does join the cluster, but rgmanager never 
> gets started;
> 
> * has anyone faced this problem with IPMI? We have used IPMI as a fence 
> agent on tenths of implementations with Red Hat Cluster Suite, since 
> version 3, and we have never had this kind of problem. The servers in 
> question are Dell PowerEdges 2900, and there is a crossover cable 
> beetween both onboard #1 NICs of the server, so that we have a dedicated 
> network path for one machine turning off the other.
> 
> 
> Thank you all for your support.
> 
> Regards,
> 
> Celso.
> 


-- 
Subhendu Ghosh
Solutions Architect
Red Hat




More information about the Linux-cluster mailing list