[Linux-cluster] What if the fence device doesn't work?

Eric Kerin eric at bootseg.com
Wed Nov 22 14:21:59 UTC 2006


Janne Peltonen wrote:
> On Tue, Nov 21, 2006 at 10:36:44AM -0500, Eric Kerin wrote:
>   
>> Well, first it tries ilo.  If that fails, it uses the APC.  If that 
>> fails, then we have a multipoint failure, and obviously protecting 
>> against multipoint failure costs a lot more than protecting against a 
>> single point failure...
>>     
>
> OK, I thought that the power would go through the APC first, iLO second.
> So if the APC failed, then a working iLO would do no good, because it
> wouldn't have power. But then, I'm not an expert on hardware...
>   
In this case you're right.  If the APC died a horrible firey death then 
the server would probably be down, and both the APC and iLO would not be 
accessible for fencing. 

In my situation I have redundant power supplies, and two APC devices, 
(both left power supplies go into one, right into the other)  Therefore 
if one of my power controllers fail, the systems is still online.  And 
hopefully I'll notice the downed power supplies before the next time the 
cluster needs to fence something.  This lowers the risk to an acceptable 
level for me, but not necessarily for everyone.

Thanks,
Eric Kerin
eric at bootseg.com







More information about the Linux-cluster mailing list