[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] What if the fence device doesn't work?



Eric Kerin wrote:

Janne Peltonen wrote:

On Tue, Nov 21, 2006 at 08:26:20AM -0500, Eric Kerin wrote:
So to keep that scenario from happening, the cluster software ensures that a successful fence occurs before continuing operation. It's a fail-safe style setup. Better to take 30 minutes downtime for an admin to make the right decision than corrupt your filesystems and have to take 8 -24 hours downtime to restore the system.


I do understand the basics. I wouldn't want the cluster suite to think
that a node couldn't access a resource such as an FS when it can. It
would just be nice to configure the cluster suite so that if one method
of fencing fails, it tries another, <SNIP>

Actually, that's entirely possible.

See: http://sources.redhat.com/cluster/faq.html#fence_levels
And here's an example block from the cluster.conf file first it tries ilo (HP Lights Out), and if that fails, apc (APC Network power controller) (the ILO sections is probably not correct, I just thew it in there as an example of how you'd setup the method tags):

This is correct -- but I wanted to take this chance to provide some additional information on the configuration that you present.


                       <fence>
<method name="ilo">

The method name can be anything here - it really does not correspond to anything else in the file...it could just as easily be name="1" and name="2"...as long as the method names are unique beneath any clusternode. system-config-cluster uses "1", "2", "3",etc...


<device name="server1-ilo" option="off"/> <device name="server1-ilo" option="on"/>

It is usually not necessary to specify options for off and on.....the default for every fence agent is off, then on.


                               </method>
                               <method name="apc">
<device name="APC01a" port="1" option="off"/> <device name="APC01b" port="1" option="off"/> <device name="APC01a" port="1" option="on"/> <device name="APC01b" port="1" option="on"/>

Now here is a case where it IS necessary to specify options...this configuration is usually used for nodes with dual power supplies to insure that both power supplies are powered off, before one of them is powered on. If you use system-config-cluster for configuration, there is no way to set options, BUT: If you are using power fencing and there are multiple power fence device tags inside a method block, then the configuration written out by system-config-cluster looks like the one above; the option fields are written for you and the ordering of the devices is set up to insure all power type fences are powered off, and then all are powered on. The new management interface will also set this up for you.

Regards,

-Jim



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]