[Linux-cluster] Cluster Suite v3 IPMI fencing time

Celso K. Webber celso at webbertek.com.br
Mon Mar 27 19:54:37 UTC 2006


Lon,

Thank you very much for your response, please see below:

On Mon, 27 Mar 2006 13:05:04 -0500, Lon Hohberger wrote
...
> > 2. if it is a correct behaviour that member1 waits for the fencing completion
> > to take over services, how can I reduce the total fencing time? I didn't find
> > any parameters under the cludb man page for this.
> 
> Turn off ACPI on both machines, or else IPMI will try a "graceful"
> shutdown, which is exactly what you do *not* want.  If the machine is
> dead or hung, it may not respond to the ACPI request at all.

Yes, the ACPI is already off.

If I issue a "ipmitool -I lan ... chassis power cycle" command, with
clumanager turned off, the machine resets within a few seconds, with no
graceful shutdown (it does a hard power off/on).

While the fence process is underway, I can go to a command line and issue a
"ipmitool -I lan ... power status" command, so I think there would be no
difficulties with IPMI status checking for the ipmilan stonith module.

If, on the other hand, I issue a "clufence -r <node>", it turns off the
machine immediately but waits almost 2 minutes to turn it on again, and after
that the clufence command waits for another 2 minutes until it returns to the
shell prompt.

So I'm guessing this has something to do with IBM's implementation of IPMI on
these xSeries 366 machines. Under Dell PowerEdges the fence process was quite
quick for me.

By the way, both the BIOS and the BMC firmware codes were updated to the
latest release before setting up all this enrionment.

> Note that with IPMI, you should use the NIC with IPMI for only IPMI
> traffic (or at least separate IPMI traffic from the cluster traffic),
>  or it can become a single point of failure.

Humm ... interesting, maybe I'll have to change my architecture a little bit,
although I'm issuing IPMI commnds through the least used interface (IPMI seems
to work on both onboard interfaces of this server).

If someone has some similar experiences with this long delay with fencing,
please let me know.

Regards,

Celso.




More information about the Linux-cluster mailing list