[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] two fencing problems



Bryan Cardillo wrote:
>         I'm in the process of testing the attached patch, basically
>         just had to remove a portion of the match for the `Control
>         Outlet' option.

Interesting ... I see you were getting hung on the menu after where I was - looks like my problem was that the author didn't expect anyone to rename their outlets to something more useful than "Outlet 1", "Outlet 2", etc. The same problem plagues the next menu, because it was looking to match the "----- Outlet # -------" banner, but the assigned name shows up there instead. The following patch (against the "original") seemingly fixes both of these problems generally (incorporating Bryan's fix as well).

--- /sbin/fence_apc     2005-08-01 19:01:17.000000000 -0400
+++ fence_apc   2005-12-06 09:09:55.000000000 -0500
@@ -244,10 +244,10 @@
/--\s*device manager.*(\d+)\s*-\s*Outlet Control/is ||

                        # "Device Manager", "1- Cluster Node 0   ON"
- /--\s*Outlet Control.*(\d+)\s*-\s+Outlet\s+$opt_n\D[^\n]*\s(?-i:ON|OFF)\*?\s/ism || + /--\s*Outlet Control.*($opt_n)\s*-[^\n]+\s(?-i:ON|OFF)\*?\s/ism ||

                        # Administrator Outlet Control menu
- /--\s*Outlet $opt_n\D.*(\d+)\s*-\s*control outlet\s+$opt_n\D/ism + /Outlet\s+:\s*$opt_n\D.*(\d+)\s*-\s*control outlet/ism
                ) {
                        $t->print($1);
                        next;

        here is the clusternode elem I'm using, with the port
        specified, and seems to work so far.  as far as I know, this
        must be specified in the cluster.conf manually.

<clusternode name="node1" votes="1">
    <fence>
        <method name="pdu">
            <device name="pdu" port="1"/>
        </method>
    </fence>
</clusternode>

Ah, I see I was confusing <fencedevice ...> with <fence> - it looks like it is configurable in the configuration tool afterall, under "manage fencing for this node". Here's what I got after setting it up with my two cross-wired PDUs (the nodes have redundant power, so node 1 is plugged into outlet 1 on each pdu, and node 2 to outlet 2 on each pdu):

                <clusternode name="NODE1" votes="1">
                        <fence>
                                <method name="1">
<device name="FENCE1" option="off" port="1" switch="1"/> <device name="FENCE2" option="off" port="1" switch="1"/> <device name="FENCE1" option="on" port="1" switch="1"/> <device name="FENCE2" option="on" port="1" switch="1"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="NODE2" votes="1">
                        <fence>
                                <method name="1">
<device name="FENCE1" option="off" port="2" switch="1"/> <device name="FENCE2" option="off" port="2" switch="1"/> <device name="FENCE1" option="on" port="2" switch="1"/> <device name="FENCE2" option="on" port="2" switch="1"/>
                                </method>
                        </fence>
                </clusternode>

Except then when I stopped the configurator and started it again it complained about the "switch=" options that it put there itself! removing them by hand seems to have fixed it. *sigh*

And it still doesn't appear to work ... I can turn the outlets on and off from the command line, but if I down the interface on a node, the other node reports that it's removing the "failed" node from the cluster, and that it's fencing the "failed" node, but the "failed" node never gets shut down. Does this get logged somewhere besides /var/log/messages, or is there a way to force it to be more verbose? If I could see what command fenced is actually invoking that might help ...

-g

Greg Forte
gforte udel edu
IT - User Services
University of Delaware
302-831-1982
Newark, DE


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]