Greetings,<br><br>Using stock &quot;clustering&quot; and &quot;cluster-storage&quot; from RHEL5 update 4 X86_64 ISO.<br><br>As an example using my below config: <br><br>Node1 is running service1, node2 is running service2, etc, etc, node5 is spare and available for the relocation of any failover domain / cluster service.<br>
<br>If I go into the APC PDU and turn off the electrical port to node1, node2 will fence node1 (going into the APC PDU and doing and off, on on node1&#39;s port), this is fine. Works well. When node1 comes back up, then it shuts down service1 and service1 relocates to node5.<br>
<br>Now if I go in the lab and literally pull the plug on node5 running service1, another node fences node5 via the APC - can check the APC PDU log and see that it has done an off/on on node5&#39;s electrical port just fine.<br>
<br>But I pulled the plug on node5 - resetting the power doesn&#39;t matter. I want to simulate a completely dead node, and have the service relocate in this case of complete node failure.<br><br>In this RHEL5.4 cluster, the service never relocates. I can similate this on any node for any service. What if a node&#39;s motherboard fries? <br>
<br>What can I set to have the remaining nodes stop waiting for the reboot of a failed node and just go ahead and relocate the cluster service that had been running on the now failed node?<br><br>Thank you!<br><br>versions:<br>
<br>cman-2.0.115-1.el5<br>openais-0.80.6-8.el5<br>modcluster-0.12.1-2.el5<br>lvm2-cluster-2.02.46-8.el5<br>rgmanager-2.0.52-1.el5<br>ricci-0.12.2-6.el5<br><br>cluster.conf (sanitized, real scripts removed, all gfs2 mounts gone for clarity):<br>
&lt;?xml version=&quot;1.0&quot;?&gt;<br>&lt;cluster config_version=&quot;1&quot; name=&quot;alderaanDefenseShieldRebelAllianceCluster&quot;&gt;<br>    &lt;fence_daemon clean_start=&quot;0&quot; post_fail_delay=&quot;3&quot; post_join_delay=&quot;60&quot;/&gt;<br>
    &lt;clusternodes&gt;<br>        &lt;clusternode name=&quot;192.168.1.1&quot; nodeid=&quot;1&quot; votes=&quot;1&quot;&gt;<br>            &lt;fence&gt;<br>                &lt;method name=&quot;1&quot;&gt;<br>                    &lt;device name=&quot;apc_pdu&quot; port=&quot;1&quot; switch=&quot;1&quot;/&gt;<br>
                &lt;/method&gt;<br>            &lt;/fence&gt;<br>        &lt;/clusternode&gt;<br>        &lt;clusternode name=&quot;192.168.1.2&quot; nodeid=&quot;2&quot; votes=&quot;1&quot;&gt;<br>            &lt;fence&gt;<br>
                &lt;method name=&quot;1&quot;&gt;<br>                    &lt;device name=&quot;apc_pdu&quot; port=&quot;2&quot; switch=&quot;1&quot;/&gt;<br>                &lt;/method&gt;<br>            &lt;/fence&gt;<br>
        &lt;/clusternode&gt;<br>        &lt;clusternode name=&quot;192.168.1.3&quot; nodeid=&quot;3&quot; votes=&quot;1&quot;&gt;<br>            &lt;fence&gt;<br>                &lt;method name=&quot;1&quot;&gt;<br>                    &lt;device name=&quot;apc_pdu&quot; port=&quot;3&quot; switch=&quot;1&quot;/&gt;<br>
                &lt;/method&gt;<br>            &lt;/fence&gt;<br>        &lt;/clusternode&gt;<br>        &lt;clusternode name=&quot;192.168.1.4&quot; nodeid=&quot;4&quot; votes=&quot;1&quot;&gt;<br>            &lt;fence&gt;<br>
                &lt;method name=&quot;1&quot;&gt;<br>                    &lt;device name=&quot;apc_pdu&quot; port=&quot;4&quot; switch=&quot;1&quot;/&gt;<br>                &lt;/method&gt;<br>            &lt;/fence&gt;<br>
        &lt;/clusternode&gt;<br>        &lt;clusternode name=&quot;192.168.1.5&quot; nodeid=&quot;5&quot; votes=&quot;1&quot;&gt;<br>            &lt;fence&gt;<br>                &lt;method name=&quot;1&quot;&gt;<br>                    &lt;device name=&quot;apc_pdu&quot; port=&quot;5&quot; switch=&quot;1&quot;/&gt;<br>
                &lt;/method&gt;<br>            &lt;/fence&gt;<br>        &lt;/clusternode&gt;<br>    &lt;/clusternodes&gt;<br>    &lt;cman expected_votes=&quot;6&quot;/&gt;<br>    &lt;fencedevices&gt;<br>        &lt;fencedevice agent=&quot;fence_apc&quot; ipaddr=&quot;192.168.1.20&quot; login=&quot;device&quot; name=&quot;apc_pdu&quot; passwd=&quot;wonderwomanWasAPrettyCoolSuperhero&quot;/&gt;<br>
    &lt;/fencedevices&gt;<br>    &lt;rm&gt;<br>        &lt;failoverdomains&gt;<br>            &lt;failoverdomain name=&quot;fd1&quot; nofailback=&quot;0&quot; ordered=&quot;1&quot; restricted=&quot;1&quot;&gt;<br>                &lt;failoverdomainnode name=&quot;192.168.1.1&quot; priority=&quot;1&quot;/&gt;<br>
                &lt;failoverdomainnode name=&quot;192.168.1.2&quot; priority=&quot;2&quot;/&gt;<br>                &lt;failoverdomainnode name=&quot;192.168.1.3&quot; priority=&quot;3&quot;/&gt;<br>                &lt;failoverdomainnode name=&quot;192.168.1.4&quot; priority=&quot;4&quot;/&gt;<br>
                &lt;failoverdomainnode name=&quot;192.168.1.5&quot; priority=&quot;5&quot;/&gt;<br>            &lt;/failoverdomain&gt;<br>            &lt;failoverdomain name=&quot;fd2&quot; nofailback=&quot;0&quot; ordered=&quot;1&quot; restricted=&quot;1&quot;&gt;<br>
                &lt;failoverdomainnode name=&quot;192.168.1.1&quot; priority=&quot;5&quot;/&gt;<br>                &lt;failoverdomainnode name=&quot;192.168.1.2&quot; priority=&quot;1&quot;/&gt;<br>                &lt;failoverdomainnode name=&quot;192.168.1.3&quot; priority=&quot;2&quot;/&gt;<br>
                &lt;failoverdomainnode name=&quot;192.168.1.4&quot; priority=&quot;3&quot;/&gt;<br>                &lt;failoverdomainnode name=&quot;192.168.1.5&quot; priority=&quot;4&quot;/&gt;<br>            &lt;/failoverdomain&gt;<br>
            &lt;failoverdomain name=&quot;fd3&quot; nofailback=&quot;0&quot; ordered=&quot;1&quot; restricted=&quot;1&quot;&gt;<br>                &lt;failoverdomainnode name=&quot;192.168.1.1&quot; priority=&quot;4&quot;/&gt;<br>
                &lt;failoverdomainnode name=&quot;192.168.1.2&quot; priority=&quot;5&quot;/&gt;<br>                &lt;failoverdomainnode name=&quot;192.168.1.3&quot; priority=&quot;1&quot;/&gt;<br>                &lt;failoverdomainnode name=&quot;192.168.1.4&quot; priority=&quot;2&quot;/&gt;<br>
                &lt;failoverdomainnode name=&quot;192.168.1.5&quot; priority=&quot;3&quot;/&gt;<br>            &lt;/failoverdomain&gt;<br>            &lt;failoverdomain name=&quot;fd4&quot; nofailback=&quot;0&quot; ordered=&quot;1&quot; restricted=&quot;1&quot;&gt;<br>
                &lt;failoverdomainnode name=&quot;192.168.1.1&quot; priority=&quot;3&quot;/&gt;<br>                &lt;failoverdomainnode name=&quot;192.168.1.2&quot; priority=&quot;4&quot;/&gt;<br>                &lt;failoverdomainnode name=&quot;192.168.1.3&quot; priority=&quot;5&quot;/&gt;<br>
                &lt;failoverdomainnode name=&quot;192.168.1.4&quot; priority=&quot;1&quot;/&gt;<br>                &lt;failoverdomainnode name=&quot;192.168.1.5&quot; priority=&quot;2&quot;/&gt;<br>            &lt;/failoverdomain&gt;<br>
        &lt;/failoverdomains&gt;<br>        &lt;resources&gt;<br>            &lt;ip address=&quot;10.1.1.1&quot; monitor_link=&quot;1&quot;/&gt;<br>            &lt;ip address=&quot;10.1.1.2&quot; monitor_link=&quot;1&quot;/&gt;<br>
            &lt;ip address=&quot;10.1.1.3&quot; monitor_link=&quot;1&quot;/&gt;<br>            &lt;ip address=&quot;10.1.1.4&quot; monitor_link=&quot;1&quot;/&gt;<br>            &lt;ip address=&quot;10.1.1.5&quot; monitor_link=&quot;1&quot;/&gt;<br>
            &lt;script file=&quot;/usr/local/bin/service1&quot; name=&quot;service1&quot;/&gt;<br>            &lt;script file=&quot;/usr/local/bin/service2&quot; name=&quot;service2&quot;/&gt;<br>            &lt;script file=&quot;/usr/local/bin/service3&quot; name=&quot;service3&quot;/&gt;<br>
            &lt;script file=&quot;/usr/local/bin/service4&quot; name=&quot;service4&quot;/&gt;<br>       &lt;/resources&gt;<br>        &lt;service autostart=&quot;1&quot; domain=&quot;fd1&quot; exclusive=&quot;1&quot; name=&quot;service1&quot; recovery=&quot;relocate&quot;&gt;<br>
            &lt;ip ref=&quot;10.1.1.1&quot;/&gt;<br>            &lt;script ref=&quot;service1&quot;/&gt;<br>        &lt;/service&gt;<br>        &lt;service autostart=&quot;1&quot; domain=&quot;fd2&quot; exclusive=&quot;1&quot; name=&quot;service2&quot; recovery=&quot;relocate&quot;&gt;<br>
            &lt;ip ref=&quot;10.1.1.2&quot;/&gt;<br>            &lt;script ref=&quot;service2&quot;/&gt;<br>        &lt;/service&gt;<br>        &lt;service autostart=&quot;1&quot; domain=&quot;fd3&quot; exclusive=&quot;1&quot; name=&quot;service3&quot; recovery=&quot;relocate&quot;&gt;<br>
            &lt;ip ref=&quot;10.1.1.3&quot;/&gt;<br>            &lt;script ref=&quot;service3&quot;/&gt;<br>        &lt;/service&gt;<br>        &lt;service autostart=&quot;1&quot; domain=&quot;fd4&quot; exclusive=&quot;1&quot; name=&quot;service4&quot; recovery=&quot;relocate&quot;&gt;<br>
            &lt;ip ref=&quot;10.1.1.4&quot;/&gt;<br>            &lt;script ref=&quot;service4&quot;/&gt;<br>        &lt;/service&gt;<br>    &lt;/rm&gt;<br>&lt;/cluster&gt;<br><br>