[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] Manual fencing doest work

Fence manual setup simply waits until either
1) the user reboots the failed node _and_ uses fence_ack_manaul to notify the node asking for the fence that you have done so.
2) the node that "failed" comes back up

In the steps you described, you never acknowledged the request for fencing - hence, you have to wait for the machine to come back up.


BTW, i'd never use manual fencing in production.

On Apr 3, 2006, at 5:30 AM, Thai Duong wrote:

Hi all,

 I have a 2 node GFS 6.1 cluster with the following configuration:

 <?xml version="1.0"?>
 <cluster name="fccrac" config_version="5">

     <cman two_node="1" expected_votes="1">

       <clusternode name="fcc1" votes="1">
         <method name="single">
          <device name="human" nodename="fcc1"/>

       <clusternode name="fcc4" votes="1">
         <method name="single">
          <device name="human" nodename="fcc4"/>

    <fence_device name="human" agent="fence_manual"/>


It turns out that manual fencing doest work as expected. When I force power down a node, the other could not fence it and worse, the whole GFS file system is freeze waiting for the downed node to be up again. I got something like below in kernel log

 Apr  2 16:46:28 fcc1 fenced[3444]: fencing node "fcc4"
 Apr  2 16:46:28 fcc1 fenced[3444]: fence "fcc4" failed

 Some information about GFS and kernel:

 [root fcc1 ~]# rpm -qa | grep GFS

 [root fcc1 ~]# uname -a
Linux fcc1 2.6.9-22.0.2.EL #1 SMP Thu Jan 5 17:04:58 EST 2006 ia64 ia64 ia64 GNU/Linux

 Please help.


 Thai Duong.
Linux-cluster mailing list
Linux-cluster redhat com

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]