[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] Tiebreaker IP Address



Barry Brimer <lists brimer org>:
>> As you said, Fencing is a nice way of saying "make sure the non-responsive
>> node can not write anything to our disks, by whatever means necessary".
>> This usually involves the equivalent of pulling the power plug out of the
>> non-responsive node.  Why be so harsh?  Why not do a normal shutdown?
>
> A "normal" shutdown will always flush buffers to disk.  The most important 
> thing is the integrity of our data.  If the cluster has determined the node 
> is not functioning properly, we don't want to give it the opportunity to 
> write bad/corrupted data to our disk.  By "pulling the plug" it will not be 
> able to do so.
>
>> So does that means that even in any case of cluster failure (suppose a
>> network fail), the node will shutdown abnormally only, or it will be a 
>> clean
>> shutdown. And once a node is shutdown due to a failure, will the node
>> automatically come up or does it need to be manually brought up.
>
> As mentioned before, fencing is "pulling the plug" .. if fencing is set up 
> correctly, the node will reboot and rejoin the cluster.

Unless it's been simply IO fenced (unplugged from the shared storage).  In
that case you'll need to reboot and verify its "live" status in the
cluster before you unfence it.  In my experience though the node tends to
freak out if it's only been IO fenced and an unclean reboot is necessary
anyways.  This is probably what you meant by "if fencing is setup
correctly" - both IO and Power fencing methods should be used.

Brian

>
> Barry
>> -----Original Message-----
>> From: linux-cluster-bounces redhat com
>> [mailto:linux-cluster-bounces redhat com] On Behalf Of Barry Brimer
>> Sent: Sunday, January 20, 2008 7:31 PM
>> To: linux clustering
>> Subject: Re: [Linux-cluster] Tiebreaker IP Address
>>
>>> Can any one explain me what exactly is the tiebreaker IP and how does it
>>> function? What is the use if we set the tiebreaker IP as the Default
>> Gateway
>>> address?
>>
>> In clustering, it is important that the cluster nodes are able to
>> communicate with one another.  It is also important that the cluster nodes
>> agree on the status of the cluster.  To acheive this, various methods are
>> used to communicate between cluster nodes to inform the other nodes that
>> this node is active and participating in the cluster.  Quorum is usually
>> defined as "greater than one half".  In a cluster larger than 2 nodes,
>> the cluster nodes can determine that if they stop receiving cluster
>> communications (usually referred to as heartbeat) from a particular node,
>> they assume that the non-responsive node is not functioning correctly, and
>> one of the remaining nodes in the cluster will fence the non-responsive
>> node.  Fencing is a nice way of saying "make sure the non-responsive node
>> can not write anything to our disks, by whatever means necessary".  This
>> usually involves the equivalent of pulling the power plug out of the
>> non-responsive node.  Why be so harsh?  Why not do a normal shutdown?  If
>> the non-responsive node has data in buffers that has not been written to
>> disk, and the other cluster nodes feel that this node is having a problem,
>> they want to ensure that the non-responsive node can not write its buffers
>> out to disk, in order to make sure that the non-responsive node has no
>> chance of corrupting the data used by the cluster.  This is all fine,
>> because if you have greater than 2 nodes, you should be able to get
>> agreement by a majority on whether a node is functioning, and therefore
>> whether the cluster is allowed to operate.  In a two-node cluster, we need
>> to have some other way to determine which cluster member is healthy, and
>> which one isn't.  If a cluster node were functioning correctly, it would
>> be able to reach its default gateway.  Therefore the tiebreaker IP address
>> is the default gateway because both machines should be able to reach it if
>> they were functioning properly.  Therefore if one node is able to reach
>> the tiebreaker IP address, and one isn't, it is assumed that the properly
>> running node is the one that can reach the default gateway, and that
>> allows the tie to be broken and allows that node to fence the other node.
>>
>> Barry
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster redhat com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>>
>> <<<<   Disclaimer Message  >>>>
>> "This email and any files transmitted with it are confidential and 
>> intended solely for the use of the individual or entity to whom they are 
>> addressed. If you are not the named addressee, please notify the sender 
>> immediately after deleting this e-mail from your system and do not 
>> disseminate, distribute or copy this e-mail. The sender does not accept 
>> liability for any errors or omissions in the contents of this message, 
>> which arise as a result of erroneous e-mail transmission."
>> [Mohsin Haider Darwish LLC & Group Companies, PO.Box 880, Ruwi-112, Oman]
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster redhat com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>> !DSPAM:4796b976281132343459193!
>>
>>
>
> --
> Linux-cluster mailing list
> Linux-cluster redhat com
> https://www.redhat.com/mailman/listinfo/linux-cluster

Attachment: smime.p7s
Description: S/MIME cryptographic signature


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]