[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Linux-cluster] Workings of Tiebreaker IP (RHCS)

I  pulled a message from 2005  about tiebreakers. I have  some questions  and it does not seem to agree  with what I see culmanger do.
>> Hello,
>> To completely understand what the role of a tiebreaker IP within a two
>> or four node RHCS cluster is, I've searched redhat and Google. I can't
>> however find anything describing the precise workings of the
>> tiebreaker-IP. I would really like to know what happens excactly when
>> the tiebreaker is used an how (maybe even somekind of flow diagram). 
>> Can
 anyone here maybe explain that to me, or point me in the direction
>> of more specific information regarding tiebreaker?
>The tiebreaker IP address is used as an additional vote in the event
>that half the nodes become unreachable or dead in a 2 or 4 node >cluster
>on RHCS.
>The IP address must reside on the same network as is used for cluster
>communication.  To be a little more specific, if your cluster is using
>eth0 for communication, your IP address used for a tiebreaker must be
>reachable only via eth0 (otherwise, you will end up with a split >brain).
>When enabled, the nodes ping the given IP address at regular
>When the IP address is not reachable, the tiebreaker is considered
>"dead".  When it is reachable, it is considered "alive".
>It acts as an additional vote (like an extra cluster member), except >for
>one key difference: Unless the default configuration is overridden, >the
How  does this  work? Does the node trying to become the active node access the tiebreaker and put a lock on it? How does it reseve it? 
Just  pinging it  would not prevent the other node from doing the same.
>IP tiebreaker may not be used to *form* a quorum where one did not
>So, if one node of a two node cluster is online, it will never become
>quorate unless the other node comes online (or administrator override,
>see man pages for "cluforce" and "cludb").
>So, in a 2 node cluster, if one node fails and the other node is >online
>(and the tiebreaker is still "alive" according to that node), the
>remaining node considers itself quorate and "shoots" (aka STONITHs, >aka
>fences) the dead node and takes over services.
>If a network partition occurs such that both nodes see the tiebreaker
>but not each other, the first one to fence the other will naturally
>Ok, moving on...
>The disk tiebreaker works in a similar way, except that it lets the
>cluster limp in along in a safe, semi-split-brain (split brain) in a
>network outage.  What I mean is that because there's state information
>written to/read from the shared raw partitions, the nodes can actually
>tell via other means whether or not the other node is "alive" or not >as
>opposed to relying solely on the network traffic.
>Both nodes update state information on the shared partitions.  When >one
>node detects that the other node has not updated its information for
>period of time, that node is "down" according to the disk subsystem.  >If
>this coincides with a "down" status from the membership daemon, the >node
>is fenced and services are failed over.  If the node never goes down
>(and keeps updating its information on the shared partitions), then >the
I do not use a IP tiebreaker. I have a two nodes system. When the active node shows it is down via memebership but up  via disk then
Clumanager determines it is in an “uncertain state” and shoots it. 
>node is never fenced and services never fail over.
-- Lon

Talk is cheap. Use Yahoo! Messenger to make PC-to-Phone calls. Great rates starting at 1¢/min.
Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]