[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] Re: CS5 two-nodes with quorum disk

On Tue, 2007-12-11 at 09:38 +0100, Alain Moulle wrote:
> Hi Lon
> Thanks for your information about votes values with quorumd.
> Another question about my tests :
> Now I have the quorum disk working correctly, and so I wanted
> to do this test : ifdown on the heart beat interface, to simulate
> a heart beat network breakdown. I expected the cluster NOT to failover
> because of quorum disk always available, but in fact after the
> 21s the node where I've stopped the if eth has been fenced despite
> the quorumdisk ...
> Where is my misunderstanding ?

QDisk provides additional votes based on user-defined heuristics (or, no
heuristics, depending).  The combination of the heuristics + votes can
be used to:

* prevent even-split fence races in the event of a network partition -
one cluster partition can, given well-defined heuristics, decide it is
unfit for cluster participation (and usually remove itself), while the
other remains "fit" and therefore fences the bad partition

* allow a minority partition to become the surviving partition a split -
similar to the above - given a 4-node cluster, 3 nodes in a majority
partition could decide that they are *all* unfit for cluster
participation and remove themselves - while the 1-node minority
partition continues to operate

* prevent a partition from becoming quorate after being fenced - on
boot, if a node does not meet its heuristic requirements and a master
node exists in the cluster, it cannot become quorate unless it has
communications with the master qdisk node (optionally, you can have
qdisk stop CMAN in this case)

... and possibly other things, but those are the main ones.

It's not a replacement for cluster communications nor is it a
replacement for CMAN's membership (in fact, it relies on CMAN's
membership - and fencing - to do its job).

Even if qdiskd told CMAN which nodes were online, much of the internal
network traffic (for example, DLM traffic) cannot be pushed through the
disk in a meaningful way, meaning GFS access would be blocked.

-- Lon

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]