[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Linux-cluster] Re: CS5 two-nodes with quorum disk

Hi Lon

I've carefully read your last detailed information. I've a
better understanding but something is again not clear for me :
in my two node cluster node1/node2, with quorum disk , without any heuristic,
I would like to be sure that if there is a failure on the heart-beat
network, only one node fences the other and not both, so :
when I do on node2 ifdown on eth if of heart-beat, what is the
mechanism via the quorum disk that assures that ?
Or how must I configure to assure that ?
I think all will be clear for me if I understand this case ...

QDisk provides additional votes based on user-defined heuristics (or, no
heuristics, depending).  The combination of the heuristics + votes can
be used to:

* prevent even-split fence races in the event of a network partition -
one cluster partition can, given well-defined heuristics, decide it is
unfit for cluster participation (and usually remove itself), while the
other remains "fit" and therefore fences the bad partition

* allow a minority partition to become the surviving partition a split -
similar to the above - given a 4-node cluster, 3 nodes in a majority
partition could decide that they are *all* unfit for cluster
participation and remove themselves - while the 1-node minority
partition continues to operate

* prevent a partition from becoming quorate after being fenced - on
boot, if a node does not meet its heuristic requirements and a master
node exists in the cluster, it cannot become quorate unless it has
communications with the master qdisk node (optionally, you can have
qdisk stop CMAN in this case)

... and possibly other things, but those are the main ones.

It's not a replacement for cluster communications nor is it a
replacement for CMAN's membership (in fact, it relies on CMAN's
membership - and fencing - to do its job).

Even if qdiskd told CMAN which nodes were online, much of the internal
network traffic (for example, DLM traffic) cannot be pushed through the
disk in a meaningful way, meaning GFS access would be blocked.

-- Lon


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]