[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] Redhat without qdisk



On 04/12/2012 12:04 PM, emmanuel segura wrote:
If don't use the qdisk who is the master in the split-brain?

Remember, DLM/rgmanager recovery is performed -after- fencing. In a two node cluster:

with no qdisk (two_node="1"):
- by default, both nodes go to fence at the same time :(
+ fencing delay helps administrator predetermine which node wins
  + combine with failover domain rules for a bonus

with qdiskd:
+ (given proper heuristics or master_wins) only one node fences
- with master_wins, no method to predetermine who "wins"
- incorrect configuration is no better than two_node="1": both
  nodes will fence at the same time

Qdiskd works by simply adjusting quorum: Fencing requires quorum, so by taking it away from a host, you can prevent that host from fencing.

The problem with qdiskd is that the 'configuration' bit has historically been significantly harder than it needed to be (hidsight...).

In STABLE32, this has been alleviated quite a lot, though - it pretty much configures itself. For two node clusters using STABLE32:

  <quorumd label="foo" />


fence_scsi in redhat 5.X doesn't reboot the node, the fenced node neve
release the resources

More or less.


Why redhat made the qdisk as Tie-breakers and some people from support
say it's one optional or some time says is not needed?

It is optional and is often not needed. It was developed really for two purposes:

- to help resolve fencing races (which can be resolved using delays or other tactics)

- to allow 'last-man-standing' in >2-node clusters.

With qdiskd you can go from 4 to 1 node (given properly configured heuristics). The other 3 nodes then, because heuristics fail, can't "gang up" (by forming a quorum) on the surviving node and take over - this means your critical service stays running and available. The problem is that, in practice, the "last node" is rarely able to handle the workload.

This behavior is obviated by features in corosync 2.0, which gives administrators the ability to state that a -new- quorum can only form if all members are present (but joining an existing quorum is always allowed).

-- Lon


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]