[Linux-cluster] Redhat without qdisk

Lon Hohberger lhh at redhat.com
Thu Apr 12 16:31:44 UTC 2012


On 04/12/2012 12:04 PM, emmanuel segura wrote:
> If don't use the qdisk who is the master in the split-brain?

Remember, DLM/rgmanager recovery is performed -after- fencing.  In a two 
node cluster:

with no qdisk (two_node="1"):
- by default, both nodes go to fence at the same time :(
+ fencing delay helps administrator predetermine which node wins
   + combine with failover domain rules for a bonus

with qdiskd:
+ (given proper heuristics or master_wins) only one node fences
- with master_wins, no method to predetermine who "wins"
- incorrect configuration is no better than two_node="1": both
   nodes will fence at the same time

Qdiskd works by simply adjusting quorum: Fencing requires quorum, so by 
taking it away from a host, you can prevent that host from fencing.

The problem with qdiskd is that the 'configuration' bit has historically 
been significantly harder than it needed to be (hidsight...).

In STABLE32, this has been alleviated quite a lot, though - it pretty 
much configures itself.  For two node clusters using STABLE32:

   <quorumd label="foo" />


> fence_scsi in redhat 5.X doesn't reboot the node, the fenced node neve
> release the resources

More or less.


> Why redhat made the qdisk as Tie-breakers and some people from support
> say it's one optional or some time says is not needed?

It is optional and is often not needed.  It was developed really for two 
purposes:

- to help resolve fencing races (which can be resolved using delays or 
other tactics)

- to allow 'last-man-standing' in >2-node clusters.

With qdiskd you can go from 4 to 1 node (given properly configured 
heuristics).  The other 3 nodes then, because heuristics fail, can't 
"gang up" (by forming a quorum) on the surviving node and take over - 
this means your critical service stays running and available.  The 
problem is that, in practice, the "last node" is rarely able to handle 
the workload.

This behavior is obviated by features in corosync 2.0, which gives 
administrators the ability to state that a -new- quorum can only form if 
all members are present (but joining an existing quorum is always allowed).

-- Lon




More information about the Linux-cluster mailing list