[Linux-cluster] GFS+DRBD+Quorum: Help wrap my brain around this

Wed Nov 17 21:02:35 UTC 2010

You are right so far in your first paragraph.

You cannot totally solve the quorum cluster problem with a two node
cluster. The basic issue you are really trying to address is you want to
avoid a split brain scenario, that is really all quorum is giving you. 

So with DRBD your best bet is to do your level best to avoid a split
brain with your two nodes. Use decent fencing (maybe multiple fences),
have redundant bonded network links (and interlinks) (I'm looking at
splitting these over two physical cards on the nodes), setup DRBD's
startup waiting appropriately and be careful at startups (see scenario
below).

Then just tell RHCS that you want to run with 2 nodes in cluster.conf
e.g 

<cman expected_votes="1" two_node="1"/>

And in drbd.conf I have,

  startup {
  	wfc-timeout  300;       # Wait 300 for initial connection
  	degr-wfc-timeout 60;  # Wait only 60 seconds if this node was a
degraded cluster
	become-primary-on both;
  }

, many may prefer the system to wait indefinitely in DRBD on some of
these conditions (to manually bring stuff up in a bad situation). So
basically here I will wait 5 mins for the other node to join my DRBD
before doing any cluster stuff and but wait less (60s) if I was degraded
already (I'm assuming my other node is probably broken for an extended
period in that case, so I want my other server up pretty quick). I'm
still thinking this through just now. 

On a two node non-shared storage setup you can never fully guard against
the scenario of node A being shutdown, node B then being shutdown later.
Then node A being brought up and having no way of knowing that it has
the older data than B, if B is still down. You can mitigate against this
though by ensuring that you setup DRBD to wait long enough (or forever)
on boot, and/or being careful to start things up in the right order
after long periods of downtime from one node (good node needs to be up
already). Just needs a bit of scenario thought. 

Three nodes just adds needless complexity from what you are saying.

That's my thoughts on this, I'm pretty new to this too. Just how I'm
thinking this should work just now.

Colin

On Wed, 2010-11-17 at 15:22 -0500, Andrew Gideon wrote:
> I'm trying to figure out the best solution for GFS+DRBD.  My mental
> block isn't really with GFS, though, but with clustered LVM (I think).
> 
> I understand the quorum problem with a two-node cluster.  And I
> understand that DRBD is not suitable for use as a quorum disk
> (presumably because it too would suffer from any partitioning, unlike a
> physical array connected directly to both nodes).
> 
> Am I right so far?
> 
> What I'd really like to do is have a three (or more) node cluster with
> two nodes having access to the DRBD storage.  This solves the quorum
> problem (effectively having the third node as a quorum server).
> 
> But when I try to create a volume on a volume group on a device shared
> by two nodes of a three node cluster, I get an error indicating that the
> volume group cannot be found on the third node.  Which is true: the
> shared volume isn't available on that node.
> 
> In the Cluster Logical Volume Manager document, I found:
> 
>         By default, logical volumes created with CLVM on shared storage
>         are visible to all computers that have access to the shared
>         storage. 
>         
> What I've not figured out is how to tell CLVMD (or whomever) that only
> nodes one and two have access to the shared storage.  Is there a way to
> do this? 
> 
> I've also read, in the GFS2 Overview document:
> 
>         When you configure a GFS2 file system as a cluster file system,
>         you must ensure that all nodes in the cluster have access to the
>         shared storage
> 
> This suggests that a cluster running GFS must have access to the storage
> on all nodes.  Which would clearly block my idea for a three node
> cluster with only two nodes having access to the shared storage.
> 
> I do have one idea, but it sounds like a more complex version of a Rube
> Goldberg device: A two node cluster with a third machine providing
> access to a device via iSCSI.  The LUN exported from that third system
> could be used as the quorum disk by the two cluster nodes (effectively
> making that little iSCSI target the quorum server).
> 
> This assumes that a failure of the quorum disk in an otherwise healthy
> two node cluster is survived.  I've yet to confirm this.
> 
> This seems ridiculously complex, so much so that I cannot imagine that
> there's not a better solution.  But I just cannot get my brain wrapped
> around this well enough to see it.
> 
> Any suggestions would be very welcome.
> 
> Thanks...
> 
> 	Andrew
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

This email and any files transmitted with it are confidential and are intended solely for the use of the individual or entity to whom they are addressed.  If you are not the original recipient or the person responsible for delivering the email to the intended recipient, be advised that you have received this email in error, and that any use, dissemination, forwarding, printing, or copying of this email is strictly prohibited. If you received this email in error, please immediately notify the sender and delete the original.