[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] DRBD with GFS applicable for this scenario?

Zaeem Arshad wrote:
Hi List,

We have 2 geographically distant sites located approximately 35km
apart with dark fiber connectivity available between them. Mail01 and
SAN1 is placed at site A while Mail02 and SAN2 is at site B. Our
requirement is to have the mail servers in a cluster configuration in
an active/active mode. To cater for the loss of connectivity or losing
a SAN itself, I have come up with the following design.

What's the point of having a SAN if you're using DRBD? You might as well have DAS in each of the two mail servers. Unless you need so much storage space that you can't put enough disks directly into the server...

1) Export 1 block device from each SAN to its mail server i.e. SAN1
exports to Mail01
2) Use DRBD to configure a block device comprising of the 2 SAN
volumes and use it as a physical volume in clvm.

The CLVM bit is isn't relevant per se, you don't strictly need it, but it won't hurt.

3) Create a GFS logical volume from this PV that can be used by both servers.

That's fine.

I am wondering if this is a correct design as theoretically it looks
to address both node and SAN failure or connectivity loss.

The problem you have is that you have no way of enacting fencing if the connectivity between the sites fails. If a node fails, any cluster file system (GFS included) will mandate a fencing action to ensure that one of the nodes gets taken down and stays down. If you have lost cross-site connectivity, the nodes won't be able to fence each other, and GFS will simply block until connectivity is restored and fencing succeeds. The chances are that when this happens, it'll also cause a fencing shoot-out and both nodes may well end up getting fenced.

You could use some kind of cheat-fencing, say, by setting a firewall rule that will prevent the nodes from re-connecting (you'd need to write your own fencing agent, but that's not particularly difficult), but then you would be pretty much guaranteeing a split-brain situation, where the nodes would end up operating independently without any hope of ever re-synchronising.

The bottom line is that you need reliable out-of-band fencing mechanism. If you have GSM/wireless signal in both areas you could rig up a separate, small fencing "server" on each site with a GSM modem, and write a fencing agent that sends a fencing request by SMS. When the fencing server receives a fencing request, you'd have to make it issue a local fencing action using one of the more standard fencing agents. Note that in this case, due to high latency of things like SMS, you'd need to implement accurate time stamping and deliberately semi-randomize the delay between fencing requests being sent so that you could check time stamps and the fencing servers could sensibly decide whether to obey the local fencing request or the remote one.

You have to get a little creative about it and write a few lines of code to glue it together. I've been meaning to implement something like this for a while, but I haven't gotten around to it yet.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]