[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] DRBD with GFS applicable for this scenario?

Zaeem Arshad wrote:

We have 2 geographically distant sites located approximately 35km
apart with dark fiber connectivity available between them. Mail01 and
SAN1 is placed at site A while Mail02 and SAN2 is at site B. Our
requirement is to have the mail servers in a cluster configuration in
an active/active mode. To cater for the loss of connectivity or losing
a SAN itself, I have come up with the following design.
What's the point of having a SAN if you're using DRBD? You might as well
have DAS in each of the two mail servers. Unless you need so much storage
space that you can't put enough disks directly into the server...

We have already bought the SAN, that's why. We do expect our storage
needs to be on the higher side. Also, I am sharing storage on both
SANs as theoretically DRBD will use the local resource first for
read/write and failover to the IP reachable disk volume.

Hmm... What sort of ping time do you get? I presume you have established that it is on the sensible side.

In terms of performance you will need to make sure that machines tend to access only their own sub-paths on the file system (e.g. spool/1 and spool/2, and server 1 doesn't touch spool/2 until server 2 goes down). Otherwise the performance is going to be attrocious since file locks will end up bouncing between the machines. These normally live in cache on a conventional file system so if they have to start getting exchanged at most accesses you are looking at a latency degradation from ~ 50ns down to some milliseconds. If your connectivity is VERY good, if it's 35km I would be surprised if your latencies are better than 10ms, which you'll feel even on the disk latency, let along memory latency - we are talking 200,000x slower in the best case scenario.

1) Export 1 block device from each SAN to its mail server i.e. SAN1
exports to Mail01
2) Use DRBD to configure a block device comprising of the 2 SAN
volumes and use it as a physical volume in clvm.
The CLVM bit is isn't relevant per se, you don't strictly need it, but it
won't hurt.

3) Create a GFS logical volume from this PV that can be used by both
That's fine.

I am wondering if this is a correct design as theoretically it looks
to address both node and SAN failure or connectivity loss.
The problem you have is that you have no way of enacting fencing if the
connectivity between the sites fails. If a node fails, any cluster file
system (GFS included) will mandate a fencing action to ensure that one of
the nodes gets taken down and stays down. If you have lost cross-site
connectivity, the nodes won't be able to fence each other, and GFS will
simply block until connectivity is restored and fencing succeeds. The
chances are that when this happens, it'll also cause a fencing shoot-out and
both nodes may well end up getting fenced.

You could use some kind of cheat-fencing, say, by setting a firewall rule
that will prevent the nodes from re-connecting (you'd need to write your own
fencing agent, but that's not particularly difficult), but then you would be
pretty much guaranteeing a split-brain situation, where the nodes would end
up operating independently without any hope of ever re-synchronising.

In such a case where we lose site connectivity altogether, I'd like
the Site 2 servers to shut itself down to avoid a split-brain
condition. Since, I am implementing clustering, won't the quorum
server take care of this issue?

So you propose to have a quorum disk on site 2? OK, that works. The problem is that fencing works by one server fencing another, not itself. So you'll still need a reliable OOB fencing mechanism such as the one I described.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]