[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Linux-cluster] Re: Fencing question in geo cluster (dual sites clustering)


No comments on this RHCS gurus ? Am I trying to setup (multisite cluster) something that 'll never be supported ?

Or is the qdiskd reboot action considered as sufficient?  (Reboot action should be a dirty power reset to prevent data syncing) 

If so, all IO's on the wrong nodes (at the isolated site) should be frozen untill quorum is eventually regained. If not it'll end up with a (dirty) reboot.


2009/8/21 brem belguebli <brem belguebli gmail com>
I'm trying to find out what best fencing solution could fit a dual sites cluster.
Cluster is equally sized on each site (2 nodes/site), each site hosting a SAN array so that each node from any site can see the 2 arrays.
Quorum  disk (iscsi LUN) is hosted on a 3rd site.
SAN and LAN using the same telco infrastructure (2 redundant DWDM loops). 
In case something happens at Telco level (both DWDM loops are broken) that makes 1 of the 2 sites completely isolated from the rest of the world,
the nodes at the good site (the one still operationnal) won't be able to fence any node from the wrong site (the one that is isolated) as there is no way for them to reach their ILO's or do any SAN fencing as the switches at the wrong site are no more reachable.
As qdiskd is not reachable from the wrong nodes, they end up being rebooted by  qdisk, but there is a short time (a few seconds) during which the wrong nodes are still seing their local SAN array storage and may potentially have written data on it.
Any ideas or comments on how to ensure data integrity in such setup ?

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]