On Tue, 17 Apr 2007, Bryn M. Reeves wrote:
Assuming a dedicated VLAN which servers only AOE and GFS traffic among
the Coraid boxes and the GFS hosts, is there any need for fencing at all?
What are the hidden traps behind such setups?
If you are using GFS on shared storage then you need fencing. Period.
The only way you can guarantee data integrity in this scenario is by
completely cutting a failed or misbehaving node off from the storage;
either by power cycling it or having the storage reject its access.
Otherwise, imagine a situation where a node hangs for some reason and is
ejected from the cluster. At this point none of its locks for the shared
data are valid anymore. Some time later, the node recovers from the hang
and begins flushing writes to the storage -> corruption.
I see - the sentences above make much more clearer why fencing is needed.
Thank you the explanation - and the hint for the possibility to reject
the access at the storage itself!
Address: KFKI Research Institute for Particle and Nuclear Physics
H-1525 Budapest 114, POB. 49, Hungary
Linux-cluster mailing list