Re: [Linux-cluster] GFS over AOE without fencing?

Kadlecsik Jozsi wrote:
>> If you are using GFS on shared storage then you need fencing. Period.
>> The only way you can guarantee data integrity in this scenario is by
>> completely cutting a failed or misbehaving node off from the storage;
>> either by power cycling it or having the storage reject its access.
>> Otherwise, imagine a situation where a node hangs for some reason and is
>> ejected from the cluster. At this point none of its locks for the shared
>> data are valid anymore. Some time later, the node recovers from the hang
>> and begins flushing writes to the storage -> corruption.
> I see - the sentences above make much more clearer why fencing is needed.
> Thank you the explanation - and the hint for the possibility to reject 
> the access at the storage itself!
> Best regards,
> Jozsef

Hi Jozsef,

No problem! For cutting the access off at the storage there is a new
fence agent in the cluster CVS called fence_scsi - you can use that if
the storage supports SCSI3 reservations. I don't know if that is the
case for AOE though.

Otherwise you could use a regular network power switch, or maybe cook
something up with a custom fencing script (I've used hacks with iptables
before for Linux based iscsi devices).

Kind regards,
