[Linux-cluster] fencing: external vs watchdog

Fri Aug 17 07:29:05 UTC 2007

Hi,

I'd like to discuss and collect information about the two diffrent fencing 
approaches.

external fencing: The failed cluster node is disconnected from the storage 
device by onother node in the cluster. After a failure detection all cluster 
activities are suspended until the IO fencing of the failed node has been 
completed successfully.

watchdog fencing: A failed cluster node has to recognize the failure by itself 
and will be shut down by a kind of internal watchdog feature.

Now, I see that theoretically the external fencing method (when configured 
correctly) is the betterer approach because of the exactly defined state 
during a fencing and recovery operation.

But the question is: What are real world examples of failures when the 
watchdog fencing would fail and cause data corruption on the storage device ?
I'd like to collect some real world examples and also theoretical approaches.

All comments welcome !

Mark
-- 
Gruss / Regards,

Dipl.-Ing. Mark Hlawatschek
http://www.atix.de/
http://www.open-sharedroot.org/

**
ATIX Informationstechnologie und Consulting AG