[Linux-cluster] Re: More CS4 fencing fun

Hi Lon,
you mail is "music" for my ears :D

I will try your /sbin/fence_dontcare immediately.

I will, anyway, try to explain myself better because english is not my main language.

I understand that cluster suite is also about multiple fail protection and date integrity but our goal is having a 100% NSPOF cluster, i dont want to be interrupted in weekends when i play my favourite video game (WOW) just because ONE component broke and all cluster hung :-)

Sure our hardware configuration can sustain also some multi-point failure, but NSPOF is our mail goal

We have almost everything redundant.

Every server have dual power supplies connected to independent power source, dual nic, internal HD are mirrored with an hot spare, 2 FC cards to connect to a MSA 1000, with redundant controllers and redundant power supply connected to independent power source too.

On msa1000 we have a raid 5 with hot spare.

We have all this things and it's really frustrating for us that if active node's mainboard fails, for shout circuit or too high temperature or some vital component failure or whatever, then all hungs.

About  WTI :

In my case WTI should be useful only in case of multiple failure, for example both network switch fails so heartbeat fails and ilo fails too and with /sbin/fence_dontcare i will have corruption. Is this correct ?

I will need a supplemental NIC for every server to connect to WTI, but since WTI have only one ethernet port i will need a separate hub or switch to connect to it , or i can connect one server to the ethernet port and another one to the serial port? Can i manage both serial and ethernet port ?


