[Linux-cluster] partly OT: failover <500ms

Fri Sep 2 07:03:33 UTC 2005

Lon Hohberger wrote:
> On Thu, 2005-09-01 at 21:58 +0200, Jure Pečar wrote:
> 
>>Hi all,
>>
>>Sorry if this is somewhat offtopic here ...
>>
>>Our telco is looking into linux HA solutions for their VoIP needs. Their
>>main requirement is that the failover happens in the order of a few 100ms. 
>>
>>Can redhat cluster be tweaked to work reliably with such short time
>>periods? This would mean heartbeat on the level of few ms and status probes
>>on the level of 10ms. Is this even feasible?
> 
> 
> Possibly, I don't think it can do it right now.  A couple of things to
> remember:
> 
> * For such a fast requirement, you'll want a dedicated network for
> cluster traffic and a real-time kernel.
> 
> * Also, "detection and initiation of recovery" is all the cluster
> software can do for you; your application - by itself - may take longer
> than this to recover.
> 
> * It's practically impossible to guarantee completion of I/O fencing in
> this amount of time, so your application must be able to do without, or
> you need to create a new specialized fencing mechanism which is
> guaranteed to complete within a very fast time.
> 
> * I *think* CMAN is currently at the whole-second granularity, so some
> changes would need to be made to give it finer granularity.  This
> shouldn't be difficult (but I'll let the developers of CMAN answer this
> definitively, though... ;) )
> 

All true :) All cman timers are calibrated in seconds. I did run some tests a
while ago with them in milliseconds and 100ms timeouts and it worked
/reasonably/ well. However, without an RT kernel I wouldn't like to put this
into a production system - we've had several instances of the cman kernel thread
(which runs at the top RT priority) being stalled for up to 5 seconds and that
node being fenced. Smaller stalls may be more common so with timeouts set that
low you may well get nodes fenced for small delays.

To be quite honest I'm not really sure what causes these stalls, as they
generally happen under heavy IO load I assume (possibly wrongly) that they are
related to disk flushes but someone who knows the VM better may out me right on
this.

-- 

patrick