Re: [Linux-cluster] CS4 Update 2 / problem with systems dump ?

On Wed, Mar 22, 2006 at 08:59:17AM +0100, Alain Moulle wrote:
> >> You might set a fencing delay that would allow the dump to complete, e.g.
> >>   <fence_daemon post_fail_delay="10">
> >>   </fence_daemon>

> OK but does that mean that one we have patched this, the peer node will
> wait in all cases this delay before fencing the node with problem, even
> if this node is not dumping , right ?

When fenced goes to fence a failed node, it waits 10s before actually
killing it.  That applies to all nodes that fail.

> So, the workaround that you propose is to be used only this way :
> 1. a node has crashed and was about to dump but has been fenced.
> 2. patch the post_fail_delay
> 3. re-start CS4 on both nodes
> 4. wait for a new crash and dump, and in this case, the failover
>    will take at least the post_fail_delay value.

I'm not sure what you mean by this, but it doesn't sound right.
post_fail_delay would be added permanently to cluster.conf which
is the same on all nodes... you don't change it.


