Re: [Linux-cluster] CS5 / About qdisk parameters

On Thu, 2008-05-15 at 16:21 +0200, Alain Moulle wrote:
> Hi Lon
> Thans again, but that's strange because in the man , the recommended
> values are :
> intervall="1" tko="10" and so we have a result < 21s which is the
> default value of heart-beat timer, so not a hair above like you
> recommened in previous email ...
> extract of man qddisk :
>          interval="1"
>             This is the frequency of read/write cycles, in seconds.
>          tko="10"
>             This  is  the  number  of  cycles  a node must miss in order to be
>             declared dead.
> ?
> So the better values to match with the default heart-beat timeout of 21s should
> be :
> interval="2" and tko="11"
> right ?

Yes, but you don't want to match it.

You want qdisk to timeout before CMAN with enough time so that ifthe
qdisk master node dies, there is enough time to elect a new master
*before* CMAN would normally transition.

On RHEL4, the default CMAN timeout is 21 seconds.

On RHEL5, it's 5 seconds - which must be tweaked currently using the
totem <token ... > parameter.

I intend to make qdiskd automatically detect the CMAN death detection
time in the near future and automatically configure itself, because this
is something users/administrators just *shouldn't* have to deal with...

(Does anyone disagree with that? :) )

Anyway, here's a graphical representation as to why qdiskd needs to time
out (long) before CMAN:


-- Lon

