[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Cluster-devel] Question about /etc/init.d/cman start



On 08/29/2011 09:01 AM, Dietmar Maurer wrote:
>> It is actually configurable via /etc/sysconfig/cman (or /etc/defaults/cman on
>> debian based systems)
>>
>> # CMAN_QUORUM_TIMEOUT -- amount of time to wait for a quorate cluster
>> on
>> #     startup quorum is needed by many other applications, so we may as
>> #     well wait here.  If CMAN_QUORUM_TIMEOUT is zero, quorum will
>> #     be ignored.
>> [ -z "$CMAN_QUORUM_TIMEOUT" ] && CMAN_QUORUM_TIMEOUT=45
>>
>> Setting CMAN_QUORUM_TIMEOUT=0 will simply stop waiting for quorum and
>> continue the execution of the init script.
> 
> Sure, but I want to wait for quorum.
> 
>> Assuming you want to retain the default behavior, once quorum is gained, it is
>> enough to execute /etc/init.d/cman start again. The script is clever enough to
>> start only what is necessary.
>> You have a good point regarding cmannotifyd. In theory it could be used to
>> trigger a "/etc/init.d/cman start" once quorum is achieved and notification
>> dispatched. I can fix this upstream, but for any RHEL6 changes, I'll need you to
> 
> I compile my own packages for debian, so a fix for upstream would be great. I am just unsure
> if we should call unfence_self() from cmannotifyd. I guess it is OK if we check that
> we got quorum for the first time?

No you can't call unfencing from cmannotifyd. I honestly don't recall
all the details on why, but one of the reason is (for example):

- node 1 and node 2
- node 1 start experiencing network problems
- node 2 fence-scsi node 1
- node 1 unfence itself in a non clean state due to cman notifications
of up/downs.
- cluster goes kaboom.

> Besides, why do you want that extra complexity running 'cman start' from cmannotifyd? Especially error handling is somehow unclear to me (what if cman start fails there?).

Well it's one way to do it.

cmannotifyd (as documented) does not provide error handling itself. The
reason is that you can't really halt all cluster operations because a
bad script is activated by a "random" user via cmannotifyd.

> So can't we simple make those daemon smart enough so that we can start them at boot time (always)?

They are smart enough. You are misreading the comments about wait for
quorum in cman init.

The daemons can be safely started at boot time, even without quorum, but
they can't do anything useful till quorum is achieved. That is why it is
possible to override the wait for quorum.

Most users have requested and wants to wait for quorum and fail if there
is no quorum since it really doesn't help to have more daemons running
on top cman.

So maybe what you want is an option to:

wait for quorum, if there is no quorum after timeout, still allow
everything else to start?

Fabio


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]