[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] $OCF_ERR_CONFIGURED - recovers service on another cluster node



On 01/27/2012 04:03 AM, Parvez Shaikh wrote:
Hi guys,

I am using Red Hat Cluster Suite which comes with RHEL 5.5 -

cman_tool version
 >>6.2.0 config xxx

Now I have a script resource in which I return $OCF_ERR_CONFIGURED; in
case of a Fatal irrecoverable error, hoping that my service would not
start on another cluster node.

But I see that cluster, relocates it to another cluster node and
attempts to start it.

I referred error code documentation from
http://www.linux-ha.org/doc/dev-guides/_return_codes.html

Is there any return code which makes RHCS to give up on recovering service?


The resource must fail during the 'stop' phase if you want rgmanager to not try to recover it. There is no 'start' phase error condition that tells rgmanager to give up.

The history: If you don't have a program installed or configured on host1 but try to enable a service there, it will obviously fail to start (rightfully so). However, host2 may have the configuration. So, rgmanager will then stop the service and try to start it on host2. In fact, it will systematically try every host in the cluster until:

  - the service starts successfully

  - no more hosts are available (e.g. restricted failover domain,
    exclusive services, or simply all hosts were tried).  At this
    point, the service is placed in the 'stopped' state in
    the hopes that the next host to come online will be able to
    start the service

  - a failure during 'stop' occurs.  Most errors during the stop
    phase will trigger an abortion of the enable request (except
    'OCF_NOT_INSTALLED' when a <script> is missing)

-- Lon


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]