[Linux-cluster] Can anyone help tackling issues with custom resource agent for RHCS?

Wed Apr 3 18:07:16 UTC 2013

On 02/24/2013 11:49 AM, Ralph.Grothe at itdz-berlin.de wrote:
> Hallo digimer,
> 
> I already knew this link and have read FAQs and other stuff
> there.
> 
> Unfortunately, many features such as dependencies between cluster
> services, that our customers demand from us to be enabled in
> their clusters (and what they have been accustomed to in their
> former clusters (e.g. Veritas) which are to be migrated from to
> RHCS ones) are hardly anywhere documented.
> 
> But when I posted my query I was mistaken.
> My ifxdb agent isn't dysfunctional. It really works.
> But what it still lacks is that clurgmgrd doesn't log its actions
> despite the fact that I used mentioned ocf_log function (I also
> check in my agent if that function is defined at run time and if
> not I resource the /usr/schare/cluster/ocf-shellfuncs) and
> although it logs every step whenever I run it during disabled
> services through rg_test utility.
> I have no explanation why clurgmgrd is so taciturn when it comes
> to logging output from my ifxdb agent.
> 
> I think that I have enabled logging up to debug level.
> 
> [root at altair:/usr/share/cluster]
> # grep rm /etc/cluster/cluster.conf 
> 	<rm log_facility="local6" log_level="7"
> central_processing="1">
> 	</rm>
> [root at altair:/usr/share/cluster]
> # grep local6 /etc/syslog.conf 
> local6.*
> /var/log/clurgmgrd.log

You can test your agents by running them from command line.

So, for example, this is one of my resources:

<mdraid config_file="/etc/mdadm-extern.conf" name="extern" ssh_check="1"/>

and this is service that uses it:

<service domain="dom21" name="pg" recovery="restart">
	<ip ref="10.200.213.202">
		<mdraid ref="extern">
			<fs ref="extern"/>
		</mdraid>
	</ip>
</service>

As you can see I have two custom RAs: mdraid and pgsql91.

When I tested 'mdraid' agent, I've changed my service to look like this:

<service domain="dom21" name="pg" recovery="restart">
	<ip ref="10.200.213.202"/>
</service>

After enabling the service, I would test it from the CLI by runnning:

# OCF_RESKEY_config_file="/etc/mdadm-extern.conf" \
OCF_RESKEY_name="extern" \
OCF_RESKEY_ssh_check="1" \
bash /usr/share/cluster/mdraid.sh status

and I get the following output (for example):

<err>    mdraid: Improper setup detected
[mdraid.sh] mdraid: Improper setup detected
<err>    * device "extern" is active on node "database02-xc"
[mdraid.sh] * device "extern" is active on node "database02-xc"

and in /var/log/messages:

Apr  3 20:04:50 database01 rgmanager[28145]: [mdraid.sh] mdraid:
Improper setup detected
Apr  3 20:04:50 database01 rgmanager[28167]: [mdraid.sh] * device
"extern" is active on node "database02-xc"

So, I guess what you should do is try to run your agent and get it to
log to stdout and messages. Maybe you are not using the ocf_log function
properly?

Can you maybe share your agents with us, so somebody can maybe test it
in his environment?

Hope this post helps ;)