[Linux-cluster] How to integrate a custom resource agent into RHCS?
Digimer
linux at alteeve.com
Mon May 30 13:15:37 UTC 2011
On 05/30/2011 08:28 AM, Ralph.Grothe at itdz-berlin.de wrote:
> Hi,
>
> I hope this is the right forum. So bear with me Pacemaker
> aficionados et alii when I talk about Red Hat Cluster Suite
> (RHCS).
> That's the clusterware product I am given to set up the cluster
> and I'm not free to chose another software of my liking.
>
> Though this may sound ridiculous, since days I've been labouring
> to get a fairly simple custom resource agent (hence RA) to be
> acknowledged by RHCS and correctly executed through its
> rgmanager.
>
> When scripting my RA I mostly adhered to
> http://www.linux-ha.org/doc/dev-guides/ra-dev-guide.html apart
> from where RHCS RAs differs from general OCF.
>
> I put my RA in /usr/share/cluster and afterwards restarted
> rgmanager on all nodes.
>
> When I try to start the service whereof my RA's managed resource
> is part of the service though gets started but not my resource,
> as if it wasn't part of the service at all.
>
>
> When I try to start my resource via rg_test nothing happens apart
> from this obscure log entry
>
>
> [root at aruba:~]
> # rg_test test /etc/cluster/cluster.conf start aDIStn_sec
> Running in test mode.
> Entity: line 2: parser error : Char 0x0 out of allowed range
>
> ^
> Entity: line 2: parser error : Premature end of data in tag error
> line 1
>
> ^
> [root at aruba:~]
> # echo $?
> 0
>
> [root at aruba:~]
> # grep rg_test /var/log/cluster.log|tail -1
> May 30 13:54:55 aruba rg_test: [28643]:<err> Cannot dump
> meta-data because '/usr/share/cluster/default.metadata' is
> missing
>
>
> Though this is true
>
> [root at aruba:~]
> # ls -l /usr/share/cluster/default.metadata
> ls: /usr/share/cluster/default.metadata: No such file or
> directory
>
> there isn't such a file part of the installed clusterware at all
> either
>
> [root at aruba:~]
> # yum groupinfo Clustering|tail -10|xargs rpm -ql|grep -c
> default\\.metadata
> 0
>
> And besides, I don't understand this error because since I wrote
> my RA according to above mentioned RA Developer's Guide it of
> course dumps its metadata
>
>
> [root at aruba:~]
> # /usr/share/cluster/aDIStn_sec.sh meta-data|grep action
> <actions>
> <action name="start" timeout="0"/>
> <action name="stop" timeout="0"/>
> <action name="status" timeout="5"/>
> <action name="monitor" timeout="5"/>
> <action name="meta-data" timeout="0"/>
> <action name="verify-all" timeout="5"/>
> <action name="validate-all" timeout="5"/>
> </actions>
>
> (note, RHCS deviates from OCF here in naming its actions
> verify-all instead of validate-all and status instead of monitor.
> But both refer to the same case block in my RA)
>
>
> I also don't understand the "Char 0x0 out of allowed range" error
> from the XML parser.
>
> If it really refers to line 2 of my cluster.conf this looks
> pretty ok to me
>
>
> [root at aruba:~]
> # sed -n 2p /etc/cluster/cluster.conf
> <cluster alias="rhcs_mock" config_version="43" name="rhcs_mock">
>
>
> If I run a validity check of the XML of my cluster.conf against
> RHCS's RNG schema I also get an incomprehensible error about
> extra elements in interleave.
>
> Nevertheless, all other resources of my cluster which rely on
> RHCS's standard RAs are managed ok by the clusterware.
>
>
>
> [root at aruba:~]
> # declare -f cluconf_valid
> cluconf_valid ()
> {
> xmllint --noout --relaxng
> /usr/share/system-config-cluster/misc/cluster.ng
> ${1:-/etc/cluster/cluster.conf}
> }
> [root at aruba:~]
> # cluconf_valid
> Relax-NG validity error : Extra element cman in interleave
> /etc/cluster/cluster.conf:2: element cluster: Relax-NG validity
> error : Element cluster failed to validate content
> /etc/cluster/cluster.conf fails to validate
>
>
> Btw. is there a schema file available to check an RA's metadata
> for validity?
>
>
>
> Of course did I test my RA script for correct functionality when
> used like an init script (to which end I provide the required
> environment of OCF_RESKEY_parameter(s)),
> and it starts, stops and monitors my resource as intended.
>
>
> Can anyone help?
>
>
> Regards
> Ralph
Can you paste in your cluster.conf file? Please only alter the passwords.
Generally speaking, if your scripts can work like init.d script (taking
start/stop/status arguments), then you should be able to use the
"script" resource type.
I am not too familiar with OCF, I am afraid, but I think I can help with
RHCS as that is what I am most familiar with.
--
Digimer
E-Mail: digimer at alteeve.com
Freenode handle: digimer
Papers and Projects: http://alteeve.com
Node Assassin: http://nodeassassin.org
"I feel confined, only free to expand myself within boundaries."
More information about the Linux-cluster
mailing list