[Linux-cluster] How to integrate a custom resource agent into RHCS?

Digimer linux at alteeve.com
Mon May 30 13:15:37 UTC 2011


On 05/30/2011 08:28 AM, Ralph.Grothe at itdz-berlin.de wrote:
> Hi,
>
> I hope this is the right forum. So bear with me Pacemaker
> aficionados et alii when I talk about Red Hat Cluster Suite
> (RHCS).
> That's the clusterware product I am given to set up the cluster
> and I'm not free to chose another software of my liking.
>
> Though this may sound ridiculous, since days I've been labouring
> to get a fairly simple custom resource agent (hence RA) to be
> acknowledged by RHCS and correctly executed through its
> rgmanager.
>
> When scripting my RA I mostly adhered to
> http://www.linux-ha.org/doc/dev-guides/ra-dev-guide.html apart
> from where RHCS RAs differs from general OCF.
>
> I put my RA in /usr/share/cluster and afterwards restarted
> rgmanager on all nodes.
>
> When I try to start the service whereof my RA's managed resource
> is part of the service though gets started but not my resource,
> as if it wasn't part of the service at all.
>
>
> When I try to start my resource via rg_test nothing happens apart
> from this obscure log entry
>
>
> [root at aruba:~]
> # rg_test test /etc/cluster/cluster.conf start aDIStn_sec
> Running in test mode.
> Entity: line 2: parser error : Char 0x0 out of allowed range
>
> ^
> Entity: line 2: parser error : Premature end of data in tag error
> line 1
>
> ^
> [root at aruba:~]
> # echo $?
> 0
>
> [root at aruba:~]
> # grep rg_test /var/log/cluster.log|tail -1
> May 30 13:54:55 aruba rg_test: [28643]:<err>  Cannot dump
> meta-data because '/usr/share/cluster/default.metadata' is
> missing
>
>
> Though this is true
>
> [root at aruba:~]
> # ls -l /usr/share/cluster/default.metadata
> ls: /usr/share/cluster/default.metadata: No such file or
> directory
>
> there isn't such a file part of the installed clusterware at all
> either
>
> [root at aruba:~]
> # yum groupinfo Clustering|tail -10|xargs rpm -ql|grep -c
> default\\.metadata
> 0
>
> And besides, I don't understand this error because since I wrote
> my RA according to above mentioned RA Developer's Guide it of
> course dumps its metadata
>
>
> [root at aruba:~]
> # /usr/share/cluster/aDIStn_sec.sh meta-data|grep action
>      <actions>
>          <action name="start" timeout="0"/>
>          <action name="stop" timeout="0"/>
>          <action name="status" timeout="5"/>
>          <action name="monitor" timeout="5"/>
>          <action name="meta-data" timeout="0"/>
>          <action name="verify-all" timeout="5"/>
>          <action name="validate-all" timeout="5"/>
>      </actions>
>
> (note, RHCS deviates from OCF here in naming its actions
> verify-all instead of validate-all and status instead of monitor.
> But both refer to the same case block in my RA)
>
>
> I also don't understand the "Char 0x0 out of allowed range" error
> from the XML parser.
>
> If it really refers to line 2 of my cluster.conf this looks
> pretty ok to me
>
>
> [root at aruba:~]
> # sed -n 2p /etc/cluster/cluster.conf
> <cluster alias="rhcs_mock" config_version="43" name="rhcs_mock">
>
>
> If I run a validity check of the XML of my cluster.conf against
> RHCS's RNG schema I also get an incomprehensible error about
> extra elements in interleave.
>
> Nevertheless, all other resources of my cluster which rely on
> RHCS's standard RAs are managed ok by the clusterware.
>
>
>
> [root at aruba:~]
> # declare -f cluconf_valid
> cluconf_valid ()
> {
>      xmllint --noout --relaxng
> /usr/share/system-config-cluster/misc/cluster.ng
> ${1:-/etc/cluster/cluster.conf}
> }
> [root at aruba:~]
> # cluconf_valid
> Relax-NG validity error : Extra element cman in interleave
> /etc/cluster/cluster.conf:2: element cluster: Relax-NG validity
> error : Element cluster failed to validate content
> /etc/cluster/cluster.conf fails to validate
>
>
> Btw. is there a schema file available to check an RA's metadata
> for validity?
>
>
>
> Of course did I test my RA script for correct functionality when
> used like an init script (to which end I provide the required
> environment of OCF_RESKEY_parameter(s)),
> and it starts, stops and monitors my resource as intended.
>
>
> Can anyone help?
>
>
> Regards
> Ralph

Can you paste in your cluster.conf file? Please only alter the passwords.

Generally speaking, if your scripts can work like init.d script (taking 
start/stop/status arguments), then you should be able to use the 
"script" resource type.

I am not too familiar with OCF, I am afraid, but I think I can help with 
RHCS as that is what I am most familiar with.


-- 
Digimer
E-Mail:              digimer at alteeve.com
Freenode handle:     digimer
Papers and Projects: http://alteeve.com
Node Assassin:       http://nodeassassin.org
"I feel confined, only free to expand myself within boundaries."




More information about the Linux-cluster mailing list