[Linux-cluster] heartbeat to rgmanager design question

Thu Nov 27 14:57:50 UTC 2008

Hello all,

I've been using Heartbeat in past to do resource failover with the
following scheme:

1) Each node in the cluster runs a dummy monitoring resource agent as a
clone.  This resource agent monitors the health of a service on the node
using whatever rules one wants to write into it.  For instance, make
sure the service is not in maintenance mode, mysql is running, queries
return timely, and replication is up to date.  If all the checks pass it
uses attrd_updater to set an attribute for that service on the node to
1.  Else, it is set to 0.  Note that this resource agent in no way
affects the service it is monitoring.

2) The cluster configuration uses the attributes for each of the
monitored services to generate a score for the machine.  The machine
with the highest score gets to host the virtual ip for that service.

This scheme allows one to, for instance, touch a file on a machine that
will signify that it's in maintenance mode.  The service ip would then
be moved to another node, leaving one to test out the service on the
machine's management ip without removing it from the cluster itself
which would cause a lack of gfs access.  It also provides for more
granular monitoring of each service.

I want to know how I would configure rgmanager with something similar to
this - to have resource agents that continually monitor the status of a
service on each node and then move service IPs accordingly.  

I see that one can write their own agents, but I don't see a scoring
scheme anywhere.  My concern is that if I simply write an agent to
monitor a service and have an ip depend upon the return code of that
monitoring agent the service would not ever be failed back to the
original host.

Does this make sense?

Thanks,
Brian