[Linux-cluster] heartbeat to rgmanager design question

Fri Nov 28 07:50:24 UTC 2008

On Thu, Nov 27, 2008 at 15:57, Brian Kroth <bpkroth at gmail.com> wrote:
> Hello all,
>
> I've been using Heartbeat in past to do resource failover with the
> following scheme:
>
> 1) Each node in the cluster runs a dummy monitoring resource agent as a
> clone.  This resource agent monitors the health of a service on the node
> using whatever rules one wants to write into it.  For instance, make
> sure the service is not in maintenance mode, mysql is running, queries
> return timely, and replication is up to date.  If all the checks pass it
> uses attrd_updater to set an attribute for that service on the node to
> 1.  Else, it is set to 0.  Note that this resource agent in no way
> affects the service it is monitoring.
>
> 2) The cluster configuration uses the attributes for each of the
> monitored services to generate a score for the machine.  The machine
> with the highest score gets to host the virtual ip for that service.
>
> This scheme allows one to, for instance, touch a file on a machine that
> will signify that it's in maintenance mode.  The service ip would then
> be moved to another node, leaving one to test out the service on the
> machine's management ip without removing it from the cluster itself
> which would cause a lack of gfs access.  It also provides for more
> granular monitoring of each service.
>
> I want to know how I would configure rgmanager with something similar to
> this - to have resource agents that continually monitor the status of a
> service on each node and then move service IPs accordingly.

Just out of interest, where did the rgmanager requirement come from?

<blatant-advertisement>
The Heartbeat resource manager also runs on OpenAIS now which, IIRC,
is what rgmanager uses... so, in theory, it can manage anything
rgmanager can.
</blatant-advertisement>

>
> I see that one can write their own agents, but I don't see a scoring
> scheme anywhere.  My concern is that if I simply write an agent to
> monitor a service and have an ip depend upon the return code of that
> monitoring agent the service would not ever be failed back to the
> original host.
>
> Does this make sense?
>
> Thanks,
> Brian
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>