[Linux-cluster] Interfacing csnap to cluster stack

Tue Oct 12 18:40:57 UTC 2004

On Tue, 2004-10-12 at 01:44 -0400, Daniel Phillips wrote:

> > You never answered, How would a resource manager know to pick the
> > "best" choice?
> 
> That depends on how it is told to pick, either by pre-ordained 
> configuration, or automagic balancing algorithms, or a combination of 
> the two.

Load monitoring is on the road map, but it can add some complexity.

The current "domain" model is 100% user configured, and at time of a
failover, all nodes have all the information they need to decide whether
or not they are a good candidate to start the failed resource group.

If we are able to rely on "last known" or passive load monitoring, then
load monitoring (including things like free RAM vs. total RAM, typical
run-queue based load, etc) become much easier.

If, however, we require instantaneous load monitoring at time of
failover, it becomes much more complex...

> > AFAICT, resource management is higher up the stack and having shared
> > storage like cluster snapshot depend on it, would cause circular
> > dependencies.

Yes, it most certainly could.

> Not only that, but after a read-through, rgmanager is not suited to low 
> level use as currently conceived.  Just one of many problems: we don't 
> want to be parsing XML in a block device failover path.  So I will stop 
> bothering Lon about making this be what it's not.

Actually, it doesn't parse anything on failover, but it does fork + exec
things.  Some or all of those things may be scripts, which I think is
probably far worse for your needs than merely parsing some random XML
document.  For instance, starting an Oracle instance isn't exactly
bounded and predictable WRT memory consumption.  Though, we can
guarantee that csnap servers recover before Oracle instances pretty
easily.

It only parses XML (or reads from CCS) when (a) it starts and (b) during
a configuration change.

> We also need to arrange for the csnap server to give up the PW lock if 
> its node leaves the cluster.  The agent better subscribe for some sort 
> of cluster management event here, except there isn't any such event 
> except when gdlm is already dead, which isn't much use.  This is a big 
> fat deficiency.

I think the DLM handles this case, right?  Isn't the lock just lost (IE,
nothing to clean up ...)

You're right about the shutdown-event.  They seem to be provided after
shutdown.

> >  If administrator knows which machines is "best", have him
> >  start the snapshot targets on that machine 1st.  Not perfect,
> >  but simple and provides high availability.
> >
> >  It is also possible for the csnap server to put its
> >  server address and port information in the LVB.
> >
> >  This seems simple, workable, and easy to program.
> 
> And maps easily to either gdlm or gulm, though somebody would have to 
> write a userland interface to make this transparent.  (Lon?).

That would probably be a welcome addition, and shouldn't be terribly
hard to do.

> >  How complicated of a resource metric were you thinking about?
> 
> User defined, where one of the things the user can say is "automagic".  
> 
> A simple priority scheme would let the user assign a priority number for 
> each node, and the resource manager picks the node with the highest 
> priority (there is no point in distributing this algorithm).

Sounds like failover domains.

-- Lon