[Cluster-devel] [RFC] Killing ccs for good

Fabio Massimo Di Nitto fabbione at ubuntu.com
Tue Oct 23 07:33:17 UTC 2007


Hi everybody,

a few months ago David tickled me with the idea of killing ccs, so I started
poking around for fun here and there and I think that I am at a point where
we can start looking at the work it has been done so far.

What does killing ccs buy us?

- remove of another daemon from the stack with everything that comes with it
  (less code to handle, less open ports on the network, less stuff to
synchronize, etc.)
- the option to design a new more useful API.
- add your reason here.

I created a git branch at git.fugedabout.it sandbox to work on this project:

http://git.fugedabout.it/?p=people/fabbione/cluster-noccs.git;a=shortlog;h=noccs

You want to remember that this is a test/private branch and it gets rebased once
in a while (breaking the classic git pull) and it is on a sandbox == *absolutely
no guarantee it will exists in forever*

So far we have:

- removed completely ccs/ from the tree.
- implemented a small libcman API that:
  (high level)
  - reads random cluster.conf files.
  - loads them.
  - query them.
  (low level)
  - handle config conversion from buf to conf and viceversa.
  - download/upload the config from/to cman aisexec.
- implemented the basic bits in cman/aisexec.
- converted all services to use the new API (except rgmanager - see below)

The libcman API is not final yet. There might be changes related to the way in
which we upload/download config. The API is missing a bunch of ccs_*
functionalities that would be seen as regression if missing (even tho i doubt
there are that many users out there).
The new API so far offers some interesting bits compared to the ccs_ one
specially on the query front. The way in which queries are done is the same, but
the results are passed in a slightly different way that will help to clean up
all those endless query loops in various services.

The cman/aisexec implementation is really basic at the moment. It doesn't
support downloading of the configuration from the network yet and I have just
spotted a lot of duplicated code that can be simplified by being all in the same
place (and not splitted between ccs and cman).

The patch set in git is not as atomic as it could be. This work started in a
private CVS branch and imported into git almost in one shot. Some commits are
just too big to be of understandable as they should.

We tested the overall with simple configurations and everything seems to work so
far.

What we would like to do next:

- first of all we need to gather consensus if we all want to kill ccs. I have
done enough work now to make a decision but i don't plan to spend more time on
this unless we all agree on what direction to take.

and assuming we have consensus:

- get people to start testing the tree. I expect bugs and regressions even with
those simple changes. This will give us enough input to see if what we have as a
basic start is solid. At this point i don't expect much code review but just
some plain testing in your setups.

- start a general review of the implemented API bits and rgmanager conversion.
This will be a very useful way to force somebody (hi Lon! :)) to review the
entire API in details and use an extra pairs of eyes to smash down what's
missing. rgmanager exercises a lot also external bits like ccs_tools that needs
to be re-implemented in a compatibility form.

- add service config call back: this is going to be a major win over ccs where a
service will register a call back that will be invoked each time there is a
config change. It will simplify a lot the need to config sync as it stands now.

- at the same time that somebody (hi again Lon! ;)) will convert rgmanager, we
will implement the missing ccs bits into cman.

Once the API is stable and tested:

- move the config handler to its own openais service. This will be totally
transparent to the applications and we will rework most of the internal bits
to use the aisexec db.

Thanks
Fabio

PS I am aware of a regression in fence for a recent bug fix that Ryan committed
by adding a new call to ccs_lib. Patrick and I did discuss it on IRC and the
same data are already available in some form within the aisexec / cman db but we
had to no time to convert it yet. This will happen in CVS HEAD as well (if i
understood Patrick correctly as there is no point to store or query the same
information twice).

-- 
I'm going to make him an offer he can't refuse.




More information about the Cluster-devel mailing list