[Linux-cluster] new userland cman

Fri Sep 30 19:40:00 UTC 2005

Patrick

Thanks for the work

I have a few comments inline

On Fri, 2005-09-30 at 14:44 +0100, Patrick Caulfield wrote:
> This has got to the stage where I'd be grateful for any testing other people can
>  do, though obviously don't endanger a production system!
> 
> You should be able to run the DLM and GFS on this, see
> 
> https://www.redhat.com/archives/linux-cluster/2005-September/msg00177.html
> 
> for (very) brief instructions. There is a new clvm patch available in the
> cluster CVS at cman/lib/clvmd-libcman.diff
> 
> Here's a list of the user-visible changes, please feel free to ask questions on
> the list.
> 
> good
> ----
> - (optional) encryption & authentication of communications
> - Multiple interface support (unfinished, needs AIS and cman work)
> - Automatic re-reading of CCS if a new node joins with an updated config file
> 
> bad
> ---
> - Always uses CCS (cman_tool join -X removed)*
> - Compulsory static node IDs (easily enforced by GUI or command-line)
> - Can't have multiple clusters using the same port number unless they use a
> different encryption key. Currently cluster name is ignored.**
> - Hard limit to size of cluster (set at compile time to 32 currently)***
> 

I hope to have multiring in 2006; then we should scale to hundreds of
processors...

> neutral
> -------
> - Always uses multicast (no broadcast). A default multicast address is supplied
> if none is given

If broadcast is important, which I guess it may be, we can pretty easily
add this support...

> - libcman is the only API ( a compatible libcman is available for the kernel
> version)
> - Simplified CCS schema, but will read old one if it has nodeids in it.****
> 
> internal
> --------
> - Usable messaging API
> - Robust membership algorithm
> - Community involvement, multiple developers.
> 
> 
> * I very much doubt that anyone will notice apart from maybe Dave & me
> 
> ** Could fix this in AIS, but I'm not sure the patch would be popular upstream.
> It's much more efficient to run them on different ports or multicast addresses
> anyway. Incidentally: DON'T run an encrypted and a non-encrypted cluster on the
> same port & multicast address (not that you would!) - the non-encrypted ones
> will crash.
> 

On this point, you mention you could fix "this", do you mean having two
clusters use the same port and ips?  I have also considered and do want
this by having each "cluster" join a specific group at startup to serve
as the cluster membership view.  Unfortunately this would require
process group membership, and the process groups interface is unfinished
(totempg.c) so this isn't possible today.  Note I'd take a patch from
someone that finished the job on this interface :)  I for example, would
like communication for a specific checkpoint to go over a specific named
group, instead of to everyone connected to totem.  Then the clm could
join a group and get membership events, the checkpoint service for a
specific checkpoint could join a group, and communicate on that group,
and get membership events for that group etc.

What did you have in mind here?

regards
-steve

> *** I doubt that the old cman worked well above 30 nodes anyway. I intend to do
> some AIS hacking to improve this situation by drastically reducing the network
> packet size.
> 
> **** The main difference here is that the multicast address need only be
> specified once, in the <cman> section of cluster.conf. The interface used will
> be the one that is bound to the hostname mentioned.
> 
> 
> patrick
>