[Linux-cluster] [RFC] Generic Kernel API

Daniel McNeil daniel at osdl.org
Thu Sep 30 17:52:15 UTC 2004


On Thu, 2004-09-30 at 00:10, Patrick Caulfield wrote:
> On Wed, Sep 29, 2004 at 04:55:41PM -0700, Daniel McNeil wrote:
> > 
> > Patrick,
> > 
> > I read over your api and have a few comments.
> > 
> > Simple stuff first.  The membership_node looks very similar to the SAF
> > interfaces, so I assume they fields mean the same.  mn_member is 32bits
> > but it just specifies if this node is a member (1) or not (0), right?
> 
> Yes.
>  
> > The mni_viewnumber is 32 bits, in SAF it is 64bits.  Might want it to
> > be 64bits.  (I think nodeid should be 64bits, but SAF has it as 32bits,
> > so I guess it is ok).
> > 
> > What is mni_context?
> 
> It's an opaque structure passed in from the caller that gets passed back via the
> callback so that the caller can identify the request (or attach private
> information).
>  
> > I bit more description of these fields would be nice -- don't have to
> > be as verbose as SAF :)
> > 
> > In membership_ops, you have start_notify and notify_stop -- might want
> > to be consistent with the naming (either notify_start or stop_notify).
> 
> Yes, I fixed that!
>  
> > Now the more complicated stuff:
> > 
> > I think we need more information on how this api works and a description
> > of how the calls are used.
> > 
> > cm_attach() is used to attach to a particular cluster provider that
> > has been registered.  Who calls cm_attach()?
> > 
> > I assume whoever calls cm_attach() will then be calling the ops
> > functions.
> > 
> > What is cmprivate in start_notify?
> > 
> > Once start_notify is called the CM module will call the callback
> > function whenever there is a change until notify_stop is called?
> > 
> > The membership_callback_routine only has "context" and "reason".
> > Again, what is context?  What is reason?
> > How is the data returned?  I'm guessing a struct membership_notify_info
> > is filled in at from the buffer passed in from start_notify,  Is that
> > right?  A bit more description here would be good.
> > 
> > What is the difference between get_quorate() and get_info() which
> > returns a struct quorum_info with qi_quorum?
> 
> get_quorate returns a boolean value that just says whether the cluster has
> quorum or not. get_info returns a struct showing the elements that went up to
> making that decision. I'm not really sure how much use it is to applications but
> I don't like hiding information!
>  
> > Should get_quorate() and get_info() take a viewnumber so we can match
> > up the list of member and whether it had quorum?  (it could have changed
> > after the callback with membership before we call get_quorum.)
> 
> The problem there is keeping a list of members for each view, which seems like
> rather a waste of memory in kernel space.

Would it be ok to just keep the info for the last viewnumber only?
If the viewnumber did not match then an error could be returned
for get_quorate.  get_info could return the viewnumber as part
of quorum info.

What does get_votes do if nodeid is NOT currently in the cluster
membership?

> 
> 
> I'm in the middle of writing an implementation of this (with a cman plugin) that
> I'll post shortly. That should clear up any other points that I may seem to have
> ignored above! some of the things have been fixed in the meantime. I should get
> it posted this week.

Good.  Code should clear things up!

Thanks,

Daniel




More information about the Linux-cluster mailing list