[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] RHCS 5.1 latest packages, 2-node cluster, doesn't come up with only 1 node

On Fri, 2008-02-08 at 19:33 -0200, Celso K. Webber wrote:
> Hi Lon,
> On Fri, 08 Feb 2008 16:15:36 -0500, Lon Hohberger wrote
> > On Fri, 2008-02-08 at 11:18 -0200, Celso K. Webber wrote:
> > > Feb  7 20:07:01 mrp02 kernel: dlm: no local IP address has been set
> > > Feb  7 20:07:01 mrp02 kernel: dlm: cannot start dlm lowcomms -107
> > 
> > This is why rgmanager didn't work (and possibly even exited).  Does
> > 'uname -n' match what's in cluster.conf?
> > 
> No, it does not! I didn't know it should match, I'm configuring RHCS Clusters 
> since version 2.1 and this never bothered me, sorry!!!
> Well, I usually do the following in /etc/hosts:
> -> assume network is for public access
> -> assume network is for heartbeat
>   realservername1.domainname realservername1
>   realservername2.domainname realservername2
>      node1.localdomain node1
>      node2.localdomain node2
>   servicename1.domainname servicename1
>   servicename2.domainname servicename2
> ... and so on for other virtual IPs for services ...
> Then I configure in cluster.conf the names associated with the private 
> addresses/interfaces, so that I'm sure that heartbeat traffic is going 
> through the correct interfaces.
> For obvious reasons, "uname -n" returns the public hostnames, such as 
> realservername1.domainname.
> I noticed that from some time there is a question in the FAQ explaining how 
> to "bind" the heartbeat traffic to a specific interface/address. But I was 
> happy with my solution, specially because the answer to that question 
> suggested touching the init script, and I don't like to alter standard system 
> files, specially init scripts. At least in RHCS v4, I didn't find a better 
> way to "bind" the heartbeat traffic to a specific interface. I didn't 
> experiment about this with RHCS v5, I just went on with my previous method.
> For me this is common practice, for instance, Oracle Database respects an 
> environment variable called ORACLE_HOSTNAME, so that you can "instruct" the 
> several utilities to consider that name instead of the real server's name. 
> This is very useful in a Cluster environment.
> Please tell me:
> * is it really wrong set the node names in cluster.conf to a name different 
> to that reported by "uname -n"?
> * if it is "ugly" or considered wrong, what is the best way to instruct CMAN 
> which interface to use for heartbeat?

I think it's mostly fixed in RHEL5.

We have updated the CMAN init script for RHEL5 to
allow /etc/sysconfig/cluster to have "NODENAME=preferred_host_name".  It
will go out with the next update, but here it is in CVS:

*massive url*




It still could be a bug.  The dlm unable to determine the local hostname
is definitely why rgmanager died (it needs the DLM!).  Updating the
script / trying to force CMAN with a specific node name is just one way
to eliminate a possible cause (and it might fix it, too ;) ).

> * does this solution work both for RHCS v4 and v5?

The RHEL5 script is not backwards compatible, but cman_tool join -n
<preferred_host_name> is.

> * would it be better to have only one interface for public and heartbeat 
> traffic, maybe channel bonding dual NICs?

Better is certainly a matter of perception in this case.  I would expect
you'd want to get your current configuration working before altering
your network topology.  Also, it's not like your configuration is
particularly strange...

> * is there any other significant difference between RHCSv4 and v5 I should be 
> aware of?

> As always, thank you very very much for your support!

We do what we can, but please keep in mind that a public mailing list
isn't a very good support forum compared to (for example): 


-- Lon

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]