[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] RHCS 5.1 latest packages, 2-node cluster, doesn't come up with only 1 node



On Fri, 2008-02-08 at 19:33 -0200, Celso K. Webber wrote:
> Hi Lon,
> 
> On Fri, 08 Feb 2008 16:15:36 -0500, Lon Hohberger wrote
> > On Fri, 2008-02-08 at 11:18 -0200, Celso K. Webber wrote:
> > > Feb  7 20:07:01 mrp02 kernel: dlm: no local IP address has been set
> > > Feb  7 20:07:01 mrp02 kernel: dlm: cannot start dlm lowcomms -107
> > 
> > This is why rgmanager didn't work (and possibly even exited).  Does
> > 'uname -n' match what's in cluster.conf?
> > 
> No, it does not! I didn't know it should match, I'm configuring RHCS Clusters 
> since version 2.1 and this never bothered me, sorry!!!
> 
> Well, I usually do the following in /etc/hosts:
> -> assume network 192.168.1.0/24 is for public access
> -> assume network 10.0.0.0/8 is for heartbeat
> 
> 192.168.1.1   realservername1.domainname realservername1
> 192.168.1.2   realservername2.domainname realservername2
> 
> 10.0.0.1      node1.localdomain node1
> 10.0.0.2      node2.localdomain node2
> 
> 192.168.1.3   servicename1.domainname servicename1
> 192.168.1.4   servicename2.domainname servicename2
> ... and so on for other virtual IPs for services ...
> 
> Then I configure in cluster.conf the names associated with the private 
> addresses/interfaces, so that I'm sure that heartbeat traffic is going 
> through the correct interfaces.
> 
> For obvious reasons, "uname -n" returns the public hostnames, such as 
> realservername1.domainname.
> 
> I noticed that from some time there is a question in the FAQ explaining how 
> to "bind" the heartbeat traffic to a specific interface/address. But I was 
> happy with my solution, specially because the answer to that question 
> suggested touching the init script, and I don't like to alter standard system 
> files, specially init scripts. At least in RHCS v4, I didn't find a better 
> way to "bind" the heartbeat traffic to a specific interface. I didn't 
> experiment about this with RHCS v5, I just went on with my previous method.
> 
> For me this is common practice, for instance, Oracle Database respects an 
> environment variable called ORACLE_HOSTNAME, so that you can "instruct" the 
> several utilities to consider that name instead of the real server's name. 
> This is very useful in a Cluster environment.
> 
> Please tell me:
> * is it really wrong set the node names in cluster.conf to a name different 
> to that reported by "uname -n"?
> * if it is "ugly" or considered wrong, what is the best way to instruct CMAN 
> which interface to use for heartbeat?

I think it's mostly fixed in RHEL5.

We have updated the CMAN init script for RHEL5 to
allow /etc/sysconfig/cluster to have "NODENAME=preferred_host_name".  It
will go out with the next update, but here it is in CVS:

*massive url*

http://sources.redhat.com/cgi-bin/cvsweb.cgi/~checkout~/cluster/cman/init.d/Attic/cman?rev=1.26.2.6&content-type=text/plain&cvsroot=cluster&hideattic=0&only_with_tag=RHEL5

tinyurl:

http://tinyurl.com/2fg6nd


It still could be a bug.  The dlm unable to determine the local hostname
is definitely why rgmanager died (it needs the DLM!).  Updating the
script / trying to force CMAN with a specific node name is just one way
to eliminate a possible cause (and it might fix it, too ;) ).


> * does this solution work both for RHCS v4 and v5?

The RHEL5 script is not backwards compatible, but cman_tool join -n
<preferred_host_name> is.


> * would it be better to have only one interface for public and heartbeat 
> traffic, maybe channel bonding dual NICs?

Better is certainly a matter of perception in this case.  I would expect
you'd want to get your current configuration working before altering
your network topology.  Also, it's not like your configuration is
particularly strange...


> * is there any other significant difference between RHCSv4 and v5 I should be 
> aware of?


> As always, thank you very very much for your support!

We do what we can, but please keep in mind that a public mailing list
isn't a very good support forum compared to (for example): 

  https://www.redhat.com/apps/support/

-- Lon


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]