[Linux-cluster] GFS on CentOS - cman unable to start
Luiz Gustavo Tonello
gustavo.tonello at gmail.com
Fri Jan 6 20:05:23 UTC 2012
Hi,
This servers is on VMware? At the same host?
SElinux is disable? iptables have something?
In my environment I had a problem to start GFS2 with servers in differents
hosts.
To clustering servers, was need migrate one server to the same host of the
other, and restart this.
I think, one of the problem was because the virtual switchs.
To solve, I changed a multicast IP, to use 225.0.0.13 at cluster.conf
<multicast addr="225.0.0.13"/>
And add a static route in both, to use default gateway.
I don't know if it's correct, but this solve my problem.
I hope that help you.
Regards.
On Fri, Jan 6, 2012 at 5:01 PM, Wes Modes <wmodes at ucsc.edu> wrote:
> Hi, Steven.
>
> I've tried just about every possible combination of hostname and
> cluster.conf.
>
> ping to test01 resolves to 128.114.31.112
> ping to test01.gdao.ucsc.edu resolves to 128.114.31.112
>
> It feels like the right thing is being returned. This feels like it
> might be a quirk (or bug possibly) of cman or openais.
>
> There are some old bug reports around this, for example
> https://bugzilla.redhat.com/show_bug.cgi?id=488565. It sounds like the
> way that cman reports this error is anything but straightforward.
>
> Is there anyone who has encountered this error and found a solution?
>
> Wes
>
>
> On 1/6/2012 2:00 AM, Steven Whitehouse wrote:
> > Hi,
> >
> > On Thu, 2012-01-05 at 13:54 -0800, Wes Modes wrote:
> >> Howdy, y'all. I'm trying to set up GFS in a cluster on CentOS systems
> >> running on vmWare. The GFS FS is on a Dell Equilogic SAN.
> >>
> >> I keep running into the same problem despite many differently-flavored
> >> attempts to set up GFS. The problem comes when I try to start cman, the
> >> cluster management software.
> >>
> >> [root at test01]# service cman start
> >> Starting cluster:
> >> Loading modules... done
> >> Mounting configfs... done
> >> Starting ccsd... done
> >> Starting cman... failed
> >> cman not started: Can't find local node name in cluster.conf
> >> /usr/sbin/cman_tool: aisexec daemon didn't start
> >> [FAILED]
> >>
> > This looks like what it says... whatever the node name is in
> > cluster.conf, it doesn't exist when the name is looked up, or possibly
> > it does exist, but is mapped to the loopback address (it needs to map to
> > an address which is valid cluster-wide)
> >
> > Since your config files look correct, the next thing to check is what
> > the resolver is actually returning. Try (for example) a ping to test01
> > (you need to specify exactly the same form of the name as is used in
> > cluster.conf) from test02 and see whether it uses the correct ip
> > address, just in case the wrong thing is being returned.
> >
> > Steve.
> >
> >> [root at test01]# tail /var/log/messages
> >> Jan 5 13:39:40 testbench06 ccsd[13194]: Unable to connect to
> >> cluster infrastructure after 1193640 seconds.
> >> Jan 5 13:40:10 testbench06 ccsd[13194]: Unable to connect to
> >> cluster infrastructure after 1193670 seconds.
> >> Jan 5 13:40:24 testbench06 openais[3939]: [MAIN ] AIS Executive
> >> Service RELEASE 'subrev 1887 version 0.80.6'
> >> Jan 5 13:40:24 testbench06 openais[3939]: [MAIN ] Copyright (C)
> >> 2002-2006 MontaVista Software, Inc and contributors.
> >> Jan 5 13:40:24 testbench06 openais[3939]: [MAIN ] Copyright (C)
> >> 2006 Red Hat, Inc.
> >> Jan 5 13:40:24 testbench06 openais[3939]: [MAIN ] AIS Executive
> >> Service: started and ready to provide service.
> >> Jan 5 13:40:24 testbench06 openais[3939]: [MAIN ] local node name
> >> "test01.gdao.ucsc.edu" not found in cluster.conf
> >> Jan 5 13:40:24 testbench06 openais[3939]: [MAIN ] Error reading CCS
> >> info, cannot start
> >> Jan 5 13:40:24 testbench06 openais[3939]: [MAIN ] Error reading
> >> config from CCS
> >> Jan 5 13:40:24 testbench06 openais[3939]: [MAIN ] AIS Executive
> >> exiting (reason: could not read the main configuration file).
> >>
> >> Here are details of my configuration:
> >>
> >> [root at test01]# rpm -qa | grep cman
> >> cman-2.0.115-85.el5_7.2
> >>
> >> [root at test01]# echo $HOSTNAME
> >> test01.gdao.ucsc.edu
> >>
> >> [root at test01]# hostname
> >> test01.gdao.ucsc.edu
> >>
> >> [root at test01]# cat /etc/hosts
> >> # Do not remove the following line, or various programs
> >> # that require network functionality will fail.
> >> 128.114.31.112 test01 test01.gdao test01.gdao.ucsc.edu
> >> 128.114.31.113 test02 test02.gdao test02.gdao.ucsc.edu
> >> 127.0.0.1 localhost.localdomain localhost
> >> ::1 localhost6.localdomain6 localhost6
> >>
> >> [root at test01]# sestatus
> >> SELinux status: enabled
> >> SELinuxfs mount: /selinux
> >> Current mode: permissive
> >> Mode from config file: permissive
> >> Policy version: 21
> >> Policy from config file: targeted
> >>
> >> [root at test01]# cat /etc/cluster/cluster.conf
> >> <?xml version="1.0"?>
> >> <cluster config_version="25" name="gdao_cluster">
> >> <fence_daemon post_fail_delay="0" post_join_delay="120"/>
> >> <clusternodes>
> >> <clusternode name="test01" nodeid="1" votes="1">
> >> <fence>
> >> <method name="single">
> >> <device name="gfs_vmware"/>
> >> </method>
> >> </fence>
> >> </clusternode>
> >> <clusternode name="test02" nodeid="2" votes="1">
> >> <fence>
> >> <method name="single">
> >> <device name="gfs_vmware"/>
> >> </method>
> >> </fence>
> >> </clusternode>
> >> </clusternodes>
> >> <cman/>
> >> <fencedevices>
> >> <fencedevice agent="fence_manual" name="gfs1_ipmi"/>
> >> <fencedevice agent="fence_vmware" name="gfs_vmware"
> >> ipaddr="gdvcenter.ucsc.edu" login="root" passwd="1hateAmazon.com"
> >> vmlogin="root" vmpasswd="esxpass"
> >>
> port="/vmfs/volumes/49086551-c64fd83c-0401-001e0bcd6848/eagle1/gfs1.vmx"/>
> >> </fencedevices>
> >> <rm>
> >> <failoverdomains/>
> >> </rm>
> >> </cluster>
> >>
> >> I've seen much discussion of this problem, but no definitive solutions.
> >> Any help you can provide will be welcome.
> >>
> >> Wes Modes
> >>
> >> --
> >> Linux-cluster mailing list
> >> Linux-cluster at redhat.com
> >> https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
--
Luiz Gustavo P Tonello.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20120106/4c0c46de/attachment.htm>
More information about the Linux-cluster
mailing list