[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] Cluster doesn't come up while rebooting



I wouldn't worry about the "Magma Event: Membership Change" messages. I think that get printed out whenever a machine joins or leaves the cluster. (You have to be part of the cluster to see the changes... which is why everyone sees local change first, followed by whoever comes after them.) Do you have syslog set to print out 'debug'? That may explain some of these messages...

Just to get this straight, after all machines are up, if you use 'clusvcadm' to start the services, it works? If you reboot all machines, it doesn't work on bootup? What if you just reboot one machine?

Someone will have to confirm my next few statements, but this is what I think is happening... rgmanager does a 'stop' when a machine comes up. I'm guessing this is why you are seeing the "is not mounted" and other messages. In your cluster.conf, you have the services set to 'autostart="0"', which means they will not start by default(?). So, you need to start by hand when the machines come up. Potential solution is to ignore the messages you've attached (or figure out why syslog is being so verbose), and take out the 'autostart="0"' from cluster.conf.

 brassow

On Jul 1, 2008, at 4:06 AM, Stevan Colaco wrote:

Hello All,

I need your help for one issue i am facing .

OS: RHEL4 ES Update 6 64bit

I have a deployment where we have 2 + 1 cluster (2 active and one
passive). I have a service which is to be failed over but faced issues
when i rebooted all 3 servers. Services got disabled. But when i use
clusvsadm to manually enable service it works. Here are the logs : -

Jun 25 11:13:15 mb1 clurgmgrd[14825]:  Resource Group Manager Starting
Jun 25 11:13:15 mb1 clurgmgrd[14825]: Loading Service Data
Jun 25 11:13:17 mb1 clurgmgrd[14825]: Initializing Services
Jun 25 11:13:17 mb1 clurgmgrd: [14825]: /dev/sdh1 is not mounted
Jun 25 11:13:17 mb1 clurgmgrd: [14825]: stop: Could not match
LABEL=MB2-BACKUP with a real device
Jun 25 11:13:17 mb1 clurgmgrd[14825]: stop on fs:MB2-BACKUP returned 2
(invalid argument(s))
Jun 25 11:13:17 mb1 clurgmgrd: [14825]: stop: Could not match
LABEL=MB2-STORE with a real device
Jun 25 11:13:17 mb1 clurgmgrd[14825]: stop on fs:MB2-STORE returned 2
(invalid argument(s))
Jun 25 11:13:17 mb1 clurgmgrd: [14825]: stop: Could not match
LABEL=MB2-DBDATA with a real device
Jun 25 11:13:17 mb1 clurgmgrd[14825]: stop on fs:MB2-DBDATA returned 2
(invalid argument(s))
Jun 25 11:13:17 mb1 clurgmgrd: [14825]: stop: Could not match
LABEL=MB2-CONF with a real device
Jun 25 11:13:17 mb1 clurgmgrd[14825]: stop on fs:MB2-CONF returned 2
(invalid argument(s))
Jun 25 11:13:17 mb1 clurgmgrd: [14825]: stop: Could not match
LABEL=MB2-REDOLOG with a real device
Jun 25 11:13:17 mb1 clurgmgrd[14825]: stop on fs:MB2-REDOLOG returned
2 (invalid argument(s))
Jun 25 11:13:17 mb1 clurgmgrd: [14825]: stop: Could not match
LABEL=MB2-INDEX with a real device
Jun 25 11:13:17 mb1 clurgmgrd[14825]: stop on fs:MB2-INDEX returned 2
(invalid argument(s))
Jun 25 11:13:17 mb1 clurgmgrd: [14825]: stop: Could not match
LABEL=MB2-LOG with a real device
Jun 25 11:13:17 mb1 clurgmgrd[14825]: stop on fs:MB2-LOG returned 2
(invalid argument(s))
Jun 25 11:13:17 mb1 clurgmgrd: [14825]: stop: Could not match
LABEL=MB2-ZIMBRA-CLUST with a real device
Jun 25 11:13:17 mb1 clurgmgrd[14825]: stop on fs:MB2-CLUSTER returned
2 (invalid argument(s))
Jun 25 11:13:22 mb1 clurgmgrd: [14825]: /dev/sdg1 is not mounted
Jun 25 11:13:27 mb1 clurgmgrd: [14825]: /dev/sdf1 is not mounted
Jun 25 11:13:33 mb1 clurgmgrd: [14825]: /dev/sde1 is not mounted
Jun 25 11:13:38 mb1 clurgmgrd: [14825]: /dev/sdd1 is not mounted
Jun 25 11:13:43 mb1 clurgmgrd: [14825]: /dev/sdc1 is not mounted
Jun 25 11:13:45 mb1 rgmanager: clurgmgrd startup failed
Jun 25 11:13:48 mb1 clurgmgrd: [14825]: /dev/sdb1 is not mounted
Jun 25 11:13:53 mb1 clurgmgrd: [14825]: /dev/sda1 is not mounted
Jun 25 11:13:58 mb1 clurgmgrd[14825]: Services Initialized
Jun 25 11:14:01 mb1 clurgmgrd[14825]: Logged in SG "usrm::manager"
Jun 25 11:14:01 mb1 clurgmgrd[14825]: Magma Event: Membership Change
Jun 25 11:14:01 mb1 clurgmgrd[14825]: State change: Local UP
Jun 25 11:14:01 mb1 clurgmgrd[14825]: State change: mbstandby.ku.edu.kw UP
Jun 25 11:14:03 mb1 clurgmgrd[14825]: Magma Event: Membership Change
Jun 25 11:14:03 mb1 clurgmgrd[14825]: State change: mb2.ku.edu.kw UP


MB2 server Logs

Jun 25 11:13:40 mb2 clurgmgrd[14776]:  Resource Group Manager Starting
Jun 25 11:13:40 mb2 clurgmgrd[14776]: Loading Service Data
Jun 25 11:13:41 mb2 clurgmgrd[14776]: Initializing Services
Jun 25 11:13:41 mb2 clurgmgrd: [14776]: stop: Could not match
LABEL=MB1-DBDATA with a real device
Jun 25 11:13:41 mb2 clurgmgrd[14776]: stop on fs:MB1-DBDATA returned 2
(invalid argument(s))
Jun 25 11:13:41 mb2 clurgmgrd: [14776]: stop: Could not match
LABEL=MB1-INDEX with a real device
Jun 25 11:13:41 mb2 clurgmgrd[14776]: stop on fs:MB1-INDEX returned 2
(invalid argument(s))
Jun 25 11:13:41 mb2 clurgmgrd: [14776]: stop: Could not match
LABEL=MB1-LOG with a real device
Jun 25 11:13:41 mb2 clurgmgrd[14776]: stop on fs:MB1-LOG returned 2
(invalid argument(s))
Jun 25 11:13:41 mb2 clurgmgrd: [14776]: stop: Could not match
LABEL=MB1-CONF with a real device
Jun 25 11:13:41 mb2 clurgmgrd[14776]: stop on fs:MB1-CONF returned 2
(invalid argument(s))
Jun 25 11:13:41 mb2 clurgmgrd: [14776]: /dev/sdh1 is not mounted
Jun 25 11:13:41 mb2 clurgmgrd: [14776]: stop: Could not match
LABEL=MB1-BACKUP with a real device
Jun 25 11:13:41 mb2 clurgmgrd[14776]: stop on fs:MB1-BACKUP returned 2
(invalid argument(s))
Jun 25 11:13:41 mb2 clurgmgrd: [14776]: stop: Could not match
LABEL=MB1-REDOLOG with a real device
Jun 25 11:13:41 mb2 clurgmgrd[14776]: stop on fs:MB1-REDOLOG returned
2 (invalid argument(s))
Jun 25 11:13:41 mb2 clurgmgrd: [14776]: stop: Could not match
LABEL=MB1-STORE with a real device
Jun 25 11:13:41 mb2 clurgmgrd[14776]: stop on fs:MB1-STORE returned 2
(invalid argument(s))
Jun 25 11:13:41 mb2 clurgmgrd: [14776]: stop: Could not match
LABEL=MB1-ZIMBRA-CLUST with a real device
Jun 25 11:13:41 mb2 clurgmgrd[14776]: stop on fs:MB1-CLUSTER returned
2 (invalid argument(s))
Jun 25 11:13:46 mb2 clurgmgrd: [14776]: /dev/sdf1 is not mounted
Jun 25 11:13:52 mb2 clurgmgrd: [14776]: /dev/sdg1 is not mounted
Jun 25 11:13:57 mb2 clurgmgrd: [14776]: /dev/sde1 is not mounted
Jun 25 11:14:02 mb2 clurgmgrd: [14776]: /dev/sdd1 is not mounted
Jun 25 11:14:07 mb2 clurgmgrd: [14776]: /dev/sdc1 is not mounted
Jun 25 11:14:10 mb2 rgmanager: clurgmgrd startup failed
Jun 25 11:14:12 mb2 clurgmgrd: [14776]: /dev/sdb1 is not mounted
Jun 25 11:14:18 mb2 clurgmgrd: [14776]: /dev/sda1 is not mounted
Jun 25 11:14:23 mb2 clurgmgrd[14776]: Services Initialized
Jun 25 11:14:25 mb2 clurgmgrd[14776]: Logged in SG "usrm::manager"
Jun 25 11:14:25 mb2 clurgmgrd[14776]: Magma Event: Membership Change
Jun 25 11:14:25 mb2 clurgmgrd[14776]: State change: Local UP
Jun 25 11:14:25 mb2 clurgmgrd[14776]: State change: mb1.ku.edu.kw UP
Jun 25 11:14:25 mb2 clurgmgrd[14776]: State change: mbstandby.ku.edu.kw UP

MBSTANDBY LOGS

Jun 25 11:13:26 mbstandby clurgmgrd[15850]: Resource Group Manager Starting
Jun 25 11:13:26 mbstandby clurgmgrd[15850]: Loading Service Data
Jun 25 11:13:27 mbstandby clurgmgrd[15850]: Initializing Services
Jun 25 11:13:27 mbstandby clurgmgrd: [15850]: /dev/sdl1 is not mounted
Jun 25 11:13:27 mbstandby clurgmgrd: [15850]: /dev/sdp1 is not mounted
Jun 25 11:13:32 mbstandby clurgmgrd: [15850]: /dev/sdk1 is not mounted
Jun 25 11:13:32 mbstandby clurgmgrd: [15850]: /dev/sdn1 is not mounted
Jun 25 11:13:38 mbstandby clurgmgrd: [15850]: /dev/sdj1 is not mounted
Jun 25 11:13:38 mbstandby clurgmgrd: [15850]: /dev/sdo1 is not mounted
Jun 25 11:13:43 mbstandby clurgmgrd: [15850]: /dev/sdi1 is not mounted
Jun 25 11:13:43 mbstandby clurgmgrd: [15850]: /dev/sdm1 is not mounted
Jun 25 11:13:47 mbstandby sshd(pam_unix)[17583]: session opened for
user root by (uid=0)
Jun 25 11:13:48 mbstandby clurgmgrd: [15850]: /dev/sdd1 is not mounted
Jun 25 11:13:48 mbstandby clurgmgrd: [15850]: /dev/sdh1 is not mounted
Jun 25 11:13:53 mbstandby clurgmgrd: [15850]: /dev/sdg1 is not mounted
Jun 25 11:13:53 mbstandby clurgmgrd: [15850]: /dev/sdc1 is not mounted
Jun 25 11:13:56 mbstandby rgmanager: clurgmgrd startup failed
Jun 25 11:13:56 mbstandby su(pam_unix)[18378]: session opened for user
zimbra by (uid=0)
Jun 25 11:13:56 mbstandby zimbra: -bash: /opt/zimbra/log/startup.log:
No such file or directory
Jun 25 11:13:56 mbstandby su(pam_unix)[18378]: session closed for user zimbra
Jun 25 11:13:56 mbstandby rc: Starting zimbra: failed
Jun 25 11:13:58 mbstandby clurgmgrd: [15850]: /dev/sdf1 is not mounted
Jun 25 11:13:58 mbstandby clurgmgrd: [15850]: /dev/sdb1 is not mounted
Jun 25 11:14:04 mbstandby clurgmgrd: [15850]: /dev/sde1 is not mounted
Jun 25 11:14:04 mbstandby clurgmgrd: [15850]: /dev/sda1 is not mounted
Jun 25 11:14:09 mbstandby clurgmgrd[15850]: Services Initialized
Jun 25 11:14:09 mbstandby clurgmgrd[15850]: Logged in SG "usrm::manager" Jun 25 11:14:09 mbstandby clurgmgrd[15850]: Magma Event: Membership Change
Jun 25 11:14:09 mbstandby clurgmgrd[15850]: State change: Local UP
Jun 25 11:14:12 mbstandby clurgmgrd[15850]: Magma Event: Membership Change Jun 25 11:14:12 mbstandby clurgmgrd[15850]: State change: mb1.ku.edu.kw UP
Jun 25 11:14:13 mbstandby clurgmgrd[15850]: Resource groups locked;
not evaluating
Jun 25 11:14:14 mbstandby clurgmgrd[15850]: Magma Event: Membership Change Jun 25 11:14:14 mbstandby clurgmgrd[15850]: State change: mb2.ku.edu.kw UP
Jun 25 11:49:22 mbstandby sshd(pam_unix)[9438]: session opened for
user root by (uid=0)

I am using e2label to mount on failover as well as primary server.
Attached also is my cluster.conf.

Right now fencing is not being used properly just using manual and was
doing tetsing with HP ILO fencing.

!st query i have is why does it show "Magma Event: Membership Change" ?

Since i have initially defined 3 members in cluster , it should not
give me this . Is it because of some package missing or i have to run
up2date ?

I have installed following packages : -

ccs-1.0.11-1.x86_64.rpm
cman-kernheaders-2.6.9-53.5.x86_64.rpm  gulm-1.0.10-0.x86_64.rpm
magma-plugins-1.0.12-0.x86_64.rpm
ccs-devel-1.0.11-1.x86_64.rpm          dlm-1.0.7-1.x86_64.rpm
      gulm-devel-1.0.10-0.x86_64.rpm
perl-Net-Telnet-3.03-3.noarch.rpm
cman-1.0.17-0.x86_64.rpm               dlm-devel-1.0.7-1.x86_64.rpm
      iddev-2.0.0-4.x86_64.rpm        rgmanager-1.9.72-1.x86_64.rpm
cman-devel-1.0.17-0.x86_64.rpm
dlm-kernel-2.6.9-52.2.x86_64.rpm        iddev-devel-2.0.0-4.x86_64.rpm
system-config-cluster-1.0.51-2.0.noarch.rpm
cman-kernel-2.6.9-53.5.x86_64.rpm
dlm-kernel-smp-2.6.9-52.2.x86_64.rpm    luci-0.11.0-3.x86_64.rpm
cman-kernel-smp-2.6.9-53.5.x86_64.rpm  fence-1.32.50-2.x86_64.rpm
      magma-1.0.8-1.x86_64.rpm

Should i be missing any other important package for cluster ? I
installed packages using rpm -ivh *.rpm .
Also i stopped lock_glumd service as i am using lock_dlm lock manager.

Later i tried using just IP in service part w/o mount points and
application service. Then also on reboot it doesnt startup.Here are
the logs :-

Jun 27 19:44:37 mb1 clurgmgrd[12737]: <notice> Resource Group Manager Starting
Jun 27 19:44:37 mb1 clurgmgrd[12737]: <info> Loading Service Data
Jun 27 19:44:37 mb1 fstab-sync[12738]: removed all generated mount points
Jun 27 19:44:38 mb1 clurgmgrd[12737]: <info> Initializing Services
Jun 27 19:44:38 mb1 clurgmgrd[12737]: <info> Services Initialized
Jun 27 19:44:38 mb1 clurgmgrd[12737]: <info> Logged in SG "usrm::manager" Jun 27 19:44:38 mb1 clurgmgrd[12737]: <info> Magma Event: Membership Change
Jun 27 19:44:38 mb1 clurgmgrd[12737]: <info> State change: Local UP
Jun 27 19:44:38 mb1 rgmanager: clurgmgrd startup succeeded
Jun 27 19:44:41 mb1 clurgmgrd[12737]: <info> Magma Event: Membership Change
Jun 27 19:44:41 mb1 clurgmgrd[12737]: <info> State change:
mbstandby.ku.edu.kw UP
Jun 27 19:44:43 mb1 clurgmgrd[12737]: <info> Magma Event: Membership Change Jun 27 19:44:43 mb1 clurgmgrd[12737]: <info> State change: mb2.ku.edu.kw UP

Attached is also cluster.conf for this

Please guide what could be the issue. Thanks in advance.

Regards,
-Steven
<cluster-with-IP.txt><cluster-with-service.txt>--
Linux-cluster mailing list
Linux-cluster redhat com
https://www.redhat.com/mailman/listinfo/linux-cluster


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]