[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] Nodes are getting Down while relocating service



Hello Jose

If you look the cluster.conf you can see his dosn't using drbd

Like i sayed beforce
===================================================
[network_problem]
===================================================
Jan 28 15:50:05 ssdgblade2 openais[10324]: [TOTEM] FAILED TO RECEIVE 
Jan 28 15:50:05 ssdgblade2 openais[10324]: [TOTEM] entering GATHER state from 6. 
Jan 28 15:50:05 ssdgblade2 openais[10324]: [TOTEM] FAILED TO RECEIVE 
Jan 28 15:50:05 ssdgblade2 openais[10324]: [TOTEM] entering GATHER state from 6. 
Jan 28 15:50:06 ssdgblade2 openais[10324]: [TOTEM] FAILED TO RECEIVE 
Jan 28 15:50:06 ssdgblade2 openais[10324]: [TOTEM] entering GATHER state from 6. 
Jan 28 15:50:06 ssdgblade2 openais[10324]: [TOTEM] FAILED TO RECEIVE 
Jan 28 15:50:06 ssdgblade2 openais[10324]: [TOTEM] entering GATHER state from 6. 
Jan 28 15:50:07 ssdgblade2 openais[10324]: [TOTEM] FAILED TO RECEIVE 
Jan 28 15:50:07 ssdgblade2 openais[10324]: [TOTEM] entering GATHER state from 6. 
Jan 28 15:50:07 ssdgblade2 openais[10324]: [TOTEM] FAILED TO RECEIVE 
Jan 28 15:50:07 ssdgblade2 openais[10324]: [TOTEM] entering GATHER state from 6. 
Jan 28 15:50:08 ssdgblade2 openais[10324]: [TOTEM] FAILED TO RECEIVE 
Jan 28 15:50:08 ssdgblade2 openais[10324]: [TOTEM] entering GATHER state from 6. 
Jan 28 15:50:08 ssdgblade2 openais[10324]: [TOTEM] FAILED TO RECEIVE 
Jan 28 15:50:08 ssdgblade2 openais[10324]: [TOTEM] entering GATHER state from 6. 
Jan 28 15:50:09 ssdgblade2 openais[10324]: [TOTEM] FAILED TO RECEIVE 
Jan 28 15:50:09 ssdgblade2 openais[10324]: [TOTEM] entering GATHER state from 6. 
Jan 28 15:50:09 ssdgblade2 openais[10324]: [TOTEM] FAILED TO RECEIVE 
Jan 28 15:50:09 ssdgblade2 openais[10324]: [TOTEM] entering GATHER state from 6. 
Jan 28 15:50:10 ssdgblade2 openais[10324]: [TOTEM] FAILED TO RECEIVE 
Jan 28 15:50:10 ssdgblade2 openais[10324]: [TOTEM] entering GATHER state from 6. 
Jan 28 15:50:10 ssdgblade2 openais[10324]: [TOTEM] FAILED TO RECEIVE 
Jan 28 15:50:10 ssdgblade2 openais[10324]: [TOTEM] entering GATHER state from 6. 
Jan 28 15:50:11 ssdgblade2 openais[10324]: [TOTEM] FAILED TO RECEIVE 
Jan 28 15:50:11 ssdgblade2 openais[10324]: [TOTEM] entering GATHER state from 6. 
Jan 28 15:50:11 ssdgblade2 openais[10324]: [TOTEM] FAILED TO RECEIVE 
Jan 28 15:50:11 ssdgblade2 openais[10324]: [TOTEM] entering GATHER state from 6.
==================================================================

the first think it can be utils it's stops iptables

2012/1/31 jose nuno neto <jose neto liber4e com>
Hello

Took a quick look on the messages and see no fence reference, there's a
break in token messages, recovering, cluster.conf change, comunication
lost again....
could be the service shutdown, after cluster.conf update, forcing shutdown

do you have drbd running too?

Cheers
Jose Neto

> Hi,
>
> We  are facing some issue while configuring cluster in Centos 5.5
>
>
> Here is the scenario where we got stuck.
>
> Issue:
>
> All nodes in the cluster turned of if cluster services restarted or
> disabled or enabled.
>
> Three services should work as a clustered service,
>
> 1.     Postgresql.
> 2.     GFS (1TB SAN space which is mounted on /var/lib/pgsql)
> 3.     Virtual IP (common IP)—IP 10.242.108.42
>
> Even we tried adding only Virtual IP as a cluster service then also,
>
> #clusvcadm  -r DBService –m ssdgblade2.db2   (from ssdgblade1.db1)
>
> Could not relocate the service and both node get turned off.
>
> Environment
>
> CentOS 5.5
> Postgresql 8.3.3
> Kernel version-2.6.18-194
> CentOs Cluster Suit.
>
> Hardware:
>
> 1.    Chasis IBM BladeCenter E.
> 2.    IBM HS22 blades (8 numbers)—clustering is done in blade1 and blade2
> 3.    Blade Management Module IP is 10.242.108.58
> 4.    Fence device IBM Bladecenter.( login successful via telnet and
> web browser to management module).
> 5.    Cisco Catalyst 2960G Switch.
>
> IP:
>
> 10.242.108.41 (ssdgblade1.db1)
> 10.242.108.43 (ssdgblade2.db2)
>
> Virtual IP 10.242.108.42
> Multicast IP 239.192.247.38
>
>
> Diagnostic Steps followed:
>
> 1.     Removed postgresql and GFS from cluster service and rebooted
> both the server with only VIP service. Still problem exist. Can not
> relocate the service.
> 2.    Tested fencing by,
>
> #fence_node ssdgblade2.db2   (from db1)
> #fence_node ssdgblade1.db1   (from db2)
>
> Can fence the given node.  But during boot up it fence the other node.
>
> Please find the attachment for your reference.
> --
>
>
> Thanks & Regards,
>
> *Arun K P
> *
>
> System Administrator
>
> *HCL Infosystems Ltd*.
>
> *Kolkata*
>
> Mob: +91- 9903361422
>
> *www.hclinfosystems.in* <http://www.hclinfosystems.in/>
>
> *Technology that touches lives* *TM*
> **
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



--
esta es mi vida e me la vivo hasta que dios quiera

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]