[Linux-cluster] Problem in cluster with xen kernel

carlopmart carlopmart at gmail.com
Tue Apr 3 10:49:31 UTC 2007


Nuno Fernandes wrote:
> Hi,
> 
> Just for your information we've solved it. It was a problem in the xen bridge 
> scripts that restarted network interfaces while the cluster is active.
> 
> Changing /etc/xen/xend-config.sxp line
> 
> (network-script network-bridge)
> 
> to
> 
> (network-script /bin/true)
> 
> and creating the bridge in /etc/sysconfig/network-scripts/ifcfg-* files 
> solved.
> 
> Thanks
> Nuno Fernandes
> 
> On Tuesday 03 April 2007 10:12:20 Nuno Fernandes wrote:
>> Hi,
>>
>> I'm using rhel5 default kernel and everything seems ok.
>>
>> [root at xen1 ~]# clustat
>> Member Status: Quorate
>>
>>   Member Name                        ID   Status
>>   ------ ----                        ---- ------
>>   xen1.dc.server.pt                      1 Online, Local
>>   xen2.dc.server.pt                      2 Online
>>   xen3.dc.server.pt                      3 Online
>>
>> Later on, i reboot  xen3 to a Dom0 kernel and get in xen1 logs:
>>
>> Dec 19 23:02:47 xen1 openais[2747]: [TOTEM] The token was lost in the
>> OPERATIONAL state.
>> Dec 19 23:02:47 xen1 openais[2747]: [TOTEM] Receive multicast socket recv
>> buffer size (262142 bytes).
>> Dec 19 23:02:47 xen1 openais[2747]: [TOTEM] Transmit multicast socket send
>> buffer size (262142 bytes).
>> Dec 19 23:02:47 xen1 openais[2747]: [TOTEM] entering GATHER state from 2.
>>
>> [root at xen1 ~]# Dec 19 23:02:52 xen1 openais[2747]: [TOTEM] entering GATHER
>> state from 0.
>> Dec 19 23:02:52 xen1 openais[2747]: [TOTEM] Creating commit token because I
>> am the rep.
>> Dec 19 23:02:52 xen1 openais[2747]: [TOTEM] Saving state aru 2f high seq
>> received 2f
>> Dec 19 23:02:52 xen1 openais[2747]: [TOTEM] entering COMMIT state.
>> Dec 19 23:02:52 xen1 openais[2747]: [TOTEM] entering RECOVERY state.
>> Dec 19 23:02:52 xen1 openais[2747]: [TOTEM] position [0] member
>> 172.16.40.107: Dec 19 23:02:52 xen1 openais[2747]: [TOTEM] previous ring
>> seq 84 rep 172.16.40.107
>> Dec 19 23:02:52 xen1 openais[2747]: [TOTEM] aru 2f high delivered 2f
>> received flag 0
>> Dec 19 23:02:52 xen1 openais[2747]: [TOTEM] position [1] member
>> 172.16.40.108: Dec 19 23:02:52 xen1 openais[2747]: [TOTEM] previous ring
>> seq 84 rep 172.16.40.107
>> Dec 19 23:02:52 xen1 openais[2747]: [TOTEM] aru 2f high delivered 2f
>> received flag 0
>> Dec 19 23:02:52 xen1 openais[2747]: [TOTEM] Did not need to originate any
>> messages in recovery.
>> Dec 19 23:02:52 xen1 openais[2747]: [TOTEM] Storing new sequence id for
>> ring 58
>> Dec 19 23:02:52 xen1 kernel: dlm: closing connection to node 3
>> Dec 19 23:02:52 xen1 openais[2747]: [TOTEM] Sending initial ORF token
>> Dec 19 23:02:52 xen1 openais[2747]: [CLM  ] CLM CONFIGURATION CHANGE
>> Dec 19 23:02:52 xen1 openais[2747]: [CLM  ] New Configuration:
>> Dec 19 23:02:52 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.107)
>> Dec 19 23:02:52 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.108)
>> Dec 19 23:02:52 xen1 openais[2747]: [CLM  ] Members Left:
>> Dec 19 23:02:52 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.116)
>> Dec 19 23:02:52 xen1 openais[2747]: [CLM  ] Members Joined:
>> Dec 19 23:02:52 xen1 openais[2747]: [SYNC ] This node is within the primary
>> component and will provide service.
>> Dec 19 23:02:52 xen1 openais[2747]: [CLM  ] CLM CONFIGURATION CHANGE
>> Dec 19 23:02:52 xen1 openais[2747]: [CLM  ] New Configuration:
>> Dec 19 23:02:52 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.107)
>> Dec 19 23:02:52 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.108)
>> Dec 19 23:02:52 xen1 openais[2747]: [CLM  ] Members Left:
>> Dec 19 23:02:52 xen1 openais[2747]: [CLM  ] Members Joined:
>> Dec 19 23:02:52 xen1 openais[2747]: [SYNC ] This node is within the primary
>> component and will provide service.
>> Dec 19 23:02:52 xen1 openais[2747]: [TOTEM] entering OPERATIONAL state.
>> Dec 19 23:02:52 xen1 openais[2747]: [CLM  ] got nodejoin message
>> 172.16.40.107 Dec 19 23:02:53 xen1 openais[2747]: [CLM  ] got nodejoin
>> message 172.16.40.108 Dec 19 23:02:53 xen1 openais[2747]: [CPG  ] got
>> joinlist message from node 2 Dec 19 23:02:53 xen1 openais[2747]: [CPG  ]
>> got joinlist message from node 1
>>
>> So far so good, xen3 is offline while it reboots...
>>
>> [root at xen1 ~]# clustat
>> Member Status: Quorate
>>
>>   Member Name                        ID   Status
>>   ------ ----                        ---- ------
>>   xen1.dc.server.pt                      1 Online, Local
>>   xen2.dc.server.pt                      2 Online
>>   xen3.dc.server.pt                      3 Offline
>>
>> After it reboots i get node join in xen1 server logs:
>>
>> Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] entering GATHER state from 11.
>> Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] Creating commit token because I
>> am the rep.
>> Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] Saving state aru 17 high seq
>> received 17
>> Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] entering COMMIT state.
>> Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] entering RECOVERY state.
>> Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] position [0] member
>> 172.16.40.107: Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] previous ring
>> seq 88 rep 172.16.40.107
>> Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] aru 17 high delivered 17
>> received flag 0
>> Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] position [1] member
>> 172.16.40.108: Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] previous ring
>> seq 88 rep 172.16.40.107
>> Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] aru 17 high delivered 17
>> received flag 0
>> Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] position [2] member
>> 172.16.40.116: Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] previous ring
>> seq 4 rep 172.16.40.116
>> Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] aru 9 high delivered 9 received
>> flag 0
>> Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] Did not need to originate any
>> messages in recovery.
>> Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] Storing new sequence id for
>> ring 5c
>> Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] Sending initial ORF token
>> Dec 19 23:05:03 xen1 openais[2747]: [CLM  ] CLM CONFIGURATION CHANGE
>> Dec 19 23:05:03 xen1 openais[2747]: [CLM  ] New Configuration:
>> Dec 19 23:05:03 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.107)
>> Dec 19 23:05:03 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.108)
>> Dec 19 23:05:04 xen1 openais[2747]: [CLM  ] Members Left:
>> Dec 19 23:05:04 xen1 openais[2747]: [CLM  ] Members Joined:
>> Dec 19 23:05:04 xen1 openais[2747]: [SYNC ] This node is within the primary
>> component and will provide service.
>> Dec 19 23:05:04 xen1 openais[2747]: [CLM  ] CLM CONFIGURATION CHANGE
>> Dec 19 23:05:04 xen1 openais[2747]: [CLM  ] New Configuration:
>> Dec 19 23:05:04 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.107)
>> Dec 19 23:05:04 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.108)
>> Dec 19 23:05:04 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.116)
>> Dec 19 23:05:04 xen1 openais[2747]: [CLM  ] Members Left:
>> Dec 19 23:05:04 xen1 openais[2747]: [CLM  ] Members Joined:
>> Dec 19 23:05:04 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.116)
>> Dec 19 23:05:04 xen1 openais[2747]: [SYNC ] This node is within the primary
>> component and will provide service.
>> Dec 19 23:05:04 xen1 openais[2747]: [TOTEM] entering OPERATIONAL state.
>> Dec 19 23:05:04 xen1 openais[2747]: [CLM  ] got nodejoin message
>> 172.16.40.107 Dec 19 23:05:04 xen1 openais[2747]: [CLM  ] got nodejoin
>> message 172.16.40.108 Dec 19 23:05:04 xen1 openais[2747]: [CLM  ] got
>> nodejoin message 172.16.40.116 Dec 19 23:05:04 xen1 openais[2747]: [CPG  ]
>> got joinlist message from node 1 Dec 19 23:05:04 xen1 openais[2747]: [CPG 
>> ] got joinlist message from node 2 Dec 19 23:05:12 xen1 kernel: dlm:
>> connecting to 3
>> Dec 19 23:05:12 xen1 kernel: dlm: got connection from 3
>>
>> Clustat also reports ok status:
>>
>> [root at xen1 ~]# clustat
>> Member Status: Quorate
>>
>>   Member Name                        ID   Status
>>   ------ ----                        ---- ------
>>   xen1.dc.server.pt                      1 Online, Local
>>   xen2.dc.server.pt                      2 Online
>>   xen3.dc.server.pt                      3 Online
>>
>> Everything ok so far...
>>
>> Next i reboot xen2. When xen2 leaves xen1 complains that it can speak with
>> xen3 and fences it.
>>
>> Dec 19 23:08:48 xen1 openais[2747]: [TOTEM] Retransmit List: 32
>> Dec 19 23:08:48 xen1 openais[2747]: [TOTEM] Retransmit List: 32
>> Dec 19 23:08:48 xen1 openais[2747]: [TOTEM] Retransmit List: 32 33 34
>> Dec 19 23:08:55 xen1 last message repeated 47 times
>> Dec 19 23:08:55 xen1 openais[2747]: [TOTEM] FAILED TO RECEIVE
>> Dec 19 23:08:55 xen1 openais[2747]: [TOTEM] entering GATHER state from 6.
>> Dec 19 23:08:55 xen1 openais[2747]: [TOTEM] Retransmit List: 32 33 34
>> Dec 19 23:08:55 xen1 openais[2747]: [TOTEM] FAILED TO RECEIVE
>> Dec 19 23:08:55 xen1 openais[2747]: [TOTEM] entering GATHER state from 6.
>> Dec 19 23:08:56 xen1 openais[2747]: [TOTEM] Retransmit List: 32 33 34
>> Dec 19 23:08:56 xen1 openais[2747]: [TOTEM] FAILED TO RECEIVE
>> Dec 19 23:08:56 xen1 openais[2747]: [TOTEM] entering GATHER state from 6.
>> Dec 19 23:08:56 xen1 openais[2747]: [TOTEM] Retransmit List: 32 33 34
>> Dec 19 23:08:56 xen1 openais[2747]: [TOTEM] FAILED TO RECEIVE
>> Dec 19 23:08:56 xen1 openais[2747]: [TOTEM] entering GATHER state from 6.
>> Dec 19 23:08:57 xen1 openais[2747]: [TOTEM] Retransmit List: 32 33 34
>> Dec 19 23:08:57 xen1 openais[2747]: [TOTEM] FAILED TO RECEIVE
>> Dec 19 23:08:57 xen1 openais[2747]: [TOTEM] entering GATHER state from 6.
>> Dec 19 23:08:57 xen1 openais[2747]: [TOTEM] Retransmit List: 32 33 34
>> Dec 19 23:08:57 xen1 openais[2747]: [TOTEM] FAILED TO RECEIVE
>> Dec 19 23:08:57 xen1 openais[2747]: [TOTEM] entering GATHER state from 6.
>> Dec 19 23:08:58 xen1 openais[2747]: [TOTEM] Retransmit List: 32 33 34
>> Dec 19 23:08:58 xen1 openais[2747]: [TOTEM] FAILED TO RECEIVE
>> Dec 19 23:08:58 xen1 openais[2747]: [TOTEM] entering GATHER state from 6.
>> Dec 19 23:08:58 xen1 openais[2747]: [TOTEM] Retransmit List: 32 33 34
>> Dec 19 23:08:58 xen1 openais[2747]: [TOTEM] FAILED TO RECEIVE
>> Dec 19 23:08:58 xen1 openais[2747]: [TOTEM] entering GATHER state from 6.
>> Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] Retransmit List: 32 33 34
>> Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] FAILED TO RECEIVE
>> Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] entering GATHER state from 6.
>> Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] Retransmit List: 32 33 34
>> Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] FAILED TO RECEIVE
>> Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] entering GATHER state from 6.
>> Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] entering GATHER state from 11.
>> Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] Creating commit token because I
>> am the rep.
>> Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] Saving state aru 34 high seq
>> received 34
>> Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] entering COMMIT state.
>> Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] entering RECOVERY state.
>> Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] position [0] member
>> 172.16.40.107: Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] previous ring
>> seq 92 rep 172.16.40.107
>> Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] aru 34 high delivered 34
>> received flag 0
>> Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] position [1] member
>> 172.16.40.108: Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] previous ring
>> seq 92 rep 172.16.40.107
>> Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] aru 34 high delivered 34
>> received flag 0
>> Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] Did not need to originate any
>> messages in recovery.
>> Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] Storing new sequence id for
>> ring 60
>> Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] Sending initial ORF token
>> Dec 19 23:08:59 xen1 kernel: dlm: closing connection to node 3
>>
>>
>> Dec 19 23:08:59 xen1 fenced[2763]: xen3.dc.aeiou.pt not a cluster member
>> after 0 sec post_fail_delay
>>
>>
>>
>> Dec 19 23:09:00 xen1 openais[2747]: [CLM  ] CLM CONFIGURATION CHANGE
>> Dec 19 23:09:00 xen1 fenced[2763]: xen2.dc.aeiou.pt not a cluster member
>> after 0 sec post_fail_delay
>> Dec 19 23:09:00 xen1 openais[2747]: [CLM  ] New Configuration:
>> Dec 19 23:09:00 xen1 fenced[2763]: fencing node "xen3.dc.aeiou.pt"
>> Dec 19 23:09:00 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.107)
>> Dec 19 23:09:00 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.108)
>> Dec 19 23:09:00 xen1 openais[2747]: [CLM  ] Members Left:
>> Dec 19 23:09:00 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.116)
>> Dec 19 23:09:00 xen1 openais[2747]: [CLM  ] Members Joined:
>> Dec 19 23:09:00 xen1 openais[2747]: [SYNC ] This node is within the primary
>> component and will provide service.
>> Dec 19 23:09:00 xen1 openais[2747]: [CLM  ] CLM CONFIGURATION CHANGE
>> Dec 19 23:09:00 xen1 openais[2747]: [CLM  ] New Configuration:
>> Dec 19 23:09:00 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.107)
>> Dec 19 23:09:00 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.108)
>> Dec 19 23:09:00 xen1 openais[2747]: [CLM  ] Members Left:
>> Dec 19 23:09:00 xen1 openais[2747]: [CLM  ] Members Joined:
>> Dec 19 23:09:00 xen1 openais[2747]: [SYNC ] This node is within the primary
>> component and will provide service.
>> Dec 19 23:09:00 xen1 openais[2747]: [TOTEM] entering OPERATIONAL state.
>> Dec 19 23:09:00 xen1 openais[2747]: [CLM  ] got nodejoin message
>> 172.16.40.107 Dec 19 23:09:00 xen1 openais[2747]: [CLM  ] got nodejoin
>> message 172.16.40.108 Dec 19 23:09:00 xen1 openais[2747]: [CPG  ] got
>> joinlist message from node 2 Dec 19 23:09:00 xen1 openais[2747]: [CPG  ]
>> got joinlist message from node 1 Dec 19 23:09:05 xen1 openais[2747]:
>> [TOTEM] entering GATHER state from 11. Dec 19 23:09:09 xen1 openais[2747]:
>> [TOTEM] entering GATHER state from 0. Dec 19 23:09:09 xen1 openais[2747]:
>> [TOTEM] Creating commit token because I am the rep.
>> Dec 19 23:09:09 xen1 openais[2747]: [TOTEM] Saving state aru 1a high seq
>> received 1a
>> Dec 19 23:09:09 xen1 openais[2747]: [TOTEM] entering COMMIT state.
>> Dec 19 23:09:09 xen1 openais[2747]: [TOTEM] entering RECOVERY state.
>> Dec 19 23:09:09 xen1 openais[2747]: [TOTEM] position [0] member
>> 172.16.40.107: Dec 19 23:09:09 xen1 openais[2747]: [TOTEM] previous ring
>> seq 96 rep 172.16.40.107
>> Dec 19 23:09:09 xen1 openais[2747]: [TOTEM] aru 1a high delivered 1a
>> received flag 0
>> Dec 19 23:09:09 xen1 openais[2747]: [TOTEM] position [1] member
>> 172.16.40.116: Dec 19 23:09:09 xen1 openais[2747]: [TOTEM] previous ring
>> seq 92 rep 172.16.40.107
>> Dec 19 23:09:09 xen1 openais[2747]: [TOTEM] aru 31 high delivered 31
>> received flag 0
>> Dec 19 23:09:09 xen1 openais[2747]: [TOTEM] Did not need to originate any
>> messages in recovery.
>> Dec 19 23:09:09 xen1 openais[2747]: [TOTEM] Storing new sequence id for
>> ring 64
>> Dec 19 23:09:09 xen1 kernel: dlm: closing connection to node 2
>> Dec 19 23:09:09 xen1 openais[2747]: [TOTEM] Sending initial ORF token
>> Dec 19 23:09:09 xen1 openais[2747]: [CLM  ] CLM CONFIGURATION CHANGE
>> Dec 19 23:09:10 xen1 openais[2747]: [CLM  ] New Configuration:
>> Dec 19 23:09:10 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.107)
>> Dec 19 23:09:10 xen1 openais[2747]: [CLM  ] Members Left:
>> Dec 19 23:09:10 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.108)
>> Dec 19 23:09:10 xen1 openais[2747]: [CLM  ] Members Joined:
>> Dec 19 23:09:10 xen1 openais[2747]: [CMAN ] quorum lost, blocking activity
>> Dec 19 23:09:10 xen1 openais[2747]: [SYNC ] This node is within the primary
>> component and will provide service.
>> Dec 19 23:09:10 xen1 openais[2747]: [CLM  ] CLM CONFIGURATION CHANGE
>> Dec 19 23:09:10 xen1 openais[2747]: [CLM  ] New Configuration:
>> Dec 19 23:09:10 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.107)
>> Dec 19 23:09:10 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.116)
>> Dec 19 23:09:10 xen1 openais[2747]: [CLM  ] Members Left:
>> Dec 19 23:09:10 xen1 openais[2747]: [CLM  ] Members Joined:
>> Dec 19 23:09:10 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.116)
>> Dec 19 23:09:10 xen1 openais[2747]: [SYNC ] This node is within the primary
>> component and will provide service.
>> Dec 19 23:09:10 xen1 openais[2747]: [TOTEM] entering OPERATIONAL state.
>> Dec 19 23:09:10 xen1 openais[2747]: [MAIN ] Node xen3.dc.aeiou.pt not
>> joined to cman because it has rejoined an inquorate cluster
>> Dec 19 23:09:10 xen1 openais[2747]: [CLM  ] got nodejoin message
>> 172.16.40.107 Dec 19 23:09:10 xen1 openais[2747]: [CLM  ] got nodejoin
>> message 172.16.40.116 Dec 19 23:09:10 xen1 openais[2747]: [CPG  ] got
>> joinlist message from node 3 Dec 19 23:09:10 xen1 openais[2747]: [CPG  ]
>> got joinlist message from node 1 Dec 19 23:09:14 xen1 ccsd[2740]: Cluster
>> is not quorate.  Refusing connection. Dec 19 23:09:14 xen1 ccsd[2740]:
>> Error while processing connect: Connection refused
>> Dec 19 23:09:19 xen1 ccsd[2740]: Cluster is not quorate.  Refusing
>> connection. Dec 19 23:09:19 xen1 ccsd[2740]: Error while processing
>> connect: Connection refused
>> Dec 19 23:09:24 xen1 ccsd[2740]: Cluster is not quorate.  Refusing
>> connection. Dec 19 23:09:24 xen1 ccsd[2740]: Error while processing
>> connect: Connection refused
>> Dec 19 23:09:29 xen1 ccsd[2740]: Cluster is not quorate.  Refusing
>> connection. Dec 19 23:09:29 xen1 ccsd[2740]: Error while processing
>> connect: Connection refused
>> Dec 19 23:09:34 xen1 ccsd[2740]: Cluster is not quorate.  Refusing
>> connection. Dec 19 23:09:34 xen1 ccsd[2740]: Error while processing
>> connect: Connection refused
>> Dec 19 23:09:36 xen1 openais[2747]: [TOTEM] The token was lost in the
>> OPERATIONAL state.
>> Dec 19 23:09:36 xen1 openais[2747]: [TOTEM] Receive multicast socket recv
>> buffer size (262142 bytes).
>> Dec 19 23:09:36 xen1 openais[2747]: [TOTEM] Transmit multicast socket send
>> buffer size (262142 bytes).
>> Dec 19 23:09:36 xen1 openais[2747]: [TOTEM] entering GATHER state from 2.
>> Dec 19 23:09:39 xen1 ccsd[2740]: Cluster is not quorate.  Refusing
>> connection. Dec 19 23:09:39 xen1 ccsd[2740]: Error while processing
>> connect: Connection refused
>> Dec 19 23:09:40 xen1 openais[2747]: [TOTEM] entering GATHER state from 0.
>> Dec 19 23:09:40 xen1 openais[2747]: [TOTEM] Creating commit token because I
>> am the rep.
>> Dec 19 23:09:40 xen1 openais[2747]: [TOTEM] Saving state aru 18 high seq
>> received 18
>> Dec 19 23:09:40 xen1 openais[2747]: [TOTEM] entering COMMIT state.
>> Dec 19 23:09:40 xen1 openais[2747]: [TOTEM] entering RECOVERY state.
>> Dec 19 23:09:40 xen1 openais[2747]: [TOTEM] position [0] member
>> 172.16.40.107: Dec 19 23:09:40 xen1 openais[2747]: [TOTEM] previous ring
>> seq 100 rep 172.16.40.107
>> Dec 19 23:09:40 xen1 openais[2747]: [TOTEM] aru 18 high delivered 18
>> received flag 0
>> Dec 19 23:09:40 xen1 openais[2747]: [TOTEM] Did not need to originate any
>> messages in recovery.
>> Dec 19 23:09:40 xen1 openais[2747]: [TOTEM] Storing new sequence id for
>> ring 68
>> Dec 19 23:09:40 xen1 openais[2747]: [TOTEM] Sending initial ORF token
>> Dec 19 23:09:40 xen1 openais[2747]: [CLM  ] CLM CONFIGURATION CHANGE
>> Dec 19 23:09:40 xen1 openais[2747]: [CLM  ] New Configuration:
>> Dec 19 23:09:41 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.107)
>> Dec 19 23:09:41 xen1 openais[2747]: [CLM  ] Members Left:
>> Dec 19 23:09:41 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.116)
>> Dec 19 23:09:41 xen1 openais[2747]: [CLM  ] Members Joined:
>> Dec 19 23:09:41 xen1 openais[2747]: [SYNC ] This node is within the primary
>> component and will provide service.
>> Dec 19 23:09:41 xen1 openais[2747]: [CLM  ] CLM CONFIGURATION CHANGE
>> Dec 19 23:09:41 xen1 openais[2747]: [CLM  ] New Configuration:
>> Dec 19 23:09:41 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.107)
>> Dec 19 23:09:41 xen1 openais[2747]: [CLM  ] Members Left:
>> Dec 19 23:09:41 xen1 openais[2747]: [CLM  ] Members Joined:
>> Dec 19 23:09:41 xen1 openais[2747]: [SYNC ] This node is within the primary
>> component and will provide service.
>> Dec 19 23:09:41 xen1 openais[2747]: [TOTEM] entering OPERATIONAL state.
>> Dec 19 23:09:41 xen1 openais[2747]: [CLM  ] got nodejoin message
>> 172.16.40.107 Dec 19 23:09:41 xen1 openais[2747]: [CPG  ] got joinlist
>> message from node 1 Dec 19 23:09:44 xen1 ccsd[2740]: Cluster is not
>> quorate.  Refusing connection. Dec 19 23:09:44 xen1 ccsd[2740]: Error while
>> processing connect: Connection refused
>> Dec 19 23:09:49 xen1 ccsd[2740]: Cluster is not quorate.  Refusing
>> connection. Dec 19 23:09:49 xen1 ccsd[2740]: Error while processing
>> connect: Connection refused
>> Dec 19 23:09:54 xen1 ccsd[2740]: Cluster is not quorate.  Refusing
>> connection. Dec 19 23:09:54 xen1 ccsd[2740]: Error while processing
>> connect: Connection refused
>> Dec 19 23:09:59 xen1 ccsd[2740]: Cluster is not quorate.  Refusing
>> connection. Dec 19 23:09:59 xen1 ccsd[2740]: Error while processing
>> connect: Connection refused
>> Dec 19 23:10:04 xen1 ccsd[2740]: Cluster is not quorate.  Refusing
>> connection. Dec 19 23:10:04 xen1 ccsd[2740]: Error while processing
>> connect: Connection refused
>> Dec 19 23:10:09 xen1 ccsd[2740]: Cluster is not quorate.  Refusing
>> connection. Dec 19 23:10:09 xen1 ccsd[2740]: Error while processing
>> connect: Connection refused
>>
>> The last errors are ok because the cluster isn't quorate anymore. xen2 was
>> rebooting and xen3 was fenced, so leaving xen1 alone creates an unquorate
>> cluster...
>>
>> The unusual thing is that it only happens when one of the nodes is using
>> rhel5 xen kernel. Maybe something in the bridge-utils bug and multicast?
>> This problem happens if i reboot xen1 server with xen kernel or xen2
>> server.
>>
>>
>> Any intel?
>>
>> Thanks
>> Nuno Fernandes

Hi Nuno,

  Sorry for this question: how do you have created xen bridges with 
ifcfg-files?? I am trying to do the same, but it doesn't works for me ...




> 
> 
> 
> 
> ------------------------------------------------------------------------
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


-- 
CL Martinez
carlopmart {at} gmail {d0t} com




More information about the Linux-cluster mailing list