[Linux-cluster] fencing loop in a 2-node partitioned cluster

Tue Feb 24 15:01:41 UTC 2009

We've solved this problem by using fence_timeouts that are dependent on the 
nodeid. Means node0 gets timeout=0 and node1 gets timeout=10. Then node0 will 
always survive. That's not the optimum way but works.
Or use qdiskd and let it detect the networkpartitioning (whereever it happens) 
and decide which side should survive by a heuristic.
Marc.
On Tuesday 24 February 2009 14:50:08 Gianluca Cecchi wrote:
> And these are the logs I see on the wo nodes:
> the first node:
> Feb 23 16:26:38 oracs1 openais[6020]: [TOTEM] The token was lost in
> the OPERATIONAL state.
> Feb 23 16:26:38 oracs1 openais[6020]: [TOTEM] Receive multicast socket
> recv buffer size (288000 bytes).
> Feb 23 16:26:38 oracs1 openais[6020]: [TOTEM] Transmit multicast
> socket send buffer size (288000 bytes).
> Feb 23 16:26:38 oracs1 openais[6020]: [TOTEM] entering GATHER state from 2.
> Feb 23 16:26:43 oracs1 openais[6020]: [TOTEM] entering GATHER state from 0.
> Feb 23 16:26:43 oracs1 openais[6020]: [TOTEM] Creating commit token
> because I am the rep.
> Feb 23 16:26:43 oracs1 openais[6020]: [TOTEM] Saving state aru 36 high
> seq received 36
> Feb 23 16:26:43 oracs1 openais[6020]: [TOTEM] Storing new sequence id
> for ring 4d944
> Feb 23 16:26:43 oracs1 openais[6020]: [TOTEM] entering COMMIT state.
> Feb 23 16:26:43 oracs1 openais[6020]: [TOTEM] entering RECOVERY state.
> Feb 23 16:26:43 oracs1 openais[6020]: [TOTEM] position [0] member
> 192.168.16.1: Feb 23 16:26:43 oracs1 openais[6020]: [TOTEM] previous ring
> seq 317760 rep 192.168.16.1
> Feb 23 16:26:43 oracs1 openais[6020]: [TOTEM] aru 36 high delivered 36
> received flag 1
> Feb 23 16:26:43 oracs1 openais[6020]: [TOTEM] Did not need to
> originate any messages in recovery.
> Feb 23 16:26:43 oracs1 openais[6020]: [TOTEM] Sending initial ORF token
> Feb 23 16:26:43 oracs1 openais[6020]: [CLM  ] CLM CONFIGURATION CHANGE
> Feb 23 16:26:43 oracs1 openais[6020]: [CLM  ] New Configuration:
> Feb 23 16:26:43 oracs1 kernel: dlm: closing connection to node 2
> Feb 23 16:26:43 oracs1 fenced[6078]: node02 not a cluster member after
> 0 sec post_fail_delay
> Feb 23 16:26:43 oracs1 openais[6020]: [CLM  ]   r(0) ip(192.168.16.1)
> Feb 23 16:26:43 oracs1 clurgmgrd[6868]: <info> State change: node02 DOWN
> Feb 23 16:26:43 oracs1 fenced[6078]: fencing node "node02"
> Feb 23 16:26:43 oracs1 openais[6020]: [CLM  ] Members Left:
> Feb 23 16:26:43 oracs1 openais[6020]: [CLM  ]   r(0) ip(192.168.16.8)
> Feb 23 16:26:43 oracs1 openais[6020]: [CLM  ] Members Joined:
> Feb 23 16:26:43 oracs1 openais[6020]: [CLM  ] CLM CONFIGURATION CHANGE
> Feb 23 16:26:43 oracs1 openais[6020]: [CLM  ] New Configuration:
> Feb 23 16:26:43 oracs1 openais[6020]: [CLM  ]   r(0) ip(192.168.16.1)
> Feb 23 16:26:43 oracs1 openais[6020]: [CLM  ] Members Left:
> Feb 23 16:26:43 oracs1 openais[6020]: [CLM  ] Members Joined:
> Feb 23 16:26:43 oracs1 openais[6020]: [SYNC ] This node is within the
> primary component and will provide service.
> Feb 23 16:26:43 oracs1 openais[6020]: [TOTEM] entering OPERATIONAL state.
> Feb 23 16:26:43 oracs1 openais[6020]: [CLM  ] got nodejoin message
> 192.168.16.1 Feb 23 16:26:43 oracs1 openais[6020]: [CPG  ] got joinlist
> message from node 1 Feb 23 16:26:48 oracs1 clurgmgrd[6868]: <info> Waiting
> for node #2 to be fenced
>
> The other node:
> Feb 23 16:26:38 oracs2 openais[6027]: [TOTEM] The token was lost in
> the OPERATIONAL state.
> Feb 23 16:26:38 oracs2 openais[6027]: [TOTEM] Receive multicast socket
> recv buffer size (288000 bytes).
> Feb 23 16:26:38 oracs2 openais[6027]: [TOTEM] Transmit multicast
> socket send buffer size (288000 bytes).
> Feb 23 16:26:38 oracs2 openais[6027]: [TOTEM] entering GATHER state from 2.
> Feb 23 16:26:43 oracs2 openais[6027]: [TOTEM] entering GATHER state from 0.
> Feb 23 16:26:43 oracs2 openais[6027]: [TOTEM] Creating commit token
> because I am the rep.
> Feb 23 16:26:43 oracs2 openais[6027]: [TOTEM] Saving state aru 36 high
> seq received 36
> Feb 23 16:26:43 oracs2 openais[6027]: [TOTEM] Storing new sequence id
> for ring 4d944
> Feb 23 16:26:43 oracs2 openais[6027]: [TOTEM] entering COMMIT state.
> Feb 23 16:26:43 oracs2 openais[6027]: [TOTEM] entering RECOVERY state.
> Feb 23 16:26:43 oracs2 openais[6027]: [TOTEM] position [0] member
> 192.168.16.8: Feb 23 16:26:43 oracs2 openais[6027]: [TOTEM] previous ring
> seq 317760 rep 192.168.16.1
> Feb 23 16:26:43 oracs2 openais[6027]: [TOTEM] aru 36 high delivered 36
> received flag 1
> Feb 23 16:26:43 oracs2 openais[6027]: [TOTEM] Did not need to
> originate any messages in recovery.
> Feb 23 16:26:43 oracs2 openais[6027]: [TOTEM] Sending initial ORF token
> Feb 23 16:26:43 oracs2 openais[6027]: [CLM  ] CLM CONFIGURATION CHANGE
> Feb 23 16:26:43 oracs2 openais[6027]: [CLM  ] New Configuration:
> Feb 23 16:26:43 oracs2 kernel: dlm: closing connection to node 1
> Feb 23 16:26:43 oracs2 fenced[6085]: node01 not a cluster member after
> 0 sec post_fail_delay
> Feb 23 16:26:43 oracs2 openais[6027]: [CLM  ]   r(0) ip(192.168.16.8)
> Feb 23 16:26:43 oracs2 clurgmgrd[6880]: <info> State change: node01 DOWN
> Feb 23 16:26:43 oracs2 fenced[6085]: fencing node "node01"
> Feb 23 16:26:43 oracs2 openais[6027]: [CLM  ] Members Left:
> Feb 23 16:26:43 oracs2 openais[6027]: [CLM  ]   r(0) ip(192.168.16.1)
> Feb 23 16:26:43 oracs2 openais[6027]: [CLM  ] Members Joined:
> Feb 23 16:26:43 oracs2 openais[6027]: [CLM  ] CLM CONFIGURATION CHANGE
> Feb 23 16:26:43 oracs2 openais[6027]: [CLM  ] New Configuration:
> Feb 23 16:26:43 oracs2 openais[6027]: [CLM  ]   r(0) ip(192.168.16.8)
> Feb 23 16:26:43 oracs2 openais[6027]: [CLM  ] Members Left:
> Feb 23 16:26:43 oracs2 openais[6027]: [CLM  ] Members Joined:
> Feb 23 16:26:43 oracs2 openais[6027]: [SYNC ] This node is within the
> primary component and will provide service.
> Feb 23 16:26:44 oracs2 openais[6027]: [TOTEM] entering OPERATIONAL state.
> Feb 23 16:26:44 oracs2 openais[6027]: [CLM  ] got nodejoin message
> 192.168.16.8 Feb 23 16:26:44 oracs2 openais[6027]: [CPG  ] got joinlist
> message from node 2 Feb 23 16:26:48 oracs2 clurgmgrd[6880]: <info> Waiting
> for node #1 to be fenced
>
> The result is that both nodes went in power off state and I manually
> needed to power on them.
>
> As I also have debug enabled for qdisk, these are the logs for it:
> node01:
> Feb 23 09:39:42 oracs1 qdiskd[6062]: <debug> Heuristic: 'ping -c1 -w1
> 10.4.5.250' score=1 interval=2 tko=3
> Feb 23 09:39:42 oracs1 qdiskd[6062]: <debug> 1 heuristics loaded
> Feb 23 09:39:42 oracs1 qdiskd[6062]: <debug> Quorum Daemon: 1
> heuristics, 3 interval, 5 tko, 1 votes
> Feb 23 09:39:42 oracs1 qdiskd[6062]: <debug> Run Flags: 00000031
> Feb 23 09:39:43 oracs1 qdiskd[6062]: <info> Quorum Partition:
> /dev/dm-5 Label: acsquorum
> Feb 23 09:39:43 oracs1 qdiskd[6063]: <info> Quorum Daemon Initializing
> Feb 23 09:39:43 oracs1 qdiskd[6063]: <debug> I/O Size: 512  Page Size: 4096
> Feb 23 09:39:44 oracs1 qdiskd[6063]: <info> Heuristic: 'ping -c1 -w1
> 10.4.5.250' UP
> Feb 23 09:39:58 oracs1 qdiskd[6063]: <info> Initial score 1/1
> Feb 23 09:39:58 oracs1 qdiskd[6063]: <info> Initialization complete
> Feb 23 09:39:58 oracs1 qdiskd[6063]: <notice> Score sufficient for
> master operation (1/1; required=1); upgrading
> Feb 23 09:40:04 oracs1 qdiskd[6063]: <debug> Making bid for master
> Feb 23 09:40:10 oracs1 qdiskd[6063]: <info> Assuming master role
> Feb 23 09:42:52 oracs1 qdiskd[6063]: <debug> Node 2 is UP
>
>
> node02:
> Feb 23 16:12:34 oracs2 qdiskd[6069]: <debug> Heuristic: 'ping -c1 -w1
> 10.4.5.250' score=1 interval=2 tko=3
> Feb 23 16:12:34 oracs2 qdiskd[6069]: <debug> 1 heuristics loaded
> Feb 23 16:12:34 oracs2 qdiskd[6069]: <debug> Quorum Daemon: 1
> heuristics, 3 interval, 5 tko, 1 votes
> Feb 23 16:12:34 oracs2 qdiskd[6069]: <debug> Run Flags: 00000031
> Feb 23 16:12:34 oracs2 qdiskd[6069]: <info> Quorum Partition:
> /dev/dm-10 Label: acsquorum
> Feb 23 16:12:34 oracs2 qdiskd[6070]: <info> Quorum Daemon Initializing
> Feb 23 16:12:34 oracs2 qdiskd[6070]: <debug> I/O Size: 512  Page Size: 4096
> Feb 23 16:12:36 oracs2 qdiskd[6070]: <info> Heuristic: 'ping -c1 -w1
> 10.4.5.250' UP
> Feb 23 16:12:41 oracs2 qdiskd[6070]: <debug> Node 1 is UP
> Feb 23 16:12:44 oracs2 qdiskd[6070]: <info> Node 1 is the master
> Feb 23 16:12:50 oracs2 qdiskd[6070]: <info> Initial score 1/1
> Feb 23 16:12:50 oracs2 qdiskd[6070]: <info> Initialization complete
> Feb 23 16:12:50 oracs2 qdiskd[6070]: <notice> Score sufficient for
> master operation (1/1; required=1); upgrading
>
>
> Gianluca
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

-- 
Gruss / Regards,

Marc Grimme
Phone: +49-89 452 3538-14
http://www.atix.de/               http://www.open-sharedroot.org/

ATIX Informationstechnologie und Consulting AG | Einsteinstrasse 10 |
85716 Unterschleissheim | www.atix.de | www.open-sharedroot.org

Registergericht: Amtsgericht Muenchen, Registernummer: HRB 168930, USt.-Id.: 
DE209485962 | Vorstand: Marc Grimme, Mark Hlawatschek, Thomas Merz (Vors.) |
Vorsitzender des Aufsichtsrats: Dr. Martin Buss