[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Linux-cluster] Two-node cluster disconnecting



Hello list

I have a problem with a two-node cluster going split-brain. When I first boot the other node, it correctly starts all the services and informs that cluster is quorate. Then when I boot the other node, on the boot phase when it starts the cluster software it does not find the node already running and starts the same services already running on node 1! When the boot is complete I can see that the nodes have found each other for a small period of time but then immediately disconnect from each other. The cluster is created with Conga with shared disk support though no shared disks are created yet. This is on CentOS 5.

cluster.conf:

<?xml version="1.0"?>
<cluster alias="testcluster" config_version="11" name="testcluster">
        <fence_daemon clean_start="0" post_fail_delay="5" post_join_delay="1200"/>
        <clusternodes>
                <clusternode name="hume" nodeid="1" votes="1">
                        <fence>
                                <method name="2">
                                        <device name="ilohume"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="kant" nodeid="2" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="ilokant"/>
                                </method>
                        </fence>
                </clusternode>
        </clusternodes>
        <cman expected_votes="1" two_node="1"/>
        <fencedevices>
                <fencedevice agent="fence_ilo" hostname="x.x.x.x" login="*" name="ilohume" passwd="*"/>
                <fencedevice agent="fence_ilo" hostname=" x.x.x.x" login="*" name="ilokant" passwd="*"/>
        </fencedevices>
        <rm>
                <failoverdomains/>
                <resources/>
                <service autostart="1" exclusive="0" name="test" recovery="relocate">
                        <script file="/etc/init.d/pgtest" name="pg"/>
                </service>
                <service autostart="1" exclusive="0" name="test2">
                        <script file="/etc/init.d/pgtest2" name="pg2"/>
                </service>
        </rm>
</cluster>

clustat & cman_tool status & cman_tool nodes on node already running:

$ sudo clustat
Member Status: Quorate

Member Name                        ID   Status
------ ----                        ---- ------
hume                           1 Online, Local, rgmanager
kant                           2 Offline

Service Name         Owner (Last)                   State
------- ----         ----- ------                   -----
service:test         hume                    started
service:test2        hume                    started


$ sudo cman_tool status
Version: 6.0.1
Config Version: 11
Cluster Name: testcluster
Cluster Id: 31540
Cluster Member: Yes
Cluster Generation: 32
Membership state: Cluster-Member
Nodes: 1
Expected votes: 1
Total votes: 1
Quorum: 1
Active subsystems: 8
Flags: 2node
Ports Bound: 0 11 177
Node name: hume
Node ID: 1
Multicast addresses: 239.192.123.175
Node addresses: 193.166.192.100


$ sudo cman_tool nodes
Node  Sts   Inc   Joined               Name
   1   M      4   2007-10-10 14:58:53  hume
   2   X     28                        kant


Here's what gets logged in /var/log/messages

Oct 11 07:20:15 hume openais[2410]: [TOTEM] entering GATHER state from 9.
Oct 11 07:20:15 hume openais[2410]: [TOTEM] Creating commit token because I am the rep.
Oct 11 07:20:15 hume openais[2410]: [TOTEM] Saving state aru 13bd2 high seq received 13bd2
Oct 11 07:20:15 hume openais[2410]: [TOTEM] entering COMMIT state.
Oct 11 07:20:15 hume openais[2410]: [TOTEM] entering RECOVERY state.
Oct 11 07:20:15 hume openais[2410]: [TOTEM] position [0] member 193.166.192.100:
Oct 11 07:20:15 hume openais[2410]: [TOTEM] previous ring seq 24 rep 193.166.192.100
Oct 11 07:20:15 hume openais[2410]: [TOTEM] aru 13bd2 high delivered 13bd2 received flag 0
Oct 11 07:20:15 hume openais[2410]: [TOTEM] position [1] member 193.166.192.101:
Oct 11 07:20:15 hume openais[2410]: [TOTEM] previous ring seq 4 rep 193.166.192.101
Oct 11 07:20:15 hume openais[2410]: [TOTEM] aru 27 high delivered 27 received flag 0
Oct 11 07:20:15 hume openais[2410]: [TOTEM] Did not need to originate any messages in recovery.
Oct 11 07:20:15 hume openais[2410]: [TOTEM] Storing new sequence id for ring 1c
Oct 11 07:20:15 hume kernel: dlm: connecting to 2
Oct 11 07:20:15 hume openais[2410]: [TOTEM] Sending initial ORF token
Oct 11 07:20:15 hume kernel: dlm: got connection from 2
Oct 11 07:20:15 hume openais[2410]: [CLM  ] CLM CONFIGURATION CHANGE
Oct 11 07:20:15 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:15 hume openais[2410]: [CLM  ] New Configuration:
Oct 11 07:20:15 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:15 hume openais[2410]: [CLM  ]     r(0) ip(193.166.192.100)
Oct 11 07:20:15 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:15 hume openais[2410]: [CLM  ] Members Left:
Oct 11 07:20:15 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:15 hume openais[2410]: [CLM  ] Members Joined:
Oct 11 07:20:15 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:15 hume openais[2410]: [SYNC ] This node is within the primary component and will provide service.
Oct 11 07:20:15 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:15 hume openais[2410]: [CLM  ] CLM CONFIGURATION CHANGE
Oct 11 07:20:15 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:15 hume openais[2410]: [CLM  ] New Configuration:
Oct 11 07:20:15 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:15 hume openais[2410]: [CLM  ]     r(0) ip(193.166.192.100)
Oct 11 07:20:15 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:15 hume openais[2410]: [CLM  ]     r(0) ip( 193.166.192.101)
Oct 11 07:20:15 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:15 hume openais[2410]: [CLM  ] Members Left:
Oct 11 07:20:15 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:15 hume openais[2410]: [CLM  ] Members Joined:
Oct 11 07:20:15 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:15 hume openais[2410]: [CLM  ]     r(0) ip( 193.166.192.101)
Oct 11 07:20:15 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:15 hume openais[2410]: [SYNC ] This node is within the primary component and will provide service.
Oct 11 07:20:15 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:15 hume openais[2410]: [TOTEM] entering OPERATIONAL state.
Oct 11 07:20:15 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:15 hume openais[2410]: [CLM  ] got nodejoin message 193.166.192.100
Oct 11 07:20:15 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:15 hume openais[2410]: [CLM  ] got nodejoin message 193.166.192.101
Oct 11 07:20:15 hume openais[2410]: [CPG  ] got joinlist message from node 1
Oct 11 07:20:15 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:15 hume openais[2410]: [CPG  ] got joinlist message from node 2
Oct 11 07:20:15 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:15 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:15 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:16 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:16 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:16 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:16 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:16 hume kernel: device eth0 entered promiscuous mode
Oct 11 07:20:16 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:16 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:16 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:16 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:16 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:16 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:17 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:17 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:17 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:17 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:17 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:17 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:18 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:18 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:18 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:18 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:18 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:18 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:19 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:19 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:19 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:19 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:20 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:20 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:20 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:20 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:21 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:21 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:21 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:21 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:22 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:22 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:22 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:22 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:23 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:23 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:23 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:23 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:24 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:24 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:25 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:25 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:25 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:25 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:25 hume kernel: device eth0 left promiscuous mode
Oct 11 07:20:26 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:26 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:27 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:27 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:27 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:27 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:28 hume kernel: device eth0 entered promiscuous mode
Oct 11 07:20:28 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:28 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:29 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:29 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:29 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:29 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:30 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:30 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:31 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:31 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:32 hume openais[2410]: [TOTEM] The token was lost in the OPERATIONAL state.
Oct 11 07:20:32 hume openais[2410]: [TOTEM] Receive multicast socket recv buffer size (262142 bytes).
Oct 11 07:20:32 hume openais[2410]: [TOTEM] Transmit multicast socket send buffer size (262142 bytes).
Oct 11 07:20:32 hume openais[2410]: [TOTEM] entering GATHER state from 2.
Oct 11 07:20:32 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:32 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:32 hume kernel: device eth0 left promiscuous mode
Oct 11 07:20:33 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:33 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:34 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:34 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:34 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:34 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:35 hume kernel: dlm: lockspace 30002 from 2 type 1 not found
Oct 11 07:20:35 hume kernel: dlm: lockspace 20002 from 2 type 1 not found
Oct 11 07:20:36 hume kernel: dlm: connecting to 2
Oct 11 07:20:36 hume openais[2410]: [TOTEM] entering GATHER state from 0.
Oct 11 07:20:36 hume openais[2410]: [TOTEM] Creating commit token because I am the rep.
Oct 11 07:20:36 hume openais[2410]: [TOTEM] Saving state aru 1d high seq received 20
Oct 11 07:20:36 hume openais[2410]: [TOTEM] entering COMMIT state.
Oct 11 07:20:36 hume openais[2410]: [TOTEM] entering RECOVERY state.
Oct 11 07:20:36 hume openais[2410]: [TOTEM] position [0] member 193.166.192.100:
Oct 11 07:20:36 hume openais[2410]: [TOTEM] previous ring seq 28 rep 193.166.192.100
Oct 11 07:20:36 hume openais[2410]: [TOTEM] aru 1d high delivered 1d received flag 0
Oct 11 07:20:36 hume openais[2410]: [TOTEM] copying all old ring messages from 1e-20.
Oct 11 07:20:36 hume openais[2410]: [TOTEM] Originated 0 messages in RECOVERY.
Oct 11 07:20:36 hume openais[2410]: [TOTEM] Originated for recovery:
Oct 11 07:20:36 hume openais[2410]: [TOTEM] Not Originated for recovery: 1e 1f 20
Oct 11 07:20:36 hume openais[2410]: [TOTEM] Storing new sequence id for ring 20
Oct 11 07:20:36 hume kernel: dlm: closing connection to node 2
Oct 11 07:20:36 hume openais[2410]: [TOTEM] Sending initial ORF token
Oct 11 07:20:37 hume openais[2410]: [CLM  ] CLM CONFIGURATION CHANGE
Oct 11 07:20:37 hume openais[2410]: [CLM  ] New Configuration:
Oct 11 07:20:37 hume openais[2410]: [CLM  ]     r(0) ip(193.166.192.100)
Oct 11 07:20:37 hume openais[2410]: [CLM  ] Members Left:
Oct 11 07:20:37 hume openais[2410]: [CLM  ]     r(0) ip( 193.166.192.101)
Oct 11 07:20:37 hume openais[2410]: [CLM  ] Members Joined:
Oct 11 07:20:37 hume openais[2410]: [SYNC ] This node is within the primary component and will provide service.
Oct 11 07:20:37 hume openais[2410]: [CLM  ] CLM CONFIGURATION CHANGE
Oct 11 07:20:37 hume openais[2410]: [CLM  ] New Configuration:
Oct 11 07:20:37 hume openais[2410]: [CLM  ]     r(0) ip( 193.166.192.100)
Oct 11 07:20:37 hume openais[2410]: [CLM  ] Members Left:
Oct 11 07:20:37 hume openais[2410]: [CLM  ] Members Joined:
Oct 11 07:20:37 hume openais[2410]: [SYNC ] This node is within the primary component and will provide service.
Oct 11 07:20:37 hume openais[2410]: [TOTEM] entering OPERATIONAL state.
Oct 11 07:20:37 hume openais[2410]: [CLM  ] got nodejoin message 193.166.192.100
Oct 11 07:20:37 hume openais[2410]: [CPG  ] got joinlist message from node 1


and on the other node:


Oct 11 07:20:16 kant openais[2411]: [TOTEM] entering GATHER state from 11.
Oct 11 07:20:16 kant openais[2411]: [TOTEM] Saving state aru 27 high seq received 27
Oct 11 07:20:16 kant openais[2411]: [TOTEM] entering COMMIT state.
Oct 11 07:20:16 kant openais[2411]: [TOTEM] entering RECOVERY state.
Oct 11 07:20:16 kant openais[2411]: [TOTEM] position [0] member 193.166.192.100:
Oct 11 07:20:16 kant openais[2411]: [TOTEM] previous ring seq 24 rep 193.166.192.100
Oct 11 07:20:16 kant openais[2411]: [TOTEM] aru 13bd2 high delivered 13bd2 received flag 0
Oct 11 07:20:16 kant openais[2411]: [TOTEM] position [1] member 193.166.192.101:
Oct 11 07:20:16 kant openais[2411]: [TOTEM] previous ring seq 4 rep 193.166.192.101
Oct 11 07:20:16 kant openais[2411]: [TOTEM] aru 27 high delivered 27 received flag 0
Oct 11 07:20:16 kant openais[2411]: [TOTEM] Did not need to originate any messages in recovery.
Oct 11 07:20:16 kant openais[2411]: [TOTEM] Storing new sequence id for ring 1c
Oct 11 07:20:16 kant openais[2411]: [CLM  ] CLM CONFIGURATION CHANGE
Oct 11 07:20:16 kant kernel: dlm: connecting to 1
Oct 11 07:20:16 kant openais[2411]: [CLM  ] New Configuration:
Oct 11 07:20:16 kant kernel: dlm: got connection from 1
Oct 11 07:20:16 kant openais[2411]: [CLM  ]     r(0) ip( 193.166.192.101)
Oct 11 07:20:16 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:16 kant openais[2411]: [CLM  ] Members Left:
Oct 11 07:20:16 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:16 kant openais[2411]: [CLM  ] Members Joined:
Oct 11 07:20:16 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:16 kant openais[2411]: [SYNC ] This node is within the primary component and will provide service.
Oct 11 07:20:16 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:16 kant openais[2411]: [CLM  ] CLM CONFIGURATION CHANGE
Oct 11 07:20:16 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:16 kant openais[2411]: [CLM  ] New Configuration:
Oct 11 07:20:16 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:16 kant openais[2411]: [CLM  ]     r(0) ip( 193.166.192.100)
Oct 11 07:20:16 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:16 kant openais[2411]: [CLM  ]     r(0) ip(193.166.192.101)
Oct 11 07:20:16 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:16 kant openais[2411]: [CLM  ] Members Left:
Oct 11 07:20:16 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:16 kant openais[2411]: [CLM  ] Members Joined:
Oct 11 07:20:16 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:16 kant openais[2411]: [CLM  ]     r(0) ip(193.166.192.100)
Oct 11 07:20:16 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:16 kant openais[2411]: [SYNC ] This node is within the primary component and will provide service.
Oct 11 07:20:16 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:16 kant openais[2411]: [TOTEM] entering OPERATIONAL state.
Oct 11 07:20:16 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:16 kant openais[2411]: [CLM  ] got nodejoin message 193.166.192.100
Oct 11 07:20:16 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:16 kant openais[2411]: [CLM  ] got nodejoin message 193.166.192.101
Oct 11 07:20:16 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:16 kant openais[2411]: [CPG  ] got joinlist message from node 1
Oct 11 07:20:16 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:16 kant openais[2411]: [CPG  ] got joinlist message from node 2
Oct 11 07:20:16 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:16 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:16 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:16 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:17 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:17 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:17 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:17 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:17 kant openais[2411]: [TOTEM] Retransmit List: 1e
Oct 11 07:20:17 kant openais[2411]: [TOTEM] Retransmit List: 1e
Oct 11 07:20:17 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f
Oct 11 07:20:17 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f
Oct 11 07:20:17 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:17 kant last message repeated 29 times
Oct 11 07:20:17 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:17 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:17 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:17 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:17 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:18 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:18 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:18 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:18 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:18 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:18 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:18 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:18 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:18 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:18 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:18 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:19 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:19 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:19 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:19 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:19 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:19 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:20 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:20 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:20 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:20 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:20 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:20 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:20 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:20 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:20 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:21 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:21 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:21 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:21 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:21 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:21 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:22 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:22 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:22 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:22 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:22 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:22 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:22 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:23 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:23 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:23 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:23 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:23 kant openais[2411]: [TOTEM] FAILED TO RECEIVE
Oct 11 07:20:23 kant openais[2411]: [TOTEM] entering GATHER state from 6.
Oct 11 07:20:23 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:23 kant openais[2411]: [TOTEM] FAILED TO RECEIVE
Oct 11 07:20:23 kant openais[2411]: [TOTEM] entering GATHER state from 6.
Oct 11 07:20:23 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:23 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:24 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:24 kant openais[2411]: [TOTEM] FAILED TO RECEIVE
Oct 11 07:20:24 kant openais[2411]: [TOTEM] entering GATHER state from 6.
Oct 11 07:20:24 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:24 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:24 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:24 kant openais[2411]: [TOTEM] FAILED TO RECEIVE
Oct 11 07:20:24 kant openais[2411]: [TOTEM] entering GATHER state from 6.
Oct 11 07:20:24 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:24 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:25 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:25 kant openais[2411]: [TOTEM] FAILED TO RECEIVE
Oct 11 07:20:25 kant openais[2411]: [TOTEM] entering GATHER state from 6.
Oct 11 07:20:25 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:25 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:25 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:25 kant openais[2411]: [TOTEM] FAILED TO RECEIVE
Oct 11 07:20:25 kant openais[2411]: [TOTEM] entering GATHER state from 6.
Oct 11 07:20:26 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:26 kant openais[2411]: [TOTEM] FAILED TO RECEIVE
Oct 11 07:20:26 kant openais[2411]: [TOTEM] entering GATHER state from 6.
Oct 11 07:20:26 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:26 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:26 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:26 kant openais[2411]: [TOTEM] FAILED TO RECEIVE
Oct 11 07:20:26 kant openais[2411]: [TOTEM] entering GATHER state from 6.
Oct 11 07:20:26 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:26 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:27 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:27 kant openais[2411]: [TOTEM] FAILED TO RECEIVE
Oct 11 07:20:27 kant openais[2411]: [TOTEM] entering GATHER state from 6.
Oct 11 07:20:27 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:27 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:27 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:27 kant openais[2411]: [TOTEM] FAILED TO RECEIVE
Oct 11 07:20:27 kant openais[2411]: [TOTEM] entering GATHER state from 6.
Oct 11 07:20:28 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:28 kant openais[2411]: [TOTEM] FAILED TO RECEIVE
Oct 11 07:20:28 kant openais[2411]: [TOTEM] entering GATHER state from 6.
Oct 11 07:20:28 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:28 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:28 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:28 kant openais[2411]: [TOTEM] FAILED TO RECEIVE
Oct 11 07:20:28 kant openais[2411]: [TOTEM] entering GATHER state from 6.
Oct 11 07:20:28 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:28 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:29 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:29 kant openais[2411]: [TOTEM] FAILED TO RECEIVE
Oct 11 07:20:29 kant openais[2411]: [TOTEM] entering GATHER state from 6.
Oct 11 07:20:29 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:29 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:29 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:29 kant openais[2411]: [TOTEM] FAILED TO RECEIVE
Oct 11 07:20:29 kant openais[2411]: [TOTEM] entering GATHER state from 6.
Oct 11 07:20:30 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:30 kant openais[2411]: [TOTEM] FAILED TO RECEIVE
Oct 11 07:20:30 kant openais[2411]: [TOTEM] entering GATHER state from 6.
Oct 11 07:20:30 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:30 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:30 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:30 kant openais[2411]: [TOTEM] FAILED TO RECEIVE
Oct 11 07:20:30 kant openais[2411]: [TOTEM] entering GATHER state from 6.
Oct 11 07:20:31 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:31 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:31 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:31 kant openais[2411]: [TOTEM] FAILED TO RECEIVE
Oct 11 07:20:31 kant openais[2411]: [TOTEM] entering GATHER state from 6.
Oct 11 07:20:31 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:31 kant openais[2411]: [TOTEM] FAILED TO RECEIVE
Oct 11 07:20:31 kant openais[2411]: [TOTEM] entering GATHER state from 6.
Oct 11 07:20:31 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:31 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:32 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:32 kant openais[2411]: [TOTEM] FAILED TO RECEIVE
Oct 11 07:20:32 kant openais[2411]: [TOTEM] entering GATHER state from 6.
Oct 11 07:20:32 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:32 kant openais[2411]: [TOTEM] FAILED TO RECEIVE
Oct 11 07:20:32 kant openais[2411]: [TOTEM] entering GATHER state from 6.
Oct 11 07:20:32 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:32 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:33 kant openais[2411]: [TOTEM] Retransmit List: 1e 1f 20
Oct 11 07:20:33 kant openais[2411]: [TOTEM] FAILED TO RECEIVE
Oct 11 07:20:33 kant openais[2411]: [TOTEM] entering GATHER state from 6.
Oct 11 07:20:33 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:33 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:34 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:34 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:35 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:35 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:36 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:36 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:37 kant kernel: dlm: lockspace 20001 from 1 type 1 not found
Oct 11 07:20:37 kant kernel: dlm: lockspace 30001 from 1 type 1 not found
Oct 11 07:20:37 kant openais[2411]: [TOTEM] entering GATHER state from 0.
Oct 11 07:20:37 kant openais[2411]: [TOTEM] Creating commit token because I am the rep.
Oct 11 07:20:37 kant openais[2411]: [TOTEM] Saving state aru 20 high seq received 20
Oct 11 07:20:37 kant openais[2411]: [TOTEM] entering COMMIT state.
Oct 11 07:20:37 kant openais[2411]: [TOTEM] entering RECOVERY state.
Oct 11 07:20:37 kant openais[2411]: [TOTEM] position [0] member 193.166.192.101:
Oct 11 07:20:37 kant openais[2411]: [TOTEM] previous ring seq 28 rep 193.166.192.100
Oct 11 07:20:37 kant openais[2411]: [TOTEM] aru 20 high delivered 20 received flag 0
Oct 11 07:20:37 kant openais[2411]: [TOTEM] Did not need to originate any messages in recovery.
Oct 11 07:20:37 kant openais[2411]: [TOTEM] Storing new sequence id for ring 20
Oct 11 07:20:37 kant openais[2411]: [TOTEM] Sending initial ORF token
Oct 11 07:20:37 kant openais[2411]: [CLM  ] CLM CONFIGURATION CHANGE
Oct 11 07:20:37 kant openais[2411]: [CLM  ] New Configuration:
Oct 11 07:20:37 kant kernel: dlm: closing connection to node 1
Oct 11 07:20:37 kant openais[2411]: [CLM  ]     r(0) ip(193.166.192.101)
Oct 11 07:20:37 kant kernel: dlm: connect from non cluster node
Oct 11 07:20:37 kant openais[2411]: [CLM  ] Members Left:
Oct 11 07:20:37 kant openais[2411]: [CLM  ]     r(0) ip(193.166.192.100)
Oct 11 07:20:38 kant openais[2411]: [CLM  ] Members Joined:
Oct 11 07:20:38 kant openais[2411]: [SYNC ] This node is within the primary component and will provide service.
Oct 11 07:20:38 kant openais[2411]: [CLM  ] CLM CONFIGURATION CHANGE
Oct 11 07:20:38 kant openais[2411]: [CLM  ] New Configuration:
Oct 11 07:20:38 kant openais[2411]: [CLM  ]     r(0) ip( 193.166.192.101)
Oct 11 07:20:38 kant openais[2411]: [CLM  ] Members Left:
Oct 11 07:20:38 kant openais[2411]: [CLM  ] Members Joined:
Oct 11 07:20:38 kant openais[2411]: [SYNC ] This node is within the primary component and will provide service.
Oct 11 07:20:38 kant openais[2411]: [TOTEM] entering OPERATIONAL state.
Oct 11 07:20:38 kant openais[2411]: [CLM  ] got nodejoin message 193.166.192.101
Oct 11 07:20:38 kant openais[2411]: [CPG  ] got joinlist message from node 2
Oct 11 07:21:31 kant snmpd[2664]: Connection from UDP: [193.166.218.61]:55646
Oct 11 07:21:31 kant snmpd[2664]: Received SNMP packet(s) from UDP: [ 193.166.218.61]:55646
Oct 11 07:21:31 kant snmpd[2664]: Connection from UDP: [193.166.218.61]:55646
Oct 11 07:21:31 kant snmpd[2664]: Connection from UDP: [ 193.166.218.61]:55647
Oct 11 07:21:31 kant snmpd[2664]: Received SNMP packet(s) from UDP: [193.166.218.61]:55647
Oct 11 07:21:31 kant snmpd[2664]: Connection from UDP: [ 193.166.218.61]:55647
Oct 11 07:21:31 kant last message repeated 2 times
Oct 11 07:21:31 kant snmpd[2664]: Connection from UDP: [193.166.218.61]:55646
Oct 11 07:21:41 kant ntpd[2696]: synchronized to LOCAL(0), stratum 10
Oct 11 07:21:41 kant ntpd[2696]: kernel time sync enabled 0001
Oct 11 07:22:45 kant ntpd[2696]: synchronized to 193.166.211.70, stratum 2
Oct 11 07:26:35 kant snmpd[2664]: Connection from UDP: [ 193.166.218.61]:56021
Oct 11 07:26:35 kant snmpd[2664]: Received SNMP packet(s) from UDP: [193.166.218.61]:56021
Oct 11 07:26:35 kant snmpd[2664]: Connection from UDP: [ 193.166.218.61]:56021
Oct 11 07:26:35 kant snmpd[2664]: Connection from UDP: [193.166.218.61]:56022
Oct 11 07:26:35 kant snmpd[2664]: Received SNMP packet(s) from UDP: [ 193.166.218.61]:56022
Oct 11 07:26:35 kant snmpd[2664]: Connection from UDP: [193.166.218.61]:56022
Oct 11 07:26:35 kant last message repeated 2 times
Oct 11 07:26:35 kant snmpd[2664]: Connection from UDP: [193.166.218.61]:56021
Oct 11 07:30:20 kant openais[2411]: [TOTEM] entering GATHER state from 11.
Oct 11 07:30:20 kant openais[2411]: [TOTEM] Saving state aru 14 high seq received 14
Oct 11 07:30:20 kant openais[2411]: [TOTEM] entering COMMIT state.
Oct 11 07:30:20 kant openais[2411]: [TOTEM] entering RECOVERY state.
Oct 11 07:30:20 kant openais[2411]: [TOTEM] position [0] member 193.166.192.100:
Oct 11 07:30:20 kant openais[2411]: [TOTEM] previous ring seq 32 rep 193.166.192.100
Oct 11 07:30:20 kant openais[2411]: [TOTEM] aru 15 high delivered 15 received flag 0
Oct 11 07:30:20 kant openais[2411]: [TOTEM] position [1] member 193.166.192.101:
Oct 11 07:30:20 kant openais[2411]: [TOTEM] previous ring seq 32 rep 193.166.192.101
Oct 11 07:30:20 kant openais[2411]: [TOTEM] aru 14 high delivered 14 received flag 0
Oct 11 07:30:20 kant openais[2411]: [TOTEM] Did not need to originate any messages in recovery.
Oct 11 07:30:20 kant openais[2411]: [TOTEM] Storing new sequence id for ring 24
Oct 11 07:30:20 kant openais[2411]: [CLM  ] CLM CONFIGURATION CHANGE
Oct 11 07:30:20 kant openais[2411]: [CLM  ] New Configuration:
Oct 11 07:30:20 kant openais[2411]: [CLM  ]     r(0) ip( 193.166.192.101)
Oct 11 07:30:20 kant openais[2411]: [CLM  ] Members Left:
Oct 11 07:30:20 kant openais[2411]: [CLM  ] Members Joined:
Oct 11 07:30:20 kant openais[2411]: [SYNC ] This node is within the primary component and will provide service.
Oct 11 07:30:20 kant openais[2411]: [CLM  ] CLM CONFIGURATION CHANGE
Oct 11 07:30:20 kant openais[2411]: [CLM  ] New Configuration:
Oct 11 07:30:20 kant openais[2411]: [CLM  ]     r(0) ip( 193.166.192.100)
Oct 11 07:30:20 kant openais[2411]: [CLM  ]     r(0) ip(193.166.192.101)
Oct 11 07:30:20 kant openais[2411]: [CLM  ] Members Left:
Oct 11 07:30:20 kant openais[2411]: [CLM  ] Members Joined:
Oct 11 07:30:20 kant openais[2411]: [CLM  ]     r(0) ip(193.166.192.100)
Oct 11 07:30:20 kant openais[2411]: [SYNC ] This node is within the primary component and will provide service.
Oct 11 07:30:20 kant openais[2411]: [TOTEM] entering OPERATIONAL state.
Oct 11 07:30:20 kant openais[2411]: [MAIN ] Killing node hume because it has rejoined the cluster without cman_tool join
Oct 11 07:30:20 kant openais[2411]: [CMAN ] cman killed by node 1 for reason 3
Oct 11 07:30:20 kant dlm_controld[2433]: cluster is down, exiting
Oct 11 07:30:20 kant kernel: dlm: closing connection to node 2
Oct 11 07:30:20 kant gfs_controld[2439]: groupd_dispatch error -1 errno 11
Oct 11 07:30:20 kant fenced[2427]: cluster is down, exiting
Oct 11 07:30:20 kant gfs_controld[2439]: groupd connection died
Oct 11 07:30:20 kant gfs_controld[2439]: cluster is down, exiting
Oct 11 07:30:47 kant ccsd[2403]: Unable to connect to cluster infrastructure after 30 seconds.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]