[Linux-cluster] Cannot make cluster after upgrade

Ian Hayes cthulhucalling at gmail.com
Wed Jul 8 06:59:24 UTC 2009


Sounds a little split-brainish....... have you tried the clean_start=1
option?

On Jul 7, 2009 11:54 PM, "Abed-nego G. Escobal, Jr." <abednegoyulo at yahoo.com>
wrote:


After an upgrade from 5.2 to 5.3, the cluster, named GFSCluster, seems to
stop being a cluster. GFSCluster is a 2 node cluster using iscsi, cman,
clvm, and gfs and it was working fine when it was on 5.2 The configuration
on both of the nodes (passwords removed)

<?xml version="1.0"?>
<cluster name="GFSCluster" config_version="5">
<cman expected_votes="1" two_node="1"/>
 <clusternodes><clusternode name="node01.company.com" votes="1"
nodeid="1"><fence><method name="single"><device
name="node01_ipmi"/></method></fence></clusternode><clusternode name="
node02.company.com" votes="1" nodeid="2"><fence><method
name="single"><device
name="node02_ipmi"/></method></fence></clusternode></clusternodes>
 <fencedevices><fencedevice name="node01_ipmi" agent="fence_ipmilan"
ipaddr="10.1.0.5" login="root" passwd="********"/><fencedevice
name="node02_ipmi" agent="fence_ipmilan" ipaddr="10.1.0.7" login="root"
passwd="********"/></fencedevices>
 <rm>
   <failoverdomains/>
   <resources/>
 </rm>
</cluster>

When starting the service cman, they both hang on the part starting fencing

Starting cluster:
  Loading modules... done
  Mounting configfs... done
  Starting ccsd... done
  Starting cman... done
  Starting daemons... done
  Starting fencing...

After 5 minutes the task finishes with "done" but clustat says

==== As root on web01.company.com ====
 Cluster Status for GFSCluster @ Wed Jul  8 01:00:24 2009
 Member Status: Quorate

  Member Name                             ID   Status
  ------ ----                             ---- ------
  node01.company.com                         1 Online, Local
  node02.company.com                         2 Offline


==== As root on web02.company.com ====
 Cluster Status for GFSCluster @ Wed Jul  8 01:00:26 2009
 Member Status: Quorate

  Member Name                             ID   Status
  ------ ----                             ---- ------
  node01.company.com                         1 Offline
  node02.company.com                         2 Online, Local

They are both quorate with their own cluster

In the logs of web01 I found repeating messages

Jul  8 00:55:27 web01 fenced[21872]: node02.company.com not a cluster member
after 6 sec post_join_delay
Jul  8 00:55:27 web01 fenced[21872]: fencing node "node02.company.com"
Jul  8 00:55:52 web01 fenced[21872]: agent "fence_ipmilan" reports:
Rebooting machine @ IPMI:10.1.0.7...ipmilan: Failed to connect after 30
seconds Failed


In the logs of web02 I also found the same repeating messages

Jul  8 00:55:27 web02 fenced[6363]: node01.company.com not a cluster member
after 6 sec post_join_delay
Jul  8 00:55:27 web02 fenced[6363]: fencing node "node01.company.com"
Jul  8 00:55:53 web02 fenced[6363]: agent "fence_ipmilan" reports: Rebooting
machine @ IPMI:10.1.0.5...ipmilan: Failed to connect after 30 seconds Failed


Is there a bug on 5.3 with regards to clustering?
Is there any workarounds?



     Feel safer online. Upgrade to the new, safer Internet Explorer 8
optimized for Yahoo! to put your mind at peace. It's free. Get IE8 here!
http://downloads.yahoo.com/sg/internetexplorer/

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20090707/f192b443/attachment.htm>


More information about the Linux-cluster mailing list