[Linux-cluster] Hang on start fence_tool join with qdisk

Eugene Melnichuk doc at mts.com.ua
Mon Jul 23 13:37:42 UTC 2007



Hi,

I have problem with my cluster running on RHEL5 + updates from  
http://people.redhat.com/lhh/rhel5-test/  

I have 2 node cluster with shared quorum disk, qdiskd is running, but 
when I start service cman I hang on Starting fencing.
In my logs I have messages about regained qourum :

Jul 21 15:50:18 arf-web1 qdiskd[7326]: <info> Assuming master role
Jul 21 15:50:19 arf-web1 ccsd[8188]: Cluster is not quorate.  Refusing 
connection.
Jul 21 15:50:19 arf-web1 ccsd[8188]: Error while processing connect: 
Connection refused
Jul 21 15:50:19 arf-web1 openais[8200]: [CMAN ] quorum regained, 
resuming activity
Jul 21 15:50:20 arf-web1 clurgmgrd[7746]: <notice> Quorum formed, starting
Jul 21 15:50:20 arf-web1 kernel: dlm: no local IP address has been set
Jul 21 15:50:20 arf-web1 kernel: dlm: cannot start dlm lowcomms -12


After few minutes process of starting fencing finished , but I still do 
not have running services and in group_tool I see that joining to fence 
domain is not complete.

[root at arf-web1 ~]# group_tool
type             level name     id       state
fence            0     default  00010002 JOIN_START_WAIT
[2]

When I try issue commands like cman_tool or clustat I got nothing and 
hang on access to socket /var/run/cman_client (but can Ctrl-C running 
command)
[root at arf-web1 ~]# strace cman_tool status
execve("/usr/sbin/cman_tool", ["cman_tool", "status"], [/* 21 vars */]) = 0
<skip>
socket(PF_FILE, SOCK_STREAM, 0)         = 3
fcntl(3, F_SETFD, FD_CLOEXEC)           = 0
connect(3, {sa_family=AF_FILE, path="/var/run/cman_client"}, 110 
<unfinished ...>

[root at arf-web1 ~]# strace clustat
execve("/usr/sbin/clustat", ["clustat"], [/* 21 vars */]) = 0
socket(PF_FILE, SOCK_STREAM, 0)         = 3
fcntl(3, F_SETFD, FD_CLOEXEC)           = 0
connect(3, {sa_family=AF_FILE, path="/var/run/cman_client"}, 110 
<unfinished ...>

What can I do to resolve this ?


Thanks in advance,
Eugene



--
Eugene Melnichuk
Lead Engineer
email: doc at umc.ua <mailto:doc at umc.ua>
mob: +380503304043
pbx: +380501105731
CJSC Ukrainian Mobile Communications
49/2 Pobedy ave., room 4.26, 03680, Kyiv, Ukraine


-------------- next part --------------
A non-text attachment was scrubbed...
Name: cluster.conf
Type: text/xml
Size: 2342 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070723/9ae21293/attachment.xml>


More information about the Linux-cluster mailing list