[Linux-cluster] quorum / hba issues

Daryl Fenton dfenton at ucalgary.ca
Tue Dec 12 17:10:40 UTC 2006


Right now we have 2 HP blade servers (Blade1 and Blade3) running redhat 
AS 4U4 and cluster suite 4, they are both accessing LVMs on our EMC 
CX700 SAN. Presently we have a 350Gig Ext3 LVM and a 350Gig GFS LVM that 
they are trying to share using cluster suite and NFS. The following 
issue is when we are running tests on our Ext3 NFS share. When we take 
down one of the HBA connections to Blade1 the multipath kicks in and 
everything works fine, but when we disable all of the HBA connections on 
Blade1 the quorum then notices that Blade1 can’t access the qdisk and 
the cluster then fences blade1 which causes it to reboot it’s self. The 
problem is when blade1 comes back up, it can’t find it’s quorum disk 
since the hba is down. Since you need cman for the quorum to work cman 
fires up fine and blade1 joins the cluster. The next service to start is 
qdsikd which fails since blade1’s hba is down and it can’t see the 
quorum disk. Once everything is started blade1 tries to get it’s 
services back from the cluster and fails them since it’s hba is down. 
And then just sites there in the failed state until manual intervention. 
Is there a way to get blade1 not to join the cluster since it’s hba is 
still down, or if it does join the cluster tell it to fence it’s self / 
not accept any services?

Thanks,

Daryl Fenton




More information about the Linux-cluster mailing list