[Linux-cluster] Odd cluster problems
Jay Leafey
jleafey at utmem.edu
Thu Aug 2 19:00:13 UTC 2007
Lon Hohberger wrote:
> On Tue, Jul 31, 2007 at 10:48:44AM -0500, Jay Leafey wrote:
>> I've got a 3-node cluster running CentOS 4.5 and I cannot communicate
>> with the resource group manager. When I use the clustat command I get a
>> timeout:
>>
>>> [root at rapier ~]# clustat
>>> Timed out waiting for a response from Resource Group Manager
>>> Member Status: Quorate
>>>
>>> Member Name Status
>>> ------ ---- ------
>>> rapier.utmem.edu Online, Local, rgmanager
>>> thorax.utmem.edu Offline
>>> cyclops.utmem.edu Online, rgmanager
>
>>> Fence Domain: "default" 2 2 recover 4 -
>>> [1 2]
>
> Until fencing completes, rgmanager won't respond.
>
> fence_ack_manual needs to be run.
>
>>> <SNIP>
>>>
>>> User: "usrm::manager" 10 10 recover 2 -
>>> [1 2]
>>>
>
Your reply was a bit confusing at first, but looking deeper showed you
were right on the mark. The systems (using HP ILO fencing) were unable
to communicate with each other very well or with the ILO ports at all.
Turns out some of the ports they were configured on had been moved to a
different VLAN, so the network was split between the ILOs and the host
ports.
Configuring the ports properly seems to have resolved the issue,
everything is working fine now. I guess I just need to keep the rubber
hose handy for "discussions" with the network guys! (grin!)
Thanks!
--
Jay Leafey - University of Tennessee
E-Mail: jleafey at utmem.edu Phone: 901-448-6534 FAX: 901-448-8199
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 5158 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070802/edce09f2/attachment.bin>
More information about the Linux-cluster
mailing list