[Linux-cluster] Cluster won't come up when T1 is down???

Christopher Barry christopher.barry at qlogic.com
Fri Oct 12 03:04:12 UTC 2007


> From: linux-cluster-bounces at redhat.com on behalf of isplist at logicore.net
> Sent: Thu 10/11/2007 10:12 PM
> To: linux-cluster
> Subject: RE: [Linux-cluster] Cluster won't come up when T1 is down???
 
> Great, list cop. Can't you just stop replying then? I've been on this list and 


WooO00ooooOOOOooo; Pull Over! ;)

isplist, I'm know you're a great guy, and no doubt a tad stressed with your site down, but you have posted your message the exact same way at least 4 times. No one has answered you. I'm at least replying. Isn't the definition of insanity doing the same thing over and over while expecting different results? I'm simply attempting to engage you to help you - selfishly, I'll admit, because to be honest, I'm tired of this email hitting my inbox. However, you are acting as if I - in attempting to help you understand the traditional Internet, *time honored* posting techniques (for which there are very good reasons),  am somehow the bad guy here.

Please, let's please dispense with the vitriol, and solve your problem, ok? Great, then let's get on with it. You can thank me later.

Now, on to your computer problem:

>> Here's a very weird one. I have a cluster of web servers outgoing over a
>> T1.
>> When the T1 went down this morning, the cluster, which is all internal,
>> non
>> routable IP's, would not come back. All of the machines locked up around
>> the
>> loading DLM section on bootup.
>>
>> Once the T1 came back, they all booted just fine and went into cluster
>> mode.
>>
>> What in the world would cause that? There aren't any external services
>> required to fire up my local cluster, never were, it's always been fine
>> before.

Obviously, the external link is the dependency we need to examine. We need to ask what could be the dependency here? My gut feeling is that there is a name resolution problem happening. Where are the names you use for your nodes to find each other stored? I'm guessing in a dns that you cannot access when the T1 drops.

Insure that each node has an /etc/host file that has all node names and IP addresses in it, if you have not done so already. This will ensure that names will be resolved correctly - even if dns is not available. Understanding your network topology would also be helpful.

But keep in mind I am simply guessing here - as you have provided me with few details.

Can you please provide:

* a description of your topology
* config data for all interfaces
* contents of /etc/hosts
* IP address of dns server


Thanks,
-C

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20071011/7c388324/attachment.htm>


More information about the Linux-cluster mailing list