RE: [Linux-cluster] Cluster Crashes


> First of all, is there a way I can test to see if my Brocade 
> switch is actually doing any fencing or not? I get the sense 
> it's doing nothing.

Log in to the switch and check the status of the port connected to the
node in question.  It should be down.

> I think this because my cluster is terribly unstable. If I 
> reboot a node, that's fine, it works, the cluster stays up. 
> However, if one of the nodes crashes in any manner, it takes 
> down everything to the point of having to shut down every 
> machine and starting it all one at a time.
> If a drive get's moved on my FC storage, the cluster crashes. 
> If the storage is rebooted, the cluster crashes. If I change 
> pretty much anything on the storage, the cluster crashes, 
> it's nuts. The way it seems to start is that one node seems 
> to have a kernel panic which sets off the rest.

You're kidding, right?  Filesystems don't like having storage yanked
out from underneath them, clustered or not.

> I know this is limited information but I need somewhere to 
> start. I can't even begin to think of using this in a 
> production environment, no one would get any sleep watching 
> over this to make sure it's all up :).
> Mike
