[Linux-cluster] AS 4.0 with GFS 6.1 cluster not working

Lon Hohberger lhh at redhat.com
Wed Nov 15 16:59:35 UTC 2006


On Mon, 2006-11-13 at 13:04 -0500, Patel, Tushar wrote:
>  
> Hello,
>  
> We have lot of AS 4.0 GFS 6.0 clusters in our firm working fine.
> However recently we are upgrading our clusters to GFS 6.1
>  
> From what we have seen so far is once we upgraded to GFS 6.1 our
> failover test is failing.
>  
> So far we have conducted failover tests using 2 scenario
> 1.) Manually reboot one of the node in cluster using HP-iLO interface
> sending fence_ilo command from one of the nodes - the command works
> fine and host does reboot.
> 2.) Pulling out network cables of one of the host in the cluster.
>  
> We have 4 node cluster.
>  
> Problem is with either of the above tests, gfs hangs and clustat
> starts reporting only member information. No service information. 
> Clustat displays following message:

> "Timeout : Resource Manager not responding"
>  
> The gfs just keeps hanging for untill we manually intervene and bring
> the halted node up.
>  
> It seems fencing is not working or rgmanager is flawed and cannot
> process/parse information.

If fencing breaks, rgmanager will hang...

Look at /proc/cluster/services -- if you see the fence domain in the
'recover' state, fencing needs to be fixed.

-- Lon






More information about the Linux-cluster mailing list