[Linux-cluster] Fwd: GFS volume hangs on 3 nodes after gfs_grow

Alan A alan.zg at gmail.com
Fri Sep 26 18:34:07 UTC 2008


Again thanks for the fast and prompt response Bob.

I restored nodes to the healthy state and they can access GFS volumes.
node3:
service gfs status
Configured GFS mountpoints:
/lvm_test1
/lvm_test2
Active GFS mountpoints:
/lvm_test1
/lvm_test2

node4:
service gfs status
Configured GFS mountpoints:
/lvm_test1
/lvm_test2
Active GFS mountpoints:
/lvm_test1
/lvm_test2

node2 - lucy node:
service gfs status
Configured GFS mountpoints:
/lvm_test1
/lvm_test2
Active GFS mountpoints:
/lvm_test1
/lvm_test2


I will try to reproduce the problem with gfs_grow.

One more question regarding GFS - what steps would you recommend (if any)
for growing and shrinking active GFS volume?

On Fri, Sep 26, 2008 at 12:44 PM, Bob Peterson <rpeterso at redhat.com> wrote:

> ----- "Alan A" <alan.zg at gmail.com> wrote:
> | Thanks again, Bob.
> |
> | No kernel-panic on any of the nodes. I had to cold boot all 3 nodes in
> | order
> | to get the cluster going (might have been a fence issue but am not
> | 100%
> | sure, since we use only SCSI fencing until we agree on secondary
> | fencing
> | method). What is 'scary' is that gfs_grow command paralized that
> | volume on
> | all 3 nodes, and I coldn't access, nor unmount, nor run gfs_fsck, from
> | any
> | of the nodes. We will do more testing on this, btw do you have
> | suggested
> | "safe" method of growing and shrinking the volume other than what is
> | noted
> | in 5.2 documentation (since we followed the RHEL manual). If the GFS
> | volume
> | hangs - what is the best way to try and unmount it from the node,
> | would
> | 'gfs_freeze' helped)?
>
> Hi Alan,
>
> No, gfs_freeze won't help.  In these cases, it's probably best to
> reboot the node that caused the problem, by /sbin/reboot -fin or
> throwing the power switch I think.  I suspect that clvmd status
> hung because of the earlier problem.
>
> I'm not aware of any problems in your version of gfs_grow that can
> cause this kind of lockup.  It's designed to be run seamlessly while
> other processes are using the file system, and that's the kind of
> thing we test regularly.
>
> If you figure out how to recreate the lockup, let me know so I
> can try it out.  Of course, if this is a production cluster, I
> would not take it out of production a long time to try this.
> But if I can recreate the problem here, I'll file a bugzilla
> record and get it fixed.
>
> Regards,
>
> Bob Peterson
> Red Hat Clustering & GFS
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>



-- 
Alan A.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080926/75285fe0/attachment.htm>


More information about the Linux-cluster mailing list