[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] Fwd: GFS volume hangs on 3 nodes after gfs_grow



Again thanks for the fast and prompt response Bob.

I restored nodes to the healthy state and they can access GFS volumes.
node3:
service gfs status
Configured GFS mountpoints:
/lvm_test1
/lvm_test2
Active GFS mountpoints:
/lvm_test1
/lvm_test2

node4:
service gfs status
Configured GFS mountpoints:
/lvm_test1
/lvm_test2
Active GFS mountpoints:
/lvm_test1
/lvm_test2

node2 - lucy node:
service gfs status
Configured GFS mountpoints:
/lvm_test1
/lvm_test2
Active GFS mountpoints:
/lvm_test1
/lvm_test2


I will try to reproduce the problem with gfs_grow.

One more question regarding GFS - what steps would you recommend (if any) for growing and shrinking active GFS volume?

On Fri, Sep 26, 2008 at 12:44 PM, Bob Peterson <rpeterso redhat com> wrote:
----- "Alan A" <alan zg gmail com> wrote:
| Thanks again, Bob.
|
| No kernel-panic on any of the nodes. I had to cold boot all 3 nodes in
| order
| to get the cluster going (might have been a fence issue but am not
| 100%
| sure, since we use only SCSI fencing until we agree on secondary
| fencing
| method). What is 'scary' is that gfs_grow command paralized that
| volume on
| all 3 nodes, and I coldn't access, nor unmount, nor run gfs_fsck, from
| any
| of the nodes. We will do more testing on this, btw do you have
| suggested
| "safe" method of growing and shrinking the volume other than what is
| noted
| in 5.2 documentation (since we followed the RHEL manual). If the GFS
| volume
| hangs - what is the best way to try and unmount it from the node,
| would
| 'gfs_freeze' helped)?

Hi Alan,

No, gfs_freeze won't help.  In these cases, it's probably best to
reboot the node that caused the problem, by /sbin/reboot -fin or
throwing the power switch I think.  I suspect that clvmd status
hung because of the earlier problem.

I'm not aware of any problems in your version of gfs_grow that can
cause this kind of lockup.  It's designed to be run seamlessly while
other processes are using the file system, and that's the kind of
thing we test regularly.

If you figure out how to recreate the lockup, let me know so I
can try it out.  Of course, if this is a production cluster, I
would not take it out of production a long time to try this.
But if I can recreate the problem here, I'll file a bugzilla
record and get it fixed.

Regards,

Bob Peterson
Red Hat Clustering & GFS

--
Linux-cluster mailing list
Linux-cluster redhat com
https://www.redhat.com/mailman/listinfo/linux-cluster



--
Alan A.

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]