[Linux-cluster] GFS volume hangs on 3 nodes after gfs_grow

Thu Sep 25 19:46:15 UTC 2008

----- "Alan A" <alan.zg at gmail.com> wrote:
| Hi all!
| 
| I have 3 node test cluster utilizing SCSI fencing and GFS. I have made
| 2 GFS
| Logical Volumes - lvm1 and lvm2, both utilizing 5GB on 10GB disks.
| Testing
| the command line tools I did lvextend -L +1G /devicename to bring lvm2
| to
| 6GB. This went fine without any problems. Then I issued command
| gfs_grow
| /mountpoint and the volume became inaccessible. Any command trying to
| access
| the volume hangs, and umount returns: /sbin/umount.gfs: /lvm2: device
| is
| busy.
| 
| Few questions - Since I have two volumes on this cluster and the lvm1
| works
| just fine, would there be any suggestions to unmounting lvm2 in order
| to try
| and fix it?
| Is gfs_grow - bug free or not (use/do not use)?
| Is there any other way besides restarting the cluster/ nodes to get
| lvm2
| back in operational state?
| -- 
| Alan A.

Hi Alan,

Did you check in dmesg for kernel messages relating to the hang?

I have seen some bugs in gfs_grow, and there are some fixes that
haven't made it out to all users yet, but you did not tell us which
version of the software you're using.  You didn't even say whether
this is RHEL4/CentOS4 or RHEL5/Centos5 or another distro. 

I'm not aware of any bugs in the most recent gfs_grow that appears
in the cluster git repository.  These gfs_grow fixes will trickle
out to various releases if you're not compiling from the source code,
so you may or may not have the fixed code.

If your software is not recent, it's likely that an interrupted or
hung gfs_grow will end up corrupting the GFS file system.  There is
a new, improved version of gfs_fsck that can repair the damage, but
again, you need a recent version of the software.

Regards,

Bob Peterson
Red Hat Clustering & GFS