[Linux-cluster] Re: corrupted GFS filesystem

Bob Peterson rpeterso at redhat.com
Thu Aug 14 13:52:28 UTC 2008


On Wed, 2008-08-13 at 23:37 -0500, David Potterveld wrote:
> I have a corrupted GFS filesystem, and gfs_fsck is unable to fix it.
(snip)
> I have the feeling it's only slightly damaged, but I don't know the
> GFS disk structure, and I haven't a clue what to do next. The GFS
> filesystem was created with the default resource group size.
> 
> Any suggestions how to proceed? Is there any way to recover the
> filesystem? This is the storage for a mail server. I have a backup,
> but I would lose 24 hours of inbound mail, and I'd really hate to do
> that.
> 
> Thanks!
> David Potterveld   (davep at core.com)

Hi David,

Can you post the dmesgs or console messages received that reported
the original file system damage?

Also, you never said whether this was RHEL4/Centos4 or RHEL5/Centos5
or equivalent.  That's always helpful.

I can't even imagine what could have happened to delete all your
resource groups.  I've never seen that happen before.
Unless it was a hardware problem or if some other unprotected
process wiped out metadata from the file system.

If the resource groups disappear, that's big trouble.  That's
likely why it thought your root directory was gone.

>From the superblock, I can tell that the gfs_fsck quit abnormally.
So before you can mount again, you'll have to do:
gfs_tool sb /dev/VGsan0/lvsan0 proto "lock_dlm"

Unfortunately, since you ran gfs_fsck several times, you've probably
wiped out all information about how the file system got into that
condition, so we may never know how this happened.
You might have to restore from backup after doing gfs_mkfs again.

If you get a file system withdraw on gfs due to corruption, I
recommend that, before you run gfs_fsck, you save the metadata by
doing this:

gfs2_edit savemeta /dev/your/device /tmp/devicename.metadata

There is a RHEL4 version of the gfs2_edit tool on my people page,
which is here:

http://people.redhat.com/rpeterso/Experimental/RHEL4.x/

(This link contains both source code and 32-bit compiled code).

For RHEL5, I recommend compiling the gfs2_edit code from source
code, because it will be more up to date.  (Older versions of
gfs2_edit don't do as good a job saving GFS (1) metadata.)

The advantage of saving the metadata is that if something goes
wrong, you can restore the file system back to its original
pre-gfs_fsck condition.  Also, if you open up a bugzilla record
so we can try to solve the problem, we can use that metadata to
understand what's wrong with the file system and what happened
to corrupt it.  (Please remember to bzip2 the metadata before
sending it in; it will likely be too big to attach, so you
might need to put it onto a web site or ftp server).

In this case, saving your metadata is probably useless because
it's been changed so much from the original problem by your
running gfs_fsck so many times.

Regards,

Bob Peterson
Red Hat Clustering & GFS





More information about the Linux-cluster mailing list