[Cluster-devel] [PATCH 4/8] libgfs2: Improve and simplify blk_alloc_in_rg

Mon Jan 27 15:46:55 UTC 2014

----- Original Message -----
| This function included naive implementations of gfs2_setbit and
| gfs2_bitfit so these have been replaced with calls to those functions.
| The 'type' parameter has been replaced with 'state' and the type
| defines removed in favour of the GFS2_BLKST_* values. As these were the
| same for both meta and data block allocations, data_alloc() has been
| removed and its callers updated to use meta_alloc().
| 
| Signed-off-by: Andrew Price <anprice at redhat.com>

Hi,

I've been staring at this patch for a long time now.
The patch looks correct. However, my concern is this:

The fsck.gfs2 tool is designed to repair either GFS2 or GFS1 file systems,
and GFS1 uses bitmaps differently from GFS2. An indirect block, for example,
will be marked as a data block in a GFS2 file system, but "meta" in GFS1,
(which translates to unlinked dinode in a GFS2 bitmap).

Of course, since this is upstream code, it's tempting to say that there
won't be any GFS1 file systems hanging around. However, in theory, users
might take a GFS1 file system from a legacy system and try migrate it to
a newer system, upstream, RHEL7, Fedora, whatever, in which case they
want to use gfs2_convert. But gfs2_convert doesn't do any error checking,
so we recommend running fsck before the convert. However, on newer OSes,
there isn't a gfs_fsck, there's only fsck.gfs2, which should handle both.

So we need to ask some hard questions:

(1) Does it matter? If the fsck is only being run for the sake of sanity
    for gfs2_convert, allocating those "metadata" blocks as "data" blocks
    will likely be all right, since gfs2_convert will need to convert them
    anyway.
(2) From the patch, it looks like "data" and "meta" are being treated the
    same anyway, so is there a bug in today's fsck.gfs2? This might
    show up if there's a GFS1 file system that has enough damage as to
    push a pile of directory entries into lost+found, thus increasing its
    size enough to add indirect blocks. Due to that same bug, fsck.gfs2
    might not catch the discrepancy, so we probably should check it by hand
    using gfs2_edit.
(3) If there is a bug with how fsck.gfs2 handles GFS1 bitmaps, do we need
    to fix it? Or is it too much work for too little gain?
(4) I've got a test case I run called fsck.gfs2.nightmare2.sh, which tests
    fsck against my entire collection of metadata sets, both GFS and GFS2.
    It can run for days, depending on the hardware. Do the GFS1 metadata
    sets still pass with this patch?

Regards,

Bob Peterson
Red Hat File Systems