[Date Prev][Date Next] [Thread Prev][Thread Next]
[linux-lvm] Re: ext2resize
- From: Andreas Dilger <adilger enel ucalgary ca>
- To: buytenh dsv nl
- Cc: linux-lvm msede com (Linux LVM mailing list), linux-fsdevel vger rutgers edu (Linux FS development list)
- Subject: [linux-lvm] Re: ext2resize
- Date: Wed, 30 Jun 1999 01:34:34 -0600 (MDT)
Lennert Buytenhek writes:
> Correct. ext2 is divided into block groups, which are 8mb
> big when using 1k blocks. A block group looks like this:
> 1 block superblock
> ? blocks group descriptor table
> 1 block block bitmap
> 1 block inode bitmap
> ? blocks inode table
> ? blocks data blocks (the bulk of the group)
I was emailing with Mike Field about this, and according to the
definition of ext2_super_block in ext2_fs.h, it should be possible to
set the location of the block bitmap, inode bitmap, and inode table
anywhere in the group, and have the datablocks follow. If you set the
pointers to these structures to start, say, 33 blocks into the group,
this would allow you to grow the GDT to handle an 8GB filesystem before
a reorg (block moving) is necessary.
I looked into the code in e2fsprogs/lib/ext2fs (openfs.c, initialize.c)
and the kernel (fs/ext2/balloc.c). It looks like, while an ext2 reader
will only (currently) calculate desc_blocks based on the number of group
descriptors and the block size, it will gladly use the values supplied
in the superblock for the location of the block bitmap, inode bitmap,
inode table, and the number of data blocks - leaving a "gap" after the
GDT for future growth (NB - need to check e2fsck for what it does). If
you "fix" initialize.c to have a larger number of desc_blocks than the
minimum needed, existing kernels and e2fsck should work OK with this,
which is a big plus. Your ext2_resize could also do this without
actually "growing" the filesystem - just get it ready to do so if
When it comes time to grow the filesystem, all you need to
0) expand LV/partition/md/loopback file/etc to be larger.
1) userland - write into new groups the new FS data (superblock,
GDT, inode bitmaps, inode blocks, etc). This is what
mke2fs + ext2extend from ext2-volume does to a new disk. It
should be relatively straight forward, maybe a new flag to
mke2fs which says "start writing X groups into the FS". The
only real issue is the last group, which appears to be able to
NOT have a superblock or GDT, which is a BIG problem...
2) userland - write into the "spare" GDT for each existing group
any needed values. Since this is likely constant, it could
even be done long in advance (eg FS creation, or
ext2_offline_resize). There should be no worry about this
space being overwritten by the kernel, since it will never
read or write these blocks.
3) userland - write into all "extra" superblocks the new FS
configuration, updating blocks_count, free_blocks, r_blocks_count,
inodes_count, free_inodes_count, groups_count. Again, hopefully
no worries about overwriting this on a running system because the
kernel shouldn't touch these on an open filesystem.
4) lock FS in kernel
5) kernel - update kernel superblock data with new FS config as in (3).
May need to "realloc" the GDT tables in memory, as the kernel
will only have allocated enough based on old GDT size (or so it looks
in my 2.0.36 balloc.c).
6) kernel - write primary superblock to disk. This is the "real"
copy, and the other superblocks are only estimates that will be
overwritten when the FS is unmounted, I believe. If system
crashes without FS unmount, then primary superblock should be
used on remount anyways, and e2fsck will fix others?
7) unlock FS in kernel
8) userland - proceed to use new space in FS ;-)
> This is what ext2resize basically does (when enlarging).
> But you'll need a way to get this through to the kernel (it
> has it's own superblock copy). I haven't really looked at
> the volume patch very well.
As I suggested to Mike, it may be desirable to have two different
implementations - an online resize which will not do much (if any) block
moving, and can only resize up to the next 256MB boundary (or
pre-allocated GDT size), and an offline resize which will do things like
renumber inode and data blocks, remove inodes, add GDT blocks, etc.
Mike had also suggested that when we are doing a major FS (offline)
reorg, we could start removing blocks from the inode table instead of
data blocks as there are usually free inodes in each group, but not
always data blocks...
> You can remount an fs RO, ext2resize it, and remount it RW methinks.
This would likely break many programs, as they would fail for the time
it is in RO mode. A more pleasant solution is to only allow growth to a
pre-determined limit online (with a kernel lock), and then force the
user to unmount the FS to do block shuffling.
> About shrinking an existing fs: this would be even
> messier. (Involves moving inodes around, and those
> inodes might be in core. Et cetera. Hell on earth :-)
> But growing an fs might be messy too, because of
> the growing group descriptor table.
I don't think shrinking a FS online is as big a need as growing it, and
this can be left for a utility that works when the FS is unmounted.
Andreas Dilger University of Calgary \ "If a man ate a pound of pasta and
Micronet Research Group \ a pound of antipasto, would they
Dept of Electrical & Computer Engineering \ cancel out, leaving him still
http://www-mddsp.enel.ucalgary.ca/People/adilger/ hungry?" -- Dogbert
[Date Prev][Date Next] [Thread Prev][Thread Next]