[Cluster-devel] [GFS2 Patch] [Try 2] GFS2: Reduce file fragmentation

Bob Peterson rpeterso at redhat.com
Wed Jul 11 18:07:56 UTC 2012


----- Original Message -----
| Hi,
| 
| Generally looking much better, but some further comments are included
| below...
| 
| I'm still not sure that I understand fully the implications of the
| new
| allocation algorithm, but the code surrounding that core part of the
| patch is looking much better now.
(snip)
| Don't we always need to drop this data structure at this point,
| whether
| or not the reservation is active?
Yes we do. Good catch. Fixed in the latest version.

| Does this need to be a mutex? I think a spinlock would do just as
| well
| in this case since we never schedule while holding the lock.

Good suggestion; changed to a spinlock.
 
| Why are we adding an extra field just to pass to the tracepoint?
| Surely
| it is easier just to pass the inode if required, rather than this
| structure, then you can access all the required fields.

Good idea. Changed to pass a parameter to the trace point.
 
| What is the difference between rs_free and rs_blks? Shouldn't these
| two
| always be identical, since there is no point in reserving blocks
| which
| are not free.

I guess I used a misleading variable name. The two variables have
two meanings and both are needed. I renamed variable rs_blks to rs_len
because it represents the length of the reservation.

| Can we split this into two functions? That way we can have a
| __gfs2_tree_del() which doesn't do anything with the lock, and a
| gfs2_tree_del() which grabs and drop the lock and calls the former.

Good idea. Done.

| Use max_t here?

Good idea. Done.

| 
| I'm not sure that I understand this comment at all. Currently with
| directories we never deallocate any blocks at all until the directory
| is
| deallocated when it is unlinked. We will want to extend this to
| directories eventually, even if we don't do that immediately.

I clarified the comment to make it more clear what's going on.
I'm talking about gaps in the _reservation_ not gaps in the blocks.
The current algorithm makes assumptions based on the fact that block
reservations don't have gaps, and the "next" free block will be the
successor to the last claimed. If you use reservations for directories,
what can happen is that two files may be created, which claims two
blocks in the reservation. If the first file is deleted from the
directory, that block becomes a "hole" in the reservation, which breaks
the code with its current assumptions. We either have to:
(a) keep the current assumptions which make block claims faster, or
(b) Make no such assumptions and implement a bitmap-like search of the
    reservation that can fill holes. It wouldn't be too tough to do,
    especially since we already have nicely tuned functions to do it.
    I'm just worried that it's going to hurt performance.
 
| Is there some missing locking here? What protects the rb tree at this
| point?

Good catch; fixed in the latest and greatest.

I'll post the replacement patch as a separate email.

Regards,

Bob Peterson
Red Hat File Systems




More information about the Cluster-devel mailing list