[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: How to generate a large file allocating space



On Thu, Nov 04, 2010 at 06:29:47PM +0000, Alex Bligh wrote:
> 
> >Well, I would personally not be against an extension to fallocate()
> >where if the caller of the syscall specifies a new flag, that might be
> >named FALLOC_FL_EXPOSE_OLD_DATA, and if the caller either has root
> >privs or (if capabilities are enabled) CAP_DAC_OVERRIDE &&
> >CAP_MAC_OVERRIDE, it would be able to allocate blocks whose extents
> >would be marked as initialized without actually initializing the
> >blocks first.
> 
> That sounds a lot like "send patches" which I just might do, if only
> to gain better understanding as to what is going on.

Patches to do this wouldn't be that hard.  The harder part would
probably be the politics on fs-devel regarding the semantics of
FALLOC_FL_EXPOSE_OLD_DATA.

> I seem to remember (from lwn's summary of lkml) that the proposed
> options for fallocate() got a bit baroque to start with, and people
> then simplified down to zero options. Perhaps that was a simplification
> too far.

It was simplified down to one flag.  But that means we have a flags
field we can use to extend fallocate.

> In the mean time, particularly as I'd ideally like to avoid a kernel
> modification, is there a safe way I could use or modify the ext2
> library to run through the extents of a fallocated() file and clear
> the "unwritten" bit? If I clear that (which from memory is the top
> bit of the extent length), is that alone safe? (on an unmounted
> file system, obviously).

Yes, there most certainly is.  The functions you'd probably want to
use are ext2fs_extent_open(), and then either use ext2fs_extent_goto()
to go to a specific extent, use ext2fs_extent_get() with the
EXT2_EXTENT_NEXT operation to iterate over the extents, and then use
ext2fs_extent_replace() to mutate the extent.  Oh, and then use
ext2fs_extent_close() when you're done looking at and/or changing the
extents of a file.

If you build tst_extents in lib/ext2fs, you can use commands like
"inode" (to open the extents for a particular inode), and "root",
"current", "next", "prev", "next_leaf", "prev_leaf", "next_sibling",
"prev_sibling", "delete_node", "insert_node", "replace_node",
"split_node", "print_all", "goto", etc.  Please don't use this in
production, but it's not a bad way to play with an extent tree, either
for learning purposes or to create test cases.  tst_extents.c is also
a good way of seeing how the various libext2fs extent API's work.

> I would tend to agree that replicating across commodity disks is
> in almost all cases a better technological solution, but the
> technology is still further away from readiness there. Sadly
> technological arguments don't always win the day, and we need
> something in the mean time...

Well, things like Hadoopfs exist today, and Ceph (if you need a
POSIX-level access) is admittedly less stable.  But if you're starting
from scratch, wouldn't that be pretty far away from readiness as well?

     	      	       	       	      	  - Ted


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]