Filesystem fragmentation and scatter-gather DMA

David Schwartz davids at webmaster.com
Mon Mar 17 06:05:48 UTC 2008


Jon Forrest wrote:

> 1) The greatest benefit from having a contiguous file would
> be when the whole file is read (let's stick with reads) in
> one I/O operation. The would result in the minimal amount of
> disk arm movement, which is the slowest part of a disk I/O
> operation. But, this isn't the way most I/Os take place. Instead,
> most I/Os are fairly small. Plus, and this is the kicker, on
> a modern multitasking operating system, those small I/Os are coming
> from different processes reading from different files. Assuming that the
> data to be read isn't in a memory cache, this means that the disk arm is
> going to be flying all over the place, trying to satisfy all
> the seek operations being issued by the operating system.
> Sure, the operating system, and maybe even the disk controller,
> might be trying to re-order I/Os but there's only so much of
> this that can be done. A contiguous file doesn't really help
> much because there's a very good change that the disk arm is
> going to have to move elsewhere on the disk between the time
> that pieces of a file are read.

That's not really the issue. The issue is whether a read of a chunk of a
file can take place without any extra seeks or whether it does require extra
seeks. Further, for the vast majority of cases, there is only one I/O stream
going on at a time. The disk will read ahead. If that can satisfy even a
small fraction of the subsequent I/Os the OS issues, that's a big win.

> 3) Modern disks do all kind of internal block remapping so there's
> no guarantee that what appears to be contiguous to the operating
> system is actually really and truly contiguous on the disk. I have
> no idea how often this possibility occurs, or how bad the skew is
> between "fake" blocks and "real" blocks. But, it could happen.

Not bad enough to make a significant difference on any but a nearly-failing
drive.

> The mystery that really puzzles and sometimes frightens me is
> why an NTFS file system becomes fragmented so easily in the first
> place. Let's say I'm installing Windows 2000 on a newly formatted
> 20GB disk. Let's say that the total amount of space used by the
> new installation is 600MB. Why should I see any fragmented files,
> other than registry files, after such an installation? I have no
> idea. My thinking is that all files that aren't created and then
> later extended should be able to be created contiguously to begin with.

Only if you're willing to leave big holes behind, which will rapidly lead to
a full disk and massive fragmentation. As files are being created, files are
also being deleted. There is no way for the OS to know ahead of time which
files are going to be around for a long time, so it has to mix the
short-term files with the long-term files. But, of course, once you
defragment a large chunk of non-changing files, they should stay that way.

DS





More information about the Ext3-users mailing list