[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: ext3 on 2.4.4


On Thu, May 31, 2001 at 12:21:50AM -0400, Bryan J. Smith wrote:

> aspects of the VFS and internals.  I have been warning all
> colleagues to carefully consider the relatively new state of 2.4
> itself before looking at some of the JFS options on 2.4.  It still
> amazes me on how many, supposedly "experienced," administrators
> immediately jump feet first into a new major kernel release, or a
> RedHat (X+1).0 release and expect everything to work perfectly.

Right.  2.4 is actually a big leap forward in many ways.  A lot of the
core networking and VFS code is so much cleaner than the 2.2 code that
it is simply more robust with less corner cases to go wrong, and with
all of the effort that the distributions put into getting 2.4 stable,
the core is looking pretty good.

BUT: there are so many new drivers, there is so much new SMP
fine-grained locking, there is so much new VM behaviour that you just
cannot expect to be safe deploying 2.4 yet unless you have qualified
it in your own environment.  I am finding 2.4 very good on the
desktop, and a lot of the VM performance problems of earlier 2.4 seem
to be getting much better in 2.4.5.  

It's still a work in progress (as is 2.2 to a much lesser extent!), in
other words, but 2.4 is enormously better right now than the early 2.2
kernels were at this stage.  2.4 is definitely worth evaluating.  It
is definitely not worth installing blindly on a mission-critical server
with the assumption that it is guaranteed to work right first time.
It might well work flawlessly, but it's a gamble to assume that you
won't have any trouble at all.  

That's not a criticism of 2.4: it is a big change from 2.2 in many
ways and you don't make *any* massive change like that without proper
testing if you value your server.  There are some huge server farms
now running 2.4 very successfully.  It's just a change which needs to
be managed like any other big change.

> Which brings me to my more "political/non-technical/touchy"
> question, probably something that you probably, normally avoid for
> obvious reasons.
> I have heard that you have dropped all intensions to extend Ext3
> beyond the capabilities/features of Ext2/VFS. 

I don't know where you heard that!  We have a number of really
interesting things happening in ext2 right now --- large fast
directories and efficient handling of small files, ACL support, and
beyond that we want to add extended attributes and extent mappings.
I definitely want to keep ext3 stable, which means that working out
how to release the extended versions will be an issue, but I certainly
want to see ext3 taking advantage of these new features.

There are also ext3-specific features.  Ext3 already offers
data-journaling, which allows synchronous data writes to be spooled to
the journal sequentially.  So, synchronous write traffic such as
sendmail spools or NFS servers can obtain fully consistent,
synchronous data and metadata writes without seeking on the disk.
Combine that with ext3 support for journaling to a separate spool disk
and certain workloads will be enormously faster (we've already
demonstrated external spool disks using a prototype ext3 variant which
ran noticably faster than ext2).

> these "misguided" individuals, I find myself trying to break down
> all the capabilities, features, concerns, approaches (e.g., Ext3 =
> evolutionary, ReiserFS = revolutionary, XFS = existing ports) of the
> various JFS options for Linux.

Actually, reiserFS *is* revolutionary.  That doesn't come without
price: it takes much more CPU than ext2 (we see ext2 being
substantially faster than any of the journaling filesystems on slow
CPUs) and the code is a lot more complex.  Right now all that the core
reiserFS code in 2.4 offers in terms of features for that complexity
is the journaling and the small-file/large-dir efficiency.  That's
only an evolutionary step forward in terms of features.  But the real
change there will be the advent of plugins that Hans is working on for
reiserFS-4: plugins with the ability to take advantage of
database-type semantics in the underlying storage services look
really, really exciting, and certainly qualifies as revolutionary
rather than evolutionary.

> So, with that said, where do you see SGI's XFS, admist your own Ext3
> efforts as well ReiserFS being in the stock kernel as of 2.4.1?  I
> would be very interested in hearing your opinion from your kernel
> developer viewpoint, your RedHat employee viewpoint and your overall
> technical viewpoint.

I tend just to wear one hat in public: I'm a kernel developer, my
technical viewpoint is coloured by that and as a Red Hat employee I am
*still* a kernel developer!

I really think that the biggest advantage Linux has to offer its users
is choice.  That means that I don't want Linux to displace all other
systems: we don't want to be everything to everybody as much as to
give people alternatives.  Having multiple different filesystems
available is a good thing if it allows us to try different approaches
and see which works for which environments.  

That means that having evolutionary and revolutionary filesystems at
the same time is a Good Thing.  Stability is my number 1 priority for
ext3.  That means that ext2 filesystems are 100% compatible with
ext3: you never need to reformat.  It means ext3 takes advantage of
the tried-and-tested ext2 fsck code.  It means that I don't want to
break any existing ext2 functionality such as NFS or quotas, and that
I reuse existing tested kernel code wherever possible, and try not to
introduce performance bottlenecks that degrade existing ext2
performance.  Other things such as the hash-indexed directory code are
important but secondary to the stability concern.  

That means that ext3 will *never* be btree-based.  If you want to
explore the possibilities that a radical new filesystem structure
offers, you need to make a different choice.  Choice Is good.

> As a kernel developer, what do you see lacking
> with XFS at this time

Feature-wise?  XFS is a pretty good filesystem.  It's a bit
heavyweight for some things, but I don't see it as "lacking" anything
compared to the alternatives on Linux.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]