[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: htree stabilitity and performance issues



> It depends on the workload.  Things which do readdir scans of
> directories followed by a stat or a open of all of the files in the
> directory actually do worse with htree, because readdir() no longer
> returns files in the order they were created.  This means the inodes
> get opened in random order, which means inode lookups that don't make
> the cache will on average require reading in a new inode table block,
> where as if you read inode 1000, 1001, 1002, 1003, etc., they will all
> be from the same inode table block.  This can be fixed if you modify
> your application to pull all of the filenames using readdir, and then
> sort the files by inode number before trying to open or stat them.   

Makes sense, this is exactly what the app would be doing.

> This has to be done in userspace because a directory can be
> arbitrarily big, so we can't do it in the kernel.  However, for people
> who don't want to modify their application, I do have an LD_PRELOAD
> module which you can try using that should also do the trick (see
> attached).

Could you resend please - it didn't come through.

> OK, this is weird.  Was the reboot a clean reboot, or an unclean
> shutdown?  The fact that e2fsck didn't report any errors is very
> curious, since normally both of these errors would be instantly picked
> up by e2fsck.

Do definately a clean reboot - the partition was unmounted. no power
cord pulling occurred.

> The weird errors on non-htree enabled partitions are normally caused
> by unexpected crap in an indirect block or in the inode table, again,
> e2fsck finds those sorts of problems, so if e2fsck didn't find it, the
> corruption was in the cached copy in memory only.  Normally this
> points to hardware problems, but if it was only happening with the
> htree kernel, that is very curious.  I don't see how the htree patches
> could have caused such an effect.

I agree. The error mostly occurred with the 'df' command. It was exiting
with a 'bus error'. I couldn't even ldd it.

Whilst performing any operation on the file the above mentioned errors
occurred. I haven't experienced any other issues with the server at all
- it has been rock solid. It's only with the htree patch that the
problem occurred.

> Yup, it's the latest that is released.  I have some patches internally
> against 2.4.23, but they don't have any additional changes or bugfixes
> over what was in 2.4.21rc5.  There may have been some additional
> patches that went into 2.6, but I do keep an eye for them and push
> them into the 2.4 backport patches that I maintain.

So htree workloads are best suited for applications that know the name
of the file in advance? From your comments above I would assume that
even 'ls' would perform worse?




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]