[Fedora-livecd-list] [PATCH] cleanupDeleted and container_size - saves 5.5% output size on f7livecd-i686

Phillip Lougher phillip at lougher.demon.co.uk
Sat Aug 18 03:18:14 UTC 2007


Douglas McClendon wrote:

> 20% speedup of reading data into /dev/null isn't all that useful.

I could write an irritable comment here, but I won't rise to the
bait :-)

Writing to /dev/null factors out any overhead due to writing.  Therefore
any time improvements are purely down to sparse file reading, which
is what my tests were aimed at measuring.  You and anyone
else are welcome to carry out any additional tests in 'real-life'
scenarios, and I'll be interested in the results.

> 
> You didn't really mention what you were planning on using it for. 

Personally I have very little use for sparse file handling.  Sparse
file handling is simply part of my on-going improvements to Squashfs.
Correct sparse file handling has been one of my desired improvements for
more than two years, but has always been postponed for other higher
priority tasks.

Having downloaded a Fedora liveCD and noticing it used sparse files,
made me decide to bump the priority of adding sparse file handling
to Squashfs.  The announcement of a Fedora liveCD was interesting
because for a long time Fedora was badly lacking this.  I was
pleased to discover it used Squashfs, not so pleased to discover
it used sparse files.


> Presumably you were also aware of my turboLiveInst patch, which 
> accomplishes the 20% speedup in an actual copy situation.

I seem to have barged into an on-going argument.  I read this thread about
sparse file handling in Squashfs, and read about your turboLiveInst patch.
It seems a clever and creative solution to the problem, but I am not really
qualified to judge, having no involvement in the Fedora liveCD except insofar
as I wrote Squashfs.

In fact I only came across this thread during a search to see if anyone
was complaining about the lack of sparse file support in Squashfs.

> 
> For reasons that I can only speculate to, Jeremy, and everyone else 
> seems to have no interest in my turboLiveInst optimization approach. 
> Perhaps this method will be more palatable for them for some reason.

Politics, not my cup of tea really.

> 
> The other thing that I'm curious about, is the performance impact of 
> moving to 128kbyte blocks.  I presume that is the compressed data size. 
>  I would like to see if, in some typical usage scenarios, whether or not 
> that has a detrimental impact to desktop and general performance during 
> livecd oepration.  (i.e. for every 5kb text file read, it needs to 
> uncompress 128Kb of compressed data, adding ram and latency penalties).
> 

Yes, spot on.  Larger blocks add greater latency and larger ram usage,
but improve compression ratio.  It is very much a trade-off between
what a Squashfs filesystem user values greater.  There is unfortunately
no silver bullet here.

The latency increases due to larger block sizes are somewhat muddied
by fragment blocks in Squashfs.  Fragment blocks pack multiple files
smaller than a single block into one compressed block.  Paradoxically,
this can minimise seek time if the files in the fragment block
are read at the same time.  Once the first file is read from a
fragment, subsequent file reads are immediately satisfied from the
cached fragment rather than incurring additional seek overhead.  If the
files are sorted in the Squashfs filesystem according to first access
time, then the larger block size will probably improve boot up speed for
many liveCDs.

In any case the move to larger block sizes is encouraged by compression
algorithms other than gzip (cough LZMA), where significant compression
improvements may be achieved.  I have not undertaken any measurements
of this because I pretend not to know about any compression algorithms
other than gzip (politics again).

> But the 3MB saved from the sparse support, is absolutely FREE.  That is 
> positively excellent, and what I had suspected was the case.  Also, I'm 
> guessing that the mksquashfs performance on a 4.0G sparse file with 2.0G 
> of data will be vastly improved.  (by my definition of vast, i.e. ~10%)
> 
Yes, mksquashfs runs a *lot* faster.

Regards

Phillip




More information about the Fedora-livecd-list mailing list