On 24. juni. 2009, at 17.12, Mark Ruijter wrote:
For those who need OpenSource data deduplication today instead of
tomorrow one might take a look at lessfs.
http://www.lessfs.com
It's a good idea, but given the current traffic on the lessfs
mailing list, I'm not sure if much work is done. I have been a
member of that list since June 1 and haven't received more than one
message, which was the one I wrote myself.
I am thinking about starting to work on a data deduplicating
blockdevice, a kernel module called blockless.
If done smartly, this may perhaps be possible, but the problem is
the filesystem's metadata. Is this going to be dedup'ed? How much
will this take? A simple backup will update atime on all the files
backed up, and although atime isn't always wanted or needed, the
problem occurs elsewhere.