On 24. juni. 2009, at 17.12, Mark Ruijter wrote:
For those who need OpenSource data deduplication today instead of
tomorrow one might take a look at lessfs.
http://www.lessfs.com
It's a good idea, but given the current traffic on the lessfs mailing
list, I'm not sure if much work is done. I have been a member of that
list since June 1 and haven't received more than one message, which was
the one I wrote myself.
I am thinking about starting to work on a data deduplicating
blockdevice, a kernel module called blockless.
If done smartly, this may perhaps be possible, but the problem is the
filesystem's metadata. Is this going to be dedup'ed? How much will this
take? A simple backup will update atime on all the files backed up, and
although atime isn't always wanted or needed, the problem occurs elsewhere.