Hi all
I've been reading up a little about data deduplication, and have been in
search for an OSS filesystem with dedup without much luck. While testing
snapshots and so on in LVM, I started wondering if dedup would be better
off in LVM than in the filesystem. Would it be possible/efficient to add
dedup to the LVM layer, or perhaps a layer above LVM? This could make
dedup work for all or most of filesystems. Make a hash table with 4k (or
whatever) blocks, make virtual blocks pointing to the physical blocks
and run a remapping/deduping job at night. If written to, copy-on-write
could be used to increase speed.
Is this nonsense, or might it be an idea?