Re: [linux-lvm] Data deduplication in LVM?

On 10. juni. 2009, at 20.41, Roy Sigurd Karlsbakk wrote:

Hi all

I've been reading up a little about data deduplication, and have been in search for an OSS filesystem with dedup without much luck. While testing snapshots and so on in LVM, I started wondering if dedup would be better off in LVM than in the filesystem. Would it be possible/efficient to add dedup to the LVM layer, or perhaps a layer above LVM? This could make dedup work for all or most of filesystems. Make a hash table with 4k (or whatever) blocks, make virtual blocks pointing to the physical blocks and run a remapping/ deduping job at night. If written to, copy-on-write could be used to increase speed.

Answering myself, it seems there can be a problem with this without a rather large change in the APIs. If I understand it correctly, if metadata is deduplicated, it may impose a rather large performance impact on writes, and from the block layer, how do you know what's metadata and what's not?

