On Thu, 24 Feb 2011, Jonathan Tripathy wrote:
Yes. When you make the snapshot, there is only one copy, and the COW table
is empty. AS YOU WRITE to the origin, each chunk written is saved to
*-cow first before being written to *-real.
Got ya. So data that is being written to the origin, while the snapshot
exists, is the data that may leak, as it's saved to the COW first, then copied
over to real.
Hopefully an expert will let me know weather its safe to zero the COW after
I've finished with the snapshot.
What *is* safe is to zero the snapshot. This will overwrite any blocks
in the COW copied from the origin. The problem is that if the snapshot runs
out of room, it is invalidated, and you may or may not have overwritten
all blocks copied from the origin.
So if you don't hear from an expert, a safe prodecure is to allocate
snapshots for backup that are as big as the origin + 1 PP (which should
be enough for the COW table as well unless we are talking terabytes). Then you
can zero the snapshot (not the COW) after making a backup. That will overwrite
any data copied from the origin. The only leftover data will just be the COW
table which is a bunch of block #s, so shouldn't be a security problem.
This procedure is less efficient than zeroing LVs on allocation, and takes
extra space for worst case snapshot allocation. But if you want allocation
to be "instant", and can pay for it when recycling, that should solve your
problem. You should still zero all free space (by allocating a huge LV
with all remaining space and zeroing it) periodically in case anything
got missed.
IDEA, since you are on raid1, reads are faster than writes (twice as fast),
and your snapshots will be mostly unused (the COW will only have a few blocks
copied from the origin). So you can write a "clear" utility that scans
a block device for non-zero chunks, and only writes over those with zeros.
This might be a good application for mmap().