[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[dm-devel] Recommendations for cascaded snapshots



Hi, I've begun looking at using device-mapper on one of my linux servers to store a series of "points in time" of a data volume without chewing up excessive quantities of disk space. I'm currently using LVM for volume management, but the LVM snapshot functionality seemed too simple to handle what I want.

Basically I'm looking for a tiered set of snapshots, so that I have a snapshot for every day of the last week, one for sunday every week of the last month, one for the first day of every month for the last year, and one for the first of January every year.

Firstly, this is around 16-20 snapshots or so; how scalable is the dm- snapshot system in 2.6.18? Would a dual-core home-office fileserver with 1GB of RAM be able to handle that number of snapshots?

Another quickie question: How can I measure the current %use of a given snapshot device? I'm going to have some automated cronjobs which automatically resize my LVM devices based on disk consumption and whether or not that snapshow-backend device is expected to get more changes, but I can't figure out how to get at the amount-free information.

My first "solution" was to use stacked snapshots such that the "origin" was the raw block device populated with our current data, and each snapshot is the changes until the next recorded point-in- time, but I ran into a few problems. First of all, it doesn't seem possible to merge two snapshots together; for example if I have Nov 30th and Dec 1st snapshots, at some point I don't care about the specific Nov 30th changes anymore and I want to merge the Dec 1st snapshot ontop of the Nov 30th one and then drop the now-empty Dec 1st snapshot (renaming the Nov 30th one as Dec 1st).

I was able to find this patchset:
http://www.gnome.org/~markmc/code/lvm-snapshot-merging/

referenced here:
http://fedoraproject.org/wiki/StatelessLinuxCachedClient

which talked about a dm snapshot-merge target which ran a kernel thread to merge changes back into a lower-level device, solving the problem above; but I couldn't determine how stable it was or whether or not it applied to recent kernels.

Potentially I could even implement the merging in userspace without too much difficulty if the on-disk exception format was documented somewhere useful, but I have yet to find anything detailed and consise enough to make this practical for me; perhaps someone on the list has better information?

Another concern is the computational and disk overhead of having 30 chained snapshots, as the exceptions would be scattered across ~16 logical volumes and every read from the _current_ volume would need to scan through all 16 exception tables.

Alternatively I was considering having the "current" go to the snapshot-origin target and a series of snapshot-device-stacked-on- snapshot-device. For example, when I do my midnight snapshot, I would have the devices like this:

data:
0 $SIZE snapshot-origin /dev/mapper/data-origin

data-back1:
0 $SIZE snapshot /dev/mapper/data-origin /dev/mapper/data-excp1 p 32

data-prev1:
0 $SIZE snapshot-origin /dev/mapper/back1

data-snap0:
0 $SIZE snapshot /dev/mapper/back1 /dev/mappper/data-excp0 p 32

data-prev0:
0 $SIZE snapshot-origin /dev/mapper/back0

I would mount as follows:
/dev/mapper/data  => /data (rw)
/dev/mapper/prev1 => /data/backup/2006-10-02 (ro)
/dev/mapper/prev0 => /data/backup/2006-10-01 (ro)

Then at midnight when chaining a new snapshot, I would:
 1) create a new empty snapshot blockdev
 2) suspend data and data-back1
 3) insert data-excp000002 under data-back000001 like this:

data-back000002:
0 $SIZE snapshot /dev/mapper/data-origin /dev/mapper/data-excp000002 p 32

data-prev000002:
0 $SIZE snapshot-origin /dev/mapper/data-back000002

data-back000001:
0 $SIZE snapshot /dev/mapper/data-back000002 /dev/mapper/data- excp000001 p 32

 4) Resume data-back1 and data-origin
 5) Mount data-back2 on /data/backup/2006-10-03 (ro)

This means that the only performance penalty would be accessing data far in the past, although I still need to figure out some way to merge an unneeded snapshot into another one so I can free up the space used by the unneeded snapshot.

As you might expect I'm writing perl scripts to store the state of my snapshot-tree on-disk and automate the snapshotting process.

I appreciate any advice you can offer; especially pertaining to merging a snapshot into its base device.

Cheers,
Kyle Moffett


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]