[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[dm-devel] Snapshot Proposal



Hi Joe,

Back at OLS, we talked a bit about splitting up the dm-snapshot code into 
snapshot + exception-handling code, in order to make it easier to reuse the 
exception-handling code for a bad-block module. I'm not sure if you've had 
any more thoughts about this, or if you've started working on it yet (I know 
you were busy porting some stuff to 2.6, so I'm assuming you haven't gotten 
back to this). I've started looking through the code and seeing how it can be 
split up, and I've included my initial ideas here.

Currently, the snapshot code basically looks like this:

        -----------------
        |  dm-snapshot  |
        -----------------
          /          \
         /            \
------------      ------------------------
|  kcopyd  |      |  dm-exception-store  |
------------      ------------------------


and my current proposal is to change it to look like this:


        -----------------
        |  dm-snapshot  |
        -----------------
                | 
                | 
        ------------------
        |  dm-exception  |
        ------------------
          /           \
         /             \
------------      ------------------------
|  kcopyd  |      |  dm-exception-store  |
------------      ------------------------


Most of the data structures from dm-snapshot would move into dm-exception 
(along with a lot of code, obviously). Here are the structure definitions 
that I'm currently using (most of which are very similar to the current 
ones).

(from dm-exception-table.h)

/**
 * struct exception
 *
 * An exception is used where an old chunk of data has been
 * replaced by a new one.
 **/
struct exception {
	struct list_head hash_list;
	chunk_t old_chunk;
	chunk_t new_chunk;
};

/**
 * struct pending_exception
 *
 * An exception that is in the process of being copied from the
 * source device to the destination device.
 **/
struct pending_exception {
	struct exception e;

	/*
	 * I/O buffers waiting for this copy to complete are held
	 * in a list (using b_reqnext).
	 */
	struct buffer_head *src_bhs;
	struct buffer_head *dest_bhs;

	/*
	 * Other pending_exceptions that are processing this chunk
	 * on the source device. When this list is empty, we know
	 * we can complete the source queue.
	 */
	struct list_head siblings;

	/* Pointer back to the exception-table. */
	struct exception_table *etable;

	/* 1 indicates the exception has already been sent to kcopyd. */
	int started;
};

/**
 * struct exception_hash
 *
 * A hash table for fast storage and lookups of exceptions.
 * Used for both completed and pending exceptions.
 **/
struct exception_hash {
	uint32_t hash_mask;
	struct list_head *hash_table;
};

/*
 * Abstraction to handle the meta/layout of exception stores (the
 * destination device).
 */
struct exception_store {

	/*
	 * Deletes this store from memory when you've finished with it.
	 */
	void (*destroy) (struct exception_store *store);

	/*
	 * Read metadata from the exception device and load all existing
	 * exceptions into memory. Don't perform I/O to the destination
	 * device until this has been called.
	 */
	int (*read_metadata) (struct exception_store *store);

	/*
	 * Find somewhere to store the next exception.
	 */
	int (*prepare_exception) (struct exception_store *store,
				  struct exception *e);

	/*
	 * Update the metadata with this exception.
	 */
	void (*commit_exception) (struct exception_store *store,
				  struct exception *e,
				  void (*callback) (void *, int success),
				  void *callback_context);

	/*
	 * The destination device is invalid; note this in the metadata.
	 */
	void (*invalidate) (struct exception_store *store);

	/*
	 * Return how full the destination device is.
	 */
	void (*fraction_full) (struct exception_store *store,
			       sector_t *numerator,
			       sector_t *denominator);

	struct exception_table *etable;
	void *context;
};

/**
 * struct exception_table
 *
 * Describe one exception-handler instance.
 **/
struct exception_table {
	struct rw_semaphore lock;

	struct dm_dev *src;
	struct dm_dev *dest;

	/* Size of data blocks saved - must be a power of 2 */
	chunk_t chunk_size;
	chunk_t chunk_mask;
	chunk_t chunk_shift;

	/* You can't use the destination device if this is 0 (e.g. if full) */
	int valid;

	/* 1 if metadata has already been read from disk. */
	int have_metadata;

	/* Used for display of table. 'P' or 'N'. */
	char type;

	/* The last percentage we notified */
	int last_percent;

	/* Two hash tables. One for in-progress copies, and one
	 * for completed copies.
	 */
	struct exception_hash pending;
	struct exception_hash complete;

	/* The on disk metadata handler */
	struct exception_store store;

	struct kcopyd_client *kcopyd_client;
};



The exception_table structure is new (renamed the old exception_table to 
exception_hash), containing most of the information that was in the 
dm_snapshot structure. With this, the dm_snapshot structure could be reduced 
to as little as this:

(from dm-snapshot.h)

/**
 * struct dm_snapshot
 *
 * Describe one snapshot device.
 **/
struct dm_snapshot {
	struct dm_table *table;
	struct exception_table *etable;

	/* List of snapshots per origin. */
	struct list_head list;
};


The next step, of course, is to define the interface that dm-exception will 
provide. Some of the APIs are pretty obvious.

struct exception_table *dm_create_exception_table(...);
void dm_delete_exception_table(struct exception_table *et);
int dm_read_exception_table_metadata(struct exception_table *et);
void dm_lock_exception_table(struct exception_table *et, ...);
void dm_unlock_exception_table(struct exception_table *et, ...);
struct exception *lookup_exception(struct exception_table *et, chunk_t chunk);
struct pending_exception *find_pending_exception(struct exception_table *et,
                                                 struct buffer_head *bh);
(more to come...)


So at this point I'm faced with a few decisions.

Should the dm-exception data structures be hidden from dm-snapshot?

This is how kcopyd works, with the kcopyd_client structure not visible to its 
users, but only to the kcopyd code. This is how I'd prefer dm-exception to 
behave, but it also brings up a couple other questions.

Should struct dm_snapshot duplicate some of the data stored in struct 
exception_table?

In order to do things like getting status from the snapshot, you need access 
to some of the info in the exception-table. One way would be to expand the 
dm-snapshot structure to something like this:

struct dm_snapshot {
	struct dm_table *table;
	struct exception_table *etable;

	struct dm_dev *origin;
	struct dm_dev *cow;

	chunk_t chunk_size;
	char type;

	/* List of snapshots per origin. */
	struct list_head list;
};


But, I'm not thrilled about the duplication of this data. It just leaves the 
door open for inconsistencies between dm_snapshot and exception_table.

Should dm-exception provide simple, one-liner APIs just to get information 
about the exception-tables? Things like:

get_source_dev()
get_destination_dev()
get_chunk_size()
get_type()
and others...

This might be the way to go, but it does expand the API set quite a bit. I 
don't want to make it vastly confusing to use dm-exception. :)


In any case, do you think this is the correct overall approach to the split 
between snapshot and the exception-hander?  I know you've also had some ideas 
about creating a true cache of exceptions to help cut down on the memory 
usage for the exception-tables. If we can get this split done first, I would 
think that the exception-cache would then be completely contained within 
dm-exception, and would be transparent to snapshot and bad-block.

I'm about half-way done with converting the code to something that I can 
compile and test. If you're generally happy with this approach, I'll be able 
to finish it off and send in the code by next week. If you have ideas for a 
different approach, let me know, and I'll try to incorporate them before 
submitting the code.


-- 
Kevin Corry
kevcorry us ibm com
http://evms.sourceforge.net/




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]