[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [lvm-devel] [PATCH 02/13] Replicator: add libdm support



Dne 8.10.2009 11:42, Petr Rockai napsal(a):
> Hi,
> 
> Alasdair G Kergon <agk redhat com> writes:
>> On Mon, Oct 05, 2009 at 04:00:29PM +0200, Zdenek Kabelac wrote:
>>>  int dm_tree_suspend_children(struct dm_tree_node *dnode,
>>> -				   const char *uuid_prefix,
>>> -				   size_t uuid_prefix_len);
>>> +			     const char *uuid_prefix,
>>> +			     size_t uuid_prefix_len,
>>> +			     int priority);
>>   
>> We can't change the published interface like that.
>>
>> If there really is no alternative to this parameter then you need to introduce
>> a new function instead.
> I have discussed this in person with Zdeněk, and it seems that the main reason
> for this is to allow more atomic suspension of complete replicator.
> 
> The normal semantics of DM tree operations is that suspension of a node is done
> after all its children are suspended. For replicator, it would be beneficial to
> suspend the parent first (to stop the IO stream from the client side) and only
> afterwards start suspending the inferior nodes.
> 
> In practice, this constitutes only a very slight increase in risk of tripping a
> "synchronicity" threshold -- by the amount of data that arrives to the
> replicator head in the time frame between a leg is suspended and the replicator
> as a whole is suspended -- if this extra data exceeds the amount of free space
> in the replicator log, this could trip an extra resync that would be
> costly. But it should be stressed, that in these situations, this was already
> quite likely to happen, and the outlined problem only increases the chance of
> this happening by a small margin.
> 
> In other words, I think it is not worth worrying about just yet. At the point
> when a dmeventd-style leg dropping is implemented for replicator, this may
> become a more pressing problem, but I think the solution should be deferred
> until the work on that aspect has begun -- at that time, we ought to have a
> clearer picture of the problem area, so it may turn out that a better solution
> exists that would not require similar incompatible (and arguably risky) changes
> to the libdm core.


The orginal primary idea for my suspend_children patch was the intent to keep
logic inverse to prioritized activation -  so what is activated last gets
suspended first - but it seems to be a problematic concept.

So this is my second attempt to suspend the replicator control node before
deactivation of the replicator's head.

I think we need to be able to distinguish from the situation when we want to
simply cut off one of replicator's head - and gracefully shutdown whole
replicator.

So I've introduced new variable along the old activation_priority and I use it
as a sign (similar to noflush,skipfs) and before deactivation of node I simply
check whether that node is connected with flagged replicator control node. In
that case the node is suspend in front of deactivation.

This patch should have no influence on other targets.

Zdenek

From 6ab5e354bc3d8b0462df400cfe87b67bc595ab4c Mon Sep 17 00:00:00 2001
From: Zdenek Kabelac <zkabelac redhat com>
Date: Fri, 9 Oct 2009 00:56:46 +0200
Subject: [PATCH 08/22] Replicator: solution for suspend/deactivate

Introducing dm_tree_set_replicator_suspend() to suspend
replicator control device before actual deactivation of replicator
head device.

Signed-off-by: Zdenek Kabelac <zkabelac redhat com>
---
 libdm/.exported_symbols |    1 +
 libdm/libdevmapper.h    |    2 +
 libdm/libdm-deptree.c   |   63 +++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 66 insertions(+), 0 deletions(-)

diff --git a/libdm/.exported_symbols b/libdm/.exported_symbols
index 2960da4..71c8534 100644
--- a/libdm/.exported_symbols
+++ b/libdm/.exported_symbols
@@ -78,6 +78,7 @@ dm_tree_node_add_mirror_target_log
 dm_tree_node_add_target_area
 dm_tree_node_add_replicator_target
 dm_tree_node_add_replicator_dev_target
+dm_tree_set_replicator_suspend
 dm_tree_node_set_read_ahead
 dm_tree_skip_lockfs
 dm_tree_use_no_flush_suspend
diff --git a/libdm/libdevmapper.h b/libdm/libdevmapper.h
index 62b5e80..a6b2957 100644
--- a/libdm/libdevmapper.h
+++ b/libdm/libdevmapper.h
@@ -448,6 +448,8 @@ int dm_tree_node_add_replicator_dev_target(struct dm_tree_node *node,
 					   uint32_t slog_flags,		/* Mirror log flags */
 					   uint32_t slog_size);
 
+int dm_tree_set_replicator_suspend(struct dm_tree *dtree, const char *uuid);
+
 int dm_tree_node_add_target_area(struct dm_tree_node *node,
 				    const char *dev_name,
 				    const char *dlid,
diff --git a/libdm/libdm-deptree.c b/libdm/libdm-deptree.c
index fea1a83..6e18259 100644
--- a/libdm/libdm-deptree.c
+++ b/libdm/libdm-deptree.c
@@ -157,6 +157,7 @@ struct dm_tree_node {
         struct dm_list used_by;    	/* Nodes that use this node */
 
 	int activation_priority;	/* 0 gets activated first */
+	int replicator_suspend;		/* 1 gets suspend first */
 
 	void *context;			/* External supplied context */
 
@@ -653,6 +654,18 @@ void dm_tree_node_set_read_ahead(struct dm_tree_node *dnode,
 	dnode->props.read_ahead_flags = read_ahead_flags;
 }
 
+int dm_tree_set_replicator_suspend(struct dm_tree *dtree, const char *uuid)
+{
+	struct dm_tree_node *dnode;
+
+	if ((dnode = dm_tree_find_node_by_uuid(dtree, uuid))) {
+		log_verbose("Setting replicator suspend for %s", dnode->name);
+		dnode->replicator_suspend = 1;
+	}
+
+	return 1;
+}
+
 int dm_tree_add_dev(struct dm_tree *dtree, uint32_t major, uint32_t minor)
 {
 	return _add_dev(dtree, &dtree->root, major, minor) ? 1 : 0;
@@ -1003,6 +1016,53 @@ static int _suspend_node(const char *name, uint32_t major, uint32_t minor,
 	return r;
 }
 
+static int _suspend_replicator_parent(struct dm_tree_node *dnode,
+				      const char *uuid_prefix,
+				      size_t uuid_prefix_len)
+{
+	struct dm_info info;
+	const struct dm_info *dinfo;
+	const char *name;
+	const char *uuid;
+	struct dm_tree_link *dlink;
+
+	dm_list_iterate_items(dlink, &dnode->uses) {
+		if (!dlink->node->replicator_suspend)
+			continue;
+
+		if (!(uuid = dm_tree_node_get_uuid(dlink->node))) {
+			stack;
+			continue;
+		}
+		/* Ignore if it doesn't belong to this VG */
+		if (!_uuid_prefix_matches(uuid, uuid_prefix, uuid_prefix_len))
+			continue;
+
+		if (!(dinfo = dm_tree_node_get_info(dlink->node))) {
+			stack;	/* FIXME Is this normal? */
+			return 0;
+		}
+
+		if (dinfo->suspended)
+			continue;
+
+		if (!(name = dm_tree_node_get_name(dlink->node))) {
+			stack;
+			continue;
+		}
+
+		if (!_suspend_node(name, dinfo->major, dinfo->minor,
+				   1, 1, &info)) {
+			log_error("Unable to suspend %s (%" PRIu32
+				  ":%" PRIu32 ")", name, dinfo->major,
+				  dinfo->minor);
+			return 0;
+		}
+	}
+
+	return 1;
+}
+
 int dm_tree_deactivate_children(struct dm_tree_node *dnode,
 				   const char *uuid_prefix,
 				   size_t uuid_prefix_len)
@@ -1039,6 +1099,9 @@ int dm_tree_deactivate_children(struct dm_tree_node *dnode,
 		    !info.exists || info.open_count)
 			continue;
 
+		if (!_suspend_replicator_parent(child, uuid_prefix, uuid_prefix_len))
+			continue;
+
 		if (!_deactivate_node(name, info.major, info.minor, &dnode->dtree->cookie)) {
 			log_error("Unable to deactivate %s (%" PRIu32
 				  ":%" PRIu32 ")", name, info.major,
-- 
1.6.5.rc2


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]