[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[lvm-devel] [PATCH] partial metadata handling



Hi,

attached is a patch that does the following things:

- implements an _lv_postorder function (that may need a better name, too) --
  this will walk the LV dependency graph in depth-first order and in the
  postorder, call a callback function

- using _lv_postorder, it implements _lv_mark_partial, that decides whether any
  given LV needs to be marked as partial -- that is, it is incomplete, which
  generally means, that somewhere down the stack, a PV is missing

- a highlevel function _vg_consolidate_partial, that flags up all partial LVs
  with a PARTIAL_LV status flag

- fixes the error message in vg_write branch for PARTIAL_VG, telling the user
  that metadata cannot be updated for partial VG

- add a toolcontext flag handles_partial, that declares that the given tool can
  cope (and write out) partial metadata; all tools not setting this flag will
  exhibit the above error message -- and that should be the majority

- the obvious candidate for handles_partial right now is vgreduce
  --removemissing, but see the next patch for more tool changes

The intention is to have this patch (plus a few tool bits) commited soon
enough, as it needs some regression testing and it shouldn't impact current
behaviour in any major or negative way. Further bits will then add new
functionality on top of these changes, although those still need more work
before being commit-ready, I believe.

Sun May  4 23:54:35 CEST 2008  me mornfall net
  * Add a restart-loop encoded as a goto to lvconvert.
  
  Yes, it could have been done as a do-while loop just as well, but:
  a) lvm seems to be goto-happy
  b) do-while would indent a bunch of stuff out of my screen
Sun May  4 23:52:54 CEST 2008  me mornfall net
  * Add a comment to _lv_postorder.
Sun May  4 23:52:31 CEST 2008  me mornfall net
  * Rename all mark_missing bits to mark_partial.
Thu Apr 24 22:14:09 CEST 2008  me mornfall net
  * Exclude MISSING_PVs from create_pv_list in allocatable_only case.
Thu Apr 24 22:13:25 CEST 2008  me mornfall net
  * Even better (more resilient) approach to the activation issue.
Thu Apr 24 21:25:38 CEST 2008  me mornfall net
  * A correct fix for the lvconvert --repair vs activation issue.
Thu Apr 24 19:24:56 CEST 2008  me mornfall net
  * Try to fix the lvconvert --repair vs. activation issue.
Sun Apr 20 10:17:05 CEST 2008  me mornfall net
  * Check for MISSING_PV in vgreduce --removemissing.
Sun Apr 20 10:01:09 CEST 2008  me mornfall net
  * Generalize _lv_mark_missing to _lv_postorder with callback.
Mon Mar 31 18:36:56 CEST 2008  me mornfall net
  * Factor the 2 downconversion cases to one.
Mon Mar 31 18:36:23 CEST 2008  me mornfall net
  * Move the segment count check and adjust comment (lvconvert).
Sat Mar 29 19:36:44 CET 2008  me mornfall net
  * More code shuffling (this however introduces some problems).
Sat Mar 29 19:22:11 CET 2008  me mornfall net
  * Shuffle lvconvert code a little to make remove-and-add in one go easier.
Sun Mar 23 19:53:26 CET 2008  me mornfall net
  * Update comment for _lv_mark_missing.
Sun Mar 23 19:53:12 CET 2008  me mornfall net
  * Improve error message when partial metadata cannot be written.
Thu Mar 13 17:14:44 CET 2008  me mornfall net
  * Let vgcfgrestore overwrite partial metadata.
  
  However, this probably deserves --partial, to ensure that this is what the user
  meant? On the other hand, vgcfgrestore is potentially destructive command
  already and this doesn't seem to make it worse. It is definitely useful to edit
  metadata by hand.....
Thu Mar 13 17:12:41 CET 2008  me mornfall net
  * The log_lv, origin and cow links are in segments, not in LVs.
Thu Mar 13 17:12:17 CET 2008  me mornfall net
  * Don't forget to reset cmd->handles_partial when done.
Thu Mar 13 17:11:44 CET 2008  me mornfall net
  * Actually mark VG with MISSING_PV in it as PARTIAL_VG.
Thu Mar 13 17:08:00 CET 2008  me mornfall net
  * Don't export PARTIAL_VG, we recompute it every time. Enough to persist MISSING_PV.
Thu Mar 13 17:07:21 CET 2008  me mornfall net
  * Don't forget to mark COMPLETE_LV as non-exported flag.
Thu Mar 13 15:16:34 CET 2008  me mornfall net
  * Simplify and optimize _lv_mark_missing.
Fri Feb 15 09:52:45 CET 2008  me mornfall net
  * Resolve conflicts.
Fri Feb 15 09:51:11 CET 2008  me mornfall net
  * First go at lvconvert --repair implementation.
Fri Feb 15 09:50:53 CET 2008  me mornfall net
  * Add repair_ARG to lvconvert's argument list.
Thu Feb 14 16:20:46 CET 2008  me mornfall net
  * Add --repair switch to lvconvert for fixups in partial VGs.
Fri Feb 15 10:07:46 CET 2008  me mornfall net
  * Clean up commented out code.
Fri Feb 15 09:50:19 CET 2008  me mornfall net
  * Make lvremove work on partial VGs.
Fri Feb 15 09:49:42 CET 2008  me mornfall net
  * Fix vgreduce --removemissing for consistent partial VGs.
Fri Feb 15 09:49:02 CET 2008  me mornfall net
  * Fix _lv_mark_missing in metadata.c.
Fri Feb 15 09:47:54 CET 2008  me mornfall net
  * Add PARTIAL_LV to flags.c, as NULL (ie. not stored in metadata).
Thu Feb 14 17:37:06 CET 2008  me mornfall net
  * Make the PARTIAL_LV marking transitive.
Thu Feb 14 16:21:19 CET 2008  me mornfall net
  * Mark partial LVs with PARTIAL_LV status flag.
Thu Feb 14 16:20:07 CET 2008  me mornfall net
  * Add possibility for commands to specify they handle partial VGs.
diff -rN -u -p old-hotspare-prime/lib/commands/toolcontext.c new-hotspare-prime/lib/commands/toolcontext.c
--- old-hotspare-prime/lib/commands/toolcontext.c	2008-05-04 23:59:26.111995976 +0200
+++ new-hotspare-prime/lib/commands/toolcontext.c	2008-05-04 23:59:26.135996064 +0200
@@ -931,6 +931,7 @@ struct cmd_context *create_toolcontext(s
 	cmd->args = the_args;
 	cmd->is_static = is_static;
 	cmd->is_long_lived = is_long_lived;
+	cmd->handles_partial = 0;
 	cmd->hosttags = 0;
 	list_init(&cmd->formats);
 	list_init(&cmd->segtypes);
diff -rN -u -p old-hotspare-prime/lib/commands/toolcontext.h new-hotspare-prime/lib/commands/toolcontext.h
--- old-hotspare-prime/lib/commands/toolcontext.h	2008-05-04 23:59:26.111995976 +0200
+++ new-hotspare-prime/lib/commands/toolcontext.h	2008-05-04 23:59:26.135996064 +0200
@@ -68,6 +68,7 @@ struct cmd_context {
 	char **argv;
 	unsigned is_static;	/* Static binary? */
 	unsigned is_long_lived;	/* Optimises persistent_filter handling */
+	unsigned handles_partial;
 
 	struct dev_filter *filter;
 	int dump_filter;	/* Dump filter when exiting? */
diff -rN -u -p old-hotspare-prime/lib/format_text/flags.c new-hotspare-prime/lib/format_text/flags.c
--- old-hotspare-prime/lib/format_text/flags.c	2008-05-04 23:59:26.111995976 +0200
+++ new-hotspare-prime/lib/format_text/flags.c	2008-05-04 23:59:26.131995957 +0200
@@ -30,19 +30,20 @@ struct flag {
 static struct flag _vg_flags[] = {
 	{EXPORTED_VG, "EXPORTED"},
 	{RESIZEABLE_VG, "RESIZEABLE"},
-	{PARTIAL_VG, "PARTIAL"},
 	{PVMOVE, "PVMOVE"},
 	{LVM_READ, "READ"},
 	{LVM_WRITE, "WRITE"},
 	{CLUSTERED, "CLUSTERED"},
 	{SHARED, "SHARED"},
 	{PRECOMMITTED, NULL},
+	{PARTIAL_VG, NULL},
 	{0, NULL}
 };
 
 static struct flag _pv_flags[] = {
 	{ALLOCATABLE_PV, "ALLOCATABLE"},
 	{EXPORTED_VG, "EXPORTED"},
+	{MISSING_PV, "MISSING"},
 	{0, NULL}
 };
 
@@ -61,6 +62,8 @@ static struct flag _lv_flags[] = {
 	{SNAPSHOT, NULL},
 	{ACTIVATE_EXCL, NULL},
 	{CONVERTING, NULL},
+	{PARTIAL_LV, NULL},
+	{POSTORDER_FLAG, NULL},
 	{0, NULL}
 };
 
diff -rN -u -p old-hotspare-prime/lib/format_text/import_vsn1.c new-hotspare-prime/lib/format_text/import_vsn1.c
--- old-hotspare-prime/lib/format_text/import_vsn1.c	2008-05-04 23:59:26.103995761 +0200
+++ new-hotspare-prime/lib/format_text/import_vsn1.c	2008-05-04 23:59:26.131995957 +0200
@@ -790,7 +790,6 @@ static struct volume_group *_read_vg(str
 	dm_hash_destroy(pv_hash);
 
 	if (vg->status & PARTIAL_VG) {
-		// vg->status &= ~LVM_WRITE;
 		vg->status |= LVM_READ;
 	}
 
diff -rN -u -p old-hotspare-prime/lib/metadata/metadata.c new-hotspare-prime/lib/metadata/metadata.c
--- old-hotspare-prime/lib/metadata/metadata.c	2008-05-04 23:59:26.111995976 +0200
+++ new-hotspare-prime/lib/metadata/metadata.c	2008-05-04 23:59:26.119995703 +0200
@@ -1171,26 +1171,151 @@ int vgs_are_compatible(struct cmd_contex
 	return 1;
 }
 
+struct _lv_postorder_baton {
+	int (*fun)(struct logical_volume *lv, void *d);
+	void *d;
+};
+
+static int _lv_postorder_visit(struct logical_volume *,
+			       int (*fun)(struct logical_volume *, void *),
+			       void *);
+
+static int _lv_postorder_level(struct logical_volume *lv, void *d)
+{
+	struct _lv_postorder_baton *baton = d;
+	return _lv_postorder_visit(lv, baton->fun, baton->d);
+};
+
+static int _lv_each_dependency(struct logical_volume *lv,
+			       int (*fun)(struct logical_volume *lv, void *d),
+			       void *d)
+{
+	int i, s;
+	struct lv_segment *lvseg;
+
+	list_iterate_items(lvseg, &lv->segments) {
+		struct logical_volume *deps[] = {
+			lvseg->log_lv, lvseg->origin, lvseg->cow };
+		for (i = 0; i < sizeof(deps) / sizeof(*deps); ++i) {
+			if (deps[i]) {
+				if (!fun(deps[i], d))
+					return 0;
+			}
+		}
+		for (s = 0; s < lvseg->area_count; ++s) {
+			if (seg_type(lvseg, s) == AREA_LV) {
+				if (!fun(seg_lv(lvseg,s), d))
+					return 0;
+			}
+		}
+	}
+}
+
+static int _lv_postorder_cleanup(struct logical_volume *lv, void *d)
+{
+	if (!(lv->status & POSTORDER_FLAG))
+		return 1;
+
+	_lv_each_dependency(lv, _lv_postorder_cleanup, d);
+	lv->status &= ~POSTORDER_FLAG;
+	return 1;
+}
+
+static int _lv_postorder_visit(struct logical_volume *lv,
+			       int (*fun)(struct logical_volume *lv, void *d),
+			       void *d)
+{
+	struct _lv_postorder_baton baton;
+	int r;
+
+	if (lv->status & POSTORDER_FLAG)
+		return 1;
+
+	baton.fun = fun;
+	baton.d = d;
+	r = _lv_each_dependency(lv, _lv_postorder_level, &baton);
+	if (r) {
+		r = fun(lv, d);
+		log_verbose("visited %s", lv->name);
+	}
+	return r;
+}
+
+/*
+ * This will walk the LV dependency graph in depth-first order and in the
+ * postorder, call a callback function "fun". The void *d is passed along all
+ * the calls. The callback may return zero to indicate an error and terminate
+ * the DFS walk. The error is propagated to return value of _lv_postorder.
+ */
+static int _lv_postorder(struct logical_volume *lv,
+			       int (*fun)(struct logical_volume *lv, void *d),
+			       void *d)
+{
+	int r;
+	r = _lv_postorder_visit(lv, fun, d);
+	_lv_postorder_cleanup(lv, 0);
+	return r;
+}
+
+struct _lv_mark_partial_baton {
+	int partial;
+};
+
+static int _lv_mark_partial_collect(struct logical_volume *lv, void *d)
+{
+	struct _lv_mark_partial_baton *baton = d;
+	if (lv->status & PARTIAL_LV)
+		baton->partial = 1;
+
+	return 1;
+}
+
+static int _lv_mark_partial_single(struct logical_volume *lv, void *d)
+{
+	int s;
+	struct _lv_mark_partial_baton baton;
+	struct lv_segment *lvseg;
+
+	baton.partial = 0;
+	_lv_each_dependency(lv, _lv_mark_partial_collect, &baton);
+
+	if (baton.partial)
+		lv->status |= PARTIAL_LV;
+
+	list_iterate_items(lvseg, &lv->segments) {
+		for (s = 0; s < lvseg->area_count; ++s) {
+			if (seg_type(lvseg, s) == AREA_PV) {
+				if (seg_pv(lvseg, s)->status & MISSING_PV)
+					lv->status |= PARTIAL_LV;
+			}
+		}
+	}
+
+	return 1;
+}
+
+static int _lv_mark_partial(struct logical_volume *lv)
+{
+	return _lv_postorder(lv, _lv_mark_partial_single, NULL);
+}
+
+/*
+ * Mark LVs with missing PVs using PARTIAL_LV status flag. The flag is
+ * propagated transitively, so LVs referencing other LVs are marked
+ * partial as well, if any of their referenced LVs are marked partial.
+ */
 int vg_consolidate_partial(struct volume_group *vg)
 {
 	struct physical_volume *pv;
+	struct logical_volume *lv;
+	struct lv_list *lvl;
 	struct pv_list *pvl;
 	struct pv_segment *peg;
-	// TODO: we need to walk all segments in the volume group,
-	// replacing any pointing to a MISSING_PV with error segments
-	list_iterate_items(pvl, &vg->pvs) {
-		char buffer[64] __attribute((aligned(8)));
-		pv = pvl->pv;
-		if (pv->status & MISSING_PV) {
-			if (!id_write_format(&pv->id, buffer, sizeof(buffer)))
-				log_verbose("Replacing missing PV segments with error segments.");
-			else
-				log_verbose("Replacing missing PV segments in %s with error segments.",
-					    buffer);
-		}
 
-		list_iterate_items(peg, &pv->segments) {
-		}
+	list_iterate_items(lvl, &vg->lvs) {
+		lv = lvl->lv;
+		if (!_lv_mark_partial(lv))
+			return_0;
 	}
 }
 
@@ -1280,9 +1405,9 @@ int vg_write(struct volume_group *vg)
 	if (!vg_validate(vg))
 		return_0;
 
-	if (vg->status & PARTIAL_VG) {
-		log_error("Cannot change metadata for partial volume group %s",
-			  vg->name);
+	if ((vg->status & PARTIAL_VG) && (!vg->cmd->handles_partial)) {
+		log_error("Cannot update volume group %s while physical "
+			  "volumes are missing", vg->name);
 		return 0;
 	}
 
@@ -1476,6 +1601,26 @@ static int _update_pv_list(struct list *
 	return 1;
 }
 
+static void _vg_mark_partial(struct volume_group *vg)
+{
+	struct pv_list *pvl;
+	list_iterate_items(pvl, &vg->pvs) {
+		if (pvl->pv->status & MISSING_PV)
+			vg->status |= PARTIAL_VG;
+	}
+}
+
+static int _vg_missing_pv_count(struct volume_group *vg)
+{
+	int ret = 0;
+	struct pv_list *pvl;
+	list_iterate_items(pvl, &vg->pvs) {
+		if (pvl->pv->status & MISSING_PV)
+			++ ret;
+	}
+	return ret;
+}
+
 /* Caller sets consistent to 1 if it's safe for vg_read to correct
  * inconsistent metadata on disk (i.e. the VG write lock is held).
  * This guarantees only consistent metadata is returned unless PARTIAL_VG.
@@ -1566,7 +1711,8 @@ static struct volume_group *_vg_read(str
 
 	/* Ensure every PV in the VG was in the cache */
 	if (correct_vg) {
-		if (list_size(&correct_vg->pvs) != list_size(pvids)) {
+		if (list_size(&correct_vg->pvs) != list_size(pvids)
+		    + _vg_missing_pv_count(correct_vg)) {
 			log_debug("Cached VG %s had incorrect PV list",
 				  vgname);
 
@@ -1575,6 +1721,8 @@ static struct volume_group *_vg_read(str
 			else
 				correct_vg = NULL;
 		} else list_iterate_items(pvl, &correct_vg->pvs) {
+			if (pvl->pv->status & MISSING_PV)
+				continue;
 			if (!str_list_match_item(pvids, pvl->pv->dev->pvid)) {
 				log_debug("Cached VG %s had incorrect PV list",
 					  vgname);
@@ -1704,6 +1852,8 @@ static struct volume_group *_vg_read(str
 		}
 	}
 
+	_vg_mark_partial(correct_vg);
+
 	if (correct_vg->status & PARTIAL_VG) {
 		if (*consistent)
 			vg_consolidate_partial(correct_vg);
diff -rN -u -p old-hotspare-prime/lib/metadata/metadata-exported.h new-hotspare-prime/lib/metadata/metadata-exported.h
--- old-hotspare-prime/lib/metadata/metadata-exported.h	2008-05-04 23:59:26.111995976 +0200
+++ new-hotspare-prime/lib/metadata/metadata-exported.h	2008-05-04 23:59:26.119995703 +0200
@@ -71,6 +71,12 @@ struct pv_segment;
 //#define PRECOMMITTED		0x00200000U	/* VG - internal use only */
 #define CONVERTING		0x00400000U	/* LV */
 
+// set on any missing PVs
+#define MISSING_PV              0x00800000U
+#define PARTIAL_LV              0x01000000U
+
+#define POSTORDER_FLAG          0x02000000U
+
 #define LVM_READ              	0x00000100U	/* LV VG */
 #define LVM_WRITE             	0x00000200U	/* LV VG */
 #define CLUSTERED         	0x00000400U	/* VG */
@@ -88,8 +94,6 @@ struct pv_segment;
 #define FMT_UNLIMITED_STRIPESIZE 0x00000100U	/* Unlimited stripe size? */
 #define FMT_RESTRICTED_READAHEAD 0x00000200U	/* Readahead restricted to 2-120? */
 
-#define MISSING_PV               0x00000400U
-
 /* LVM2 external library flags */
 #define CORRECT_INCONSISTENT    0x00000001U /* Correct inconsistent metadata */
 #define FAIL_INCONSISTENT       0x00000002U /* Fail if metadata inconsistent */

Yours,
   Petr.

-- 
Peter Rockai | me()mornfall!net | prockai()redhat!com
 http://blog.mornfall.net | http://web.mornfall.net

"In My Egotistical Opinion, most people's C programs should be
 indented six feet downward and covered with dirt."
     -- Blair P. Houghton on the subject of C program indentation

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]