[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[lvm-devel] [PATCH] (5/6) lvconvert --repair



Not many changes have happened here either. However, with the changes in
underlying code (in the foundation patches, mostly), I am declaring this to be
no longer experimental and I hereby declare my intent to have this merged as
well (maybe not right away along the rest of the patches, as this is apparently
the least scrutinised one, but soon enough, as this opens the path for
important dmeventd functionality (that is, hotspare substitution of mirror
devices).

(This here is the proposed new way for dmeventd to handle partial mirror
failures, which will do as before -- remove lost legs/disklog, moreover, if
available, it will replace them with freshly allocated areas... This needs to
be more controlled eventually, but I think what we have here is a good start.)

diff -rN -p -u old-hotspare-everything/dmeventd/mirror/dmeventd_mirror.c new-hotspare-everything/dmeventd/mirror/dmeventd_mirror.c
--- old-hotspare-everything/dmeventd/mirror/dmeventd_mirror.c   2008-07-29 15:26:44.161493573 +0200
+++ new-hotspare-everything/dmeventd/mirror/dmeventd_mirror.c   2008-07-29 15:26:44.185490029 +0200
@@ -152,7 +152,7 @@ static int _remove_failed_devices(const 
        }
 
        /* FIXME Is any sanity-checking required on %s? */
-       if (CMD_SIZE <= snprintf(cmd_str, CMD_SIZE, "vgreduce --config devices{ignore_suspended_devices=1} --removemissing %s", vg)) {
+       if (CMD_SIZE <= snprintf(cmd_str, CMD_SIZE, "lvconvert --config devices{ignore_suspended_devices=1} --repair '%s/%s'", vg, lv)) {
                /* this error should be caught above, but doesn't hurt to check again */
                syslog(LOG_ERR, "Unable to form LVM command: Device name too long");
                dm_pool_empty(_mem_pool);  /* FIXME: not safe with multiple threads */

The actual patch for lvconvert follows.

Tue Jul 29 15:24:18 CEST 2008  me mornfall net
  tagged handles_missing_pvs base
Mon Jul 28 15:33:42 CEST 2008  me mornfall net
  tagged handles_missing_pvs base
Mon Jul 14 17:56:53 CEST 2008  me mornfall net
  * Remove unused label.
Mon Jul 14 17:56:41 CEST 2008  me mornfall net
  * Fix another gcc warning.
Mon Jul 14 16:39:44 CEST 2008  me mornfall net
  * Better integer types.
Fri Jul 11 15:14:43 CEST 2008  me mornfall net
  * lvconvert: handles_partial -> handles_missing_pvs.
Fri Jul 11 14:50:21 CEST 2008  me mornfall net
  tagged hotspare: lvconvert --repair 1
Fri Jul 11 14:49:00 CEST 2008  me mornfall net
  tagged hotspare: handles_missing_pvs 1
Thu Jul  3 18:46:07 CEST 2008  me mornfall net
  * Resolve conflict in lvconvert.c.
Sun Jun  8 17:50:19 CEST 2008  me mornfall net
  * Note down that the PV argument list in lvconvert is sort of  counterintuitive.
Sun Jun  8 17:48:50 CEST 2008  me mornfall net
  * Fix counters used in failed device replacement in lvconvert --repair.
Sun Jun  8 17:44:48 CEST 2008  me mornfall net
  * Do not try to upconvert a linear LV we just obtained from downconversion...
  
  In other words, a forgotten "else" in there.
Sun Jun  8 17:44:15 CEST 2008  me mornfall net
  * Use UUID to probe PV equality. Do not dereference null pointers.
Sun Jun  8 15:23:44 CEST 2008  me mornfall net
  * Handle missing PVs in _is_mirror_image_removable.
Sun Jun  8 15:09:13 CEST 2008  me mornfall net
  * Implement device replacement in lvconvert --repair.
Sun May  4 23:54:35 CEST 2008  me mornfall net
  * Add a restart-loop encoded as a goto to lvconvert.
  
  Yes, it could have been done as a do-while loop just as well, but:
  a) lvm seems to be goto-happy
  b) do-while would indent a bunch of stuff out of my screen
Thu Apr 24 22:14:09 CEST 2008  me mornfall net
  * Exclude MISSING_PVs from create_pv_list in allocatable_only case.
Mon Mar 31 18:36:56 CEST 2008  me mornfall net
  * Factor the 2 downconversion cases to one.
Mon Mar 31 18:36:23 CEST 2008  me mornfall net
  * Move the segment count check and adjust comment (lvconvert).
Sat Mar 29 19:36:44 CET 2008  me mornfall net
  * More code shuffling (this however introduces some problems).
Sat Mar 29 19:22:11 CET 2008  me mornfall net
  * Shuffle lvconvert code a little to make remove-and-add in one go easier.
Fri Feb 15 09:52:45 CET 2008  me mornfall net
  * Resolve conflicts.
Fri Feb 15 09:51:11 CET 2008  me mornfall net
  * First go at lvconvert --repair implementation.
Fri Feb 15 09:50:53 CET 2008  me mornfall net
  * Add repair_ARG to lvconvert's argument list.
Thu Feb 14 16:20:46 CET 2008  me mornfall net
  * Add --repair switch to lvconvert for fixups in partial VGs.
diff -rN -p -u old-lvconvert-repair/lib/metadata/metadata.c new-lvconvert-repair/lib/metadata/metadata.c
--- old-lvconvert-repair/lib/metadata/metadata.c	2008-07-29 15:26:47.169490843 +0200
+++ new-lvconvert-repair/lib/metadata/metadata.c	2008-07-29 15:26:47.253493875 +0200
@@ -1192,7 +1192,8 @@ static int _lv_each_dependency(struct lo
 			       int (*fn)(struct logical_volume *lv, void *data),
 			       void *data)
 {
-	int i, s;
+	size_t i;
+	uint32_t s;
 	struct lv_segment *lvseg;
 
 	struct logical_volume *deps[] = {
diff -rN -p -u old-lvconvert-repair/lib/metadata/mirror.c new-lvconvert-repair/lib/metadata/mirror.c
--- old-lvconvert-repair/lib/metadata/mirror.c	2008-07-29 15:26:47.177493363 +0200
+++ new-lvconvert-repair/lib/metadata/mirror.c	2008-07-29 15:26:47.253493875 +0200
@@ -394,7 +394,12 @@ static int _is_mirror_image_removable(st
 
 			pv_found = 0;
 			list_iterate_items(pvl, removable_pvs) {
-				if (pv->dev->dev == pvl->pv->dev->dev) {
+				if (id_equal(&pv->id, &pvl->pv->id)) {
+					pv_found = 1;
+					break;
+				}
+				if (pvl->pv->dev && pv->dev &&
+				    pv->dev->dev == pvl->pv->dev->dev) {
 					pv_found = 1;
 					break;
 				}
diff -rN -p -u old-lvconvert-repair/tools/args.h new-lvconvert-repair/tools/args.h
--- old-lvconvert-repair/tools/args.h	2008-07-29 15:26:47.185491623 +0200
+++ new-lvconvert-repair/tools/args.h	2008-07-29 15:26:47.225494587 +0200
@@ -49,6 +49,7 @@ arg(nosync_ARG, '\0', "nosync", NULL, 0)
 arg(resync_ARG, '\0', "resync", NULL, 0)
 arg(corelog_ARG, '\0', "corelog", NULL, 0)
 arg(mirrorlog_ARG, '\0', "mirrorlog", string_arg, 0)
+arg(repair_ARG, '\0', "repair", NULL, 0)
 arg(monitor_ARG, '\0', "monitor", yes_no_arg, 0)
 arg(config_ARG, '\0', "config", string_arg, 0)
 arg(trustcache_ARG, '\0', "trustcache", NULL, 0)
diff -rN -p -u old-lvconvert-repair/tools/commands.h new-lvconvert-repair/tools/commands.h
--- old-lvconvert-repair/tools/commands.h	2008-07-29 15:26:47.185491623 +0200
+++ new-lvconvert-repair/tools/commands.h	2008-07-29 15:26:47.225494587 +0200
@@ -94,6 +94,7 @@ xx(lvconvert,
    0,
    "lvconvert "
    "[-m|--mirrors Mirrors [{--mirrorlog {disk|core}|--corelog}]]\n"
+   "\t[--repair]\n"
    "\t[-R|--regionsize MirrorLogRegionSize]\n"
    "\t[--alloc AllocationPolicy]\n"
    "\t[-b|--background]\n"
@@ -115,7 +116,8 @@ xx(lvconvert,
    "\tOriginalLogicalVolume[Path] SnapshotLogicalVolume[Path]\n",
 
    alloc_ARG, background_ARG, chunksize_ARG, corelog_ARG, interval_ARG,
-   mirrorlog_ARG, mirrors_ARG, regionsize_ARG, snapshot_ARG, test_ARG, zero_ARG)
+   mirrorlog_ARG, mirrors_ARG, regionsize_ARG, repair_ARG, snapshot_ARG,
+   test_ARG, zero_ARG)
 
 xx(lvcreate,
    "Create a logical volume",
diff -rN -p -u old-lvconvert-repair/tools/lvconvert.c new-lvconvert-repair/tools/lvconvert.c
--- old-lvconvert-repair/tools/lvconvert.c	2008-07-29 15:26:47.181490188 +0200
+++ new-lvconvert-repair/tools/lvconvert.c	2008-07-29 15:26:47.217492625 +0200
@@ -364,6 +364,61 @@ static int _insert_lvconvert_layer(struc
 	return 1;
 }
 
+static int _area_missing(struct lv_segment *lvseg, int s)
+{
+	if (seg_type(lvseg, s) == AREA_LV) {
+		if (seg_lv(lvseg, s)->status & PARTIAL_LV)
+			return 1;
+	} else if (seg_type(lvseg, s) == AREA_PV) {
+		if (seg_pv(lvseg, s)->status & MISSING_PV)
+			return 1;
+	}
+	return 0;
+}
+
+/* FIXME we want to handle mirror stacks here... */
+static int _count_failed_mirrors(struct logical_volume *lv)
+{
+	struct lv_segment *lvseg;
+	int ret = 0;
+	int s;
+	list_iterate_items(lvseg, &lv->segments) {
+		if (!seg_is_mirrored(lvseg))
+			return -1;
+		for(s = 0; s < lvseg->area_count; ++s) {
+			if (_area_missing(lvseg, s))
+				++ ret;
+		}
+	}
+	return ret;
+}
+
+static struct list *_failed_pv_list(struct cmd_context *cmd,
+				    struct volume_group *vg)
+{
+	struct list *r;
+	struct pv_list *pvl, *new_pvl;
+
+	if (!(r = dm_pool_alloc(cmd->mem, sizeof(*r)))) {
+		log_error("Allocation of list failed");
+		return_0;
+	}
+
+	list_init(r);
+	list_iterate_items(pvl, &vg->pvs) {
+		if (!(pvl->pv->status & MISSING_PV))
+			continue;
+
+		if (!(new_pvl = dm_pool_alloc(cmd->mem, sizeof(*new_pvl)))) {
+			log_err("Unable to allocate physical volume list.");
+			return 0;
+		}
+		new_pvl->pv = pvl->pv;
+		list_add(r, &new_pvl->list);
+	}
+	return r;
+}
+
 /* walk down the stacked mirror LV to the original mirror LV */
 static struct logical_volume *_original_lv(struct logical_volume *lv)
 {
@@ -383,17 +438,26 @@ static int lvconvert_mirrors(struct cmd_
 	const char *mirrorlog;
 	unsigned corelog = 0;
 	struct logical_volume *original_lv;
+	struct logical_volume *log_lv;
+	int failed_mirrors = 0, failed_log = 0;
+	struct list *old_pvh;
 
 	seg = first_seg(lv);
 	existing_mirrors = lv_mirror_count(lv);
 
 	/* If called with no argument, try collapsing the resync layers */
 	if (!arg_count(cmd, mirrors_ARG) && !arg_count(cmd, mirrorlog_ARG) &&
-	    !arg_count(cmd, corelog_ARG) && !arg_count(cmd, regionsize_ARG)) {
+	    !arg_count(cmd, corelog_ARG) && !arg_count(cmd, regionsize_ARG) &&
+	    !arg_count(cmd, repair_ARG)) {
 		lp->need_polling = 1;
 		return 1;
 	}
 
+	if (arg_count(cmd, mirrors_ARG) && arg_count(cmd, repair_ARG)) {
+		log_error("You can only use one of -m, --repair.");
+		return 0;
+	}
+
 	/*
 	 * Adjust required number of mirrors
 	 *
@@ -411,38 +475,56 @@ static int lvconvert_mirrors(struct cmd_
 	else
 		lp->mirrors += 1;
 
-	/*
-	 * Did the user try to subtract more legs than available?
-	 */
-	if (lp->mirrors < 1) {
-		log_error("Logical volume %s only has %" PRIu32 " mirrors.",
-			  lv->name, existing_mirrors);
-		return 0;
-	}
-
-	/*
-	 * Adjust log type
-	 */
-	if (arg_count(cmd, corelog_ARG))
-		corelog = 1;
-
-	mirrorlog = arg_str_value(cmd, mirrorlog_ARG,
-				  corelog ? "core" : DEFAULT_MIRRORLOG);
-	if (!strcmp("disk", mirrorlog)) {
-		if (corelog) {
-			log_error("--mirrorlog disk and --corelog "
-				  "are incompatible");
+	if (arg_count(cmd,repair_ARG)) {
+		cmd->handles_missing_pvs = 1;
+		lp->need_polling = 0;
+		if (!(lv->status & PARTIAL_LV)) {
+			log_error("The mirror is consistent, nothing to repair.");
+			return_0;
+		}
+		failed_mirrors = _count_failed_mirrors(lv);
+		lp->mirrors -= failed_mirrors;
+		log_error("Mirror counts: %d/%d failed.",
+			  failed_mirrors, existing_mirrors);
+		old_pvh = lp->pvh;
+		lp->pvh = _failed_pv_list(cmd, lv->vg);
+		log_lv=first_seg(lv)->log_lv;
+		if (!log_lv || log_lv->status & PARTIAL_LV)
+			failed_log = corelog = 1;
+	} else {
+		/*
+		 * Did the user try to subtract more legs than available?
+		 */
+		if (lp->mirrors < 1) {
+			log_error("Logical volume %s only has %" PRIu32 " mirrors.",
+				  lv->name, existing_mirrors);
+			return 0;
+		}
+		
+		/*
+		 * Adjust log type
+		 */
+		if (arg_count(cmd, corelog_ARG))
+			corelog = 1;
+		
+		mirrorlog = arg_str_value(cmd, mirrorlog_ARG,
+					  corelog ? "core" : DEFAULT_MIRRORLOG);
+		if (!strcmp("disk", mirrorlog)) {
+			if (corelog) {
+				log_error("--mirrorlog disk and --corelog "
+					  "are incompatible");
+				return 0;
+			}
+			corelog = 0;
+		} else if (!strcmp("core", mirrorlog))
+			corelog = 1;
+		else {
+			log_error("Unknown mirrorlog type: %s", mirrorlog);
 			return 0;
 		}
-		corelog = 0;
-	} else if (!strcmp("core", mirrorlog))
-		corelog = 1;
-	else {
-		log_error("Unknown mirrorlog type: %s", mirrorlog);
-		return 0;
-	}
 
-	log_verbose("Setting logging type to %s", mirrorlog);
+		log_verbose("Setting logging type to %s", mirrorlog);
+	}
 
 	/*
 	 * Region size must not change on existing mirrors
@@ -455,6 +537,18 @@ static int lvconvert_mirrors(struct cmd_
 	}
 
 	/*
+	 * FIXME This check used to precede mirror->mirror conversion
+	 * but didn't affect mirror->linear or linear->mirror. I do
+	 * not understand what is its intention, in fact.
+	 */
+	if (list_size(&lv->segments) != 1) {
+		log_error("Logical volume %s has multiple "
+			  "mirror segments.", lv->name);
+		return 0;
+	}
+	
+ restart:
+	/*
 	 * Converting from mirror to linear
 	 */
 	if ((lp->mirrors == 1)) {
@@ -463,17 +557,22 @@ static int lvconvert_mirrors(struct cmd_
 				  lv->name);
 			return 1;
 		}
-
-		if (!lv_remove_mirrors(cmd, lv, existing_mirrors - 1, 1,
-				       lp->pv_count ? lp->pvh : NULL, 0))
-			return_0;
-		goto commit_changes;
 	}
 
 	/*
-	 * Converting from linear to mirror
+	 * Downconversion.
 	 */
-	if (!(lv->status & MIRRORED)) {
+	if (lp->mirrors < existing_mirrors) {
+		/* Reduce number of mirrors */
+		if (!lv_remove_mirrors(cmd, lv, existing_mirrors - lp->mirrors,
+				       (corelog || lp->mirrors == 1) ? 1U : 0U,
+				       lp->pv_count ? lp->pvh : NULL, 0))
+			return_0;
+	} else if (!(lv->status & MIRRORED)) {
+		/*
+		 * Converting from linear to mirror
+		 */
+	
 		/* FIXME Share code with lvcreate */
 
 		/* FIXME Why is this restriction here?  Fix it! */
@@ -484,6 +583,9 @@ static int lvconvert_mirrors(struct cmd_
 			}
 		}
 
+		// FIXME should we give not only lp->pvh, but also all PVs
+		// currently taken by the mirror? Would make more sense from
+		// user perspective.
 		if (!lv_add_mirrors(cmd, lv, lp->mirrors - 1, 1,
 				    adjusted_mirror_region_size(
 						lv->vg->extent_size,
@@ -494,46 +596,7 @@ static int lvconvert_mirrors(struct cmd_
 			return_0;
 		if (lp->wait_completion)
 			lp->need_polling = 1;
-		goto commit_changes;
-	}
-
-	/*
-	 * Converting from mirror to mirror with different leg count,
-	 * or different log type.
-	 */
-	if (list_size(&lv->segments) != 1) {
-		log_error("Logical volume %s has multiple "
-			  "mirror segments.", lv->name);
-		return 0;
-	}
-
-	if (lp->mirrors == existing_mirrors) {
-		/*
-		 * Convert Mirror log type
-		 */
-		original_lv = _original_lv(lv);
-		if (!first_seg(original_lv)->log_lv && !corelog) {
-			if (!add_mirror_log(cmd, original_lv, 1,
-					    adjusted_mirror_region_size(
-							lv->vg->extent_size,
-							lv->le_count,
-							lp->region_size),
-					    lp->pvh, lp->alloc))
-				return_0;
-		} else if (first_seg(original_lv)->log_lv && corelog) {
-			if (!remove_mirror_log(cmd, original_lv,
-					       lp->pv_count ? lp->pvh : NULL))
-				return_0;
-		} else {
-			/* No change */
-			log_error("Logical volume %s already has %"
-				  PRIu32 " mirror(s).", lv->name,
-				  lp->mirrors - 1);
-			if (lv->status & CONVERTING)
-				lp->need_polling = 1;
-			return 1;
-		}
-	} else if (lp->mirrors > existing_mirrors) {
+	} else if (lp->mirrors > existing_mirrors || failed_mirrors) {
 		if (lv->status & MIRROR_NOTSYNCED) {
 			log_error("Not adding mirror to mirrored LV "
 				  "without initial resync");
@@ -575,15 +638,47 @@ static int lvconvert_mirrors(struct cmd_
 			return_0;
 		lv->status |= CONVERTING;
 		lp->need_polling = 1;
-	} else {
-		/* Reduce number of mirrors */
-		if (!lv_remove_mirrors(cmd, lv, existing_mirrors - lp->mirrors,
-				       corelog ? 1U : 0U,
-				       lp->pv_count ? lp->pvh : NULL, 0))
-			return_0;
 	}
 
-commit_changes:
+	if (lp->mirrors == existing_mirrors) {
+		/*
+		 * Convert Mirror log type
+		 */
+		original_lv = _original_lv(lv);
+		if (!first_seg(original_lv)->log_lv && !corelog) {
+			if (!add_mirror_log(cmd, original_lv, 1,
+					    adjusted_mirror_region_size(
+							lv->vg->extent_size,
+							lv->le_count,
+							lp->region_size),
+					    lp->pvh, lp->alloc))
+				return_0;
+		} else if (first_seg(original_lv)->log_lv && corelog) {
+			if (!remove_mirror_log(cmd, original_lv,
+					       lp->pv_count ? lp->pvh : NULL))
+				return_0;
+		} else {
+			/* No change */
+			log_error("Logical volume %s already has %"
+				  PRIu32 " mirror(s).", lv->name,
+				  lp->mirrors - 1);
+			if (lv->status & CONVERTING)
+				lp->need_polling = 1;
+			return 1;
+		}
+	}
+
+	if (failed_log || failed_mirrors) {
+		lp->pvh = old_pvh;
+		if (failed_log)
+			failed_log = corelog = 0;
+		lp->mirrors += failed_mirrors;
+		failed_mirrors = 0;
+		existing_mirrors = lv_mirror_count(lv);
+		// now replace missing devices
+		goto restart;
+	}
+
 	log_very_verbose("Updating logical volume \"%s\" on disk(s)", lv->name);
 
 	if (!vg_write(lv->vg))
diff -rN -p -u old-lvconvert-repair/tools/toollib.c new-lvconvert-repair/tools/toollib.c
--- old-lvconvert-repair/tools/toollib.c	2008-07-29 15:26:47.181490188 +0200
+++ new-lvconvert-repair/tools/toollib.c	2008-07-29 15:26:47.225494587 +0200
@@ -1051,6 +1051,11 @@ static int _create_pv_entry(struct dm_po
 		return 1;
 	}
 
+	if (allocatable_only && (pvl->pv->status & MISSING_PV)) {
+		log_error("Physical volume %s is missing", pvname);
+		return 1;
+	}
+
 	if (allocatable_only &&
 	    (pvl->pv->pe_count == pvl->pv->pe_alloc_count)) {
 		log_err("No free extents on physical volume \"%s\"", pvname);

Yours,
   Petr.

-- 
Peter Rockai | me()mornfall!net | prockai()redhat!com
 http://blog.mornfall.net | http://web.mornfall.net

"In My Egotistical Opinion, most people's C programs should be
 indented six feet downward and covered with dirt."
     -- Blair P. Houghton on the subject of C program indentation

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]