[fwd] md raid1 chokes when one disk is removed
Danny Howard
dannyman at toldme.com
Fri Nov 11 23:25:53 UTC 2005
On Fri, Nov 11, 2005 at 02:25:35PM -0800, Rick Stevens wrote:
> On Fri, 2005-11-11 at 14:16 -0800, Danny Howard wrote:
> > Sweet! I can "fail" a disk and remove it thus:
> > mdadm --fail /dev/md0 /dev/sdb1
> > mdadm --fail /dev/md1 /dev/sdb2
> > mdadm --fail /dev/md2 /dev/sdb3
> > [ ... physically remove disk, system is fine ... ]
> > [ ... put the disk back in, system is fine ... ]
> > mdadm --remove /dev/md0 /dev/sdb1
> > mdadm --add /dev/md0 /dev/sdb1
> > mdadm --remove /dev/md1 /dev/sdb2
> > mdadm --add /dev/md1 /dev/sdb2
> > mdadm --remove /dev/md2 /dev/sdb3
> > mdadm --add /dev/md2 /dev/sdb3
> > [ ... md2 does a rebuild, but /boot and <swap> are fine -- nice! ... ]
>
> The RAID should go into degraded mode and continue to run. If you
> pulled the disk out while the system was powered up AND the system isn't
> hot swap-compatible (and most built-in SATA stuff isn't), then you've
> confused the SCSI bus badly and I'd be amazed if it worked at all after
> that. Your error messages indicate that's the case here.
Rick,
Given support of the OS, the drive is hot-swap. This works fine with
FreeBSD, and this works fine with Linux if I tell md that the disk is
failed. If the disk fails without me telling md, then md will react
badly.
> If, however, the SATA drives were in a hot swap-compatible enclosure and
> you see the same problem, then something else is wrong and we'd need to
> look at that a bit more closely.
What do you suggest?
Sincerely,
-danny
--
http://dannyman.toldme.com/
More information about the fedora-list
mailing list