[linux-lvm] Restore LVM2 RAID-1 without Reboot

Shi Jin jinzishuai at gmail.com
Wed Nov 6 16:25:23 UTC 2013


Hi there,

I have set up a RAID-1 between two PVs on two different hard disks in a
RHEL-6 environment.
I am having a problem restoring the broken mirror without rebooting the OS.

Here is how to reproduce my problem.
1. First of all, we have set up a raid-1 LV that looks like this:
[root at shi-rhel63 home]# lvs -a -o +seg_pe_ranges|grep home
  lv_home             vg_root rwi-aom-  1.50g
100.00         lv_home_rimage_0:0-47 lv_home_rimage_1:0-47
  [lv_home_rimage_0]  vg_root iwi-aor-  1.50g
             /dev/sda2:192-239
  [lv_home_rimage_1]  vg_root iwi-aor-  1.50g
             /dev/sdb2:34-81
  [lv_home_rmeta_0]   vg_root ewi-aor- 32.00m
             /dev/sda2:177-177
  [lv_home_rmeta_1]   vg_root ewi-aor- 32.00m
             /dev/sdb2:33-33
[root at shi-rhel63 home]# pvs
  PV         VG      Fmt  Attr PSize  PFree
  /dev/sda2  vg_root lvm2 a--  17.84g 224.00m
  /dev/sdb2  vg_root lvm2 a--  18.84g   1.22g

I can test the mirror by performing a dd write and watch the iostat on both
physical disks:
[root at shi-rhel63 home]# dd if=/dev/zero of=test bs=1M count=10000 &
[1] 23388
[root at shi-rhel63 home]# iostat -x 1 sda sdb -m
Linux 2.6.32-279.el6.x86_64 (shi-rhel63) 06/11/13 _x86_64_ (1 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.06    0.07    0.35    0.32    0.00   99.20

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
avgqu-sz   await  svctm  %util
sda               0.05     0.78    0.38    0.14     0.00     0.00    22.14
    0.01   11.07   5.63   0.29
sdb               0.11     0.79    0.14    0.23     0.00     0.00    27.94
    0.00    5.60   3.71   0.14

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    7.14   92.86    0.00    0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
avgqu-sz   await  svctm  %util
sda               0.00  5702.04    1.02   56.12     0.01    22.46   805.25
   50.75  911.66  17.86 102.04
sdb               0.00  5173.47    0.00   71.43     0.00    20.42   585.43
    1.21   16.97   3.44  24.59

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    8.08   91.92    0.00    0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
avgqu-sz   await  svctm  %util
sda               0.00  5131.31    0.00   52.53     0.00    19.71   768.50
   51.39  894.83  19.23 101.01
sdb               0.00  5121.21    0.00   69.70     0.00    18.20   534.70
    1.15   15.96   3.41  23.74

As you may see, both sda and sdb gets similar write IOs so the miror is in
fact working.

2. Now I am going to simulate a disk failure on sdb by removing the disk (I
use VMware so it is very easy). Of course, the mirror is now broken as show
below:

[root at shi-rhel63 home]# lvs -a -o +seg_pe_ranges|grep home
  Couldn't find device with uuid pnsMYs-Ce4t-9KYR-3Zfs-GItC-k5SZ-VidQ30.
  lv_home             vg_root rwi-aom-  1.50g
100.00         lv_home_rimage_0:0-47 lv_home_rimage_1:0-47
  [lv_home_rimage_0]  vg_root iwi-aor-  1.50g
             /dev/sda2:192-239
  [lv_home_rimage_1]  vg_root iwi-aor-  1.50g
             unknown device:34-81
  [lv_home_rmeta_0]   vg_root ewi-aor- 32.00m
             /dev/sda2:177-177
  [lv_home_rmeta_1]   vg_root ewi-aor- 32.00m
             unknown device:33-33
[root at shi-rhel63 home]# pvs
  Couldn't find device with uuid pnsMYs-Ce4t-9KYR-3Zfs-GItC-k5SZ-VidQ30.
  PV             VG      Fmt  Attr PSize  PFree
  /dev/sda2      vg_root lvm2 a--  17.84g 224.00m
  unknown device vg_root lvm2 a-m  18.84g   1.22g

I already have a slight problem here since the mirror status above still
shows 100% Copy% but I accept it as a minor presentation issue.

3. Now I put the moved disk back and I would like to have a way to
incrementally resync the difference from the point where the mirror is
broken. Note that the same disk now shows up as sdc

[root at shi-rhel63 home]# pvs
  PV         VG      Fmt  Attr PSize  PFree
  /dev/sda2  vg_root lvm2 a--  17.84g 224.00m
  /dev/sdc2  vg_root lvm2 a--  18.84g   1.22g
[root at shi-rhel63 home]# lvs -a -o +seg_pe_ranges|grep home
  lv_home             vg_root rwi-aom-  1.50g
100.00         lv_home_rimage_0:0-47 lv_home_rimage_1:0-47
  [lv_home_rimage_0]  vg_root iwi-aor-  1.50g
             /dev/sda2:192-239
  [lv_home_rimage_1]  vg_root iwi-aor-  1.50g
             /dev/sdc2:34-81
  [lv_home_rmeta_0]   vg_root ewi-aor- 32.00m
             /dev/sda2:177-177
  [lv_home_rmeta_1]   vg_root ewi-aor- 32.00m
             /dev/sdc2:33-33

So everything looks perfect but if I perform the same dd write test, here
is what I got:
[root at shi-rhel63 home]# iostat -x 1 sda sdc -m
Linux 2.6.32-279.el6.x86_64 (shi-rhel63) 06/11/13 _x86_64_ (1 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.06    0.07    0.36    0.42    0.00   99.10

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
avgqu-sz   await  svctm  %util
sda               0.05     7.70    0.39    0.21     0.00     0.03   109.24
    0.11  174.10   6.77   0.40
sdc               0.02     0.00    0.01    0.00     0.00     0.00     8.30
    0.00    3.07   2.85   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    5.15   94.85    0.00    0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
avgqu-sz   await  svctm  %util
sda               0.00  6281.44    0.00   49.48     0.00    22.14   916.38
  138.42 2932.19  20.83 103.09
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00
    0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    5.10   94.90    0.00    0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
avgqu-sz   await  svctm  %util
sda               0.00  6213.27    0.00   54.08     0.00    23.99   908.30
  136.34 2812.40  18.87 102.04
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00
    0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    6.06   93.94    0.00    0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
avgqu-sz   await  svctm  %util
sda               0.00  5141.41    0.00   50.51     0.00    22.23   901.36
  133.47 2612.74  20.00 101.01
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00
    0.00    0.00   0.00   0.00

So it is clear that the newly added mirror is not being written at all.

What is really interesting is that if I reboot, it will work properly. But
is there a way not to reboot?

Thanks a lot,
Shi
PS. My OS info

[root at shi-rhel63 home]# uname -a
Linux shi-rhel63 2.6.32-279.el6.x86_64 #1 SMP Wed Jun 13 18:24:36 EDT 2012
x86_64 x86_64 x86_64 GNU/Linux
[root at shi-rhel63 home]# lvm version
  LVM version:     2.02.95(2)-RHEL6 (2012-05-16)
  Library version: 1.02.74-RHEL6 (2012-05-16)
  Driver version:  4.22.6
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-lvm/attachments/20131106/7c125dbf/attachment.htm>


More information about the linux-lvm mailing list