[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[linux-lvm] Restore LVM2 RAID-1 without Reboot



Hi there,

I have set up a RAID-1 between two PVs on two different hard disks in a RHEL-6 environment.
I am having a problem restoring the broken mirror without rebooting the OS.

Here is how to reproduce my problem.
1. First of all, we have set up a raid-1 LV that looks like this:
[root shi-rhel63 home]# lvs -a -o +seg_pe_ranges|grep home
  lv_home             vg_root rwi-aom-  1.50g                             100.00         lv_home_rimage_0:0-47 lv_home_rimage_1:0-47  
  [lv_home_rimage_0]  vg_root iwi-aor-  1.50g                                            /dev/sda2:192-239                            
  [lv_home_rimage_1]  vg_root iwi-aor-  1.50g                                            /dev/sdb2:34-81                              
  [lv_home_rmeta_0]   vg_root ewi-aor- 32.00m                                            /dev/sda2:177-177                            
  [lv_home_rmeta_1]   vg_root ewi-aor- 32.00m                                            /dev/sdb2:33-33                              
[root shi-rhel63 home]# pvs
  PV         VG      Fmt  Attr PSize  PFree  
  /dev/sda2  vg_root lvm2 a--  17.84g 224.00m
  /dev/sdb2  vg_root lvm2 a--  18.84g   1.22g

I can test the mirror by performing a dd write and watch the iostat on both physical disks:
[root shi-rhel63 home]# dd if=/dev/zero of=test bs=1M count=10000 &
[1] 23388
[root shi-rhel63 home]# iostat -x 1 sda sdb -m
Linux 2.6.32-279.el6.x86_64 (shi-rhel63) 06/11/13 _x86_64_ (1 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.06    0.07    0.35    0.32    0.00   99.20

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.05     0.78    0.38    0.14     0.00     0.00    22.14     0.01   11.07   5.63   0.29
sdb               0.11     0.79    0.14    0.23     0.00     0.00    27.94     0.00    5.60   3.71   0.14

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    7.14   92.86    0.00    0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00  5702.04    1.02   56.12     0.01    22.46   805.25    50.75  911.66  17.86 102.04
sdb               0.00  5173.47    0.00   71.43     0.00    20.42   585.43     1.21   16.97   3.44  24.59

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    8.08   91.92    0.00    0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00  5131.31    0.00   52.53     0.00    19.71   768.50    51.39  894.83  19.23 101.01
sdb               0.00  5121.21    0.00   69.70     0.00    18.20   534.70     1.15   15.96   3.41  23.74

As you may see, both sda and sdb gets similar write IOs so the miror is in fact working.

2. Now I am going to simulate a disk failure on sdb by removing the disk (I use VMware so it is very easy). Of course, the mirror is now broken as show below:

[root shi-rhel63 home]# lvs -a -o +seg_pe_ranges|grep home
  Couldn't find device with uuid pnsMYs-Ce4t-9KYR-3Zfs-GItC-k5SZ-VidQ30.
  lv_home             vg_root rwi-aom-  1.50g                             100.00         lv_home_rimage_0:0-47 lv_home_rimage_1:0-47  
  [lv_home_rimage_0]  vg_root iwi-aor-  1.50g                                            /dev/sda2:192-239                            
  [lv_home_rimage_1]  vg_root iwi-aor-  1.50g                                            unknown device:34-81                         
  [lv_home_rmeta_0]   vg_root ewi-aor- 32.00m                                            /dev/sda2:177-177                            
  [lv_home_rmeta_1]   vg_root ewi-aor- 32.00m                                            unknown device:33-33                         
[root shi-rhel63 home]# pvs
  Couldn't find device with uuid pnsMYs-Ce4t-9KYR-3Zfs-GItC-k5SZ-VidQ30.
  PV             VG      Fmt  Attr PSize  PFree  
  /dev/sda2      vg_root lvm2 a--  17.84g 224.00m
  unknown device vg_root lvm2 a-m  18.84g   1.22g

I already have a slight problem here since the mirror status above still shows 100% Copy% but I accept it as a minor presentation issue.

3. Now I put the moved disk back and I would like to have a way to incrementally resync the difference from the point where the mirror is broken. Note that the same disk now shows up as sdc

[root shi-rhel63 home]# pvs
  PV         VG      Fmt  Attr PSize  PFree  
  /dev/sda2  vg_root lvm2 a--  17.84g 224.00m
  /dev/sdc2  vg_root lvm2 a--  18.84g   1.22g
[root shi-rhel63 home]# lvs -a -o +seg_pe_ranges|grep home
  lv_home             vg_root rwi-aom-  1.50g                             100.00         lv_home_rimage_0:0-47 lv_home_rimage_1:0-47  
  [lv_home_rimage_0]  vg_root iwi-aor-  1.50g                                            /dev/sda2:192-239                            
  [lv_home_rimage_1]  vg_root iwi-aor-  1.50g                                            /dev/sdc2:34-81                              
  [lv_home_rmeta_0]   vg_root ewi-aor- 32.00m                                            /dev/sda2:177-177                            
  [lv_home_rmeta_1]   vg_root ewi-aor- 32.00m                                            /dev/sdc2:33-33  

So everything looks perfect but if I perform the same dd write test, here is what I got:
[root shi-rhel63 home]# iostat -x 1 sda sdc -m
Linux 2.6.32-279.el6.x86_64 (shi-rhel63) 06/11/13 _x86_64_ (1 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.06    0.07    0.36    0.42    0.00   99.10

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.05     7.70    0.39    0.21     0.00     0.03   109.24     0.11  174.10   6.77   0.40
sdc               0.02     0.00    0.01    0.00     0.00     0.00     8.30     0.00    3.07   2.85   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    5.15   94.85    0.00    0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00  6281.44    0.00   49.48     0.00    22.14   916.38   138.42 2932.19  20.83 103.09
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    5.10   94.90    0.00    0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00  6213.27    0.00   54.08     0.00    23.99   908.30   136.34 2812.40  18.87 102.04
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    6.06   93.94    0.00    0.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00  5141.41    0.00   50.51     0.00    22.23   901.36   133.47 2612.74  20.00 101.01
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00

So it is clear that the newly added mirror is not being written at all.

What is really interesting is that if I reboot, it will work properly. But is there a way not to reboot?

Thanks a lot,
Shi
PS. My OS info

[root shi-rhel63 home]# uname -a
Linux shi-rhel63 2.6.32-279.el6.x86_64 #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux
[root shi-rhel63 home]# lvm version
  LVM version:     2.02.95(2)-RHEL6 (2012-05-16)
  Library version: 1.02.74-RHEL6 (2012-05-16)
  Driver version:  4.22.6


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]