[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [linux-lvm] lvm raid failure




------------------------------

Message: 6
Date: Mon, 10 Dec 2007 21:42:45 -0500
From: "Null EXE" <nullexe gmail com>
Subject: [linux-lvm] lvm raid failure
To: linux-lvm redhat com
Message-ID:
       <1f31f75c0712101842q4d71c03bu75926f1deb026a26 mail gmail com>
Content-Type: text/plain; charset="iso-8859-1"

Hi everyone,


I have been running an Edgy fileserver for about 6 months now. When I came
home from work last week I found my files all inaccessible. After
investigation I found issues with my PV. I would like to recover my data and
replace the drives if possible any help is greatly appreciated!

I have one system drive and 6 storage drive, all raid1. The storage layout
is
2x160GB /dev/md0
2x250GB /dev/md1
2x250GB /dev/md2
After investigation I found problems with my lvm. pvdisplay showed:
Couldn't find device with uuid 'xjWU5M-G3WB-tZjB-Tyrk-Q7rN-Yide-1FEVh3'.

I ran the command:
sudo pvcreate --uuid "xjWU5M-G3WB-tZjB-Tyrk-Q7rN-Yide-1FEVh3" --restorefile
/etc/lvm/archive/fileserver_00006.vg /dev/md2

This seemed to recover /dev/md2. I re-ran pvdisplay and got
Couldn't find device with uuid '1ATm8s-oxKG-nz0p-z1QA-a0s4-od9T-AMZRoo'.

I figured I could run the same command on md1,
sudo pvcreate --uuid "1ATm8s-oxKG-nz0p-z1QA-a0s4-od9T-AMZRoo" --restorefile
/etc/lvm/archive/fileserver_00006.vg /dev/md1

and got the message:
Couldn't find device with uuid '1ATm8s-oxKG-nz0p-z1QA-a0s4-od9T-AMZRoo'.
Device /dev/md1 not found (or ignored by filtering).

After re-executing the above command with -vvv at the bottom of the output I
get the following message:
#device/dev-io.c:439         Opened /dev/md1 RO
#device/dev-io.c:264       /dev/md1: size is 0 sectors
#filters/filter.c:106         /dev/md1: Skipping: Too small to hold a PV
#device/dev-io.c:485         Closed /dev/md1
#pvcreate.c:81   Device /dev/md1 not found (or ignored by filtering).

Here is my pvdisplay. Again, any help is Greatly appreciated
***START***

Couldn't find device with uuid '1ATm8s-oxKG-nz0p-z1QA-a0s4-od9T-AMZRoo'.
Couldn't find device with uuid '1ATm8s-oxKG-nz0p-z1QA-a0s4-od9T-AMZRoo'.
--- Physical volume ---
PV Name /dev/md0
VG Name fileserver
PV Size 149.05 GB / not usable 0
Allocatable yes
PE Size (KByte) 4096
Total PE 38156
Free PE 9996
Allocated PE 28160
PV UUID pZV1Ff-Y7fu-S8m1-tVFn-fOMJ-VRls-fLFEov

--- Physical volume ---
PV Name unknown device
VG Name fileserver
PV Size 232.88 GB / not usable 0
Allocatable yes (but full)
PE Size (KByte) 4096
Total PE 59618
Free PE 0
Allocated PE 59618
PV UUID 1ATm8s-oxKG-nz0p-z1QA-a0s4-od9T-AMZRoo

--- Physical volume ---
PV Name /dev/md2
VG Name fileserver
PV Size 232.88 GB / not usable 0
Allocatable yes
PE Size (KByte) 4096
Total PE 59618
Free PE 9156
Allocated PE 50462
PV UUID xjWU5M-G3WB-tZjB-Tyrk-Q7rN-Yide-1FEVh3
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://www.redhat.com/archives/linux-lvm/attachments/20071210/1c987df9/attachment.html

------------------------------

Message: 7
Date: Tue, 11 Dec 2007 09:20:29 +0100
From: Luca Berra < bluca comedia it>
Subject: Re: [linux-lvm] lvm raid failure
To: linux-lvm redhat com
Message-ID: <20071211082029 GA1535 percy comedia it >
Content-Type: text/plain; charset=us-ascii; format=flowed

On Mon, Dec 10, 2007 at 09:42:45PM -0500, Null EXE wrote:
>get the following message:
>#device/dev-io.c:439         Opened /dev/md1 RO
>#device/dev-io.c:264       /dev/md1: size is 0 sectors
>#filters/filter.c:106         /dev/md1: Skipping: Too small to hold a PV
>#device/dev-io.c:485         Closed /dev/md1
>#pvcreate.c:81   Device /dev/md1 not found (or ignored by filtering).
>
>Here is my pvdisplay. Again, any help is Greatly appreciated
you should investigate what's wrong at the md layer, lvm seems to be
just a victim.

check
/proc/mdstat
kernel-logs
mdadm.conf
mdadm -Es

L.

--
Luca Berra -- bluca comedia it
       Communication Media & Services S.r.l.
 /"\
 \ /     ASCII RIBBON CAMPAIGN
 X        AGAINST HTML MAIL
 / \


  ***
/proc/mdstat
Personalities : [raid1]
md2 : active raid1 dm-5[1]
      244195904 blocks [2/1] [_U]
     
md1 : inactive hdj1[0]
      244195904 blocks super non-persistent
      
md0 : active raid1 hdc1[0]
      156288256 blocks [2/1] [U_]
     
unused devices: <none>

/should my dm-5 be displayed or should it be a /dev/hd[a-z] device

***
mdadm -Es
ARRAY /dev/md0 level=raid1 num-devices=2 UUID=eeb0befe:9e50d574:974f5eae:ccb8e527
ARRAY /dev/md2 level=raid1 num-devices=2 UUID=1e5c1ab5:a5a06f84:d722583d:ca0f8cad
ARRAY /dev/md1 level=raid1 num-devices=2 UUID=41fc507e:7293a47c:7f18baeb:89dd5958

***
mdadm.conf
DEVICE partitions

/proc/partitions
   3     0   10022040 hda
   3     1    9574708 hda1
   3     2          1 hda2
   3     5     441756 hda5
  22     0  156290904 hdc
  22     1  156288321 hdc1
  22    64  156290904 hdd
  22    65  156288321 hdd1
  56     0  244198584 hdi
  56     1  244196001 hdi1
  56    64  244198584 hdj
  56    65  244196001 hdj1
  57     0  244198584 hdk
  57     1  244196001 hdk1
  57    64  244198584 hdl
  57    65  244196001 hdl1
   9     0  156288256 md0
   9     2  244195904 md2
 253     0    9574708 dm-0
 253     1     441756 dm-1
 253     2  156288321 dm-2
 253     3  156288321 dm-3
 253     4  244196001 dm-4
 253     5  244196001 dm-5
 253     6  244196001 dm-6
 253     7  244196001 dm-7


***
kernel logs
Nov 28 22:58:46 ark-server kernel: [42949384.540000 ] raid1: raid set md1 active with 1 out of 2 mirrors
Nov 28 22:58:46 ark-server kernel: [42949384.560000] md: md2 stopped.
Nov 28 22:58:46 ark-server kernel: [42949384.560000] md: bind<hdj1>
Nov 28 22:58:46 ark-server kernel: [ 42949384.560000] md: hdl1 has same UUID but different superblock to hdj1
Nov 28 22:58:46 ark-server kernel: [42949384.560000] md: hdl1 has different UUID to hdj1
Nov 28 22:58:46 ark-server kernel: [42949384.560000] md: export_rdev(hdl1)

***
Looking at all of this.  When I set up the array I remember my devices ordered with hd[cdefgh] now I'm seeing md1 trying to use hd[jl].  Is this a problem that it put assigned the drives new letter automatically?


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]