[linux-lvm] Lvm hangs on San fail

Eugene Vilensky evilensky at gmail.com
Thu Apr 15 12:41:17 UTC 2010


Can you show us a pvdisplay or verbose vgdisplay ?

On 4/15/10, jose nuno neto <jose.neto at liber4e.com> wrote:
> hellos
>
> I spent more time on this and it seems since LVM cant write to any pv on
> the  volumes it has lost, it cannot write the failure of the devices and
> update the metadata on other PVs. So it hangs forever
>
> Is this right?
>
>> GoodMornings
>>
>> This is what I have on multipath.conf
>>
>> blacklist {
>>         wwid SSun_VOL0_266DCF4A
>>         wwid SSun_VOL0_5875CF4A
>>         devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
>>         devnode "^hd[a-z]"
>> }
>> defaults {
>>                 user_friendly_names             yes
>> }
>> devices {
>>        device {
>>                 vendor                          "HITACHI"
>>                 product                         "OPEN-V"
>>                 path_grouping_policy            group_by_node_name
>>                 failback                        immediate
>>                 no_path_retry                   fail
>>        }
>>        device {
>>                 vendor                          "IET"
>>                 product                         "VIRTUAL-DISK"
>>                 path_checker                    tur
>>                 path_grouping_policy            failover
>>                 failback                        immediate
>>                 no_path_retry                   fail
>>        }
>> }
>>
>> As an example this is one LUN. It shoes [features=0] so I'd say it should
>> fail right way
>>
>> mpath-dc2-a (360060e8004f240000000f24000000502) dm-15 HITACHI,OPEN-V
>> -SU
>> [size=26G][features=0][hwhandler=0][rw]
>> \_ round-robin 0 [prio=4][active]
>>  \_ 5:0:1:0     sdu  65:64  [active][ready]
>>  \_ 5:0:1:16384 sdac 65:192 [active][ready]
>>  \_ 5:0:1:32768 sdas 66:192 [active][ready]
>>  \_ 5:0:1:49152 sdba 67:64  [active][ready]
>> \_ round-robin 0 [prio=4][enabled]
>>  \_ 3:0:1:0     sdaw 67:0   [active][ready]
>>  \_ 3:0:1:16384 sdbe 67:128 [active][ready]
>>  \_ 3:0:1:32768 sdbi 67:192 [active][ready]
>>  \_ 3:0:1:49152 sdbm 68:0   [active][ready]
>>
>> It think they fail since I see this messages from LVM:
>> Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in
>> vg_syb_roger-lv_syb_roger_admin
>> Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty devices in
>> vg_syb_roger-lv_syb_roger_admin
>>
>> But from some reason LVM cant remove them, any option I should have on
>> lvm.conf?
>>
>> BestRegards
>> Jose
>>> post your multipath.conf file, you may be queuing forever ?
>>>
>>>
>>>
>>> On Wed, 2010-04-14 at 15:03 +0000, jose nuno neto wrote:
>>>> Hi2all
>>>>
>>>> I'm on RHEL 5.4 with
>>>> lvm2-2.02.46-8.el5_4.1
>>>> 2.6.18-164.2.1.el5
>>>>
>>>> I have a multipathed SAN connection with what Im builing LVs
>>>> Its a Cluster system, and I want LVs to switch on failure
>>>>
>>>> If I simulate a fail through the OS via
>>>> /sys/bus/scsi/devices/$DEVICE/delete
>>>> I get a LV fail and the service switch to other node
>>>>
>>>> But if I do it "real" portdown on the SAN Switch, multipath reports
>>>> path
>>>> down, but LVM commands hang forever and nothing gets switched
>>>>
>>>> from the logs i see multipath failing paths, and lvm Failed to remove
>>>> faulty
>>>> "devices"
>>>>
>>>> Any ideas how I should  "fix" it?
>>>>
>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Log device, 253:53, has failed.
>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Device failure in
>>>> vg_ora_scapa-lv_ora_scapa_redo
>>>> Apr 14 16:02:45 dc1-x6250-a lvm[15622]: Another thread is handling an
>>>> event.  Waiting...
>>>>
>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active
>>>> paths: 0
>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-a: remaining active
>>>> paths: 0
>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active
>>>> paths: 0
>>>> Apr 14 16:02:52 dc1-x6250-a multipathd: mpath-dc1-b: remaining active
>>>> paths: 0
>>>>
>>>> Apr 14 16:03:05 dc1-x6250-a lvm[15622]: Device failure in
>>>> vg_syb_roger-lv_syb_roger_admin
>>>> Apr 14 16:03:14 dc1-x6250-a lvm[15622]: Failed to remove faulty devices
>>>> in
>>>> vg_syb_roger-lv_syb_roger_admin
>>>>
>>>> Much Thanks
>>>> Jose
>>>>
>>>> _______________________________________________
>>>> linux-lvm mailing list
>>>> linux-lvm at redhat.com
>>>> https://www.redhat.com/mailman/listinfo/linux-lvm
>>>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>>>
>>>
>>> _______________________________________________
>>> linux-lvm mailing list
>>> linux-lvm at redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-lvm
>>> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>>>
>>
>>
>
> _______________________________________________
> linux-lvm mailing list
> linux-lvm at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-lvm
> read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
>

-- 
Sent from my mobile device

Regards,
Eugene Vilensky
evilensky at gmail.com




More information about the linux-lvm mailing list