[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[dm-devel] multipath devices fail with CX-700



hei,

we have two Altix 3700bx boxes with sles9 connected to two Clariion CX-700. The other host has problems with multipath:


sambo:~ # multipath -l
dm names   N
dm table 360060160f389120074e2fa5c6092da11p1  N
dm table 360060160f389120074e2fa5c6092da11  N
dm table 360060160f389120074e2fa5c6092da11  N
dm status 360060160f389120074e2fa5c6092da11  N
dm info 360060160f389120074e2fa5c6092da11  O
dm table 360060160685510009659bbb697cfda11  N
dm table 360060160685510009659bbb697cfda11  N
dm status 360060160685510009659bbb697cfda11  N
dm info 360060160685510009659bbb697cfda11  O
dm table 350060160b06013a050060160b06013a0  N
dm table 350060160b06013a050060160b06013a0  N
dm status 350060160b06013a050060160b06013a0  N
dm info 350060160b06013a050060160b06013a0  O
dm table 360060160685510002aee0cba1da9da11  N
dm table 360060160685510002aee0cba1da9da11  N
dm status 360060160685510002aee0cba1da9da11  N
dm info 360060160685510002aee0cba1da9da11  O
360060160f389120074e2fa5c6092da11
[size=250 GB][features="1 queue_if_no_path"][hwhandler="1 emc"]
\_ round-robin 0 [active]
\_ 3:0:1:2     sdn  8:208  [active][ready]
\_ 4:0:1:2     sdw  65:96  [active][ready]
\_ round-robin 0 [enabled]
\_ 3:0:0:2     sdl  8:176  [active][faulty]
\_ 4:0:0:2     sdu  65:64  [active][faulty]

360060160685510009659bbb697cfda11
[size=3668 GB][features="1 queue_if_no_path"][hwhandler="1 emc"]
\_ round-robin 0 [active]
\_ 4:0:2:4     sdz  65:144 [active][ready]
\_ 3:0:2:4     sdad 65:208 [active][ready]
\_ round-robin 0 [enabled]
\_ 4:0:3:4     sdac 65:192 [active][faulty]
\_ 3:0:3:4     sds  65:32  [active][faulty]

350060160b06013a050060160b06013a0
[size=1 GB][features="1 queue_if_no_path"][hwhandler="1 emc"]
\_ round-robin 0 [enabled]
\_ 3:0:0:0     sdk  8:160  [failed][faulty]
\_ 3:0:1:0     sdm  8:192  [failed][faulty]
\_ 4:0:0:0     sdt  65:48  [failed][faulty]
\_ 4:0:1:0     sdv  65:80  [failed][faulty]

360060160685510002aee0cba1da9da11
[size=240 GB][features="1 queue_if_no_path"][hwhandler="1 emc"]
\_ round-robin 0 [enabled]
\_ 3:0:2:3     sdp  8:240  [failed][faulty]
\_ 4:0:2:3     sdy  65:128 [failed][faulty]
\_ round-robin 0 [active]
\_ 4:0:3:3     sdab 65:176 [active][ready]
\_ 3:0:3:3     sdr  65:16  [active][ready]


Secondary paths to this last 240GB lun are failed . Interesting is that paths to other LUNs from the same target(s) are ok. Is there any command to force the multipath rescan the devices? I have restarted the multpathd but it does not help. Yesterday evening all paths to this device failed but running multpath fixed this issue.

Another issue is the 1GB "LUN" that actually is not a LUN but the Clariion SP . When the host is booted we need to disconnect the FC otherwise the system won't come up but stop in "creating devices" . My guess is that when the multipath finds this device with no active paths the multipath gives up but the boot does not continue.

The last boot messages:

device-mapper: dm-emc: emc_endio: pg_init error -5
device-mapper: dm-emc: emc_endio: Found valid sense data 052602
device-mapper: dm-multipath: 65:192: Error trying to initialize PG, failing path
device-mapper: dm-multipath: Failing path 65:192

without fc the system boots up ok and I can add the scsi devices manually and run multpath without any problems. I have blacklisted the device in multipath.conf but for some reason blacklisting does not work for this device.

Any ideas how to fix this boot issue?

multipath-tools is the one shipped with Suse. There is newer package available but it has similar problems plus it does not read the blacklist from multipath.conf (or maybe the syntax has changed?) .


sambo:~ # rpm -qa |grep multipath
multipath-tools-0.4.5-0.11

kernel is the SGI sn2 .

sambo:~ # uname -a
Linux sambo 2.6.5-7.252-sn2 #1 SMP Tue Feb 14 11:11:04 UTC 2006 ia64 ia64 ia64 GNU/Linux


multipath.conf:

I wrote the defaults out as the multipath did not always recognize the Clariion devices correctly.

sambo:~ # cat /etc/multipath.conf
# multipath config 8.3.2006 mattik
# katso /usr/share/doc/packages/multipath-tools/multipath.conf.annotated

defaults {

       multipath_tool  "/sbin/multipath v0"
       udev_dir        /dev
       polling_interval 5
       default_selector "round-robin 0"
       default_path_grouping_policy    multibus
       default_getuid_callout  "/sbin/scsi_id -g -u -s /block/%n"
       default_prio_callout    /bin/true
       rr_min_io       1000
       rr_weight       uniform
       failback        immediate
}

devnode_blacklist {

       #SGI local & JBOD
       wwid SSGI_ST3146707LC_3KS0FX0K00007535HKZN
       wwid SSGI_ST373454LC_3KP0HZTZ000075443N5B
       wwid SSGI_ST373454LC_3KP0J0DH00007535H6SF
       wwid SSGI_ST373454LC_3KP0JJMD000075444QP9
       wwid SSGI_ST373454LC_3KP0HSFL00007544WT2U
       wwid SSGI_ST373454LC_3KP0JB4P00007543R83C
       wwid SSGI_ST373454LC_3KP0JAE8000075443RKN
       wwid SSGI_ST373454LC_3KP0J53H000075431M8Q
       wwid SSGI_ST373454LC_3KP0J64F00007544WRLC
       wwid 360060160f3891200e4d229aefa3ada11

       #dvd
       devnode "/dev/dvd"
       # sp
       wwid    350060160b060139650060160b0601396

       # defaults
       devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
       devnode "^hd[a-z][[0-9]*]"
}

# oletukset (hwtable.c)
devices {
       device {
               vendor "DGC"
               product "*"
               path_grouping_policy group_by_prio
               prio_callout    "/sbin/mpath_prio_emc /dev/%n"
               hardware_handler "1 emc"
               features "1 queue_if_no_path"
               checker "emc_clariion"
      }
}




any help appreciated

cheers

Matti Keranen




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]