[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[dm-devel] Multipathing with RHEL4 U2 on EMC DMX



Title: [dm-devel] Multipathing with RHEL4 U2 on EMC DMX

Hello,

I've just recently connected some HP BL20p G3 blades running RHEL4 U2 up to a DMX2000 (via McData switches). We didn't get PowerPath and intended to use device-mapper multipathing. I was able to get things up for the most part and get devices defined, but have to do that manually. I've run into a few issues/concerns that I was hoping someone had run across;

Kernel: 2.6.9-22.Elsmp
Multipath tools: multipath-tools-0.4.6.1-1
Device Mapper: device-mapper-1.01.05-01

1. When running 'multipath -v3' I get errors from the getuid_callout string; "error calling out scsi_id -g -ppre-spc3-83 -u -s /block/sdb". It doesn't like the "-ppre-spc3-83" part. After some research it appears that for the DMX (Symmetrix) a more appropriate string would be "/sbin/scsi_id -g -p 0x80 -u -s /block/sdb". I tried adding that into the 'defaults' section of /etc/multipath.conf, but it doesn't appear to pick it up. I've tried restarting multipathd, rebooting, etc. Is there anyway to get it to take this string? I believe that is part of the problem I'm having with the next item (#2).

2. In order to have multipath working for my EMC devices I have to manually create them on system reboot. I simply created a new startup script for this. Basically I just do; 'echo "0 17677440 multipath 0 0 2 1 round-robin 0 1 1 8:112 1000 round-robin 0 1 1 8:240 1000" | dmsetup create dm0' for each device. Is that normal to have to do that or is there a way to do this automatically? I would suspect it has to do with having a hardware_handler. I thought about the dm-emc handler, but that appears to only work for the CX/AX/FC family (i.e. Clariion) of arrays which work nothing like the Symmetrix. Perhaps if I could get the getuid_callout string working that would help.

3. Early this morning there was a problem on one of the multipath devices used for Oracle ASM;

        SCSI error : <0 0 0 6> return code = 0x20000
        end_request: I/O error, dev sdd, sector 64078960
        end_request: I/O error, dev sdd, sector 64078961
        device-mapper: dm-multipath: Failing path 8:48.
        SCSI error : <1 0 0 6> return code = 0x20000
        end_request: I/O error, dev sdl, sector 34888112
        end_request: I/O error, dev sdl, sector 34888113
        device-mapper: dm-multipath: Failing path 8:176.

'multipath -l' showed the device as;

        dm2 ()
        [size=67 GB][features="0"][hwhandler="0"]
        \_ round-robin 0 [enabled]
         \_ 0:0:0:6  sdd 8:48  [failed][ready]
        \_ round-robin 0 [enabled]
         \_ 1:0:0:6  sdl 8:176 [failed][ready]

The LUNs on this server are shared between three servers and the other two remained on-line so I know the LUN or paths to the array didn't go out. Since the other LUNs on this server remained active I know I didn't loose any HBA connectivity either. The DBAs said they were writing a bunch of data to it when it dropped off line. I ran a few 'multipath' and 'dmsetup status' commands to see what was up and it came back online (it had been "failed" from ~3am to 7am).

Should I try using "failover" instead of "multibus" for my "path_grouping_policy"? I would like to have it load balance, but failover is more important.

Sorry for the long-winded post.

Any help would be appreciated.

Thanks,
David

David Child
Email  David.Child@ps.net.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]