[dm-devel] RE: Problem with multipathd and a blacklisted device

Ransegnola, Lori Lori.Ransegnola at hp.com
Tue Mar 24 13:27:43 UTC 2009


I sent this question last week and have not heard a thing.  Do you have a suggestion of a better mailing list that I should use for this question?

Thank you.

Lori

> -----Original Message-----
> From: Ransegnola, Lori
> Sent: Thursday, March 19, 2009 3:51 PM
> To: dm-devel at redhat.com
> Cc: Ransegnola, Lori
> Subject: Problem with multipathd and a blacklisted device
>
> Configuration:
> --------------
>
> I have a multipath configuration set up where 10 disks are
> multipathed and 2 disks are blacklisted and are not
> multipathed.  Here are the relevant parts of the multipath.conf file:
>
> defaults {
>         udev_dir                /dev
>         polling_interval        10
>         selector                "round-robin 0"
> #       path_grouping_policy    multibus
>         getuid_callout          "/sbin/scsi_id -g -u -s /block/%n"
>         prio_callout            /bin/true
> #       path_checker            readsector0
>         path_checker            tur
>         rr_min_io               100
>         rr_weight               priorities
>         failback                immediate
>         no_path_retry           fail
>         user_friendly_name      yes
> }
>
> blacklist {
> #       wwid 26353900f02796769
> #       devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
> #       devnode "^hd[a-z]"
>         wwid                    35000c5000a41f72b
>         wwid                    35000c5000a41f98b
>         devnode "^cciss!c[0-9]d[0-9]*"
> }
>
>
> The two disks that are not multipathed have udev rules and
> are part of a software RAID set.  The devices show up as
>
> /etc/udev/rules.d> ls -l /dev/hpdev
> total 0
> lrwxrwxrwx 1 root root 7 Mar 18 10:50 sda1 -> ../sda1
> lrwxrwxrwx 1 root root 7 Mar 18 03:32 sdb1 -> ../sdb1
>
> /dev/md0 is created with sda and sdb.  Here is the bottom of
> the 'mdadm --detail /dev/md0' output:
>
> /root> mdadm --detail /dev/md0
> /dev/md0:
>         Version
> ...
> Failed Devices : 0
>   Spare Devices : 0
>
>            UUID : 94813396:a3e341fc:c5d4b8ba:0617a019
>          Events : 0.84
>
>     Number   Major   Minor   RaidDevice State
>        0       8        1        0      active sync   /dev/sda1
>        1       8       17        1      active sync   /dev/sdb1
> /root>
>
> I am running on a Linux RHEL5 U2 system.
> /etc/udev/rules.d> uname -a
> Linux n0 2.6.18-53.el5 #1 SMP Wed Oct 10 16:34:19 EDT 2007
> x86_64 x86_64 x86_64 GNU/Linux /etc/udev/rules.d> multipath
> -v Missing option arguement multipath-tools v0.4.7 (03/12, 2006)
> ----------------------------
>
> Problem:
>
> This configuration works well until I do some failure testing
> with one of the 2 blacklisted devs in the software RAID set.
> I found that if I temporarily remove disk sda and put it back
> a minute later the disk path, /dev/hpdev/sda1, is removed,
> even though it is blacklisted.
>
> Here are some of the pertinent /var/log/messages lines:
>
> Mar 19 09:53:19 n0 mdadm: NewArray /dev/md0 Mar 19 10:05:12
> n0 kernel: mptbase: ioc0: LogInfo(0x31170000):
> Originator={PL}, Code={IO Device Missing Delay Retry},
> SubCode(0x0000) Mar 19 10:05:40 n0 last message repeated 5
> times Mar 19 10:05:42 n0 kernel: mptbase: ioc0:
> LogInfo(0x31130000): Originator={PL}, Code={IO Not Yet
> Executed}, SubCode(0x0000) Mar 19 10:05:42 n0 kernel: sd
> 0:0:28:0: SCSI error: return code = 0x00010000 Mar 19
> 10:05:42 n0 kernel: end_request: I/O error, dev sda, sector
> 256119 Mar 19 10:05:42 n0 kernel: Buffer I/O error on device
> sda1, logical block 256056 Mar 19 10:05:42 n0 kernel: Buffer
> I/O error on device sda1, logical block 256057 Mar 19
> 10:05:42 n0 kernel: Buffer I/O error on device sda1, logical
> block 256058 Mar 19 10:05:42 n0 kernel: Buffer I/O error on
> device sda1, logical block 256059 Mar 19 10:05:42 n0 kernel:
> Buffer I/O error on device sda1, logical block 256060 Mar 19
> 10:05:42 n0 kernel: Buffer I/O error on device sda1, logical
> block 256061 Mar 19 10:05:42 n0 kernel: mptbase: ioc0:
> LogInfo(0x31130000): Originator={PL}, Code={IO Not Yet
> Executed}, SubCode(0x0000) Mar 19 10:05:42 n0 kernel: Buffer
> I/O error on device sda1, logical block 256062 Mar 19
> 10:05:42 n0 kernel: Buffer I/O error on device sda1, logical
> block 256063 Mar 19 10:05:42 n0 kernel: sd 0:0:28:0: SCSI
> error: return code = 0x00010000 Mar 19 10:05:42 n0
> multipathd: sda: remove path (uevent) Mar 19 10:05:42 n0
> kernel: end_request: I/O error, dev sda, sector 585922495 Mar
> 19 10:05:42 n0 kernel: Buffer I/O error on device sda1,
> logical block 585922432 Mar 19 10:05:42 n0 xinetd[13096]:
> START: hacl-cfgudp pid=6034 from=127.0.0.1 Mar 19 10:05:42 n0
> multipathd: uevent trigger error Mar 19 10:05:42 n0 kernel:
> Buffer I/O error on device sda1, logical block 585922433
>
>
> The sda path is gone:
>
> /root> ls -l /dev/hpdev
> total 0
> lrwxrwxrwx 1 root root 7 Mar 18 03:32 sdb1 -> ../sdb1 /root>
>
> And I cannot reassemble the software raid set.  While the
> mdstat looks 'normal' the software raid becomes degraded.
>
> /etc/udev/rules.d> cat /proc/mdstat
> Personalities : [raid1]
> md0 : active raid1 sda1[2](F) sdb1[1]
>       292961216 blocks [2/1] [_U]
>
> unused devices: <none>
> /etc/udev/rules.d> mdadm --detail /dev/md0
> /dev/md0:
>         Version : 00.90.03
>   Creation Time : Wed Feb 18 14:15:03 2009
>      Raid Level : raid1
>      Array Size : 292961216 (279.39 GiB 299.99 GB)
>     Device Size : 292961216 (279.39 GiB 299.99 GB)
>    Raid Devices : 2
>   Total Devices : 2
> Preferred Minor : 0
>     Persistence : Superblock is persistent
>
>     Update Time : Thu Mar 19 10:28:45 2009
>           State : clean, degraded
>  Active Devices : 1
> Working Devices : 1
>  Failed Devices : 1
>   Spare Devices : 0
>
>            UUID : 94813396:a3e341fc:c5d4b8ba:0617a019
>          Events : 0.86
>
>     Number   Major   Minor   RaidDevice State
>        0       0        0        0      removed
>        1       8       17        1      active sync   /dev/sdb1
>
>        2       8        1        -      faulty spare
> /etc/udev/rules.d>
>
> The disk is fine and is back in and ready to go.
>
> So, 1) why does multipathd remove the path for a blacklisted
> device?  If it is blacklisted, shouldn't multipathd just
> leave it alone??
> And 2) what can I do to keep this from happening??
>
> Lori Ransegnola
>




More information about the dm-devel mailing list