[dm-devel] IBM x460 with directly attached IBM DS4300 (turbo) multipathd TUR path checker problem
Yury Konovalov
ykonovalov at gmail.com
Mon Jan 15 17:17:48 UTC 2007
Hi!
I faced with weird problem using multipath-tools and IBM DS4300 turbo storage
system.
| Ctrl A |--ptp fc--| qla2400 HBA-->IBM x460 (first brick) |
|DS4300(turbo)| | IBM x460 (dual brick configuration) |
| Ctrl B |--ptp fc--| qla2400 HBA-->IBM x460 (second brick)|
Operating system: SLES9 SP3 x86 (32-bit)
HBA drivers: Native SuSe kernel driver (qla2400)
DS4300 target type: Linux (AVT is enabled)
The problem: Unpredictable path failures detected by TUR path checker,
which is resulted in suspending IO to corresponding filesystem. The failed
path is reinstated by multipathd on the next turn tur checker invoked by
multipathd (10 sec).
If I increase path checking freq (by reducing polling_interval to 2 as it
shown in config below), it doesn't help. In fact, it becomes even worse: I
faced with situation when all path to LUN were failed by TUR checker at the
same time. If not specifying "queue_if_no_path" feature, this leads to IO
error reported to upper level (FS). It could work quite good for a day or so,
and then *bum*.
From DS4300 controller logs I see numerous AVT event happening on various LUNs
from time to time. The interesting thing is that, according to Linux logs,
the majority of volume transfers were not initiated by multipathd (actually,
they were not even detected by multipathd). Another strange thing is that
many of AVT transfers ended up on the same controller on which it was started
(as it seems to me). I have DS4300 controller log, which is just to big to
paste here.
What I have already tried:
1) Replace 4G HBA's (qla2400) with 2G HBA's (qla2300). Problem remains.
2) IOZONE tests. Works great. No path failures were detected during tests
3) Play with polling_interval. Didn't help.
I have similar configuration working good at some other site. The difference
between two installations:
1) Single brick configuration of IBM x460
2) Different HBA type (qla2300) installed in host.
3) One HBA instead of two.
4) There is FC-switch between DS4300 controllers and HBA.
5) RHEL4 U4 x86_64 instead of SLES9 SP3 x86
Questions:
1) What else I can try to resolve this problem?
2) Is it true, that AVT mode could not be used in cluster environment (when
two or more nodes are accessing the same LUN's, and thus can trigger AVT) ?
3) Is there any hope (or need) to add RDAC hw handler to dm-multipath? It
seems like some part of the work already done by Mike Christie
(http://www.redhat.com/archives/dm-devel/2005-October/msg00020.html).
Do you have any plans to include this code? Is it in usable state?
/var/log/messages
-------------
Dec 27 07:30:08 tpc1 multipathd: 8:48: tur checker reports path is down
Dec 27 07:30:08 tpc1 multipathd: checker failed path 8:48 in map oradata1
Dec 27 07:30:08 tpc1 kernel: device-mapper: dm-multipath: Failing path 8:48
Dec 27 07:30:08 tpc1 multipathd: 8:96: tur checker reports path is down
Dec 27 07:30:08 tpc1 multipathd: checker failed path 8:96 in map oradata1
Dec 27 07:30:08 tpc1 kernel: device-mapper: dm-multipath: Failing path 8:96
Dec 27 07:30:09 tpc1 multipathd: 8:112: tur checker reports path is down
Dec 27 07:30:09 tpc1 multipathd: checker failed path 8:112 in map oraredo
Dec 27 07:30:09 tpc1 kernel: device-mapper: dm-multipath: Failing path 8:112
Dec 27 07:30:09 tpc1 kernel: Buffer I/O error on device dm-9, logical block
27696
Dec 27 07:30:09 tpc1 kernel: lost page write due to I/O error on dm-9
Dec 27 07:30:09 tpc1 kernel: Aborting journal on device dm-9.
Dec 27 07:30:11 tpc1 kernel: ext3_abort called.
Dec 27 07:30:11 tpc1 kernel: EXT3-fs abort (device dm-9): ext3_journal_start:
Detected aborted journal
Dec 27 07:30:11 tpc1 kernel: Remounting filesystem read-only
Dec 27 07:30:12 tpc1 multipathd: 8:64: tur checker reports path is down
Dec 27 07:30:12 tpc1 kernel: device-mapper: dm-multipath: Failing path 8:64
Dec 27 07:30:12 tpc1 multipathd: checker failed path 8:64 in map oraredo
Dec 27 07:30:13 tpc1 multipathd: 8:48: tur checker reports path is up
Dec 27 07:30:13 tpc1 multipathd: 8:48: reinstated
Dec 27 07:30:13 tpc1 multipathd: oradata1: switch to path group #2
Dec 27 07:30:13 tpc1 multipathd: oradata1: switch to path group #2
Dec 27 07:30:13 tpc1 multipathd: 8:96: tur checker reports path is up
Dec 27 07:30:13 tpc1 multipathd: 8:96: reinstated
Dec 27 07:30:13 tpc1 multipathd: oradata1: switch to path group #1
Dec 27 07:30:14 tpc1 multipathd: oradata1: switch to path group #1
-------------
multipath.conf :
---------------
defaults {
udev_dir /dev
multipath_tool "/sbin/multipath -v 0 -S"
polling_interval 2
default_path_grouping_policy multibus
default_getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
rr_min_io 100
failback immediate
no_path_retry fail
}
devnode_blacklist {
devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
devnode "^hd[a-z][[0-9]*]"
devnode "^cciss!c[0-9]d[0-9]*[p[0-9]*]"
devnode sda
devnode fd
devnode hd
devnode md
devnode dm
devnode sr
devnode scd
devnode st
devnode ram
devnode raw
devnode loop
}
devices {
device {
vendor "IBM "
product "1722-600 "
path_grouping_policy group_by_prio
path_checker tur
path_selector "round-robin 0"
prio_callout "/sbin/mpath_prio_tpc /dev/%n"
failback immediate
rr_min_io 1000
features "1 queue_if_no_path"
no_path_retry 300
}
}
multipaths {
multipath {
wwid 3600a0b80001ff32a000020c2456bf8a0
alias oradata1
}
multipath {
wwid 3600a0b80001ff3de000042ba456bfcbc
alias oradata2
}
multipath {
wwid 3600a0b80001ff32a000020c5456bf952
alias oraredo
}
multipath {
wwid 3600a0b80001ff32a000020c7456bf980
alias oraarch1
}
multipath {
wwid 3600a0b80001ff3de000042bc456bfcf0
alias oraarch2
}
}
multipath -ll output (with no "queue_if_no_path" feature)
---------------------------------------------------------
dm names N
dm table oraarch2 N
dm table oraarch2 N
dm status oraarch2 N
dm info oraarch2 O
dm table oraredo N
dm table oraredo N
dm status oraredo N
dm info oraredo O
dm table oraarch1 N
dm table oraarch1 N
dm status oraarch1 N
dm info oraarch1 O
dm table oradata2 N
dm table oradata2 N
dm status oradata2 N
dm info oradata2 O
dm table oraarch1p1 N
dm table oradata1 N
dm table oradata1 N
dm status oradata1 N
dm info oradata1 O
dm table oraarch2p1 N
dm table oradata1p1 N
dm table oradata2p1 N
dm table oraredo1 N
oraarch2 (3600a0b80001ff3de000042bc456bfcf0)
[size=136 GB][features="0"][hwhandler="0"]
\_ round-robin 0 [prio=6][active]
\_ 1:0:0:4 sdc 8:32 [active][ready]
\_ round-robin 0 [prio=1][enabled]
\_ 4:0:0:4 sdk 8:160 [active][ready]
oraredo (3600a0b80001ff32a000020c5456bf952)
[size=136 GB][features="0"][hwhandler="0"]
\_ round-robin 0 [prio=6][active]
\_ 3:0:0:1 sdh 8:112 [active][ready]
\_ round-robin 0 [prio=1][enabled]
\_ 2:0:0:1 sde 8:64 [active][ready]
oraarch1 (3600a0b80001ff32a000020c7456bf980)
[size=136 GB][features="0"][hwhandler="0"]
\_ round-robin 0 [prio=6][active]
\_ 2:0:0:2 sdf 8:80 [active][ready]
\_ round-robin 0 [prio=1][enabled]
\_ 3:0:0:2 sdi 8:128 [active][ready]
oradata2 (3600a0b80001ff3de000042ba456bfcbc)
[size=817 GB][features="0"][hwhandler="0"]
\_ round-robin 0 [prio=6][active]
\_ 4:0:0:3 sdj 8:144 [active][ready]
\_ round-robin 0 [prio=1][enabled]
\_ 1:0:0:3 sdb 8:16 [active][ready]
oradata1 (3600a0b80001ff32a000020c2456bf8a0)
[size=681 GB][features="0"][hwhandler="0"]
\_ round-robin 0 [prio=6][enabled]
\_ 3:0:0:0 sdg 8:96 [active][ready]
\_ round-robin 0 [prio=1][enabled]
\_ 2:0:0:0 sdd 8:48 [active][ready]
Best Regards,
Yury.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20070115/6d80cbf8/attachment.sig>
More information about the dm-devel
mailing list