[dm-devel] two paths remain failed on DS6800 after code upgrade
Gianluca Cecchi
gianluca.cecchi at gmail.com
Fri Jun 13 10:14:15 UTC 2008
Hello,
I have a test server connected to an IBM DS6800 storage.
It is a blade bl480c with two qlogic hbas, connected to 2 fc-switches.
RH EL 4.6 x86_64 installed (kernel 2.6.9-67.ELsmp)
device-mapper-1.02.21-1.el4
device-mapper-multipath-0.4.5-27.RHEL4
In boot messages I have for the hbas:
qla2400 0000:0c:00.0: Found an ISP2432, irq 185, iobase 0xffffff000001c000
QLogic Fibre Channel HBA Driver: 8.01.07-d4
QLogic QMH2462 - SBUS to 2Gb FC, Dual Channel
ISP2432: PCIe (2.5Gb/s x4) @ 0000:0c:00.0 hdma+, host#=0, fw=4.00.150 [IP]
Vendor: IBM Model: 1750500 Rev: .155
Type: Direct-Access ANSI SCSI revision: 05
On the storage I have access to two luns, so that in total I get 8 paths and
disks from sda to sdh.
In multipath I'm using default os install config for ds6800 (storage
1750500)
so it should be:
# device {
# vendor "IBM"
# product "1750500"
# path_grouping_policy group_by_prio
# getuid_callout "/sbin/scsi_id -g -u -s"
# prio_callout "/sbin/mpath_prio_alua %d"
# features "1 queue_if_no_path"
# path_checker tur
# }
In normal operation the command "multipath -ll" gives:
[root at test-rhel-p ~]# multipath -ll
*mpath1 (3600507630efe05800000000000001700)*
[size=20 GB][features="1 queue_if_no_path"][hwhandler="0"]
\_ round-robin 0 [prio=100]*[active]* **
* \_ 0:0:1:1 sdd 8:48 [active][ready]*
* \_ 1:0:1:1 sdh 8:112 [active][ready]*
\_ round-robin 0 [prio=20][enabled]
\_ 0:0:0:1 sdb 8:16 [active][ready]
\_ 1:0:0:1 sdf 8:80 [active][ready]
*mpath0 (3600507630efe05800000000000001600)*
[size=20 GB][features="1 queue_if_no_path"][hwhandler="0"]
\_ round-robin 0 [prio=100]*[active] *
* \_ 0:0:0:0 sda 8:0 [active][ready]*
* \_ 1:0:0:0 sde 8:64 [active][ready]*
\_ round-robin 0 [prio=20][enabled]
\_ 0:0:1:0 sdc 8:32 [active][ready]
\_ 1:0:1:0 sdg 8:96 [active][ready]
We had a code update for the storage, and so I wanted to test the multipath
behaviour.
It was made in concurrent mode.
I get a first path-change whithout problems, probably when fisrt controller
was updated.
mpath1:
\_ round-robin 0 [enabled]
*\_ 0:0:0:1 sdb 8:16 [failed]*
* \_ 1:0:0:1 sdf 8:80 [failed]*
and
mpath0:
\_ round-robin 0 [enabled]
* \_ 0:0:0:0 sda 8:0 [failed]*
* \_ 1:0:0:0 sde 8:64 [failed]*
while the other two path group remained active.
At the end of upgrade, probably with the second controller update, I get the
situation below.
while other servers with windows and Linux (using sdd) came back with all
paths, this server retains two paths in failed state:
[root at test-rhel-p RPMS]# multipath -l
mpath1 (3600507630efe05800000000000001700)
[size=20 GB][features="1 queue_if_no_path"][hwhandler="0"]
\_ round-robin 0 [enabled]
* \_ 0:0:1:1 sdd 8:48 [failed][faulty]*
\_ 1:0:1:1 sdh 8:112 [active]
\_ round-robin 0 [enabled]
\_ 0:0:0:1 sdb 8:16 [active]
\_ 1:0:0:1 sdf 8:80 [active]
mpath0 (3600507630efe05800000000000001600)
[size=20 GB][features="1 queue_if_no_path"][hwhandler="0"]
\_ round-robin 0 [active]
\_ 0:0:0:0 sda 8:0 [active]
\_ 1:0:0:0 sde 8:64 [active]
\_ round-robin 0 [enabled]
* \_ 0:0:1:0 sdc 8:32 [failed][faulty]*
\_ 1:0:1:0 sdg 8:96 [active]
with messages every 5 seconds of type:
error calling out /sbin/mpath_prio_alua /dev/sdc
error calling out /sbin/mpath_prio_alua /dev/sdd
Other information:
[root at test-rhel-p ]# sg_inq /dev/sdc
sg_inq: error opening file: /dev/sdc: No such device or address
[root at test-rhel-p RPMS]# ll /dev/sdc
brw-rw---- 1 root disk 8, 32 Jun 11 19:03 /dev/sdc
[root at test-rhel-p RPMS]# sg_inq /dev/sda
standard INQUIRY:
PQual=0 Device_type=0 RMB=0 version=0x05 [SPC-3]
[AERC=0] [TrmTsk=0] NormACA=1 HiSUP=1 Resp_data_format=2
SCCS=0 ACC=0 TGPS=1 3PC=0 Protect=0 BQue=0
EncServ=0 MultiP=1 (VS=0) [MChngr=0] [ACKREQQ=0] Addr16=0
[RelAdr=0] WBus16=0 Sync=0 Linked=0 [TranDis=0] CmdQue=1
Clocking=0x0 QAS=0 IUS=0
length=164 (0xa4) Peripheral device type: disk
Vendor identification: IBM
Product identification: 1750500
Product revision level: .441
Unit serial number: 68778501600
Any help to get up the paths?
Could it help a scsi rescan? What should be the correct command in this
case?
The system is operational and without interruption on disk acces for the
users, but I don't understand why the paths don't come up again...
Thanks in advance for help or suggestions.
Gianluca
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20080613/5e6478f2/attachment.htm>
More information about the dm-devel
mailing list