I've had problems like this happen to me on 3par too. What kernel version
are you using? It almost always happened when the SAN got a RSCN (using
when another server was rebooted) I found that, at least in kernel 2.6.11.7,
that if I changed the line
bio->bi_rw != (1 << BIO_RW_FAILFAST); to
bio->bi_rw != (0 << BIO_RW_FAILFAST);
in drivers/md/dm_mpath.c
the problem went away. Now, in the newest kernels, after there was a big
change to the qla drivers (2.6.12-rc? and beyond, I believe) I did not need
to do the above change, but I now get aborts sometimes (these aborts
apparently come from the qlogic card). The aborts recover, but I have been
unable to determine why I am getting them.
Andy