[Linux-cluster] Shared storage problems with LSI controller


I have a configuration with two servers and a shared storage cabinet
(connected via two *independent* SCSI busses) causing fatal SCSI errors
when one server is doing a lot of I/O and the other server is rebooting
(i.e. loading the Linux driver and initializing the controller).

This problem is fully reproducable with the latest RHEL4 kernel, but
it is *not* reproducable with RHEL5b2.

When using this shared device with cluster suite and GFS (I only tried
this with RHEL4), the GFS filesystem is damaged unrepairable when one
node reboots!

I see some buzilla entries about this driver (although with different
errors) and when Googling I found some more complaints about weak error
handling/recovery in this driver.

I tried to port the MPT Fusion driver from the RHEL5b2 kernel to the
RHEL4 kernel, but this seems to require some non-trivial backporting.

Is this indeed a problem with the LSI driver?  Are there any upgrades
for the driver that can be compiled for the RHEL4 kernels?


Jos Vos
--    X/OS Experts in Open Systems BV   |   Phone: +31 20 6938364
--    Amsterdam, The Netherlands        |     Fax: +31 20 6948204

