[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [dm-devel] [PATCH 0/4] scsi_dh: Fix for handler attach and code clean up

On 11/01/2011 06:19 PM, Moger, Babu wrote:
These series of patches handles following things.
1. Fixes handler attach issue.
2. Introduces match function for all the handlers
3. cleans up the scsi_dh code

I have noticed the attach issue during our multipath testing. We found that during
the lun discovery there were lots of error messages like below..

Oct 25 08:24:43 kswm-mihosi kernel: sd 0:0:0:7: [sdav]  Result: hostbyte=DID_OK
Oct 25 08:24:43 kswm-mihosi kernel: sd 0:0:0:7: [sdav]  Sense Key : Illegal
Request [current]
Oct 25 08:24:43 kswm-mihosi kernel: sd 0:0:0:7: [sdav]<<vendor>>  ASC=0x94
ASCQ=0x1ASC=0x94 ASCQ=0x1
Oct 25 08:24:43 kswm-mihosi kernel: sd 0:0:0:7: [sdav] CDB: Read(10): 28 00 00
00 00 00 00 00 80 00
Oct 25 08:24:43 kswm-mihosi kernel: end_request: I/O error, dev sdav, sector 0

These messages were coming in spite of having scsi_dh_rdac in initrd.  Reason
for these errors are due to device handler not being attached properly. If the
device handler was attached then the I/O's should not go to passive paths.

Investigating further we found that there errors started with the introduction
of these patches below.


This patch introduces the match function for device handlers. But the match function
was added only to scsi_dh_alua handler.

Reason for the failure is, if the match function is not available then scsi_dh calls
scsi_get_device_flags_keyed(sdev, sdev->vendor, sdev->model, SCSI_DEVINFO_DH)
which compares the exact vendor and product strings from sdev(which comes from inquiry).

While setting up the scsi_dev_info, handlers use the short strings(example below).
Look at scsi_register_device_handler code. If  you look closely the handlers have
only first few characters of the model string. So, scsi_get_device_flags_keyed
fails and handler will not be attached. This applies to all the handlers.

Example :

         Strings in scsi_dh_rdac handler..
                     {"IBM", "1746"},

         Actual string..
                      "IBM", "1746      FAStT"

Couple of ways we can fix this problem.
1. List the complete model strings in handlers for all the products.
2. Introduce match function for all the handler and remove the call to scsi_get_device_flags_keyed.

I think the option 2 is the best way to fix this problem. This removes lot of code in scsi_dh and
simplifies the things..

TESTED the patches on NetApp E series storage.

And this was the main intention of the original patch, introducing the ->match() function.

However, one thing to keep in mind here:
What do we do if two ->match() function trigger?
IE a NetApp E series in ALUA mode would have both match functions from scsi_dh_alua and scsi_dh_rdac matching. As there is no sane way of keeping the module order (having alua always first in the list of modules is very error-prone), I would think we'd need to introduce some checks in the vendor-specific device-handler to have the match() function returning 'false' if the respective device is in ALUA mode.

This doesn't affect multipathing at all, it's just for having a sane default when the system boots up.


Dr. Hannes Reinecke		      zSeries & Storage
hare suse de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]