[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [dm-devel] SCSI's heuristics for enabling WRITE SAME still need work [was: dm mpath: disable WRITE SAME if it fails]

On 13-09-20 06:03 PM, Martin K. Petersen wrote:
"Mike" == Mike Snitzer <snitzer redhat com> writes:


Mike> AFAIK the reason for these heuristics is: devices that do support
Mike> WRITE SAME cannot properly report as much because they don't
Mike> support is apparently very common?

Only a handful of the very latest and greatest devices support RSOC. The
number of devices that support WRITE SAME is orders of magnitude larger.

Last I checked I had exactly 1 out of about 100 devices in my lab that
supported RSOC.

Mike> I can appreciate the idea behind the current heuristics but I
Mike> think the prevelence of the other side of the spectrum (SCSI
Mike> devices that don't support RSOC or WRITE SAME) was underestimated.

If you by "devices" mean vintage PCI RAID controllers that don't pass
things through correctly, then yes. I don't think I have a single SCSI
drive that doesn't support WRITE SAME. And all the controllers I tested
with here worked fine.

Mike> As I say in that comment: "A proper fix could be to make SCSI's
Mike> default be to disable WRITE SAME for devices that don't properly
Mike> report they support it.  And possibly have a whitelist to opt-in
Mike> to enabling WRITE SAME for select targets."

The problem with the opt-in approach is that there are orders of
magnitude more devices that would need to get it enabled than there are
broken ones that need it disabled.

There are only a couple of handfuls of RAID controller drivers. We've
been working through the issues on these on a case by case basis.

Yes, I totally agree it sucks. And I hate that things broke for people
with Areca and 3ware. But we got those fixed. And it's way easier to
blacklist "all devices hanging off RAID driver xyz" than it is to
whitelist every SCSI drive known to man. It sucks in the short term but
is better long term.

The major headache here of course is that WRITE SAME is inherently
destructive. We can't just fire off one during discovery and see if it
works. For WRITE you can issue a command with a transfer length of 0 to
see if things work. But unfortunately for WRITE SAME a transfer length
of zero means "wipe the entire device". Yikes!

I guess we could read one sector and try to write it back using WRITE
SAME and a block count of one. But it's really icky. And I don't like
the notion of actually writing things during discovery.

As far as being able to trigger a restacking of the queue limits I think
it's inevitable. We see more and more devices that change properties
after a firmware upgrade. I think we'll just have to bite the bullet and
work on that...

Would a closer examination of the available VPD pages
help? For example support for the Logical Block
Provisioning and Block Limits VPD pages. Given either
of those two pages, even if the WRITE SAME specific
fields in those pages are not set, it is unlikely that
sending a WRITE SAME (when actually required rather
than at discovery) would wedge the disk/controller.

If WSNZ is set in the Block Limits VPD pages then it
should be "safe" ** to send a zero length WRITE SAME
command to a LU. And that is another good reason to
check the response of a VPD page request carefully
(e.g. the echo-ed page_code in byte 1 and a sensible
page_length in bytes 2 and 3) since crap devices often
return a standard INQUIRY response to a VPD page request.

Doug Gilbert

** there is departure from the normal "do nothing when
   transfer_length or number_of_LB fields are zero":
   In the case of the WS command with number_of_LBs=0
   and WSNZ=1 in the BLOCK LIMITS VPD page, the response
   I guess if it yielded status=GOOD in that case and
   you heard the disk clicking, you might get quite
   worried :-)

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]