[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [dm-devel] [PATCH 5/7] dm core: reject I/O violating new queue limits



Hi Hannes,

I'm very sorry for the huge delay, but please see below.

On 2009/04/28 16:49 +0900, Kiyoshi Ueda wrote:
> Hi Hannes,
> 
> On 2009/04/24 17:59 +0900, Hannes Reinecke wrote:
>> Hi Kiyoshi,
>>
>> Kiyoshi Ueda wrote:
>>> This patch detects requests violating the queue limitations
>>> and rejects them.
>>>
>>> The same limitation checks are done when requests are submitted
>>> to the queue by blk_insert_cloned_request().
>>> However, such violation can happen if a table is swapped and
>>> the queue limitations are shrunk while some requests are
>>> in the queue.
>>>
>>> Since struct request is a reliable one in the block layer and
>>> device drivers, dispatching such requests is pretty dangerous.
>>> (e.g. it may cause kernel panic easily.)
>>> So avoid to dispatch such problematic requests in request-based dm.
>>>
>> This patch actually triggers accidentally during a no-paths scenario;
>> multipathing seems to flush all device details once all paths are gone,
>> so it'll fall back to the system defaults.
>> And then this check will trigger and kill all queued I/Os. Not good.
>>
>> So either we fix device-mapper to keep the device details even for
>> the all-paths down scenario (basically we'd have to copy the device
>> details into the target_io structures and use that for comparison)
>> or we should rather not check here.
> 
> Thank you for your review and the comment.
> I haven't understood the problem correctly yet.
> (e.g. Who flush the device details and how?  Restrictions are
>  changed when the table is swapped, but no table swapping should happen
>  even if the last path is gone while the device is in use.)

I tried to understand/reproduce your scenario, but I can't.
I found a bug (http://marc.info/?l=dm-devel&m=124572577028515&w=2)
during my testing (which lets all paths down at once), but it is
an user-space bug, not a kernel bug.

Could you elaborate your problem related to the patch, if you have
detailed analisys?
(e.g. I can't see the code-path which flushes device details
      on an all-paths down scenario.)

Thanks,
Kiyoshi Ueda


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]