[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [dm-devel] [PATCH 8/8] dm-mpath: do not activate failed paths



On 02/28/2014 10:37 AM, Junichi Nomura wrote:
> On 02/28/14 18:32, Hannes Reinecke wrote:
>> On 02/28/2014 09:49 AM, Junichi Nomura wrote:
>>> On 02/27/14 16:30, Hannes Reinecke wrote:
>>>> activate_path() is run without a lock, so the path might be
>>>> set to failed before activate_path() had a chance to run.
>>>>
>>>> Signed-off-by: Hannes Reinecke <hare suse de>
>>>> ---
>>>>  drivers/md/dm-mpath.c | 7 +++++--
>>>>  1 file changed, 5 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
>>>> index 0a40fa9..22e7365 100644
>>>> --- a/drivers/md/dm-mpath.c
>>>> +++ b/drivers/md/dm-mpath.c
>>>> @@ -1196,8 +1196,11 @@ static void activate_path(struct work_struct *work)
>>>>  	struct pgpath *pgpath =
>>>>  		container_of(work, struct pgpath, activate_path.work);
>>>>  
>>>> -	scsi_dh_activate(bdev_get_queue(pgpath->path.dev->bdev),
>>>> -				pg_init_done, pgpath);
>>>> +	if (pgpath->is_active)
>>>> +		scsi_dh_activate(bdev_get_queue(pgpath->path.dev->bdev),
>>>> +				 pg_init_done, pgpath);
>>>> +	else
>>>> +		pg_init_done(pgpath, SCSI_DH_DEV_OFFLINED);
>>>
>>> Hi Hannes,
>>>
>>> What problem are you going to fix with this?
>>> It is still possible that the path is set to failed just after
>>> "if (pgpath->is_active)" and before "scsi_dh_activate".
>>>
>> activate_path() is run from a workqueue, so there is a race window
>> between the check ->is_active in __pg_init_all_paths() and the time
>> the scsi_dh_activate() is executed. And as activate_path)() is run
>> without any locks we don't have any synchronization with fail_path().
>> So it's well possible that a path gets failed in between
>> __pg_init_all_paths() and activate_paths().
> 
> What I pointed is that your patch makes the window very small
> but doesn't close it.
> If the race is a problem, this patch doesn't fix the problem.
> 
True. As said I just want to avoid unnecessary overhead by calling
functions which are known to be failing.
This patch actually makes more sense in combination with the 'accept
failed paths' patch, where it's much more likely to trigger.
So if you insist I could move it over to there.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare suse de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]