[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
Re: [dm-devel] [RFC][PATCH 0/3] dm-raid1: fix deadlock at suspend after suspend was interrupted
- From: Kiyoshi Ueda <k-ueda ct jp nec com>
- To: device-mapper development <dm-devel redhat com>
- Subject: Re: [dm-devel] [RFC][PATCH 0/3] dm-raid1: fix deadlock at suspend after suspend was interrupted
- Date: Wed, 20 Jan 2010 11:47:56 +0900
Hi Yasui-san,
On 01/20/2010 05:40 AM +0900, Takahiro Yasui wrote:
> Hi,
>
> This is a patch set to fix deadlock on suspending of mirror device.
>
>
> ISSUE
> =====
>
> Suspend procedure on a dm-mirror device could cause deadlock on recovery_count
> semaphore.
>
> When mirror_presuspend is called, recovery_count semaphore is acquired in
> dm_rh_stop_recovery() to stop recovery routine, but when an signal is caught
> in dm_wait_for_completion() or an error occurred in in dm_suspend(),
> the suspend process is interrupted without releasing recovery_count semaphore
> of a mirror device. This means that another suspend is executed, and then
> the suspend process gets stuck at dm_rh_stop_recovery().
>
> When suspend procedure is interrupted, the device should work properly since
> the status of the device is not "suspended."
>
>
> SOLUTION
> ========
>
> Introduce a target handler, cancel_presuspend, to cancel status changes
> done by a target specific presuspend handler.
How about using ->resume as a cancelling method?
Though you have to audit existing targets' ->resume handler,
I think it's better idea than adding another target handler
just for this purpose.
And in your dm-raid1 patch, cancelling log's presuspend which is used
by dm-log-userspace is missed.
So it seems that dm-raid1 can use ->resume to cancel presuspend.
Thanks,
Kiyoshi Ueda
[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]