[dm-devel] How to unload a module? (Was: rqdm: bad usage of dm_get/dm_put)

Mikulas Patocka mpatocka at redhat.com
Fri Feb 26 20:57:21 UTC 2010



On Thu, 25 Feb 2010, Kiyoshi Ueda wrote:

> Hi Mikulas,
> 
> On 02/25/2010 07:33 AM +0900, Mikulas Patocka wrote:
> >> Indeed, we shouldn't use the current dm_put() in any interrupt-context.
> >> But the "mapped_device" can disappear in request-based dm while there
> >> is a request after all bios complete, so I used dm_get()/dm_put() there.
> >> I'll consider another way to prevent the problem without dm_get()/dm_put().
> >> E.g. wait for request completion in dm_put() instead.
> > 
> > How can a request-in-progress exists when all the bios complete and the 
> > device is closed?
> 
> In the current request-based dm, the device opener can remove
> the mapped_device while the last request is still completing,
> because bios in the last request complete first and then the device
> opener can remove the mapped_device before the last request completes:
>  CPU0                                           CPU1
>  ======================================================================
>  <<INTERRUPT>>
>  blk_end_request_all(clone_rq)
>    blk_update_request(clone_rq)
>      bio_endio(clone_bio) == end_clone_bio
>        blk_update_request(orig_rq)
>          bio_endio(orig_bio)
>                                                 <<I/O completed>>
>                                                 dm_blk_close()
>                                                 dev_remove()
>                                                   dm_put(md)
>                                                     <<Free md>>
>    blk_finish_request(clone_rq)
>      ....
>      dm_end_request(clone_rq)
>        free_rq_clone(clone_rq)
>        blk_end_request_all(orig_rq)
>        rq_completed(md)
> 
> So we need a mechanism to defer the md deletion until the last request
> completes.
> 
> Thanks,
> Kiyoshi Ueda

Good point ... but I think this problem may happen even in normal 
non-request based dm.

I don't know what to do with it.

If one thread does:
- bio_endio
				and another thread does:
				- close the device
				- remove the device
				- unload module
- then the first thread, after bio_endio, executes non-existing 
instructions from unloaded module.

Any ideas, how is it solved or how it should be solved?

Module unloading does stop_machine, but AFAIK it waits for all CPUs to 
exit non-preemtable sections, it doesn't wait for the code to get out of 
disk request routine...

Mikulas




More information about the dm-devel mailing list