[dm-devel] [RFC PATCH 2/2] mm, mempool: do not throttle PF_LESS_THROTTLE tasks

Mikulas Patocka mpatocka at redhat.com
Mon Jul 25 21:52:17 UTC 2016



On Sat, 23 Jul 2016, NeilBrown wrote:

> "dirtying ... from the reclaim context" ??? What does that mean?
> According to
>   Commit: 26eecbf3543b ("[PATCH] vm: pageout throttling")
> From the history tree, the purpose of throttle_vm_writeout() is to
> limit the amount of memory that is concurrently under I/O.
> That seems strange to me because I thought it was the responsibility of
> each backing device to impose a limit - a maximum queue size of some
> sort.

Device mapper doesn't impose any limit for in-flight bios.

Some simple device mapper targets (such as linear or stripe) pass bio 
directly to the underlying device with generic_make_request, so if the 
underlying device's request limit is reached, the target's request routine 
waits.

However, complex dm targets (such as dm-crypt, dm-mirror, dm-thin) pass 
bios to a workqueue that processes them. And since there is no limit on 
the number of workqueue entries, there is no limit on the number of 
in-flight bios.

I've seen a case when I had a HPFS filesystem on dm-crypt. I wrote to the 
filesystem, there was about 2GB dirty data. The HPFS filesystem used 
512-byte bios. dm-crypt allocates one temporary page for each incoming 
bio. So, there were 4M bios in flight, each bio allocated 4k temporary 
page - that is attempted 16GB allocation. It didn't trigger OOM condition 
(because mempool allocations don't ever trigger it), but it temporarily 
exhausted all computer's memory.

I've made some patches that limit in-flight bios for device mapper in the 
past, but there were not integrated into upstream.

> If a thread is only making transient allocations, ones which will be
> freed shortly afterwards (not, for example, put in a cache), then I
> don't think it needs to be throttled at all.  I think this universally
> applies to mempools.
> In the case of dm_crypt, if it is writing too fast it will eventually be
> throttled in generic_make_request when the underlying device has a full
> queue and so blocks waiting for requests to be completed, and thus parts
> of them returned to the mempool.

No, it won't be throttled.

dm-crypt does:
1. pass the bio to the encryption workqueue
2. allocate the outgoing bio and allocate temporary pages for the 
   encrypted data
3. do the encryption
4. pass the bio to the writer thread
5. submit the write request with generic_make_request

So, if the underlying block device is throttled, it stalls the writer 
thread, but it doesn't stall the encryption threads and it doesn't stall 
the caller that submits the bios to dm-crypt.

There can be really high number of in-flight bios for dm-crypt.

Mikulas




More information about the dm-devel mailing list