[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [dm-devel] [RFC PATCH 00/20] dm-crypt: parallel processing

[cc'ing Chris Mason]

On Tue, Aug 21 2012 at  5:08am -0400,
Milan Broz <mbroz redhat com> wrote:

> Hello,
> I promised to Mikulas that I will post remaining of his dm-crypt parallel
> processing patchset (plus some related changes) with some comments.
> The problem:
> The current implementation (using per cpu struct) always use encryption
> on the same CPU which submitted IO.
> With very fast storage (usually SSD or MD RAID0) one CPU core can
> be limiting and the throughput of encrypted disk is worse in
> comparison with underlying storage.
> Idea here is to distribute encryption to other CPU cores/CPUs.
> (Side effect of patches is nice clean up dmcrypt code. :)
> Mikulas Patocka (20):
>   dm-crypt: remove per-cpu structure
>   dm-crypt: use unbound workqueue for request processing
>   dm-crypt: remove completion restart
>   dm-crypt: use encryption threads
>   dm-crypt: Unify spinlock
>   dm-crypt: Introduce an option that sets the number of threads.
>   dm-crypt: don't use write queue
>   dm-crypt: simplify io queue
>   dm-crypt: unify io_queue and crypt_queue
>   dm-crypt: don't allocate pages for a partial request.
>   dm-crypt: avoid deadlock in mempools
>   dm-crypt: simplify cc_pending
>   dm-crypt merge convert_context and dm_crypt_io
>   dm-crypt: move error handling to crypt_convert.
>   dm-crypt: remove io_pending field
>   dm-crypt: small changes
>   dm-crypt: move temporary values to stack
>   dm-crypt: offload writes to thread
>   dm-crypt: retain write ordering
>   dm-crypt: sort writes
>  drivers/md/dm-crypt.c |  838 +++++++++++++++++++++++++++----------------------
>  1 file changed, 464 insertions(+), 374 deletions(-)
> My notes:
> I extensively tested this (on top of 3.6.0-rc2) and while I like
> simplification and the main logic (if we have enough power why
> not use other CPUs) I see several problems here.
> 1) The implementation is not much useful on modern CPUs with hw accelerated
> AES (with AES-NI even one core can saturate very fast storage).
> (Some optimized crypto behaves similar, like twofish optimized modules.)
> 2) The patchset targets linear access pattern mainly and one
> process generating IOs. (With more processes/CPUs generating IOs
> you get parallel processing even with current code.)
> I can see significant improvement (~30%) for linear read
> (if not using AES-NI) and if underlying storage is zero target
> (ideal situation removing scheduler from the stack).
> For random access pattern (and more IO producers) I cannot find
> reliable improvement except ideal zero target case (where improvement
> is always >30%). For more producers it doesn't help at all.
> I tested RAID0 over SATA disks and very fast SSD on quad core cpu,
> dmcrypt running with 1, 3 or 4 threads (and with cfq and noop scheduler)
> using fio threaded test with 1 or 4 threads.
> Notes to implementation:
> 1) Last two patches (19/20) provides sorting of IO requests, this
> logically should be done in IO scheduler.
> I don't think this should be in dmcrypt, if scheduler doesn't work
> properly, it should be fixed or tuned for crypt access pattern.
> 2) Could be kernel workqueue used/fixed here instead? Basically all it needs
> is to prefer submitting CPU, if it is busy just move work to another CPU.
> 3) It doesn't honour CPU hotplugging. Number of CPU is determined
> in crypt constructor. If hotplugging is used for CPU power saving,
> it will not work properly.
> 4) With debugging options on, everything is extremely slower
> (Perhaps bug in some debug option, I just noticed this on Fedora
> rawhide with all debug on.)
> I posted it mainly because I think this should move forward,
> whatever direction...

I had a conversation with Chris Mason, maybe 2 years ago, where he
expressed concern about dm-crypt growing too much intelligence about
parallel cpu usage.  I seem to recall Chris' concern being relative to
btrfs and the requirement that IOs not get reordered (his point was any
dm-crypt change to improve cpu utilization shouldn't result in the
possibility to reorder IOs).

Could be that IO reordering isn't a concern dm-crypt (before or after
this patchset).  But I just wanted to make this point so that it is kept
in mind (also allows Chris to raise any other new dm-crypt issue/concern
he might have?).


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]