[dm-devel] dm-crypt: remove per-cpu structure
Mikulas Patocka
mpatocka at redhat.com
Thu Feb 27 01:12:40 UTC 2014
On Sat, 22 Feb 2014, Milan Broz wrote:
> On 02/22/2014 03:25 PM, Mike Snitzer wrote:
> > On Sat, Feb 22 2014 at 5:28am -0500,
> > Milan Broz <gmazyland at gmail.com> wrote:
> >
> >> On 02/21/2014 08:41 AM, Mikulas Patocka wrote:
> >>>>>> Hi Mikulas,
> >>>>>>
> >>>>>> Obviously avoiding crashes is more important than performance.
> >>
> >> Yes, but please keep these changes in linux-next for a while
> >> before submitting this for stable or you will fix one rare bug and
> >> slow down all existing users (with possible another fix to fix later)...
> >>
> >> I know that taking cpu offline becomes more used these days
> >> but still the performance is critical attribute for dmcrypt.
> >>
> >> (And IIRC this cpu work relocation problem is there for years.)
> >
> > Unfortunately, we cannot leave the code in its current state. Sure it
> > has been this way for years but taking cpus offline is _much_ more
> > common.
> >
> > As I said here:
> > https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=for-next&id=fba2d94c5e0f3d848792981977b1f62ce96377ac
> >
> > This change may undermine performance improvements intended by commit
> > c0297721 ("dm crypt: scale to multiple cpus"). But correctness is more
> > important than performance. The use of per-cpu structures may be
> > revisited later (by using cpu hot-plug notifiers to make them safe).
>
> Correctness? Yes. I think Tejun said how to fix it correctly.
>
> I have nothing against removal of percpu structure - but please assess
> what it breaks, how it will slow down dmcrypt and mention it in patch
> header if you want quick fix.
With the patch applied, dm-crypt uses one more mempool_alloc and
mempool_free call per requests.
With 512-byte requests and ramdisk as a backing device, dm-crypt can
process up to 25000 requests per second. On the same machine, you can do
19000000 mempool_alloc+mempool_free calls per second. So, the slowdown
caused by mempool_alloc+mempool_free is negligable - in theory, every
request is slowed down by 1/760.
I and Mike measured it and we didn't see any slowdown caused by the patch.
> I couldn't resist but "may undermine" in patch header sounds like
> "we have no idea what it breaks but it no longer crashes".
>
> Removal of headache by removing head is always 100% solution to
> the particular problem but it has usually also some side effects :-)
>
> ...
>
> > Mikulas does have an alternative approach that should be revisited ASAP
> > given that it also fixes this cpu hotplug issue:
> > http://people.redhat.com/mpatocka/patches/kernel/dm-crypt-paralelizace/current/series.html
>
> Yes, and I spent many hours testing it and while it helped somewhere
> it was not perfect solution in every situation. And I wish it works!
> Unfortunately, the tests output was not convincing.
> But it was described in thread you referenced.
The parallelization patches help in sequential reading.
In other workloads - such as concurrent access by multiple threads or bulk
data writing - encryption is already parallelized by the current dm-crypt
implementation, so there's nothing to gain. The new implementation can't
improve parallelization, if it is already parallelized with the current
imeplemtation.
> And if he just repost the same patchset few month later without any
> more data and no changes, why should I change my opinion?
> I can just shut up of course and just maintain userspace and watch
> people complaining.
>
> > context/timeline:
> > http://www.redhat.com/archives/dm-devel/2011-October/msg00127.html
> > http://www.redhat.com/archives/dm-devel/2012-March/msg00181.html
> > http://www.redhat.com/archives/dm-devel/2013-March/msg00103.html
> >
> > Christoph pointed out there is precendence for sorting:
> > http://www.redhat.com/archives/dm-devel/2013-March/msg00104.html
>
> My opinion is that fixing io scheduler work (sorting) in dmcrypt is
> just ugly hack and could lead to more problems in future because
> it can increase latency and it anticipates some IO access patterns.
> Dm-crypt should be (ideally) completely transparent and "invisible"
> to IO processing...
The problem is that the cfq scheduler doesn't merge adjacent requests
submitted by different thread. You don't have to sort it in dm-crypt, but
you must submit all the requests from a single thread.
Mikulas
More information about the dm-devel
mailing list