[dm-devel] Reworking dm-writeboost [was: Re: staging: Add dm-writeboost]

Sat Oct 5 07:51:16 UTC 2013

Dave,

> That's where arbitrary delays in the storage stack below XFS cause
> problems - if the first FUA log write is delayed, the next log
> buffer will get filled, issued and delayed, and when we run out of
> log buffers (there are 8 maximum) the entire log subsystem will
> stall, stopping *all* log commit operations until log buffer
> IOs complete and become free again. i.e. it can stall modifications
> across the entire filesystem while we wait for batch timeouts to
> expire and issue and complete FUA requests.
To me, this sounds like design failure in XFS log subsystem.
Or just the limitation of metadata journal.

> IMNSHO, REQ_FUA/REQ_FLUSH optimisations should be done at the
> point where they are issued - any attempt to further optimise them
> by adding delays down in the stack to aggregate FUA operations will
> only increase latency of the operations that the issuer want to have
> complete as fast as possible....
That lower layer stack attempts to optimize further
can benefit any filesystems.
So, your opinion is not always correct although
it is always correct in error handling or memory management.

I have proposed future plan of using persistent memory.
I believe with this leap forward
filesystems are free from doing such optimization
relevant to write barriers. For more detail, please see my post.
https://lkml.org/lkml/2013/10/4/186

However,
I think I should leave option to disable the optimization
in case the upper layer doesn't like it.
Maybe, writeboost should disable deferring barriers
if barrier_deadline_ms parameter is especially 0.
Linux kernel's layered architecture is obviously not always perfect
so there are similar cases in other boundaries
such as O_DIRECT to bypass the page cache.

Maybe, dm-thin and dm-cache should add such switch.

Akira