[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [dm-devel] [patch] blk-flush: fix flush policy calculation



Tejun Heo <tj kernel org> writes:

> Hello,
>
> On Tue, Aug 02, 2011 at 01:39:46PM -0400, Jeff Moyer wrote:
>> OK, sorry for top-posting here, but I chased the problem down further.
>> 
>> Commit ae1b1539622fb46e51b4d13b3f9e5f4c713f86ae, block: reimplement
>> FLUSH/FUA to support merge, introduced a regression when running any
>> sort of fsyncing workload using dm-multipath and certain storage (in our
>> case, an HP EVA).  It turns out that dm-multipath always advertised
>> flush+fua support, and passed commands on down the stack, where they
>> used to get stripped off.  The above commit, unfortunately, changed that
>> behavior:
> ...
>> So, the flush machinery was bypassed in such cases (q->flush_flags == 0
>> && rq->cmd_flags & (REQ_FLUSH|REQ_FUA)).
>> 
>> Now, however, we don't get into the flush machinery at all (which is why
>> my initial patch didn't help this situation).  Instead,
>> __elv_next_request just hands a request with flush and fua bits set to
>> the scsi_request_fn, even though the underlying request_queue does not
>> support flush or fua.
>> 
>> So, where do we fix this?  We could just accept Mike's patch to not send
>> such requests down from dm-mpath, but that seems short-sighted.  We
>> could reinstate some checks in __elv_next_request.  Or, we could put the
>> checks into blk_insert_cloned_request.
>
> Ah, okay, what changed there was where a request is passed into flush
> machinery.  Before, it was while the request was being dispatched from
> elevator to device.  After, it's de-composed when the request enters
> elevator.  The bug is that there are paths which insert new requests
> to elevator but didn't check for REQ_FLUSH|FUA.
>
> I think it would be cleaner to add a wrapper around
> __elv_add_request() which checks for REQ_FLUSH|FUA and enforce
> REQ_INSERT_FLUSH if the request needs it.  Note that this should only
> happen when a request enters the queue for the first time but not on
> requeues - that was the reason why the decision wasn't made inside
> __elv_add_request().

OK, we can do a wrapper, but it probably wouldn't be too horrific to
just fix up blk_insert_cloned_request.

Now, the next issue is that flush requests issued from the dm target
down through the stack have no bio associated with them, so we blow up
on the BUG_ON(!req->bio || req->bio != req->biotail).

Cheers,
Jeff


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]