[dm-devel] [PATCHSET block#for-2.6.36-post] block: replace barrier with sequenced flush

Vladislav Bolkhovitin vst at vlnb.net
Fri Aug 13 12:55:33 UTC 2010


Tejun Heo, on 08/12/2010 04:41 PM wrote:
> Each filesystem needs to be updated to enforce request
> ordering themselves and then to use REQ_FLUSH/FUA mechanism.

I generally agree with the patchset, but I believe this particular move 
is a really bad move.

I'm not mentioning the obvious that a common functionality (enforcing 
requests ordering in this case) should be handled by a common library, 
but not internally by a zillion file systems Linux has.

The worst in this move is that it would hide all the requests ordering 
semantic inside file systems in, most likely, a very much unclear way. 
That would lead that if I or someone else decide to implement the 
"hardware offload" of requests ordering (ORDERED requests), I or he/she 
would not be able to see any improvement until at least one file system 
be changed to be able to use it. Worse, if the implementor can't 
demonstrate the improvement, how can he encourage file systems 
developers to update their file systems? Which, basically, would mean 
that only a person with *BOTH* deep storage and file systems internals 
knowledge can do the job. How many do you know such people? Both storage 
and file systems topics are very wide and tricky, so nearly always 
people specialize in one of them, not both.

Thus, this move would basically mean that the proper ordered queuing 
would probably never be implemented in Linux.

I believe, much better would be to create a common interface, which file 
systems would use to enforce requests order, when they need it.

Advantages of this approach:

1. The ordering requirements of file systems would be clear.

2. They would be handled in one place by a common code.

3. Any storage level expert can try to implement ordered queuing without 
a deep dive into file systems design and implementation.

I already suggested such interface in 
http://marc.info/?l=linux-scsi&m=128077574815881&w=2. Internally for the 
moment it can be implemented using existing REQ_FLUSH/FUA/etc. and 
waiting for all the requests in the group to finish. As a nice side 
effect, if a device doesn't support FUA, it would be possible to issue 
SYNC_CACHE command(s) only for required blocks, not for the whole device 
as it is done now.

If requested, I can develop the interface further.

Vlad




More information about the dm-devel mailing list