[dm-devel] Barriers still not passing on simple dm devices...

Jens Axboe jens.axboe at oracle.com
Tue Mar 24 15:05:17 UTC 2009


On Tue, Mar 24 2009, Mikulas Patocka wrote:
> 
> 
> On Tue, 24 Mar 2009, Jens Axboe wrote:
> 
> > On Tue, Mar 24 2009, Mikulas Patocka wrote:
> > > 
> > > 
> > > On Tue, 24 Mar 2009, Jens Axboe wrote:
> > > 
> > > > On Tue, Mar 24 2009, Mikulas Patocka wrote:
> > > > > 
> > > > > 
> > > > > On Mon, 23 Mar 2009, Eric Sandeen wrote:
> > > > > 
> > > > > > I've noticed that on 2.6.29-rcX, with Andi's patch
> > > > > > (ab4c1424882be9cd70b89abf2b484add355712fa, dm: support barriers on
> > > > > > simple devices) barriers are still getting rejected on these simple devices.
> > > > > > 
> > > > > > The problem is in __generic_make_request():
> > > > > > 
> > > > > >                 if (bio_barrier(bio) && bio_has_data(bio) &&
> > > > > >                     (q->next_ordered == QUEUE_ORDERED_NONE)) {
> > > > > >                         err = -EOPNOTSUPP;
> > > > > >                         goto end_io;
> > > > > >                 }
> > > > > > 
> > > > > > and dm isn't flagging its queue as supporting ordered writes, so it's
> > > > > > rejected here.
> > > > > > 
> > > > > > Doing something like this:
> > > > > > 
> > > > > > + if (t->barriers_supported)
> > > > > > +         blk_queue_ordered(q, QUEUE_ORDERED_DRAIN, NULL);
> > > > > > 
> > > > > > somewhere in dm (I stuck it in dm_table_set_restrictions() - almost
> > > > > > certainly the wrong thing to do) did get my dm-linear device to mount
> > > > > > with xfs, w/o xfs complaining that its mount-time barrier tests failed.
> > > > > > 
> > > > > > So what's the right way around this?  What should dm (or md for that
> > > > > > matter) advertise on their queues about ordered-ness?  Should there be
> > > > > > some sort of "QUEUE_ORDERED_PASSTHROUGH" or something to say "this level
> > > > > > doesn't care, ask the next level" or somesuch?  Or should it inherit the
> > > > > > flag from the next level down?  Ideas?
> > > > > > 
> > > > > > Thanks,
> > > > > > -Eric
> > > > > > 
> > > > > > --
> > > > > > dm-devel mailing list
> > > > > > dm-devel at redhat.com
> > > > > > https://www.redhat.com/mailman/listinfo/dm-devel
> > > > > 
> > > > > Hi
> > > > > 
> > > > > This is misdesign in generic bio layer and it should be fixed there. I 
> > > > > think it is blocking barrier support in md-raid1 too. Jens, pls apply the 
> > > > > attached patch.
> > > > > 
> > > > > Mikulas
> > > > > 
> > > > > ----
> > > > > 
> > > > > Move test for not-supported barriers to __make_request.
> > > > > 
> > > > > This test prevents barriers from being dispatched to device mapper
> > > > > and md.
> > > > > 
> > > > > This test is sensible only for drivers that use requests (such as disk
> > > > > drivers), not for drivers that use bios.
> > > > > 
> > > > > It is better to fix it in generic code than to make workaround for it
> > > > > in device mapper and md.
> > > > 
> > > > So you audited any ->make_request_fn style driver and made sure they
> > > > rejected barriers?
> > > 
> > > I didn't.
> > > 
> > > If you grep for it, you get:
> > > 
> > > ./arch/powerpc/sysdev/axonram.c:
> > > doesn't reject barriers, but it is not needed, it ends all bios in 
> > > make_request routine
> > > 
> > > ./drivers/block/aoe/aoeblk.c:
> > > * doesn't reject barriers, should be modified to do so
> > > 
> > > ./drivers/block/brd.c
> > > doesn't reject barriers, doesn't need to, ends all bios in make_request
> > > 
> > > ./drivers/block/loop.c:
> > > doesn't reject barriers, it's ok because it doesn't reorder requests
> > > 
> > > ./drivers/block/pktcdvd.c
> > > * doesn't reject barriers, should be modified to do so
> > > 
> > > ./drivers/block/umem.c
> > > * doesn't reject barriers, I don't know if it reorders requests or not.
> > > 
> > > ./drivers/s390/block/xpram.c
> > > doesn't reject barriers, doesn't need, ends bios immediatelly
> > > 
> > > ./drivers/md/raid0.c
> > > rejects barriers
> > > 
> > > ./drivers/md/raid1.c
> > > supports barriers
> > > 
> > > ./drivers/md/raid10.c
> > > rejects barriers
> > > 
> > > ./drivers/md/raid5.c
> > > rejects barriers
> > > 
> > > ./drivers/md/linear.c
> > > rejects barriers
> > > 
> > > ./drivers/md/dm.c
> > > supports barriers partially
> > 
> > Not reordering is not enough to support the barrier primitive, unless
> > you always go to the same device and pass the barrier flag down with it.
> 
> For single-device drivers (not md/dm), not reordering should be good 
> enough to claim barrier support.

Not reordering is what the barrier is all about, the problem is how far
down you extend that guarantee. For the linux barrier, it's ALL the way
down to and including the hardware. So it's only good enough, if it
includes the device not reordering the write. And signalling completion
when it's safe. "Good enough" is not an option, it's all or nothing.

> > I think having the check in generic_make_request() is perfectly fine,
> > even if the value doesn't completely apply to stacked devices. Perhaps
> > we can add such a value, then. My main point is that barrier support
> > should be opt-in, not a default thing.
> 
> So make some flag for these bio-based devices, so that they don't have to 
> use one of those request-based options (which are meaningless for 
> non-request based device).

Sure, but as I said, I think it's mainly a cosmetic issue. Signalling
simple barrier support is just fine.

> > Over time we should have support everywhere, but it needs to be checked, 
> > audited, and trusted.
> 
> BTW. What is the rule for barriers if the device can't prevent the 
> requests from being delayed or reordered? (for example ATA<=3 disks with 
> cache that lack cache-flush command ... or flash cards that do 
> write-caching anyway and it can't be turned off). Should they support 
> barriers and try to make best effort? Or should they reject barriers to 
> inform the caller code that they have no data consistency?

If they can't flush cache, then they must reject barriers unless they
have write through caching.

-- 
Jens Axboe




More information about the dm-devel mailing list