[dm-devel] Re: [RFD] BIO_RW_BARRIER - what it means for devices, filesystems, and dm/md.

Ric Wheeler ric at emc.com
Tue Jul 10 18:39:41 UTC 2007



Tejun Heo wrote:
> [ cc'ing Ric Wheeler for storage array thingie.  Hi, whole thread is at
> http://thread.gmane.org/gmane.linux.kernel.device-mapper.devel/3344 ]

I am actually on the list, just really, really far behind in the thread ;-)

> 
> Hello,
> 
> david at lang.hm wrote:
>> but when you consider the self-contained disk arrays it's an entirely
>> different story. you can easily have a few gig of cache and a complete
>> OS pretending to be a single drive as far as you are concerned.
>>
>> and the price of such devices is plummeting (in large part thanks to
>> Linux moving into this space), you can now readily buy a 10TB array for
>> $10k that looks like a single drive.
> 
> Don't those thingies usually have NV cache or backed by battery such
> that ORDERED_DRAIN is enough?

All of the high end arrays have non-volatile cache (read, on power loss, it is a 
promise that it will get all of your data out to permanent storage). You don't 
need to ask this kind of array to drain the cache. In fact, it might just ignore 
you if you send it that kind of request ;-)

The size of the NV cache can run from a few gigabytes up to hundreds of 
gigabytes, so you really don't want to invoke cache flushes here if you can 
avoid it.

For this class of device, you can get the required in order completion and data 
integrity semantics as long as we send the IO's to the device in the correct order.

> 
> The problem is that the interface between the host and a storage device
> (ATA or SCSI) is not built to communicate that kind of information
> (grouped flush, relaxed ordering...).  I think battery backed
> ORDERED_DRAIN combined with fine-grained host queue flush would be
> pretty good.  It doesn't require some fancy new interface which isn't
> gonna be used widely anyway and can achieve most of performance gain if
> the storage plays it smart.
> 
> Thanks.
> 

I am not really sure that you need this ORDERED_DRAIN for big arrays...

ric




More information about the dm-devel mailing list