RAID5 gets a bad rap

Gordon Messmer yinyang at eburg.com
Wed Dec 31 08:33:20 UTC 2008


Chris Tyler wrote:
> On Tue, 2008-12-30 at 01:02 -0800, Gordon Messmer wrote:
>> That's not quite it.  RAID 5 performance suffers because every write 
>> requires that the entire block that's being written be read from every 
>> drive in the array, parity calculated, and then the data and parity 
>> written out.  For each block written, the array has to do N reads plus 
>> two writes.
> 
> You don't have to read all of the drives -- just the block you're
> updating and the parity block. XOR the old data you're about to
> overwrite with the parity block and the new data and you'll have the new
> parity block. Total activity: two reads plus two writes.

I've understood that to be the case, but while watching the drive 
activity lights on RAID5 arrays, it seems like I always see the entire 
set flash at the same time.  I guess I'll have to investigate that 
further to find out why.  Thanks.

>> RAID 5 tends to be most appropriate when you're trying to get as much 
>> disk space as you can with the lowest cost, you won't be running 
>> multiple simultaneous jobs on the same disk array, and when you'll be 
>> collecting data at a rate that's relatively low.
> 
> I'd say the other way around -- RAID 5 is poor at small writes (hence
> the OP's comments about database updates), but very nearly approaches
> RAID-0 speeds when reading or writing large quantities of sequential
> data.

Your assertion ignores the fact that filesystems themselves are, in 
fact, databases.  Real-world experience with many production systems and 
many workloads has convinced me to use RAID 5 as rarely as possible. 
Even when I'm forced to use it, I generally choose a RAID 5+0 
configuration as I get much better performance.




More information about the fedora-list mailing list