fake raid vs software raid

Fri Jun 24 01:46:31 UTC 2005

On Thu, Jun 23, 2005 at 03:40:28PM -0700, Dan Stromberg wrote:
> On Thu, 2005-06-23 at 12:09 +0200, Arjan van de Ven wrote:
> > On Thu, Jun 23, 2005 at 03:33:41PM +1000, Brock Lessels wrote:
> > > i am currently running dmraid on gentoo linux. the system uses a sata
> > > sil3114 card with 4 drives running raid01 the only issue is that the
> > > install takes alot of fiddling and i have 6 more machines to get up
> > > and running.
> > > 
> > > firstly could anyone tell me if there is any performance issues using
> > > software raid rather than the cards raid format?
> > 
> > the linux software raid is such that it should be able to get higher
> > performance. the raid card formats all have a "deficiency" that is caused by
> > them raiding a whole disk not a partition, and that causes some problems
> > that in turn can lead to slower performance.
> 
> Why would RAID'ing an entire disk be a performance problem relative to
> RAID'ing a partition (for example, a whole-disk partition, except for
> the first block)?

ok this is a technical detail, and it goes like this:

when you raid0, you chop stuff up in pieces of say 64Kb (there's other
values but lets assume 64kb for this argument, it holds for other values
too), and assume a 2 disk setup. That means that every other 64Kb of the
"raid image" comes from the other disk.

so

Disk A    13579
Disk B    2468

where each number is a 64Kb block.

So far so good.
Now enter partitioning. A partition starts at a cylinder boundary, and it's
very common for this to be an "odd" sector (due to the size of a cylinder
being a multiple of odd numbers often). This means that if you look at the
64Kb block where the partition starts, it may start at say position 62.5Kb
in that 64Kb block.

Now I'll get to the bit where performance suffers: in linux, filesystems
typically work in units of 4Kb *from the start of the position*, and on
x86(-64), 4Kb is also the memory granularity, so all IO goes in 4Kb chunks.

For the case where you raid inside the partition, all those 4Kb chunks on
their own nicely fit within one 64Kb chunk and all is well, they just get a
small remapping and go to their respective disk. In the partitioned
situation, there is one 4Kb block per 64Kb that actually crosses the 64Kb
boundary and needs to get BOTH disks activated (eg 2 seeks). Not only does
it mean both disks need to seek, it also means the kernel needs to split up
the IO into 2 pieces, and at the end merge them back together. Both cost
performance compared to the case where it was a simple "send it to one or
the other"