[dm-devel] Shell Scripts or Arbitrary Priority Callouts?

Sun Mar 22 16:50:48 UTC 2009

On Sun, 2009-03-22 at 17:27 +0200, Pasi Kärkkäinen wrote:
> <snip>> > Now, even though each disk says it handles concurrent I/O on each
> > > port, my testing indicates that throughput suffers when using multibus
> > > by about 1/2 (from ~60 MB/sec sustained I/O with failover to 35 MB/sec
> > > when using multibus).
> > > 
> > > However, with failover, I am effectively using only one channel on
> > > each card. With my custom priority callout, I more or less match the
> > > disks with even numbers to the even numbered scsi channels with a
> > > higher priority. Same with the odd numbered disks and odd numbered
> > > channels. The odds are 2ndary on even and vice versa. It seems to work
> > > rather well, and appears to spread the load nicely.
> > > 
> > > Thanks again for your help!
> > > 
> > I'm really glad you brought up the performance problem. I had posted
> > about it a few days ago but it seems to have gotten lost.  We are really
> > struggling with performance issues when attempting to combine multiple
> > paths (in the case of multipath to one big target) or targets (in the
> > case of software RAID0 across several targets) rather than using, in
> > effect, JBODs.  In our case, we are using iSCSI.
> > 
> > Like you, we found that using multibus caused almost a linear drop in
> > performance.  Round robin across two paths was half as much as aggregate
> > throughput to two separate disks, four paths, one fourth.
> > 
> > We also tried striping across the targets with software RAID0 combined
> > with failover multipath - roughly the same effect.
> > 
> > We really don't want to be forced to treated SAN attached disks as
> > JDOBs.  Has anyone cracked this problem of using them in either multibus
> > or RAID0 so we can present them as a single device to the OS and still
> > load balance multiple paths.  This is a HUGE problem for us so any help
> > is greatly appreciated.  Thanks- John
> 
> Hello.
> 
> Hmm.. just a guess, but could this be related to the fact that if your paths
> to the storage are different iSCSI sessions (open-iscsi _doesn't_ support
> multiple connections per session aka MC/s), then there is a separate SCSI
> command queue per path.. and if SCSI requests are split across those queues 
> they can get out-of-order and that causes performance drop?
> 
> See:
> http://www.nabble.com/round-robin-with-vmware-initiator-and-iscsi-target-td21958346.html
> 
> Especially the reply from Ross (CC). Maybe he has some comments :) 
<snip>
That makes sense and would explain what we are seeing with multipath but
why would we see the same thing with dmadm and using RAID0 to stripe
across multiple iSCSI targets? I would think that, just like increasing
physical spindles increases performance, increasing iSCSI sessions would
also increase performance.

On a side note, we did discover quite a bit about the influence of the
I/O scheduler last night.  We found that there was only marginal
difference between cfq, deadline, anticipatory, and noop when running a
single thread.  However, when running multiple threads, cfq did not
scale at all; performance for 10 threads was the same as for one - in
our case, roughly 6900 IOPS at 512 bytes block size for sequential read.
The other schedulers scaled almost linearly (at least at first).  10
threads shot up to 42000 IOPS, 40 threads to over 60000 IOPS.

We did find that RAID0 was able to scale - at 100 threads we hit around
106000 IOPS on our Nexenta based Z200 from Pogo Linux but performance on
a single thread is still less than performance to a single "spindle", a
single session.  Why is that? Thanks - John
-- 
John A. Sullivan III
Open Source Development Corporation
+1 207-985-7880
jsullivan at opensourcedevel.com

http://www.spiritualoutreach.com
Making Christianity intelligible to secular society