The thing is, that there are two settings that affect different drivers. The I/O scheduler setting will affect the disks that are part of the multipath volume (and only them), while the rr_min_io affects the multipath volume.
The higher the value of rr_min_io, the more requests are sent down one path before switching to the next in the same path group. While this is good for sequential I/O (because the elevator/scheduler on the underlying device can merge more efficiently), this reduces the amount of I/O that is sent in parallel. With very high rr_min_io settings you will end up using mostly one path at a time, while the others are idle.
Using small values for rr_min_io, the chances of spreading the requests over all paths are higher, but so is the chance of separating a long sequence into smaller parts that are not sequential for the disk devices that make the paths. Here a scheduler setting that copes with that pattern can help.
Another approach, that is not in the mainline kernel yet, is to introduce a queue to the multipath target, merge sequential request there and send each I/O down another path (like rr_min_io=1 would do). Kiyoshi Ueda from NEC had a presentation about this on last years OLS (
https://ols2006.108.redhat.com/2007/Reprints/ueda-Reprint.pdf ). From their evaluation of the current kernel, smaller rr_min_io values improved performance but the best value was different for reads and writes.