[dm-devel] Substantial performance difference when reading/writing to device-mapper vs. the individual device

Jun'ichi Nomura j-nomura at ce.jp.nec.com
Thu Jul 25 09:38:36 UTC 2013


On 07/24/13 20:49, Kaul wrote:
> Could it be explained by the difference in max_segments between the different devices and the dm device?

It depends on work load.
Have you already checked IO pattern with "iostat -xN"?
For mostly sequential IO where a lot of segments are merged,
"max_segments" might affect the performance.
For mostly random and small IO where merge does not occur so often,
it does not likely matter.

> Sounds like https://bugzilla.redhat.com/show_bug.cgi?id=755046 which is supposed to be fixed in 6.4, I reckon:

You could check other request_queue parameters to see if any differences
between dm device an sd device exist.  (/sys/class/block/*/queue/*)
Many of them could affect performance.
Also I think you should check if the same phenomenon happens with the
latest upstream kernel to get feedbacks from upstream mailing list.

The other thing I would check is CPU load, perhaps starting with commands
like top and mpstat, whether there are enough idle cycles left for the
application/kernel to submit/process IOs.

> 3514f0c5615a00003 dm-3 XtremIO,XtremApp
> size=1.0T features='0' hwhandler='0' wp=rw
> `-+- policy='queue-length 0' prio=1 status=active
>   |- 0:0:2:2 sdi  8:128  active ready running
>   |- 0:0:3:2 sdl  8:176  active ready running
>   |- 0:0:1:2 sdf  8:80   active ready running
>   |- 0:0:0:2 sdc  8:32   active ready running
>   |- 1:0:0:2 sds  65:32  active ready running
>   |- 1:0:3:2 sdab 65:176 active ready running
>   |- 1:0:2:2 sdy  65:128 active ready running
>   `- 1:0:1:2 sdv  65:80  active ready running
>  
> 
> [root at lg545 ~]# cat /sys/class/block/dm-3/queue/max_segments
> 128
> [root at lg545 ~]# cat /sys/class/block/sdi/queue/max_segments
> 1024
> [root at lg545 ~]# cat /sys/class/block/sdl/queue/max_segments
> 1024
> [root at lg545 ~]# cat /sys/class/block/sdf/queue/max_segments
> 1024
> [root at lg545 ~]# cat /sys/class/block/sdc/queue/max_segments
> 1024
> [root at lg545 ~]# cat /sys/class/block/sds/queue/max_segments
> 1024
> [root at lg545 ~]# cat /sys/class/block/sdab/queue/max_segments
> 1024
> [root at lg545 ~]# cat /sys/class/block/sdy/queue/max_segments
> 1024
> [root at lg545 ~]# cat /sys/class/block/sdv/queue/max_segments
> 1024
> 
> 
> On Mon, Jul 22, 2013 at 2:47 PM, Kaul <mykaul at gmail.com <mailto:mykaul at gmail.com>> wrote:
> 
>     We are seeing a substantial difference in performance when we perform a read/write to /dev/mapper/... vs. the specific device (/dev/sdXX)
>     What can we do to further isolate the issue?
> 
>     We are using CentOS 6.4, with all updates, 2 CPUs, 4 FC ports:
>     Here's a table comparing the results:
> 
>     # of LUNs
>     # of Paths per device
>     Native Multipath Device
>     IO Pattern
>     IOPS
>     Latency Micro
>     BW KBps
> 
>     4
>     16
>     No
>     100% Read
>     605,661.4
>     3,381
>     2,420,736
> 
>     4
>     16
>     No
>     100% Write
>     477,515.1
>     4,288
>     1,908,736
> 
>     8
>     16
>     No
>     100% Read
>     663,339.4
>     6,174
>     2,650,112
> 
>     8
>     16
>     No
>     100% Write
>     536,936.9
>     7,628
>     2,146,304
> 
>     4
>     16
>     Yes
>     100% Read
>     456,108.9
>     1,122
>     1,824,256
> 
>     4
>     16
>     Yes
>     100% Write
>     371,665.8
>     1,377
>     1,486,336
> 
>     8
>     16
>     Yes
>     100% Read
>     519,450.2
>     1,971
>     2,077,696
> 
>     8
>     16
>     Yes
>     100% Write
>     448,840.4
>     2,281
>     1,795,072

-- 
Jun'ichi Nomura, NEC Corporation




More information about the dm-devel mailing list