bad (2 times) gfs read performance under x86_64 compared to i386 WAS: Re: [Linux-cluster] AW: GNBD multipath with devicemapper? -- possible solution
Hansjörg Maurer
hansjoerg.maurer at dlr.de
Sun Apr 24 13:46:01 UTC 2005
Hi
I have done some futher testing on this issue, and I noticed an
interesting behavior.
We have an installation with RHEL4 on i386 hardware (old SAN) and one under
new x86_64 (dual CPU each hyptherthreading).
On i386 read performance is better than write performance (under ext3
and gfs)
On x86_64 read performance is better the write performance under ext3
but two times bader under gfs (Filsystem created with lock_nolock)
Here a short summary:
i386 ext3
write: 1:16
read: 0:55
read: 0:43 (blockdev --setra 8192)
i386 ext3
write: 1:30
read: 1:08
read: 1:08 (blockdev --setra 8192)
x86_64 ext3
write: 0:35
read: 0:27
read: 0:18 (blockdev --setra 8192)
x86_64 gfs
write: 0:27
read: 1:03
read: 1:03 (blockdev --setra 8192)
The gnbd tests I did last week were under x86_64 to, so that might not
be an gnbd issue, but an gfs issue under
x86_64.
The tests where done under
2.6.9-5.0.3.ELsmp and 2.6.9-6.38.ELsmp with current GFS from CVS
(RHEL4 TAG) with no difference.
Can anyone reproduce this behavior?
Greetings
Hansjörg
Here the detailed tests
[root at chianti sda1]# uname -a
Linux chianti.itsd.de 2.6.9-6.38.EL #1 Wed Apr 13 01:36:09 EDT 2005 i686
athloni386 GNU/Linux
[root at chianti ~]# mkfs.ext3 /dev/sda1
mke2fs 1.35 (28-Feb-2004)
max_blocks 4294967295, rsv_groups = 0, rsv_gdb = 1024
Dateisystem-Label=
OS-Typ: Linux
BlockgröÃe=4096 (log=2)
FragmentgröÃe=4096 (log=2)
5341184 Inodes, 10673176 Blöcke
533658 Blöcke (5.00%) reserviert für den Superuser
erster Datenblock=0
Maximum filesystem blocks=12582912
326 Blockgruppen
32768 Blöcke pro Gruppe, 32768 Fragmente pro Gruppe
16384 Inodes pro Gruppe
Superblock-Sicherungskopien gespeichert in den Blöcken:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632,
2654208,
4096000, 7962624
Schreibe Inode-Tabellen: erledigt
inode.i_blocks = 98312, i_size = 4243456
Erstelle Journal (8192 Blöcke): erledigt
Schreibe Superblöcke und Dateisystem-Accountinginformationen: erledigt
Das Dateisystem wird automatisch alle 28 Mounts bzw. alle 180 Tage
überprüft,
je nachdem, was zuerst eintritt. Veränderbar mit tune2fs -c oder -t .
[root at chianti ~]# mount /dev/sda1 /mnt/sda1/
[root at chianti ~]# cd /mnt/sda1
[root at chianti sda1]# free
total used free shared buffers cached
Mem: 515736 164504 351232 0 43544 25104
-/+ buffers/cache: 95856 419880
Swap: 786424 152 786272
[root at chianti sda1]# time mkfile 1000M a
real 1m16.242s
user 0m0.009s
sys 0m16.273s
[root at chianti sda1]# time mkfile 1000M b
real 1m17.738s
user 0m0.009s
sys 0m15.511s
[root at chianti sda1]# time cat a > /dev/null
real 0m55.025s
user 0m0.193s
sys 0m6.529s
[root at chianti sda1]# time cat b > /dev/null
real 0m54.241s
user 0m0.220s
sys 0m6.331s
[root at chianti sda1]# blockdev --setra 8192 /dev/sda1
[root at chianti sda1]# time cat a > /dev/null
real 0m43.565s
user 0m0.189s
sys 0m6.014s
[root at chianti sda1]# time cat b > /dev/null
real 0m43.214s
user 0m0.196s
sys 0m6.727s
[root at chianti ~]# gfs_mkfs -p lock_nolock -j 3 /dev/sda1
This will destroy any data on /dev/sda1.
It appears to contain a EXT2/3 filesystem.
Are you sure you want to proceed? [y/n] y
Device: /dev/sda1
Blocksize: 4096
Filesystem Size: 10573560
Journals: 3
Resource Groups: 162
Locking Protocol: lock_nolock
Lock Table:
Syncing...
All Done
[root at chianti sda1]# time mkfile 1000M a
real 1m27.421s
user 0m0.010s
sys 0m12.202s
[root at chianti sda1]# time mkfile 1000M b
real 1m35.009s
user 0m0.006s
sys 0m12.513s
[root at chianti sda1]# time cat a > /dev/null
real 1m12.609s
user 0m0.153s
sys 0m9.980s
[root at chianti sda1]# time cat b > /dev/null
real 1m7.989s
user 0m0.154s
sys 0m10.427s
[root at chianti sda1]# blockdev --setra 256 /dev/sda1
[root at chianti sda1]# time cat a > /dev/null
real 1m8.402s
user 0m0.082s
sys 0m8.841s
[root at chianti sda1]# time cat b > /dev/null
real 1m8.647s
user 0m0.124s
sys 0m9.565s
[root at chianti sda1]# blockdev --setra 8192 /dev/sda1
[root at chianti sda1]# time cat a > /dev/null
real 1m8.419s
user 0m0.115s
sys 0m9.262s
[root at rmvbs02 ~]# uname -a
Linux rmvbs02.cluster.robotic.dlr.de 2.6.9-5.0.3.ELsmp #1 SMP Sat Feb 19
15:45:14 CST 2005 x86_64 x86_64 x86_64 GNU/Linux
[root at rmvbs02 ~]# mkfs.ext3 /dev/sdb1
mke2fs 1.35 (28-Feb-2004)
max_blocks 4294967295, rsv_groups = 131072, rsv_gdb = 977
Dateisystem-Label=
OS-Typ: Linux
BlockgröÃe=4096 (log=2)
FragmentgröÃe=4096 (log=2)
97615872 Inodes, 195221872 Blöcke
9761093 Blöcke (5.00%) reserviert für den Superuser
erster Datenblock=0
Maximum filesystem blocks=4294967296
5958 Blockgruppen
32768 Blöcke pro Gruppe, 32768 Fragmente pro Gruppe
16384 Inodes pro Gruppe
Superblock-Sicherungskopien gespeichert in den Blöcken:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632,
2654208,
4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
102400000
Schreibe Inode-Tabellen: erledigt
inode.i_blocks = 140696, i_size = 4243456
Erstelle Journal (8192 Blöcke): erledigt
Schreibe Superblöcke und Dateisystem-Accountinginformationen: erledigt
Das Dateisystem wird automatisch alle 38 Mounts bzw. alle 180 Tage
überprüft,
je nachdem, was zuerst eintritt. Veränderbar mit tune2fs -c oder -t .
[root at rmvbs02 tmp]# free
total used free shared buffers cached
Mem: 1026364 65960 960404 0 684 15544
-/+ buffers/cache: 49732 976632
Swap: 2739040 1360 2737680
[root at rmvbs02 tmp]# time mkfile 2000M a
real 0m35.560s
user 0m0.005s
sys 0m7.596s
[root at rmvbs02 tmp]# time mkfile 2000M b
real 0m30.917s
user 0m0.005s
sys 0m7.707s
[root at rmvbs02 tmp]# time mkfile 2000M c
real 0m40.663s
user 0m0.010s
sys 0m7.806s
[root at rmvbs02 tmp]# time cat a > /dev/null
real 0m28.787s
user 0m0.122s
sys 0m2.818s
[root at rmvbs02 tmp]# time cat b > /dev/null
real 0m27.468s
user 0m0.124s
sys 0m2.678s
[root at rmvbs02 tmp]# blockdev --getra /dev/sdb1
128
[root at rmvbs02 tmp]# blockdev --setra 8192 /dev/sdb1
[root at rmvbs02 tmp]# time cat c > /dev/null
real 0m18.541s
user 0m0.105s
sys 0m2.064s
[root at rmvbs02 tmp]# time cat a > /dev/null
real 0m18.464s
user 0m0.117s
sys 0m2.035s
[root at rmvbs02 ~]# gfs_mkfs -p lock_nolock -j 3 /dev/sdb1
This will destroy any data on /dev/sdb1.
It appears to contain a EXT2/3 filesystem.
Are you sure you want to proceed? [y/n] y
Device: /dev/sdb1
Blocksize: 4096
Filesystem Size: 195108656
Journals: 3
Resource Groups: 2978
Locking Protocol: lock_nolock
Lock Table:
Syncing...
All Done
[root at rmvbs02 ~]# mount -t gfs /dev/sdb1 /mnt/tmp/
[root at rmvbs02 ~]# cd /mnt/tmp/
[root at rmvbs02 tmp]# time mkfile 2000M a
real 0m24.616s
user 0m0.004s
sys 0m4.892s
[root at rmvbs02 tmp]# time mkfile 2000M b
real 0m27.023s
user 0m0.005s
sys 0m5.027s
[root at rmvbs02 tmp]# time mkfile 2000M c
real 0m29.205s
user 0m0.005s
sys 0m5.163s
[root at rmvbs02 tmp]# time cat a > /dev/null
real 1m4.698s
user 0m0.120s
sys 0m6.138s
[root at rmvbs02 tmp]# time cat b > /dev/null
real 1m2.958s
user 0m0.132s
sys 0m6.175s
[root at rmvbs02 tmp]# time cat c > /dev/null
real 1m2.867s
user 0m0.109s
sys 0m6.079s
[root at rmvbs02 tmp]# blockdev --getra /dev/sdb1
8192
[root at rmvbs02 tmp]# blockdev --setra 256 /dev/sdb1
[root at rmvbs02 tmp]# time cat a > /dev/null
real 1m2.931s
user 0m0.101s
sys 0m6.073s
Benjamin Marzinski wrote:
>On Thu, Apr 21, 2005 at 07:58:49AM +0200, Hansjörg Maurer wrote:
>
>
>>Hi
>>
>>thank you very much,
>>Do you have any idea what causes the speed difference between vanilla
>>and RHEL4 kernel?
>>Should I officially file a bug in bugzilla, or do you take care of the
>>problem?
>>
>>
>
>Go ahead and file a bug on gnbd if you are interested in tracking the status
>of this. I'll put it on my list of things to do.
>
>-Ben
>
>
>
>>Greetings
>>
>>
>>Hansjörg
>>
>>
>>
>>
>>
>>
>>
>>>I got those numbers on a vanilla 2.6.11 kernel.
>>>Running on a 2.6.9-6.24.EL kernel, I get
>>>45.6608 MB/sec writes
>>>35.8578 MB/sec reads
>>>
>>>Not as pronouced as your numbers, but a noticeable speed difference.
>>>
>>>-Ben
>>>
>>>
>>>
>>>
>>>>Thank you very much
>>>>
>>>>Hansjörg
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>--
>>>>_________________________________________________________________
>>>>
>>>>Dr. Hansjoerg Maurer | LAN- & System-Manager
>>>> |
>>>>Deutsches Zentrum | DLR Oberpfaffenhofen
>>>>f. Luft- und Raumfahrt e.V. |
>>>>Institut f. Robotik |
>>>>Postfach 1116 | Muenchner Strasse 20
>>>>82230 Wessling | 82234 Wessling
>>>>Germany |
>>>> |
>>>>Tel: 08153/28-2431 | E-mail: Hansjoerg.Maurer at dlr.de
>>>>Fax: 08153/28-1134 | WWW: http://www.robotic.dlr.de/
>>>>__________________________________________________________________
>>>>
>>>>
>>>>There are 10 types of people in this world,
>>>>those who understand binary and those who don't.
>>>>
>>>>--
>>>>Linux-cluster mailing list
>>>>Linux-cluster at redhat.com
>>>>http://www.redhat.com/mailman/listinfo/linux-cluster
>>>>
>>>>
>>>>
>>>>
>>>--
>>>Linux-cluster mailing list
>>>Linux-cluster at redhat.com
>>>http://www.redhat.com/mailman/listinfo/linux-cluster
>>>
>>>
>>>
>>>
>>>
>>--
>>_________________________________________________________________
>>
>>Dr. Hansjoerg Maurer | LAN- & System-Manager
>> |
>>Deutsches Zentrum | DLR Oberpfaffenhofen
>> f. Luft- und Raumfahrt e.V. |
>>Institut f. Robotik |
>>Postfach 1116 | Muenchner Strasse 20
>>82230 Wessling | 82234 Wessling
>>Germany |
>> |
>>Tel: 08153/28-2431 | E-mail: Hansjoerg.Maurer at dlr.de
>>Fax: 08153/28-1134 | WWW: http://www.robotic.dlr.de/
>>__________________________________________________________________
>>
>>
>>There are 10 types of people in this world,
>>those who understand binary and those who don't.
>>
>>--
>>Linux-cluster mailing list
>>Linux-cluster at redhat.com
>>http://www.redhat.com/mailman/listinfo/linux-cluster
>>
>>
>
>--
>Linux-cluster mailing list
>Linux-cluster at redhat.com
>http://www.redhat.com/mailman/listinfo/linux-cluster
>
>
>
More information about the Linux-cluster
mailing list