[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: disk IO request queue



Magnus Andersen wrote:

A few more questions...

1. What does you /etc/fstab look like?

[mrtg xxxxxx mrtg]$ cat /etc/fstab
LABEL=/ / ext3 defaults 1 1
LABEL=/boot /boot ext3 defaults 1 2
none /dev/pts devpts gid=5,mode=620 0 0
none /proc proc defaults 0 0
LABEL=/u01 /u01 ext3 defaults 1 2
LABEL=/u02 /u02 ext3 defaults,noatime 1 2
LABEL=/u03 /u03 ext3 defaults,noatime 1 2
LABEL=/u04 /u04 ext3 defaults,noatime 1 2
LABEL=/u05 /u05 ext3 defaults,noatime 1 2
LABEL=/u06 /u06 ext3 defaults,noatime 1 2
LABEL=/u07 /u07 ext3 defaults,noatime 1 2
LABEL=/u08 /u08 ext3 defaults,noatime 1 2
LABEL=/u09 /u09 ext3 defaults,noatime 1 2
LABEL=/u10 /u10 ext3 defaults,noatime 1 2
LABEL=/u11 /u11 ext3 defaults,noatime 1 2
LABEL=/var /var ext3 defaults 1 2
/dev/cciss/c0d0p5 swap swap defaults,pri=1 0 0
/dev/cciss/c0d0p6 swap swap defaults,pri=2 0 0
/dev/cciss/c2d1p1 swap swap defaults,pri=3 0 0
/dev/cciss/c2d1p2 swap swap defaults,pri=4 0 0
/dev/cdrom /mnt/cdrom iso9660 noauto,owner,kudzu,ro 0 0
/dev/fd0 /mnt/floppy auto noauto,owner,kudzu 0 0


2. What does the output from a cat of /proc/sys/vm/pagecache look like?

[mrtg xxxxxx mrtg]$ cat /proc/sys/vm/pagecache
2       30      40

3. What does the output from a cat of /proc/meminfo look like?

[mrtg xxxxxx mrtg]$ cat /proc/meminfo
       total:    used:    free:  shared: buffers:  cached:
Mem:  16555655168 16143187968 412467200        0 216465408 7584522240
Swap: 4294819840 758366208 3536453632
MemTotal:     16167632 kB
MemFree:         402800 kB
MemShared:               0 kB
Buffers:             211392 kB
Cached:          6684672 kB
SwapCached:   722088 kB
Active:              224888 kB
Inact_dirty:       784772 kB
Inact_clean:    6608492 kB
Inact_target:   2224276 kB
HighTotal:     15531996 kB
HighFree:          380036 kB
LowTotal:          635636 kB
LowFree:             22764 kB
SwapTotal:      4194160 kB
SwapFree:      3453568 kB
BigPagesFree:    90112 kB

4. Is kswapd/kscand processes running alot?

this is an extrait from sar, frankly, I don't know it sounds too many or not for DB server who has 118 processes oracle running constantly.

[mrtg xxxxxx mrtg]$ sar -B
....
14:00:00     pgpgin/s pgpgout/s  activepg  inadtypg  inaclnpg  inatarpg
14:10:01     12580,92   6087,17     51356    288124   1598086    556069
14:20:01     12869,03   7150,97   1358869    211496    357811    556069
14:30:00     10836,75   4273,70     39755    305119   1639621    556069
14:40:00     11875,15   2157,05     50592    272401   1662126    556069
14:50:01     10840,94   3495,95      5300    466818   1511166    556069
15:00:00     11218,28   6351,94    811833    260201    878285    556069
15:10:00     13609,75   8266,37   1621522    326428     83188    556069
15:20:00     12949,44  12996,30   1617815    349851     78578    556069
15:30:00     14690,19  11610,43    258207   1542660    240570    556069
15:40:00     12655,25   5044,63     21691    426909   1530368    556069
15:50:00     12884,14   2814,72     48942    262341   1650613    556069
16:00:01     12753,04   2805,44    384267     43088   1497163    556069
16:10:00     19176,36   7790,58   1092592     51285    843185    556069
16:20:00    3590138,12   9234,77     46522   1835686    156279    556069
16:30:01     13198,41   4914,28     13847    335651   1635201    556069
16:40:00     13635,15   3062,69     22390    297322   1649422    556069
16:50:00     13038,47   4228,73    606612     97051   1269245    556069
17:00:00     11530,39   4239,44   1310924    350514    273286    556069
17:10:00     15584,85   5776,15     47659   1587050    389788    556069
17:20:00     12372,65   6751,91     65247    330337   1585666    556069
17:30:00     12188,38  11084,01   1219729    219331    433911    556069
17:40:02     14959,71  12696,39   1547112     85828    312153    556069
17:50:01     11087,24  15645,09   1585365    321786    124828    556069
18:00:00     10330,70  15218,37   1587687     62260    358641    556069
18:10:02      9976,87   4021,09     68018    277566   1662482    556069
18:20:00     13582,79   4956,19    830875    136754    993023    556069
18:30:00     14311,43   6323,61    401770    104135   1449782    556069
18:40:00     12212,25  10072,07   1333636    283263    338959    556069
18:50:00     11533,66   8419,48    670885     81454   1206864    556069
19:00:02     11976,42   5428,17     57846    244240   1661806    556069
19:10:02      7181,63    890,05     60445    243424   1662280    556069
19:20:00      9276,39   3165,65    673303    716847    549284    556069
19:30:00     12674,96   7099,56     21974   1414586    485291    556069
Moyenne:     41202,55   4322,33    284646    199764   1376526    556069


I don't think there is a big difference between hugetlb and bigpages. I do know that I didn't have this implemented and I saw similar
behavior. Since I implemented hugetlb my server has been running
perfect. I also did not have a memory issue, but tuning the pagecache
and bdflush vm parameters help my performance alot.


also, I found another thing which might cause the problem, but I'm not very sure. I'm using the ext3 fs which has the blocksize as 4k, and the DBA's configured the database blocksize as 8k. Do you think if that could be the cause of the bottleneck of IO? (cause one read request oracle will invoke two read() system. )

Also, are you using AIO?

I don't know. I'll check it out.


Magnus

many thanks again!

dux


On 8/5/05, nasvel <nasvel free fr> wrote:

Magnus Andersen wrote:


This sounds very similar to what I experienced when I went live on a
RHEL 3 / 9i environment. A couple of questions.

1. How are the Oracle share mounted to the system?



For oracle, we've got 4 harddisk attached to a controller SCSI. We have
a big tablespace which is composed of 16 dbf files. And the 16 files is
spreading out on the first 3 disks, and the last disk we use to store
the index tablespace.


2. Have you played with Linux vm?



We've tuned the shmmax and max open files. And we're not lack of memory,
there is 6G cached memory.
[mrtg xxxxx mrtg]$ free
            total       used       free     shared    buffers     cached
Mem:      16167632   16105196      62436          0     189340    6684672
-/+ buffers/cache:    9231184    6936448
Swap:      4194160    1196604    2997556


3. Are you using hugetlb?



no, because hugetlb is not available in AS2.1. But in AS2.1 the bigpages
is enabled. According oracle, there is no big diff between them. You
think it's important?

http://www.oracle.com/technology/pub/notes/technote_rhel3.html

Enterprise Linux 3 has replaced bigpages with a feature called hugetlb,
a backport of what is also in Linux kernel 2.6. There are a few
differences in how hugetlb works. Hugetlb behavior is similar to that of
bigpages; the pages are backed by large TLB entries, are not pageable,
and are preallocated, which means that once you allocate x megabytes of
hugetlb pages, that amount of physical memory can be used only through
hugetlbfs or shm allocated with SHM_HUGETLB.

Thank you very much!


dux


On 8/5/05, nasvel <nasvel free fr> wrote:



Hi,

since some weeks our database server (redhat taroon + oracle 9i)
suffered from a very bad performance. The load avg climbed sometimes to
80% :(, althought I think I've a powerful machine (HP, 3 Intel Xeon with
16G memory).

To try to find out the problem, I looked at the iostat report. I found
the await time are pretty high, and the average queue length is about
10. Someone told me that it is normal for a DB server, but I have some
doubt, so I would like to have you guy's opinions about that...

any suggestion is welcome, thanks

dux

=== begin output ===

Linux 2.4.9-e.62enterprise    05.08.2005

cpu-moy:  %user   %nice    %sys   %idle
        17,63    0,02   12,31   70,04

Device:  rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/s avgrq-sz avgqu-sz
await  svctm  %util
cciss/c1d1p1
       889,26  11,04 515,44 13,11  623,61  193,26     1,55    10,62
21,30   8,03  42,42
cciss/c1d0p1
       273,76  45,13 100,64 48,85  872,20  751,95    10,86    10,62
114,52  26,39  39,45
cciss/c1d2p1
       198,26 108,73 107,90 27,83  326,31 1061,51    10,22    10,62
100,75  26,34  35,76
cciss/c1d3p1
       233,38  29,04 89,97 30,70  463,79  477,95     7,80     8,29
68,69  26,78  32,32

=== end output ===

--
Taroon-list mailing list
Taroon-list redhat com
http://www.redhat.com/mailman/listinfo/taroon-list







--
Taroon-list mailing list
Taroon-list redhat com
http://www.redhat.com/mailman/listinfo/taroon-list






[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]