Issue #11 September 2005

Performance tuning tools: ps, top, sar, iostat, and vmstat

As a system administrator, part of your daily duties is to monitor systems for performance and to tune systems where necessary. While there are expensive software products and benchmarking tools that can hone a machine to optimum efficiency, there exist several basic tools within Linux® that permit the knowledgeable system administrator to gather information and use the valuable information to make decisions about where and when to tune a system.

P.S.—I want to see my processes

One of the most basic tools we can use is the utility ps. ps provides a snapshot of current processes. This snapshot can range from myself as a single user (such as what active processes I have running) to all the processes on the system. The simple example of course is to run the ps command with no options, which produces output similar to:

  PID TTY          TIME CMD
 2873 pts/1    00:00:00 bash
 3002 pts/1    00:00:00 ps
Example 1. Basic output of ps

We see in Example 1, “Basic output of ps” that we get some minimal information about the processes we are running, including ps itself. ps displays the process ID (PID), the terminal associated with the process (TTY), the cumulated CPU time in [dd-]hh:mm:ss format (TIME), and the executable name (CMD). Spectacular, right? Well, ps does this and a whole lot more. I should mention at this point that the version of ps that I am using for this article is something special compared to the ps of yester-year and of your classic UNIX®. This ps, procps version 3.2.5, accepts several kinds of options: UNIX options, which may be grouped and must be preceded by a dash, BSD options, which may be grouped and must not be used with a dash, and GNU long options, which are preceded by two dashes. For the uninitiated, those who are new to Linux, or refugees from some older BSD or System V variant, this is good news. A system administrator can track down a process via several sets of options.

root      2784  2774  0 22:45 pts/2    00:00:00 su - mfrye
mfrye     2785  2784  0 22:45 pts/2    00:00:00 -bash
root      2895  1870  0 23:04 ?        00:00:00 sshd: mfrye [priv]
mfrye     2897  2895  0 23:04 ?        00:00:00 sshd: mfrye@pts/3
mfrye     2898  2897  0 23:04 pts/3    00:00:00 -bash
mfrye     3274  2785  0 23:34 pts/2    00:00:00 ps -ef
mfrye     3275  2785  0 23:34 pts/2    00:00:00 grep mfrye
Example 2. Output of ps -ef | grep mfrye
root      2784  0.0  0.0  71368  1288 pts/2    S    22:45   0:00 su - mfrye
mfrye     2785  0.0  0.0  55124  1536 pts/2    S    22:45   0:00 -bash
root      2895  0.0  0.1  38228  2660 ?        Ss   23:04   0:00 sshd: mfrye [priv]
mfrye     2897  0.0  0.1  38228  2748 ?        S    23:04   0:00 sshd: mfrye@pts/3
mfrye     2898  0.0  0.0  55124  1528 pts/3    Ss   23:04   0:00 -bash
mfrye     3272  0.0  0.0  52948   872 pts/2    R+   23:34   0:00 ps aux
mfrye     3273  0.0  0.0  51192   636 pts/2    S+   23:34   0:00 grep mfrye
Example 3. Output of ps -aux | grep mfrye

In Example 2, “Output of ps -ef | grep mfrye” and Example 3, “Output of ps -aux | grep mfrye”, we see the output of ps with different arguments. We can use this output to track a particular set of processes (owned by mfrye) via either of two sets of options (UNIX & BSD, respectively). So what's the big deal, you're thinking? OK, so bash is a pretty tame example. In cases where another process, perhaps one that consumes more memory, or some other resource, than you want, ps can be a very quick, easy, and effective way to track that process down. So now we've tracked down a particular process, but we don't know much more than some basic information about the process's CPU usage in terms of accumulated CPU time, which as you may appreciate, is not ideal. Luckily, there's more.

Being on top

To track a process in relation to the system usage, another basic performance monitoring tool is top. To start top, simply run top from the command line. A typical glimpse of top output without any formatting can be seen in Example 4, “Basic output of top”.

top - 23:50:16 up  3:25,  1 user,  load average: 0.00, 0.00, 0.00
Tasks:  88 total,   1 running,  87 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0% us,  0.0% sy,  0.0% ni, 100.0% id,  0.0% wa,  0.0% hi,  0.0% si
Mem:   2055112k total,   227684k used,  1827428k free,    53556k buffers
Swap:  2096472k total,        0k used,  2096472k free,   100884k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
    1 root      16   0  4876  596  500 S  0.0  0.0   0:00.78 init
    2 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 migration/0
    3 root      34  19     0    0    0 S  0.0  0.0   0:00.00 ksoftirqd/0
    4 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 migration/1
    5 root      34  19     0    0    0 S  0.0  0.0   0:00.00 ksoftirqd/1
    6 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 migration/2
    7 root      34  19     0    0    0 S  0.0  0.0   0:00.00 ksoftirqd/2
    8 root      RT   0     0    0    0 S  0.0  0.0   0:00.00 migration/3
    9 root      34  19     0    0    0 S  0.0  0.0   0:00.00 ksoftirqd/3
   10 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 events/0
   11 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 events/1
   12 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 events/2
   13 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 events/3
   14 root      19  -5     0    0    0 S  0.0  0.0   0:00.00 khelper
   15 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 kthread
   22 root      20  -5     0    0    0 S  0.0  0.0   0:00.00 kacpid
  106 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 kblockd/0
  107 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 kblockd/1
  108 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 kblockd/2
  109 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 kblockd/3
  112 root      15   0     0    0    0 S  0.0  0.0   0:00.00 khubd
  162 root      20   0     0    0    0 S  0.0  0.0   0:00.00 pdflush
  163 root      15   0     0    0    0 S  0.0  0.0   0:00.01 pdflush
  166 root      13  -5     0    0    0 S  0.0  0.0   0:00.00 aio/0
  167 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 aio/1
  168 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 aio/2
  169 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 aio/3
Example 4. Basic output of top

Top is an interactive tool that allows a system administrator to view the process table in order of CPU or memory usage, by user, and at varying refresh rates. For example, a system administrator who wants to monitor the process running under the user apache (option u, apache), sorted by memory usage (option M), updated every half second (option S, .5) would get that output. See Example 5, “Example of top output sorted by user apache”.

top - 23:58:42 up  3:33,  1 user,  load average: 0.00, 0.00, 0.00
Tasks:  88 total,   1 running,  87 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0% us,  0.0% sy,  0.0% ni, 100.0% id,  0.0% wa,  0.0% hi,  0.0% si
Mem:   2055112k total,   227436k used,  1827676k free,    53740k buffers
Swap:  2096472k total,        0k used,  2096472k free,   101220k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 1911 apache    16   0  113m  13m 7984 S  0.0  0.7   0:00.00 httpd
 1912 apache    15   0  113m  13m 7980 S  0.0  0.7   0:00.00 httpd
 1913 apache    16   0  113m  12m 7912 S  0.0  0.6   0:00.00 httpd
 1914 apache    20   0  113m  12m 7912 S  0.0  0.6   0:00.00 httpd
 1915 apache    20   0  113m  12m 7912 S  0.0  0.6   0:00.00 httpd
 1916 apache    20   0  113m  12m 7912 S  0.0  0.6   0:00.00 httpd
 1917 apache    20   0  113m  12m 7912 S  0.0  0.6   0:00.00 httpd
 1918 apache    25   0  113m  12m 7912 S  0.0  0.6   0:00.00 httpd
Example 5. Example of top output sorted by user apache

Top is useful for viewing real-time process behavior within the context of system resources. The use of a faster refresh rate will provide enhanced precision for measuring system loads. For example, if you have a system running an Oracle® Database, and your startup time for the database is unacceptably slow, you will be able to see what processes consume a greater part of memory while the system is pegged. While top is a good interactive tool, you may not have the time or inclination to sit and watch processes for more than a few minutes. Luckily, there's more.

Sar, yes, sar!

Sar is one of those utilities that conjures up images of UNIX nerds that took Latin in high school (when Latin was still offered in high schools). Because of sar's relative oddness, it is often lumped into the same category as sendmail for ease of configuration. To be fair, there is wonderful documentation for most such utilities. However, looking beyond sar's reputation for obscurity in output as well as syntax reveals a powerful system monitoring tool.

You can install sar by installing the sysstat package with the command yum install sysstat. You also need to initialize sar the first time by running /usr/lib/sa/sa1 1 1 and /usr/lib/sa/sa2 -A, or by letting cron run these commands. The sysstat package will place these in /etc/cron.d/sysstat/, and you won't be able to run sar with no arguments and get meaningful output without having done this first.

Running sar with no arguments will give you some pretty obvious output as to what's going on in your system. In Example 6, “Basic output of sar”, we see the day's cumulative averages so far for every ten minutes on all CPUs. You will notice that these are the same pieces of information that we saw in top, except that in this case, sar gives us a time breakdown of when loads occurred.

Linux 2.6.12-1.1398_FC4smp (knuth)     08/28/2005

12:00:01 AM       CPU     %user     %nice   %system   %iowait     %idle
12:10:01 AM       all      0.01      0.00      0.01      0.00     99.98
12:20:01 AM       all      0.01      0.00      0.01      0.00     99.98
12:30:01 AM       all      0.01      0.00      0.01      0.01     99.98
12:40:01 AM       all      0.00      0.00      0.00      0.00    100.00
12:50:01 AM       all      0.00      0.00      0.00      0.01     99.99
01:00:01 AM       all      0.00      0.00      0.00      0.00    100.00
01:10:01 AM       all      0.00      0.00      0.00      0.00    100.00
01:20:01 AM       all      0.00      0.00      0.00      0.00    100.00
01:30:01 AM       all      0.00      0.00      0.00      0.00    100.00
01:40:01 AM       all      0.00      0.00      0.00      0.00    100.00
01:50:01 AM       all      0.00      0.00      0.00      0.00    100.00
Average:          all      0.00      0.00      0.00      0.00     99.99
Example 6. Basic output of sar

Incidentally, these values are stored by running sar in cron. Fedora™ Core 4 has the following entries in /etc/cron.d/sysstat, by default:

# run system activity accounting tool every 10 minutes */10 * * * * root
/usr/lib/sa/sa1 1 1 # generate a daily summary of process accounting at 23:53
53 23 * * * root /usr/lib/sa/sa2 -A

The sa1 script collects and stores binary data in the system activity daily data file, and sa2 writes a daily report in the /var/log/sa/ directory. Sar can also be invoked to provide real-time statistics on the fly. In Example 7, “Example output of sar 1 10”, I have invoked sar with the options for a one second interval over 10 iterations. This is a very effective way to evaluate where a bottleneck might lie. If you're having problems with I/O wait when certain reads take place, you'll be able to see it here. Running sar in this fashion offers you the dynamic output of top with the specificity of sar. See Example 7, “Example output of sar 1 10”.

Linux 2.6.12-1.1398_FC4smp (knuth)     08/28/2005

02:13:43 AM       CPU     %user     %nice   %system   %iowait     %idle
02:13:44 AM       all      0.00      0.00      0.00      0.00    100.00
02:13:45 AM       all      0.00      0.00      0.00      0.00    100.00
02:13:46 AM       all      0.00      0.00      0.00      0.00    100.00
02:13:47 AM       all      0.00      0.00      0.00      0.00    100.00
02:13:48 AM       all      0.00      0.00      0.00      0.00    100.00
02:13:49 AM       all      0.00      0.00      0.00      0.00    100.00
02:13:50 AM       all      0.00      0.00      0.00      0.00    100.00
02:13:51 AM       all      0.00      0.00      0.00      0.00    100.00
02:13:52 AM       all      0.00      0.00      0.00      0.00    100.00
02:13:53 AM       all      0.00      0.00      0.00      0.00    100.00
Average:          all      0.00      0.00      0.00      0.00    100.00
Example 7. Example output of sar 1 10

Sar also allows you to view the same output but restricts your reporting to a particular processor. Example 8, “sar -P 1 1 5 output” shows 5 one second iterations for CPU 1, and Example 9, “sar -P 2 1 5 output” shows 5 one second iterations for CPU 2.

Linux 2.6.12-1.1398_FC4smp (knuth)     08/28/2005

02:28:24 AM       CPU     %user     %nice   %system   %iowait     %idle
02:28:25 AM         1      0.00      0.00      0.00      0.00    100.00

02:28:25 AM       CPU     %user     %nice   %system   %iowait     %idle
02:28:26 AM         1      0.00      0.00      0.00      0.00    100.00

02:28:26 AM       CPU     %user     %nice   %system   %iowait     %idle
02:28:27 AM         1      0.00      0.00      0.00      0.00    100.00

02:28:27 AM       CPU     %user     %nice   %system   %iowait     %idle
02:28:28 AM         1      0.00      0.00      0.00      0.00    100.00

02:28:28 AM       CPU     %user     %nice   %system   %iowait     %idle
02:28:29 AM         1      0.00      0.00      0.00      0.00    100.00

Average:          CPU     %user     %nice   %system   %iowait     %idle
Average:            1      0.00      0.00      0.00      0.00    100.00
Example 8. sar -P 1 1 5 output
Linux 2.6.12-1.1398_FC4smp (knuth)     08/28/2005

02:28:33 AM       CPU     %user     %nice   %system   %iowait     %idle
02:28:34 AM         2      0.00      0.00      0.00      0.00    100.00

02:28:34 AM       CPU     %user     %nice   %system   %iowait     %idle
02:28:35 AM         2      0.00      0.00      0.00      0.00    100.00

02:28:35 AM       CPU     %user     %nice   %system   %iowait     %idle
02:28:36 AM         2      0.00      0.00      0.00      0.00    100.00

02:28:36 AM       CPU     %user     %nice   %system   %iowait     %idle
02:28:37 AM         2      0.00      0.00      0.00      0.00    100.00

02:28:37 AM       CPU     %user     %nice   %system   %iowait     %idle
02:28:38 AM         2      0.00      0.00      0.00      0.00    100.00

Average:          CPU     %user     %nice   %system   %iowait     %idle
Average:            2      0.00      0.00      0.00      0.00    100.00
      
Example 9. sar -P 2 1 5 output

Check your system, STAT!

There are a number of *stat commands that appear in any given system, and I would like to mention two which I think are most useful. The first of these is iostat. Iostat reports CPU statistics and input/output statistics for devices and partitions. While it seems that CPU statistics are available in every utility mentioned here so far, it's the I/O part of iostat that makes it useful. Iostat run without any parameters gives you a single history since boot report for all CPU and devices. This is useful for a quick look at device utilization and, in this case, looking at CPU usage makes a lot of sense. In Example 10, “Basic output of iostat”, iostat shows blocks read and written per second and overall.

Linux 2.6.12-1.1398_FC4smp (knuth)     08/28/2005

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.01    0.00    0.01    0.04   99.93

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda               0.92        12.27         8.27     289810     195288
Example 10. Basic output of iostat

In Example 11, “Output of iostat -p sda 1 3”, iostat displays three reports at one second intervals for device sda and all its partitions. It's easy to see how iostat can deliver real-time statistics on the partitions' reads and writes.

Linux 2.6.12-1.1398_FC4smp (knuth)     08/28/2005

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.01    0.00    0.01    0.04   99.93

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda               0.92        12.08         8.36     289810     200592
sda3              0.01         0.02         0.00        386          0
sda2              1.68        12.01         8.36     288138     200544
sda1              0.02         0.04         0.00       1024         48

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.00    0.00    0.00    0.00  100.00

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda               0.00         0.00         0.00          0          0
sda3              0.00         0.00         0.00          0          0
sda2              0.00         0.00         0.00          0          0
sda1              0.00         0.00         0.00          0          0

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.00    0.00    0.00    0.00  100.00

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda               0.00         0.00         0.00          0          0
sda3              0.00         0.00         0.00          0          0
sda2              0.00         0.00         0.00          0          0
sda1              0.00         0.00         0.00          0          0
Example 11. Output of iostat -p sda 1 3

The last utility I would like to mention is vmstat. Vmstat reports statistics on virtual memory and can be useful when trying to identify system bottlenecks. Vmstat does not count itself as a running process, and it can be used in a number of modes. Run with no parameters, vmstat will display active and inactive memory. Like iostat, vmstat can be run in iterations, at a particular interval. In Example 12, “Output of vmstat 1 5”, vmstat is run at one second intervals for five iterations.

procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
 0  0      0 1826368  57028 102352    0    0     1     1  251     6  0  0 100  0
 0  0      0 1826368  57028 102352    0    0     0     0 1008    13  0  0 100  0
 0  0      0 1826368  57028 102352    0    0     0     0 1004    13  0  0 100  0
 0  0      0 1826368  57036 102344    0    0     0    60 1007    25  0  0 100  0
 0  0      0 1826368  57036 102344    0    0     0     0 1004    13  0  0 100  0
Example 12. Output of vmstat 1 5

Vmstat can also provide a quick list of memory-related statistics from the vmstat -s command:

      2055112  total memory
       229240  used memory
        84480  active memory
        91816  inactive memory
      1825872  free memory
        57224  buffer memory
       102156  swap cache
      2096472  total swap
            0  used swap
      2096472  free swap
         1130 non-nice user cpu ticks
          247 nice user cpu ticks
         1110 system cpu ticks
      9995941 idle cpu ticks
         3860 IO-wait cpu ticks
           35 IRQ cpu ticks
           56 softirq cpu ticks
       144945 pages paged in
       108540 pages paged out
            0 pages swapped in
            0 pages swapped out
     25092942 interrupts
       575618 CPU context switches
   1126139091 boot time
         4447 forks
    

as well as partition information from the vmstat -p sda2:

sda2          reads   read sectors  writes    requested writes
               15200     288218      27285     218280

Many of the functions of the utilities discussed in this article overlap. This is the result of having several authors who have attempted to provide you with as elegant and powerful a utility as possible. This has the potential, however, of causing some confusion or apathy in using these tools because they seem redundant or are perceived to be "bloated." However, the system administrator, who recognizes each tool for its strengths and inherent ability to report cleanly the characteristics of a running system, will find that their system comes with a rather complete tool set for not only reacting to but predicting performance issues via proactive monitoring.

About the author

Matt Frye is a UNIX/Linux system administrator living in North Carolina. He is Chairman of the North Carolina System Administrators and is an active member of the Triangle Linux User Group. In his spare time, he enjoys fly fishing and mental Kung Foo.