powernowd and friends

Behdad Esfahbod behdad at cs.toronto.edu
Wed Dec 10 07:42:56 UTC 2003


Hi all,

Perhaps this is not the best place, but since Warren mentioned
powernowd, I'm finally writing it down.  Better let people know.

Any one with a laptop perhaps has tried things to modulate CPU
speed, they are mostly based on SpeedStep technology from intel,
or PowerNow from AMD.  Well, there's a P4-clockmod from intel
too.

The functionality is exported to user space through cpu-freq
interface.  The old interface is around /proc/cpufreq, while the
new one is in /sys/devices/system/cpu/*/cpufreq...

In 2.6 kernel configuration, it is mentioned that:


          Enable this cpufreq governor when you either want to set the
          CPU frequency manually or when an userspace programm shall
          be able to set the CPU dynamically, like on LART
          <http://www.lart.tudelft.nl/>

And people have developed a handfull of packages to *do the right
thing*.  To name a few, autospeedstep, cpudynd, cpufreqd, and
powernowd.  What all of them do is to set the CPU frequency on
demand.

Now comes my confusion.  All my experiments have been done on my
Sony Vaio laptop with a Pentium 4 Mobile 2.4GHz CPU.

The ACPI spec is at http://www.acpi.info/DOWNLOADS/ACPIspec-2-0c.pdf

My processor has two P states, P0 and P1.  (page 23 of ACPI spec)
And it has eigth T state.
To describe them briefly, P states are power states.  When in P0,
it's working full power, when in P1, it works at half speed
1.2GHz, and consumes more power, because of lower voltage.  If I
boot on battery power, it goes in P1.   I can put it in P1
through ACPU CPU performance interface.  But when in P1, I cannot
go back in P0.

[behdad at mces behdad]$ cd /proc/acpi/processor/CPU0/
[behdad at mces CPU0]$ ls
info  limit  performance  power  throttling
[behdad at mces CPU0]$ cat performance
state count:             2
active state:            P0
states:
   *P0:                  2400 MHz, 35000 mW, 250 uS
    P1:                  1200 MHz, 20200 mW, 250 uS
[behdad at mces CPU0]$


Now comes the T states.  For those who don't know, they are
called throttling states, and their main usage is to prevent
overheating and burning CPU.  As the name suggests, it's
throttling, means that, it does not reduce CPU frequency, just
asks processor to spend a percentage of it's time in idle.

[behdad at mces CPU0]$ cat throttling
state count:             8
active state:            T0
states:
   *T0:                  00%
    T1:                  12%
    T2:                  25%
    T3:                  37%
    T4:                  50%
    T5:                  62%
    T6:                  75%
    T7:                  87%
[behdad at mces CPU0]$


Then, if I'm in P0, throttling gives me 8 different speeds,
ranging from 298MHz/sec to 2.4GHz/sec, and if I'm in P1, it gives
from 149MHz/sec to 1.2MHz/sec.  Please not that when you are on
298MHz, it's true that you only get 298M cycles per second, but
your CPU is still beating on 2.4GHz/sec.

Now comes cpufreq.  There are enough evidence for me that cpufreq
is nothing more than a wrapper around throttling states, at least
for my processor.

On my system, if I load speedstep-ich driver, it loads and shows
me through cpufreq interface a min of 1200000 and max of 2400000.
It suggests to be P states, but I cannot change the speed what
ever I do.  That's because it's the wrong driver.  Unload that
and load p4-clockmod and I see 8 different steps in cpu freq
interface (2.6 kernel prefered).  These are exactly the same as
my ACPI T states.


Maybe some numbers help:

My laptop compiling a kernel with make -j4, on full CPU power,
consumes 40W, when idle consumes 30W.  Turn it down to 1.2GHz
(50%), in the same load consumes 35W, again in idle consumes 30W.
Turn it down to 298MHz(12%), I can't put the load on, but in
idle, it again consumes 30W.

So here are two facts:

1. You CPU does it's best to consume less power when idle, that's
because of those acpi_idle or apm_idle routines in kernel.

2. Throttling states and then cpufreq speeds, do not touch CPU
frequency and so voltage, but add CPU idle time.

The logical conclusion is that, modulating CPU speed *on demand*
means nothing.  No matter what speed you *think* your CPU is
working on, your CPU is smart enough to consume enough power to
complete the job.


My conclusion is that, CPU speed modulation on demand is as wrong
as turning lights off when in blackout, and turn on again when
you are back.

The only reason I can see people want to set CPU freq manually,
is to put a limit on their power consumption.  For example, you
have a long going Mozilla compilation session, and you want to go
home.  You put a limit on cpu freq, so that you don't need to
stop the compilation job, and you do not go out of battery before
you reach home.

Done.

Please lemme know if I'm wrong.  Experimentation on speedstep and
powernow technologies would be appreciated.  Dave Jones should
have an idea.  And don't shoot me out.  I'm going to raise it in
the right place myself.

behdad





More information about the fedora-devel-list mailing list