Commit 2409000a authored by Rafael J. Wysocki's avatar Rafael J. Wysocki

Merge branches 'pm-devfreq', 'powercap' and 'pm-docs'

* pm-devfreq:
  PM / devfreq: Get rid of some doc warnings
  PM / devfreq: Fix handling dev_pm_qos_remove_request result
  PM / devfreq: Fix a typo in a comment
  PM / devfreq: Change to DEVFREQ_GOV_UPDATE_INTERVAL event name
  PM / devfreq: Remove unneeded extern keyword
  PM / devfreq: Use constant name of userspace governor

* powercap:
  powercap: idle_inject: Replace zero-length array with flexible-array member

* pm-docs:
  docs: cpu-freq: convert cpufreq-stats.txt to ReST
  docs: cpu-freq: convert cpu-drivers.txt to ReST
  docs: cpu-freq: convert core.txt to ReST
  docs: cpu-freq: convert index.txt to ReST
  docs: cpufreq: fix a broken reference
  Documentation: cpufreq: Move legacy driver documentation
This diff is collapsed.
......@@ -11,4 +11,5 @@ Working-State Power Management
intel_idle
cpufreq
intel_pstate
cpufreq_drivers
intel_epb
PowerNow! and Cool'n'Quiet are AMD names for frequency
management capabilities in AMD processors. As the hardware
implementation changes in new generations of the processors,
there is a different cpu-freq driver for each generation.
Note that the driver's will not load on the "wrong" hardware,
so it is safe to try each driver in turn when in doubt as to
which is the correct driver.
Note that the functionality to change frequency (and voltage)
is not available in all processors. The drivers will refuse
to load on processors without this capability. The capability
is detected with the cpuid instruction.
The drivers use BIOS supplied tables to obtain frequency and
voltage information appropriate for a particular platform.
Frequency transitions will be unavailable if the BIOS does
not supply these tables.
6th Generation: powernow-k6
7th Generation: powernow-k7: Athlon, Duron, Geode.
8th Generation: powernow-k8: Athlon, Athlon 64, Opteron, Sempron.
Documentation on this functionality in 8th generation processors
is available in the "BIOS and Kernel Developer's Guide", publication
26094, in chapter 9, available for download from www.amd.com.
BIOS supplied data, for powernow-k7 and for powernow-k8, may be
from either the PSB table or from ACPI objects. The ACPI support
is only available if the kernel config sets CONFIG_ACPI_PROCESSOR.
The powernow-k8 driver will attempt to use ACPI if so configured,
and fall back to PST if that fails.
The powernow-k7 driver will try to use the PSB support first, and
fall back to ACPI if the PSB support fails. A module parameter,
acpi_force, is provided to force ACPI support to be used instead
of PSB support.
CPU frequency and voltage scaling code in the Linux(TM) kernel
.. SPDX-License-Identifier: GPL-2.0
=============================================================
General description of the CPUFreq core and CPUFreq notifiers
=============================================================
L i n u x C P U F r e q
Authors:
- Dominik Brodowski <linux@brodo.de>
- David Kimdon <dwhedon@debian.org>
- Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- Viresh Kumar <viresh.kumar@linaro.org>
C P U F r e q C o r e
.. Contents:
Dominik Brodowski <linux@brodo.de>
David Kimdon <dwhedon@debian.org>
Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Viresh Kumar <viresh.kumar@linaro.org>
Clock scaling allows you to change the clock speed of the CPUs on the
fly. This is a nice method to save battery power, because the lower
the clock speed, the less power the CPU consumes.
Contents:
---------
1. CPUFreq core and interfaces
2. CPUFreq notifiers
3. CPUFreq Table Generation with Operating Performance Point (OPP)
1. CPUFreq core and interfaces
2. CPUFreq notifiers
3. CPUFreq Table Generation with Operating Performance Point (OPP)
1. General Information
=======================
======================
The CPUFreq core code is located in drivers/cpufreq/cpufreq.c. This
cpufreq code offers a standardized interface for the CPUFreq
......@@ -63,7 +55,7 @@ The phase is specified in the second argument to the notifier. The phase is
CPUFREQ_CREATE_POLICY when the policy is first created and it is
CPUFREQ_REMOVE_POLICY when the policy is removed.
The third argument, a void *pointer, points to a struct cpufreq_policy
The third argument, a ``void *pointer``, points to a struct cpufreq_policy
consisting of several values, including min, max (the lower and upper
frequencies (in kHz) of the new policy).
......@@ -80,10 +72,13 @@ CPUFREQ_POSTCHANGE.
The third argument is a struct cpufreq_freqs with the following
values:
cpu - number of the affected CPU
old - old frequency
new - new frequency
flags - flags of the cpufreq driver
===== ===========================
cpu number of the affected CPU
old old frequency
new new frequency
flags flags of the cpufreq driver
===== ===========================
3. CPUFreq Table Generation with Operating Performance Point (OPP)
==================================================================
......@@ -94,9 +89,12 @@ dev_pm_opp_init_cpufreq_table -
the OPP layer's internal information about the available frequencies
into a format readily providable to cpufreq.
WARNING: Do not use this function in interrupt context.
.. Warning::
Do not use this function in interrupt context.
Example::
Example:
soc_pm_init()
{
/* Do things */
......@@ -106,7 +104,10 @@ dev_pm_opp_init_cpufreq_table -
/* Do other things */
}
NOTE: This function is available only if CONFIG_CPU_FREQ is enabled in
addition to CONFIG_PM_OPP.
.. note::
This function is available only if CONFIG_CPU_FREQ is enabled in
addition to CONFIG_PM_OPP.
dev_pm_opp_free_cpufreq_table - Free up the table allocated by dev_pm_opp_init_cpufreq_table
dev_pm_opp_free_cpufreq_table
Free up the table allocated by dev_pm_opp_init_cpufreq_table
CPU frequency and voltage scaling code in the Linux(TM) kernel
.. SPDX-License-Identifier: GPL-2.0
===============================================
How to Implement a new CPUFreq Processor Driver
===============================================
L i n u x C P U F r e q
Authors:
C P U D r i v e r s
- information for developers -
- Dominik Brodowski <linux@brodo.de>
- Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- Viresh Kumar <viresh.kumar@linaro.org>
.. Contents
Dominik Brodowski <linux@brodo.de>
Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Viresh Kumar <viresh.kumar@linaro.org>
Clock scaling allows you to change the clock speed of the CPUs on the
fly. This is a nice method to save battery power, because the lower
the clock speed, the less power the CPU consumes.
Contents:
---------
1. What To Do?
1.1 Initialization
1.2 Per-CPU Initialization
1.3 verify
1.4 target/target_index or setpolicy?
1.5 target/target_index
1.6 setpolicy
1.7 get_intermediate and target_intermediate
2. Frequency Table Helpers
1. What To Do?
1.1 Initialization
1.2 Per-CPU Initialization
1.3 verify
1.4 target/target_index or setpolicy?
1.5 target/target_index
1.6 setpolicy
1.7 get_intermediate and target_intermediate
2. Frequency Table Helpers
......@@ -49,7 +41,7 @@ function check whether this kernel runs on the right CPU and the right
chipset. If so, register a struct cpufreq_driver with the CPUfreq core
using cpufreq_register_driver()
What shall this struct cpufreq_driver contain?
What shall this struct cpufreq_driver contain?
.name - The name of this driver.
......@@ -108,37 +100,42 @@ Whenever a new CPU is registered with the device model, or after the
cpufreq driver registers itself, the per-policy initialization function
cpufreq_driver.init is called if no cpufreq policy existed for the CPU.
Note that the .init() and .exit() routines are called only once for the
policy and not for each CPU managed by the policy. It takes a struct
cpufreq_policy *policy as argument. What to do now?
policy and not for each CPU managed by the policy. It takes a ``struct
cpufreq_policy *policy`` as argument. What to do now?
If necessary, activate the CPUfreq support on your CPU.
Then, the driver must fill in the following values:
policy->cpuinfo.min_freq _and_
policy->cpuinfo.max_freq - the minimum and maximum frequency
(in kHz) which is supported by
this CPU
policy->cpuinfo.transition_latency the time it takes on this CPU to
switch between two frequencies in
nanoseconds (if appropriate, else
specify CPUFREQ_ETERNAL)
policy->cur The current operating frequency of
this CPU (if appropriate)
policy->min,
policy->max,
policy->policy and, if necessary,
policy->governor must contain the "default policy" for
this CPU. A few moments later,
cpufreq_driver.verify and either
cpufreq_driver.setpolicy or
cpufreq_driver.target/target_index is called
with these values.
policy->cpus Update this with the masks of the
(online + offline) CPUs that do DVFS
along with this CPU (i.e. that share
clock/voltage rails with it).
+-----------------------------------+--------------------------------------+
|policy->cpuinfo.min_freq _and_ | |
|policy->cpuinfo.max_freq | the minimum and maximum frequency |
| | (in kHz) which is supported by |
| | this CPU |
+-----------------------------------+--------------------------------------+
|policy->cpuinfo.transition_latency | the time it takes on this CPU to |
| | switch between two frequencies in |
| | nanoseconds (if appropriate, else |
| | specify CPUFREQ_ETERNAL) |
+-----------------------------------+--------------------------------------+
|policy->cur | The current operating frequency of |
| | this CPU (if appropriate) |
+-----------------------------------+--------------------------------------+
|policy->min, | |
|policy->max, | |
|policy->policy and, if necessary, | |
|policy->governor | must contain the "default policy" for|
| | this CPU. A few moments later, |
| | cpufreq_driver.verify and either |
| | cpufreq_driver.setpolicy or |
| | cpufreq_driver.target/target_index is|
| | called with these values. |
+-----------------------------------+--------------------------------------+
|policy->cpus | Update this with the masks of the |
| | (online + offline) CPUs that do DVFS |
| | along with this CPU (i.e. that share|
| | clock/voltage rails with it). |
+-----------------------------------+--------------------------------------+
For setting some of these values (cpuinfo.min[max]_freq, policy->min[max]), the
frequency table helpers might be helpful. See the section 2 for more information
......@@ -151,8 +148,8 @@ on them.
When the user decides a new policy (consisting of
"policy,governor,min,max") shall be set, this policy must be validated
so that incompatible values can be corrected. For verifying these
values cpufreq_verify_within_limits(struct cpufreq_policy *policy,
unsigned int min_freq, unsigned int max_freq) function might be helpful.
values cpufreq_verify_within_limits(``struct cpufreq_policy *policy``,
``unsigned int min_freq``, ``unsigned int max_freq``) function might be helpful.
See section 2 for details on frequency table helpers.
You need to make sure that at least one valid frequency (or operating
......@@ -163,7 +160,7 @@ policy->max first, and only if this is no solution, decrease policy->min.
1.4 target or target_index or setpolicy or fast_switch?
-------------------------------------------------------
Most cpufreq drivers or even most cpu frequency scaling algorithms
Most cpufreq drivers or even most cpu frequency scaling algorithms
only allow the CPU frequency to be set to predefined fixed values. For
these, you use the ->target(), ->target_index() or ->fast_switch()
callbacks.
......@@ -175,8 +172,8 @@ limits on their own. These shall use the ->setpolicy() callback.
1.5. target/target_index
------------------------
The target_index call has two arguments: struct cpufreq_policy *policy,
and unsigned int index (into the exposed frequency table).
The target_index call has two arguments: ``struct cpufreq_policy *policy``,
and ``unsigned int`` index (into the exposed frequency table).
The CPUfreq driver must set the new frequency when called here. The
actual frequency must be determined by freq_table[index].frequency.
......@@ -184,9 +181,9 @@ actual frequency must be determined by freq_table[index].frequency.
It should always restore to earlier frequency (i.e. policy->restore_freq) in
case of errors, even if we switched to intermediate frequency earlier.
Deprecated:
Deprecated
----------
The target call has three arguments: struct cpufreq_policy *policy,
The target call has three arguments: ``struct cpufreq_policy *policy``,
unsigned int target_frequency, unsigned int relation.
The CPUfreq driver must set the new frequency when called here. The
......@@ -210,14 +207,14 @@ Not all drivers are expected to implement it, as sleeping from within
this callback isn't allowed. This callback must be highly optimized to
do switching as fast as possible.
This function has two arguments: struct cpufreq_policy *policy and
unsigned int target_frequency.
This function has two arguments: ``struct cpufreq_policy *policy`` and
``unsigned int target_frequency``.
1.7 setpolicy
-------------
The setpolicy call only takes a struct cpufreq_policy *policy as
The setpolicy call only takes a ``struct cpufreq_policy *policy`` as
argument. You need to set the lower limit of the in-processor or
in-chipset dynamic frequency switching to policy->min, the upper limit
to policy->max, and -if supported- select a performance-oriented
......@@ -278,10 +275,10 @@ table.
cpufreq_for_each_valid_entry(pos, table) - iterates over all entries,
excluding CPUFREQ_ENTRY_INVALID frequencies.
Use arguments "pos" - a cpufreq_frequency_table * as a loop cursor and
"table" - the cpufreq_frequency_table * you want to iterate over.
Use arguments "pos" - a ``cpufreq_frequency_table *`` as a loop cursor and
"table" - the ``cpufreq_frequency_table *`` you want to iterate over.
For example:
For example::
struct cpufreq_frequency_table *pos, *driver_freq_table;
......
The cpufreq-nforce2 driver changes the FSB on nVidia nForce2 platforms.
This works better than on other platforms, because the FSB of the CPU
can be controlled independently from the PCI/AGP clock.
The module has two options:
fid: multiplier * 10 (for example 8.5 = 85)
min_fsb: minimum FSB
If not set, fid is calculated from the current CPU speed and the FSB.
min_fsb defaults to FSB at boot time - 50 MHz.
IMPORTANT: The available range is limited downwards!
Also the minimum available FSB can differ, for systems
booting with 200 MHz, 150 should always work.
.. SPDX-License-Identifier: GPL-2.0
CPU frequency and voltage scaling statistics in the Linux(TM) kernel
==========================================
General Description of sysfs CPUFreq Stats
==========================================
information for users
L i n u x c p u f r e q - s t a t s d r i v e r
- information for users -
Author: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
.. Contents
Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Contents
1. Introduction
2. Statistics Provided (with example)
3. Configuring cpufreq-stats
1. Introduction
2. Statistics Provided (with example)
3. Configuring cpufreq-stats
1. Introduction
===============
cpufreq-stats is a driver that provides CPU frequency statistics for each CPU.
These statistics are provided in /sysfs as a bunch of read_only interfaces. This
......@@ -28,8 +30,10 @@ that may be running on your CPU. So, it will work with any cpufreq_driver.
2. Statistics Provided (with example)
=====================================
cpufreq stats provides following statistics (explained in detail below).
- time_in_state
- total_trans
- trans_table
......@@ -39,53 +43,57 @@ All the statistics will be from the time the stats driver has been inserted
statistic is done. Obviously, stats driver will not have any information
about the frequency transitions before the stats driver insertion.
--------------------------------------------------------------------------------
<mysystem>:/sys/devices/system/cpu/cpu0/cpufreq/stats # ls -l
total 0
drwxr-xr-x 2 root root 0 May 14 16:06 .
drwxr-xr-x 3 root root 0 May 14 15:58 ..
--w------- 1 root root 4096 May 14 16:06 reset
-r--r--r-- 1 root root 4096 May 14 16:06 time_in_state
-r--r--r-- 1 root root 4096 May 14 16:06 total_trans
-r--r--r-- 1 root root 4096 May 14 16:06 trans_table
--------------------------------------------------------------------------------
- reset
::
<mysystem>:/sys/devices/system/cpu/cpu0/cpufreq/stats # ls -l
total 0
drwxr-xr-x 2 root root 0 May 14 16:06 .
drwxr-xr-x 3 root root 0 May 14 15:58 ..
--w------- 1 root root 4096 May 14 16:06 reset
-r--r--r-- 1 root root 4096 May 14 16:06 time_in_state
-r--r--r-- 1 root root 4096 May 14 16:06 total_trans
-r--r--r-- 1 root root 4096 May 14 16:06 trans_table
- **reset**
Write-only attribute that can be used to reset the stat counters. This can be
useful for evaluating system behaviour under different governors without the
need for a reboot.
- time_in_state
- **time_in_state**
This gives the amount of time spent in each of the frequencies supported by
this CPU. The cat output will have "<frequency> <time>" pair in each line, which
will mean this CPU spent <time> usertime units of time at <frequency>. Output
will have one line for each of the supported frequencies. usertime units here
will have one line for each of the supported frequencies. usertime units here
is 10mS (similar to other time exported in /proc).
--------------------------------------------------------------------------------
<mysystem>:/sys/devices/system/cpu/cpu0/cpufreq/stats # cat time_in_state
3600000 2089
3400000 136
3200000 34
3000000 67
2800000 172488
--------------------------------------------------------------------------------
::
<mysystem>:/sys/devices/system/cpu/cpu0/cpufreq/stats # cat time_in_state
3600000 2089
3400000 136
3200000 34
3000000 67
2800000 172488
- total_trans
This gives the total number of frequency transitions on this CPU. The cat
- **total_trans**
This gives the total number of frequency transitions on this CPU. The cat
output will have a single count which is the total number of frequency
transitions.
--------------------------------------------------------------------------------
<mysystem>:/sys/devices/system/cpu/cpu0/cpufreq/stats # cat total_trans
20
--------------------------------------------------------------------------------
::
<mysystem>:/sys/devices/system/cpu/cpu0/cpufreq/stats # cat total_trans
20
- **trans_table**
- trans_table
This will give a fine grained information about all the CPU frequency
transitions. The cat output here is a two dimensional matrix, where an entry
<i,j> (row i, column j) represents the count of number of transitions from
<i,j> (row i, column j) represents the count of number of transitions from
Freq_i to Freq_j. Freq_i rows and Freq_j columns follow the sorting order in
which the driver has provided the frequency table initially to the cpufreq core
and so can be sorted (ascending or descending) or unsorted. The output here
......@@ -95,26 +103,27 @@ readability.
If the transition table is bigger than PAGE_SIZE, reading this will
return an -EFBIG error.
--------------------------------------------------------------------------------
<mysystem>:/sys/devices/system/cpu/cpu0/cpufreq/stats # cat trans_table
From : To
: 3600000 3400000 3200000 3000000 2800000
3600000: 0 5 0 0 0
3400000: 4 0 2 0 0
3200000: 0 1 0 2 0
3000000: 0 0 1 0 3
2800000: 0 0 0 2 0
--------------------------------------------------------------------------------
::
<mysystem>:/sys/devices/system/cpu/cpu0/cpufreq/stats # cat trans_table
From : To
: 3600000 3400000 3200000 3000000 2800000
3600000: 0 5 0 0 0
3400000: 4 0 2 0 0
3200000: 0 1 0 2 0
3000000: 0 0 1 0 3
2800000: 0 0 0 2 0
3. Configuring cpufreq-stats
============================
To configure cpufreq-stats in your kernel::
To configure cpufreq-stats in your kernel
Config Main Menu
Power management options (ACPI, APM) --->
CPU Frequency scaling --->
[*] CPU Frequency scaling
[*] CPU frequency translation statistics
Config Main Menu
Power management options (ACPI, APM) --->
CPU Frequency scaling --->
[*] CPU Frequency scaling
[*] CPU frequency translation statistics
"CPU Frequency scaling" (CONFIG_CPU_FREQ) should be enabled to configure
......
CPU frequency and voltage scaling code in the Linux(TM) kernel
L i n u x C P U F r e q
Dominik Brodowski <linux@brodo.de>
.. SPDX-License-Identifier: GPL-2.0
==============================================================================
Linux CPUFreq - CPU frequency and voltage scaling code in the Linux(TM) kernel
==============================================================================
Author: Dominik Brodowski <linux@brodo.de>
Clock scaling allows you to change the clock speed of the CPUs on the
fly. This is a nice method to save battery power, because the lower
the clock speed, the less power the CPU consumes.
Documents in this directory:
----------------------------
amd-powernow.txt - AMD powernow driver specific file.
core.txt - General description of the CPUFreq core and
of CPUFreq notifiers.
cpu-drivers.txt - How to implement a new cpufreq processor driver.
cpufreq-nforce2.txt - nVidia nForce2 platform specific file.
cpufreq-stats.txt - General description of sysfs cpufreq stats.
fly. This is a nice method to save battery power, because the lower
the clock speed, the less power the CPU consumes.
index.txt - File index, Mailing list and Links (this document)
pcc-cpufreq.txt - PCC cpufreq driver specific file.
.. toctree::
:maxdepth: 1
core
cpu-drivers
cpufreq-stats
Mailing List
------------
......
/*
* pcc-cpufreq.txt - PCC interface documentation
*
* Copyright (C) 2009 Red Hat, Matthew Garrett <mjg@redhat.com>
* Copyright (C) 2009 Hewlett-Packard Development Company, L.P.
* Nagananda Chumbalkar <nagananda.chumbalkar@hp.com>
*
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; version 2 of the License.
*
* This program is distributed in the hope that it will be useful, but
* WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or NON
* INFRINGEMENT. See the GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License along
* with this program; if not, write to the Free Software Foundation, Inc.,
* 675 Mass Ave, Cambridge, MA 02139, USA.
*
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/
Processor Clocking Control Driver
---------------------------------
Contents:
---------
1. Introduction
1.1 PCC interface
1.1.1 Get Average Frequency
1.1.2 Set Desired Frequency
1.2 Platforms affected
2. Driver and /sys details
2.1 scaling_available_frequencies
2.2 cpuinfo_transition_latency
2.3 cpuinfo_cur_freq
2.4 related_cpus
3. Caveats
1. Introduction:
----------------
Processor Clocking Control (PCC) is an interface between the platform
firmware and OSPM. It is a mechanism for coordinating processor
performance (ie: frequency) between the platform firmware and the OS.
The PCC driver (pcc-cpufreq) allows OSPM to take advantage of the PCC
interface.
OS utilizes the PCC interface to inform platform firmware what frequency the
OS wants for a logical processor. The platform firmware attempts to achieve
the requested frequency. If the request for the target frequency could not be
satisfied by platform firmware, then it usually means that power budget
conditions are in place, and "power capping" is taking place.
1.1 PCC interface:
------------------
The complete PCC specification is available here:
http://www.acpica.org/download/Processor-Clocking-Control-v1p0.pdf
PCC relies on a shared memory region that provides a channel for communication
between the OS and platform firmware. PCC also implements a "doorbell" that
is used by the OS to inform the platform firmware that a command has been
sent.
The ACPI PCCH() method is used to discover the location of the PCC shared
memory region. The shared memory region header contains the "command" and
"status" interface. PCCH() also contains details on how to access the platform
doorbell.
The following commands are supported by the PCC interface:
* Get Average Frequency
* Set Desired Frequency
The ACPI PCCP() method is implemented for each logical processor and is
used to discover the offsets for the input and output buffers in the shared
memory region.
When PCC mode is enabled, the platform will not expose processor performance
or throttle states (_PSS, _TSS and related ACPI objects) to OSPM. Therefore,
the native P-state driver (such as acpi-cpufreq for Intel, powernow-k8 for
AMD) will not load.
However, OSPM remains in control of policy. The governor (eg: "ondemand")
computes the required performance for each processor based on server workload.
The PCC driver fills in the command interface, and the input buffer and
communicates the request to the platform firmware. The platform firmware is
responsible for delivering the requested performance.
Each PCC command is "global" in scope and can affect all the logical CPUs in
the system. Therefore, PCC is capable of performing "group" updates. With PCC
the OS is capable of getting/setting the frequency of all the logical CPUs in
the system with a single call to the BIOS.
1.1.1 Get Average Frequency:
----------------------------
This command is used by the OSPM to query the running frequency of the
processor since the last time this command was completed. The output buffer
indicates the average unhalted frequency of the logical processor expressed as
a percentage of the nominal (ie: maximum) CPU frequency. The output buffer
also signifies if the CPU frequency is limited by a power budget condition.
1.1.2 Set Desired Frequency:
----------------------------
This command is used by the OSPM to communicate to the platform firmware the
desired frequency for a logical processor. The output buffer is currently
ignored by OSPM. The next invocation of "Get Average Frequency" will inform
OSPM if the desired frequency was achieved or not.
1.2 Platforms affected:
-----------------------
The PCC driver will load on any system where the platform firmware:
* supports the PCC interface, and the associated PCCH() and PCCP() methods
* assumes responsibility for managing the hardware clocking controls in order
to deliver the requested processor performance
Currently, certain HP ProLiant platforms implement the PCC interface. On those
platforms PCC is the "default" choice.
However, it is possible to disable this interface via a BIOS setting. In
such an instance, as is also the case on platforms where the PCC interface
is not implemented, the PCC driver will fail to load silently.
2. Driver and /sys details:
---------------------------
When the driver loads, it merely prints the lowest and the highest CPU
frequencies supported by the platform firmware.
The PCC driver loads with a message such as:
pcc-cpufreq: (v1.00.00) driver loaded with frequency limits: 1600 MHz, 2933
MHz
This means that the OPSM can request the CPU to run at any frequency in
between the limits (1600 MHz, and 2933 MHz) specified in the message.
Internally, there is no need for the driver to convert the "target" frequency
to a corresponding P-state.
The VERSION number for the driver will be of the format v.xy.ab.
eg: 1.00.02
----- --
| |
| -- this will increase with bug fixes/enhancements to the driver
|-- this is the version of the PCC specification the driver adheres to
The following is a brief discussion on some of the fields exported via the
/sys filesystem and how their values are affected by the PCC driver:
2.1 scaling_available_frequencies:
----------------------------------
scaling_available_frequencies is not created in /sys. No intermediate
frequencies need to be listed because the BIOS will try to achieve any
frequency, within limits, requested by the governor. A frequency does not have
to be strictly associated with a P-state.
2.2 cpuinfo_transition_latency:
-------------------------------
The cpuinfo_transition_latency field is 0. The PCC specification does
not include a field to expose this value currently.
2.3 cpuinfo_cur_freq:
---------------------
A) Often cpuinfo_cur_freq will show a value different than what is declared
in the scaling_available_frequencies or scaling_cur_freq, or scaling_max_freq.
This is due to "turbo boost" available on recent Intel processors. If certain
conditions are met the BIOS can achieve a slightly higher speed than requested
by OSPM. An example:
scaling_cur_freq : 2933000
cpuinfo_cur_freq : 3196000
B) There is a round-off error associated with the cpuinfo_cur_freq value.
Since the driver obtains the current frequency as a "percentage" (%) of the
nominal frequency from the BIOS, sometimes, the values displayed by
scaling_cur_freq and cpuinfo_cur_freq may not match. An example:
scaling_cur_freq : 1600000
cpuinfo_cur_freq : 1583000
In this example, the nominal frequency is 2933 MHz. The driver obtains the
current frequency, cpuinfo_cur_freq, as 54% of the nominal frequency:
54% of 2933 MHz = 1583 MHz
Nominal frequency is the maximum frequency of the processor, and it usually
corresponds to the frequency of the P0 P-state.
2.4 related_cpus:
-----------------
The related_cpus field is identical to affected_cpus.
affected_cpus : 4
related_cpus : 4
Currently, the PCC driver does not evaluate _PSD. The platforms that support
PCC do not implement SW_ALL. So OSPM doesn't need to perform any coordination
to ensure that the same frequency is requested of all dependent CPUs.
3. Caveats:
-----------
The "cpufreq_stats" module in its present form cannot be loaded and
expected to work with the PCC driver. Since the "cpufreq_stats" module
provides information wrt each P-state, it is not applicable to the PCC driver.
......@@ -99,6 +99,7 @@ needed).
accounting/index
block/index
cdrom/index
cpu-freq/index
ide/index
fb/index
fpga/index
......
......@@ -25,7 +25,7 @@ config X86_PCC_CPUFREQ
This driver adds support for the PCC interface.
For details, take a look at:
<file:Documentation/cpu-freq/pcc-cpufreq.txt>.
<file:Documentation/admin-guide/pm/cpufreq_drivers.rst>.
To compile this driver as a module, choose M here: the
module will be called pcc-cpufreq.
......
......@@ -550,14 +550,14 @@ void devfreq_monitor_resume(struct devfreq *devfreq)
EXPORT_SYMBOL(devfreq_monitor_resume);
/**
* devfreq_interval_update() - Update device devfreq monitoring interval
* devfreq_update_interval() - Update device devfreq monitoring interval
* @devfreq: the devfreq instance.
* @delay: new polling interval to be set.
*
* Helper function to set new load monitoring polling interval. Function
* to be called from governor in response to DEVFREQ_GOV_INTERVAL event.
* to be called from governor in response to DEVFREQ_GOV_UPDATE_INTERVAL event.
*/
void devfreq_interval_update(struct devfreq *devfreq, unsigned int *delay)
void devfreq_update_interval(struct devfreq *devfreq, unsigned int *delay)
{
unsigned int cur_delay = devfreq->profile->polling_ms;
unsigned int new_delay = *delay;
......@@ -597,7 +597,7 @@ void devfreq_interval_update(struct devfreq *devfreq, unsigned int *delay)
out:
mutex_unlock(&devfreq->lock);
}
EXPORT_SYMBOL(devfreq_interval_update);
EXPORT_SYMBOL(devfreq_update_interval);
/**
* devfreq_notifier_call() - Notify that the device frequency requirements
......@@ -705,13 +705,13 @@ static void devfreq_dev_release(struct device *dev)
if (dev_pm_qos_request_active(&devfreq->user_max_freq_req)) {
err = dev_pm_qos_remove_request(&devfreq->user_max_freq_req);
if (err)
if (err < 0)
dev_warn(dev->parent,
"Failed to remove max_freq request: %d\n", err);
}
if (dev_pm_qos_request_active(&devfreq->user_min_freq_req)) {
err = dev_pm_qos_remove_request(&devfreq->user_min_freq_req);
if (err)
if (err < 0)
dev_warn(dev->parent,
"Failed to remove min_freq request: %d\n", err);
}
......@@ -1424,7 +1424,7 @@ static ssize_t polling_interval_store(struct device *dev,
if (ret != 1)
return -EINVAL;
df->governor->event_handler(df, DEVFREQ_GOV_INTERVAL, &value);
df->governor->event_handler(df, DEVFREQ_GOV_UPDATE_INTERVAL, &value);
ret = count;
return ret;
......
......@@ -18,7 +18,7 @@
/* Devfreq events */
#define DEVFREQ_GOV_START 0x1
#define DEVFREQ_GOV_STOP 0x2
#define DEVFREQ_GOV_INTERVAL 0x3
#define DEVFREQ_GOV_UPDATE_INTERVAL 0x3
#define DEVFREQ_GOV_SUSPEND 0x4
#define DEVFREQ_GOV_RESUME 0x5
......@@ -30,7 +30,7 @@
* @node: list node - contains registered devfreq governors
* @name: Governor's name
* @immutable: Immutable flag for governor. If the value is 1,
* this govenror is never changeable to other governor.
* this governor is never changeable to other governor.
* @interrupt_driven: Devfreq core won't schedule polling work for this
* governor if value is set to 1.
* @get_target_freq: Returns desired operating frequency for the device.
......@@ -57,17 +57,16 @@ struct devfreq_governor {
unsigned int event, void *data);
};
extern void devfreq_monitor_start(struct devfreq *devfreq);
extern void devfreq_monitor_stop(struct devfreq *devfreq);
extern void devfreq_monitor_suspend(struct devfreq *devfreq);
extern void devfreq_monitor_resume(struct devfreq *devfreq);
extern void devfreq_interval_update(struct devfreq *devfreq,
unsigned int *delay);
void devfreq_monitor_start(struct devfreq *devfreq);
void devfreq_monitor_stop(struct devfreq *devfreq);
void devfreq_monitor_suspend(struct devfreq *devfreq);
void devfreq_monitor_resume(struct devfreq *devfreq);
void devfreq_update_interval(struct devfreq *devfreq, unsigned int *delay);
extern int devfreq_add_governor(struct devfreq_governor *governor);
extern int devfreq_remove_governor(struct devfreq_governor *governor);
int devfreq_add_governor(struct devfreq_governor *governor);
int devfreq_remove_governor(struct devfreq_governor *governor);
extern int devfreq_update_status(struct devfreq *devfreq, unsigned long freq);
int devfreq_update_status(struct devfreq *devfreq, unsigned long freq);
static inline int devfreq_update_stats(struct devfreq *df)
{
......
......@@ -96,8 +96,8 @@ static int devfreq_simple_ondemand_handler(struct devfreq *devfreq,
devfreq_monitor_stop(devfreq);
break;
case DEVFREQ_GOV_INTERVAL:
devfreq_interval_update(devfreq, (unsigned int *)data);
case DEVFREQ_GOV_UPDATE_INTERVAL:
devfreq_update_interval(devfreq, (unsigned int *)data);
break;
case DEVFREQ_GOV_SUSPEND:
......
......@@ -131,7 +131,7 @@ static int devfreq_userspace_handler(struct devfreq *devfreq,
}
static struct devfreq_governor devfreq_userspace = {
.name = "userspace",
.name = DEVFREQ_GOV_USERSPACE,
.get_target_freq = devfreq_userspace_func,
.event_handler = devfreq_userspace_handler,
};
......
......@@ -734,7 +734,7 @@ static int tegra_governor_event_handler(struct devfreq *devfreq,
devfreq_monitor_stop(devfreq);
break;
case DEVFREQ_GOV_INTERVAL:
case DEVFREQ_GOV_UPDATE_INTERVAL:
/*
* ACTMON hardware supports up to 256 milliseconds for the
* sampling period.
......@@ -745,7 +745,7 @@ static int tegra_governor_event_handler(struct devfreq *devfreq,
}
tegra_actmon_pause(tegra);
devfreq_interval_update(devfreq, new_delay);
devfreq_update_interval(devfreq, new_delay);
ret = tegra_actmon_resume(tegra);
break;
......
......@@ -67,7 +67,7 @@ struct idle_inject_device {
struct hrtimer timer;
unsigned int idle_duration_us;
unsigned int run_duration_us;
unsigned long int cpumask[0];
unsigned long cpumask[];
};
static DEFINE_PER_CPU(struct idle_inject_thread, idle_inject_thread);
......
......@@ -158,7 +158,7 @@ struct devfreq_stats {
* functions except for the context of callbacks defined in struct
* devfreq_governor, the governor should protect its access with the
* struct mutex lock in struct devfreq. A governor may use this mutex
* to protect its own private data in void *data as well.
* to protect its own private data in ``void *data`` as well.
*/
struct devfreq {
struct list_head node;
......@@ -201,24 +201,23 @@ struct devfreq_freqs {
};
#if defined(CONFIG_PM_DEVFREQ)
extern struct devfreq *devfreq_add_device(struct device *dev,
struct devfreq_dev_profile *profile,
const char *governor_name,
void *data);
extern int devfreq_remove_device(struct devfreq *devfreq);
extern struct devfreq *devm_devfreq_add_device(struct device *dev,
struct devfreq_dev_profile *profile,
const char *governor_name,
void *data);
extern void devm_devfreq_remove_device(struct device *dev,
struct devfreq *devfreq);
struct devfreq *devfreq_add_device(struct device *dev,
struct devfreq_dev_profile *profile,
const char *governor_name,
void *data);
int devfreq_remove_device(struct devfreq *devfreq);
struct devfreq *devm_devfreq_add_device(struct device *dev,
struct devfreq_dev_profile *profile,
const char *governor_name,
void *data);
void devm_devfreq_remove_device(struct device *dev, struct devfreq *devfreq);
/* Supposed to be called by PM callbacks */
extern int devfreq_suspend_device(struct devfreq *devfreq);
extern int devfreq_resume_device(struct devfreq *devfreq);
int devfreq_suspend_device(struct devfreq *devfreq);
int devfreq_resume_device(struct devfreq *devfreq);
extern void devfreq_suspend(void);
extern void devfreq_resume(void);
void devfreq_suspend(void);
void devfreq_resume(void);
/**
* update_devfreq() - Reevaluate the device and configure frequency
......@@ -226,39 +225,38 @@ extern void devfreq_resume(void);
*
* Note: devfreq->lock must be held
*/
extern int update_devfreq(struct devfreq *devfreq);
int update_devfreq(struct devfreq *devfreq);
/* Helper functions for devfreq user device driver with OPP. */
extern struct dev_pm_opp *devfreq_recommended_opp(struct device *dev,
unsigned long *freq, u32 flags);
extern int devfreq_register_opp_notifier(struct device *dev,
struct devfreq *devfreq);
extern int devfreq_unregister_opp_notifier(struct device *dev,
struct devfreq *devfreq);
extern int devm_devfreq_register_opp_notifier(struct device *dev,
struct devfreq *devfreq);
extern void devm_devfreq_unregister_opp_notifier(struct device *dev,
struct devfreq *devfreq);
extern int devfreq_register_notifier(struct devfreq *devfreq,
struct notifier_block *nb,
unsigned int list);
extern int devfreq_unregister_notifier(struct devfreq *devfreq,
struct notifier_block *nb,
unsigned int list);
extern int devm_devfreq_register_notifier(struct device *dev,
struct dev_pm_opp *devfreq_recommended_opp(struct device *dev,
unsigned long *freq, u32 flags);
int devfreq_register_opp_notifier(struct device *dev,
struct devfreq *devfreq);
int devfreq_unregister_opp_notifier(struct device *dev,
struct devfreq *devfreq);
int devm_devfreq_register_opp_notifier(struct device *dev,
struct devfreq *devfreq);
void devm_devfreq_unregister_opp_notifier(struct device *dev,
struct devfreq *devfreq);
int devfreq_register_notifier(struct devfreq *devfreq,
struct notifier_block *nb,
unsigned int list);
int devfreq_unregister_notifier(struct devfreq *devfreq,
struct notifier_block *nb,
unsigned int list);
int devm_devfreq_register_notifier(struct device *dev,
struct devfreq *devfreq,
struct notifier_block *nb,
unsigned int list);
extern void devm_devfreq_unregister_notifier(struct device *dev,
void devm_devfreq_unregister_notifier(struct device *dev,
struct devfreq *devfreq,
struct notifier_block *nb,
unsigned int list);
extern struct devfreq *devfreq_get_devfreq_by_phandle(struct device *dev,
int index);
struct devfreq *devfreq_get_devfreq_by_phandle(struct device *dev, int index);
#if IS_ENABLED(CONFIG_DEVFREQ_GOV_SIMPLE_ONDEMAND)
/**
* struct devfreq_simple_ondemand_data - void *data fed to struct devfreq
* struct devfreq_simple_ondemand_data - ``void *data`` fed to struct devfreq
* and devfreq_add_device
* @upthreshold: If the load is over this value, the frequency jumps.
* Specify 0 to use the default. Valid value = 0 to 100.
......@@ -278,7 +276,7 @@ struct devfreq_simple_ondemand_data {
#if IS_ENABLED(CONFIG_DEVFREQ_GOV_PASSIVE)
/**
* struct devfreq_passive_data - void *data fed to struct devfreq
* struct devfreq_passive_data - ``void *data`` fed to struct devfreq
* and devfreq_add_device
* @parent: the devfreq instance of parent device.
* @get_target_freq: Optional callback, Returns desired operating frequency
......@@ -311,9 +309,9 @@ struct devfreq_passive_data {
#else /* !CONFIG_PM_DEVFREQ */
static inline struct devfreq *devfreq_add_device(struct device *dev,
struct devfreq_dev_profile *profile,
const char *governor_name,
void *data)
struct devfreq_dev_profile *profile,
const char *governor_name,
void *data)
{
return ERR_PTR(-ENOSYS);
}
......@@ -350,31 +348,31 @@ static inline void devfreq_suspend(void) {}
static inline void devfreq_resume(void) {}
static inline struct dev_pm_opp *devfreq_recommended_opp(struct device *dev,
unsigned long *freq, u32 flags)
unsigned long *freq, u32 flags)
{
return ERR_PTR(-EINVAL);
}
static inline int devfreq_register_opp_notifier(struct device *dev,
struct devfreq *devfreq)
struct devfreq *devfreq)
{
return -EINVAL;
}
static inline int devfreq_unregister_opp_notifier(struct device *dev,
struct devfreq *devfreq)
struct devfreq *devfreq)
{
return -EINVAL;
}
static inline int devm_devfreq_register_opp_notifier(struct device *dev,
struct devfreq *devfreq)
struct devfreq *devfreq)
{
return -EINVAL;
}
static inline void devm_devfreq_unregister_opp_notifier(struct device *dev,
struct devfreq *devfreq)
struct devfreq *devfreq)
{
}
......@@ -393,22 +391,22 @@ static inline int devfreq_unregister_notifier(struct devfreq *devfreq,
}
static inline int devm_devfreq_register_notifier(struct device *dev,
struct devfreq *devfreq,
struct notifier_block *nb,
unsigned int list)
struct devfreq *devfreq,
struct notifier_block *nb,
unsigned int list)
{
return 0;
}
static inline void devm_devfreq_unregister_notifier(struct device *dev,
struct devfreq *devfreq,
struct notifier_block *nb,
unsigned int list)
struct devfreq *devfreq,
struct notifier_block *nb,
unsigned int list)
{
}
static inline struct devfreq *devfreq_get_devfreq_by_phandle(struct device *dev,
int index)
int index)
{
return ERR_PTR(-ENODEV);
}
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment