Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
L
linux
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
Kirill Smelkov
linux
Commits
f3db6de5
Commit
f3db6de5
authored
Aug 14, 2020
by
Rafael J. Wysocki
Browse files
Options
Browse Files
Download
Plain Diff
Merge branch 'pm-cpufreq'
* pm-cpufreq: cpufreq: intel_pstate: Implement passive mode with HWP enabled
parents
f6235eb1
f6ebbcf0
Changes
4
Hide whitespace changes
Inline
Side-by-side
Showing
4 changed files
with
229 additions
and
113 deletions
+229
-113
Documentation/admin-guide/pm/intel_pstate.rst
Documentation/admin-guide/pm/intel_pstate.rst
+43
-46
drivers/cpufreq/cpufreq.c
drivers/cpufreq/cpufreq.c
+2
-4
drivers/cpufreq/intel_pstate.c
drivers/cpufreq/intel_pstate.c
+182
-63
include/linux/cpufreq.h
include/linux/cpufreq.h
+2
-0
No files found.
Documentation/admin-guide/pm/intel_pstate.rst
View file @
f3db6de5
...
@@ -54,10 +54,13 @@ registered (see `below <status_attr_>`_).
...
@@ -54,10 +54,13 @@ registered (see `below <status_attr_>`_).
Operation
Modes
Operation
Modes
===============
===============
``
intel_pstate
``
can
operate
in
three
different
modes
:
in
the
active
mode
with
``
intel_pstate
``
can
operate
in
two
different
modes
,
active
or
passive
.
In
the
or
without
hardware
-
managed
P
-
states
support
and
in
the
passive
mode
.
Which
of
active
mode
,
it
uses
its
own
internal
performance
scaling
governor
algorithm
or
them
will
be
in
effect
depends
on
what
kernel
command
line
options
are
used
and
allows
the
hardware
to
do
preformance
scaling
by
itself
,
while
in
the
passive
on
the
capabilities
of
the
processor
.
mode
it
responds
to
requests
made
by
a
generic
``
CPUFreq
``
governor
implementing
a
certain
performance
scaling
algorithm
.
Which
of
them
will
be
in
effect
depends
on
what
kernel
command
line
options
are
used
and
on
the
capabilities
of
the
processor
.
Active
Mode
Active
Mode
-----------
-----------
...
@@ -194,10 +197,11 @@ This is the default operation mode of ``intel_pstate`` for processors without
...
@@ -194,10 +197,11 @@ This is the default operation mode of ``intel_pstate`` for processors without
hardware-managed P-states (HWP) support. It is always used if the
hardware-managed P-states (HWP) support. It is always used if the
``intel_pstate=passive`` argument is passed to the kernel in the command line
``intel_pstate=passive`` argument is passed to the kernel in the command line
regardless of whether or not the given processor supports HWP. [Note that the
regardless of whether or not the given processor supports HWP. [Note that the
``intel_pstate=no_hwp`` setting implies ``intel_pstate=passive`` if it is used
``intel_pstate=no_hwp`` setting causes the driver to start in the passive mode
without ``intel_pstate=active``.] Like in the active mode without HWP support,
if it is not combined with ``intel_pstate=active``.] Like in the active mode
in this mode ``intel_pstate`` may refuse to work with processors that are not
without HWP support, in this mode ``intel_pstate`` may refuse to work with
recognized by it.
processors that are not recognized by it if HWP is prevented from being enabled
through the kernel command line.
If the driver works in this mode, the ``scaling_driver`` policy attribute in
If the driver works in this mode, the ``scaling_driver`` policy attribute in
``sysfs`` for all ``CPUFreq`` policies contains the string "intel_cpufreq".
``sysfs`` for all ``CPUFreq`` policies contains the string "intel_cpufreq".
...
@@ -318,10 +322,9 @@ manuals need to be consulted to get to it too.
...
@@ -318,10 +322,9 @@ manuals need to be consulted to get to it too.
For this reason, there is a list of supported processors in ``intel_pstate`` and
For this reason, there is a list of supported processors in ``intel_pstate`` and
the driver initialization will fail if the detected processor is not in that
the driver initialization will fail if the detected processor is not in that
list, unless it supports the `HWP feature <Active Mode_>`_. [The interface to
list, unless it supports the HWP feature. [The interface to obtain all of the
obtain all of the information listed above is the same for all of the processors
information listed above is the same for all of the processors supporting the
supporting the HWP feature, which is why they all are supported by
HWP feature, which is why ``intel_pstate`` works with all of them.]
``intel_pstate``.]
User Space Interface in ``sysfs``
User Space Interface in ``sysfs``
...
@@ -425,22 +428,16 @@ argument is passed to the kernel in the command line.
...
@@ -425,22 +428,16 @@ argument is passed to the kernel in the command line.
as
well
as
the
per
-
policy
ones
)
are
then
reset
to
their
default
as
well
as
the
per
-
policy
ones
)
are
then
reset
to
their
default
values
,
possibly
depending
on
the
target
operation
mode
.]
values
,
possibly
depending
on
the
target
operation
mode
.]
That
only
is
supported
in
some
configurations
,
though
(
for
example
,
if
the
`
HWP
feature
is
enabled
in
the
processor
<
Active
Mode
With
HWP_
>`
_
,
the
operation
mode
of
the
driver
cannot
be
changed
),
and
if
it
is
not
supported
in
the
current
configuration
,
writes
to
this
attribute
will
fail
with
an
appropriate
error
.
``
energy_efficiency
``
``
energy_efficiency
``
This
attribute
is
only
present
on
platforms
,
which
have
CPUs
matching
This
attribute
is
only
present
on
platforms
with
CPUs
matching
the
Kaby
Kaby
Lake
or
Coffee
Lake
desktop
CPU
model
.
By
default
Lake
or
Coffee
Lake
desktop
CPU
model
.
By
default
,
energy
-
efficiency
energy
efficiency
optimizations
are
disabled
on
these
CPU
models
in
HWP
optimizations
are
disabled
on
these
CPU
models
if
HWP
is
enabled
.
mode
by
this
driver
.
Enabling
energy
efficiency
may
limit
maximum
Enabling
energy
-
efficiency
optimizations
may
limit
maximum
operating
operating
frequency
in
both
HWP
and
non
HWP
mode
.
In
non
HWP
mode
,
frequency
with
or
without
the
HWP
feature
.
With
HWP
enabled
,
the
optimizations
are
done
only
in
the
turbo
frequency
range
.
In
HWP
mode
,
optimizations
are
done
only
in
the
turbo
frequency
range
.
Without
it
,
optimizations
are
done
in
the
entire
frequency
range
.
Setting
this
they
are
done
in
the
entire
available
frequency
range
.
Setting
this
attribute
to
"1"
enables
energy
efficiency
optimizations
and
setting
attribute
to
"1"
enables
the
energy
-
efficiency
optimizations
and
setting
to
"0"
disables
energy
efficiency
optimizations
.
to
"0"
disables
them
.
Interpretation
of
Policy
Attributes
Interpretation
of
Policy
Attributes
-----------------------------------
-----------------------------------
...
@@ -484,8 +481,8 @@ Next, the following policy attributes have special meaning if
...
@@ -484,8 +481,8 @@ Next, the following policy attributes have special meaning if
policy for the time interval between the last two invocations of the
policy for the time interval between the last two invocations of the
driver'
s
utilization
update
callback
by
the
CPU
scheduler
for
that
CPU
.
driver'
s
utilization
update
callback
by
the
CPU
scheduler
for
that
CPU
.
One
more
policy
attribute
is
present
if
the
`
HWP
feature
is
enabled
in
the
One
more
policy
attribute
is
present
if
the
HWP
feature
is
enabled
in
the
processor
<
Active
Mode
With
HWP_
>`
_
:
processor
:
``
base_frequency
``
``
base_frequency
``
Shows
the
base
frequency
of
the
CPU
.
Any
frequency
above
this
will
be
Shows
the
base
frequency
of
the
CPU
.
Any
frequency
above
this
will
be
...
@@ -526,11 +523,11 @@ on the following rules, regardless of the current operation mode of the driver:
...
@@ -526,11 +523,11 @@ on the following rules, regardless of the current operation mode of the driver:
3. The global and per-policy limits can be set independently.
3. The global and per-policy limits can be set independently.
I
f the `HWP feature is enabled in the processor
<Active Mode With HWP_>`_, the
I
n the `active mode with the HWP feature enabled
<Active Mode With HWP_>`_, the
resulting effective values are written into
its registers whenever the limits
resulting effective values are written into
hardware registers whenever the
change in order to request its internal P-state selection logic to always set
limits change in order to request its internal P-state selection logic to always
P-states within these limits. Otherwise, the limits are taken into account by
set P-states within these limits. Otherwise, the limits are taken into account
scaling governors (in the `passive mode <Passive Mode_>`_) and by the driver
by
scaling governors (in the `passive mode <Passive Mode_>`_) and by the driver
every time before setting a new P-state for a CPU.
every time before setting a new P-state for a CPU.
Additionally, if the ``intel_pstate=per_cpu_perf_limits`` command line argument
Additionally, if the ``intel_pstate=per_cpu_perf_limits`` command line argument
...
@@ -541,12 +538,11 @@ at all and the only way to set the limits is by using the policy attributes.
...
@@ -541,12 +538,11 @@ at all and the only way to set the limits is by using the policy attributes.
Energy vs Performance Hints
Energy vs Performance Hints
---------------------------
---------------------------
If ``intel_pstate`` works in the `active mode with the HWP feature enabled
If the hardware-managed P-states (HWP) is enabled in the processor, additional
<Active Mode With HWP_>`_ in the processor, additional attributes are present
attributes, intended to allow user space to help ``intel_pstate`` to adjust the
in every ``CPUFreq`` policy directory in ``sysfs``. They are intended to allow
processor'
s
internal
P
-
state
selection
logic
by
focusing
it
on
performance
or
on
user space to help ``intel_pstate`` to adjust the processor'
s
internal
P
-
state
energy
-
efficiency
,
or
somewhere
between
the
two
extremes
,
are
present
in
every
selection
logic
by
focusing
it
on
performance
or
on
energy
-
efficiency
,
or
``
CPUFreq
``
policy
directory
in
``
sysfs
``.
They
are
:
somewhere
between
the
two
extremes
:
``
energy_performance_preference
``
``
energy_performance_preference
``
Current
value
of
the
energy
vs
performance
hint
for
the
given
policy
Current
value
of
the
energy
vs
performance
hint
for
the
given
policy
...
@@ -650,12 +646,14 @@ of them have to be prepended with the ``intel_pstate=`` prefix.
...
@@ -650,12 +646,14 @@ of them have to be prepended with the ``intel_pstate=`` prefix.
Do
not
register
``
intel_pstate
``
as
the
scaling
driver
even
if
the
Do
not
register
``
intel_pstate
``
as
the
scaling
driver
even
if
the
processor
is
supported
by
it
.
processor
is
supported
by
it
.
``
active
``
Register
``
intel_pstate
``
in
the
`
active
mode
<
Active
Mode_
>`
_
to
start
with
.
``
passive
``
``
passive
``
Register
``
intel_pstate
``
in
the
`
passive
mode
<
Passive
Mode_
>`
_
to
Register
``
intel_pstate
``
in
the
`
passive
mode
<
Passive
Mode_
>`
_
to
start
with
.
start
with
.
This
option
implies
the
``
no_hwp
``
one
described
below
.
``
force
``
``
force
``
Register
``
intel_pstate
``
as
the
scaling
driver
instead
of
Register
``
intel_pstate
``
as
the
scaling
driver
instead
of
``
acpi
-
cpufreq
``
even
if
the
latter
is
preferred
on
the
given
system
.
``
acpi
-
cpufreq
``
even
if
the
latter
is
preferred
on
the
given
system
.
...
@@ -670,13 +668,12 @@ of them have to be prepended with the ``intel_pstate=`` prefix.
...
@@ -670,13 +668,12 @@ of them have to be prepended with the ``intel_pstate=`` prefix.
driver
is
used
instead
of
``
acpi
-
cpufreq
``.
driver
is
used
instead
of
``
acpi
-
cpufreq
``.
``
no_hwp
``
``
no_hwp
``
Do
not
enable
the
`
hardware
-
managed
P
-
states
(
HWP
)
feature
Do
not
enable
the
hardware
-
managed
P
-
states
(
HWP
)
feature
even
if
it
is
<
Active
Mode
With
HWP_
>`
_
even
if
it
is
supported
by
the
processor
.
supported
by
the
processor
.
``
hwp_only
``
``
hwp_only
``
Register
``
intel_pstate
``
as
the
scaling
driver
only
if
the
Register
``
intel_pstate
``
as
the
scaling
driver
only
if
the
`
hardware
-
managed
P
-
states
(
HWP
)
feature
<
Active
Mode
With
HWP_
>`
_
is
hardware
-
managed
P
-
states
(
HWP
)
feature
is
supported
by
the
processor
.
supported
by
the
processor
.
``
support_acpi_ppc
``
``
support_acpi_ppc
``
Take
ACPI
``
_PPC
``
performance
limits
into
account
.
Take
ACPI
``
_PPC
``
performance
limits
into
account
.
...
...
drivers/cpufreq/cpufreq.c
View file @
f3db6de5
...
@@ -73,8 +73,6 @@ static inline bool has_target(void)
...
@@ -73,8 +73,6 @@ static inline bool has_target(void)
static
unsigned
int
__cpufreq_get
(
struct
cpufreq_policy
*
policy
);
static
unsigned
int
__cpufreq_get
(
struct
cpufreq_policy
*
policy
);
static
int
cpufreq_init_governor
(
struct
cpufreq_policy
*
policy
);
static
int
cpufreq_init_governor
(
struct
cpufreq_policy
*
policy
);
static
void
cpufreq_exit_governor
(
struct
cpufreq_policy
*
policy
);
static
void
cpufreq_exit_governor
(
struct
cpufreq_policy
*
policy
);
static
int
cpufreq_start_governor
(
struct
cpufreq_policy
*
policy
);
static
void
cpufreq_stop_governor
(
struct
cpufreq_policy
*
policy
);
static
void
cpufreq_governor_limits
(
struct
cpufreq_policy
*
policy
);
static
void
cpufreq_governor_limits
(
struct
cpufreq_policy
*
policy
);
static
int
cpufreq_set_policy
(
struct
cpufreq_policy
*
policy
,
static
int
cpufreq_set_policy
(
struct
cpufreq_policy
*
policy
,
struct
cpufreq_governor
*
new_gov
,
struct
cpufreq_governor
*
new_gov
,
...
@@ -2266,7 +2264,7 @@ static void cpufreq_exit_governor(struct cpufreq_policy *policy)
...
@@ -2266,7 +2264,7 @@ static void cpufreq_exit_governor(struct cpufreq_policy *policy)
module_put
(
policy
->
governor
->
owner
);
module_put
(
policy
->
governor
->
owner
);
}
}
static
int
cpufreq_start_governor
(
struct
cpufreq_policy
*
policy
)
int
cpufreq_start_governor
(
struct
cpufreq_policy
*
policy
)
{
{
int
ret
;
int
ret
;
...
@@ -2293,7 +2291,7 @@ static int cpufreq_start_governor(struct cpufreq_policy *policy)
...
@@ -2293,7 +2291,7 @@ static int cpufreq_start_governor(struct cpufreq_policy *policy)
return
0
;
return
0
;
}
}
static
void
cpufreq_stop_governor
(
struct
cpufreq_policy
*
policy
)
void
cpufreq_stop_governor
(
struct
cpufreq_policy
*
policy
)
{
{
if
(
cpufreq_suspended
||
!
policy
->
governor
)
if
(
cpufreq_suspended
||
!
policy
->
governor
)
return
;
return
;
...
...
drivers/cpufreq/intel_pstate.c
View file @
f3db6de5
...
@@ -36,6 +36,7 @@
...
@@ -36,6 +36,7 @@
#define INTEL_PSTATE_SAMPLING_INTERVAL (10 * NSEC_PER_MSEC)
#define INTEL_PSTATE_SAMPLING_INTERVAL (10 * NSEC_PER_MSEC)
#define INTEL_CPUFREQ_TRANSITION_LATENCY 20000
#define INTEL_CPUFREQ_TRANSITION_LATENCY 20000
#define INTEL_CPUFREQ_TRANSITION_DELAY_HWP 5000
#define INTEL_CPUFREQ_TRANSITION_DELAY 500
#define INTEL_CPUFREQ_TRANSITION_DELAY 500
#ifdef CONFIG_ACPI
#ifdef CONFIG_ACPI
...
@@ -220,6 +221,7 @@ struct global_params {
...
@@ -220,6 +221,7 @@ struct global_params {
* preference/bias
* preference/bias
* @epp_saved: Saved EPP/EPB during system suspend or CPU offline
* @epp_saved: Saved EPP/EPB during system suspend or CPU offline
* operation
* operation
* @epp_cached Cached HWP energy-performance preference value
* @hwp_req_cached: Cached value of the last HWP Request MSR
* @hwp_req_cached: Cached value of the last HWP Request MSR
* @hwp_cap_cached: Cached value of the last HWP Capabilities MSR
* @hwp_cap_cached: Cached value of the last HWP Capabilities MSR
* @last_io_update: Last time when IO wake flag was set
* @last_io_update: Last time when IO wake flag was set
...
@@ -257,6 +259,7 @@ struct cpudata {
...
@@ -257,6 +259,7 @@ struct cpudata {
s16
epp_policy
;
s16
epp_policy
;
s16
epp_default
;
s16
epp_default
;
s16
epp_saved
;
s16
epp_saved
;
s16
epp_cached
;
u64
hwp_req_cached
;
u64
hwp_req_cached
;
u64
hwp_cap_cached
;
u64
hwp_cap_cached
;
u64
last_io_update
;
u64
last_io_update
;
...
@@ -639,6 +642,26 @@ static int intel_pstate_get_energy_pref_index(struct cpudata *cpu_data, int *raw
...
@@ -639,6 +642,26 @@ static int intel_pstate_get_energy_pref_index(struct cpudata *cpu_data, int *raw
return
index
;
return
index
;
}
}
static
int
intel_pstate_set_epp
(
struct
cpudata
*
cpu
,
u32
epp
)
{
/*
* Use the cached HWP Request MSR value, because in the active mode the
* register itself may be updated by intel_pstate_hwp_boost_up() or
* intel_pstate_hwp_boost_down() at any time.
*/
u64
value
=
READ_ONCE
(
cpu
->
hwp_req_cached
);
value
&=
~
GENMASK_ULL
(
31
,
24
);
value
|=
(
u64
)
epp
<<
24
;
/*
* The only other updater of hwp_req_cached in the active mode,
* intel_pstate_hwp_set(), is called under the same lock as this
* function, so it cannot run in parallel with the update below.
*/
WRITE_ONCE
(
cpu
->
hwp_req_cached
,
value
);
return
wrmsrl_on_cpu
(
cpu
->
cpu
,
MSR_HWP_REQUEST
,
value
);
}
static
int
intel_pstate_set_energy_pref_index
(
struct
cpudata
*
cpu_data
,
static
int
intel_pstate_set_energy_pref_index
(
struct
cpudata
*
cpu_data
,
int
pref_index
,
bool
use_raw
,
int
pref_index
,
bool
use_raw
,
u32
raw_epp
)
u32
raw_epp
)
...
@@ -650,28 +673,12 @@ static int intel_pstate_set_energy_pref_index(struct cpudata *cpu_data,
...
@@ -650,28 +673,12 @@ static int intel_pstate_set_energy_pref_index(struct cpudata *cpu_data,
epp
=
cpu_data
->
epp_default
;
epp
=
cpu_data
->
epp_default
;
if
(
boot_cpu_has
(
X86_FEATURE_HWP_EPP
))
{
if
(
boot_cpu_has
(
X86_FEATURE_HWP_EPP
))
{
/*
* Use the cached HWP Request MSR value, because the register
* itself may be updated by intel_pstate_hwp_boost_up() or
* intel_pstate_hwp_boost_down() at any time.
*/
u64
value
=
READ_ONCE
(
cpu_data
->
hwp_req_cached
);
value
&=
~
GENMASK_ULL
(
31
,
24
);
if
(
use_raw
)
if
(
use_raw
)
epp
=
raw_epp
;
epp
=
raw_epp
;
else
if
(
epp
==
-
EINVAL
)
else
if
(
epp
==
-
EINVAL
)
epp
=
epp_values
[
pref_index
-
1
];
epp
=
epp_values
[
pref_index
-
1
];
value
|=
(
u64
)
epp
<<
24
;
ret
=
intel_pstate_set_epp
(
cpu_data
,
epp
);
/*
* The only other updater of hwp_req_cached in the active mode,
* intel_pstate_hwp_set(), is called under the same lock as this
* function, so it cannot run in parallel with the update below.
*/
WRITE_ONCE
(
cpu_data
->
hwp_req_cached
,
value
);
ret
=
wrmsrl_on_cpu
(
cpu_data
->
cpu
,
MSR_HWP_REQUEST
,
value
);
}
else
{
}
else
{
if
(
epp
==
-
EINVAL
)
if
(
epp
==
-
EINVAL
)
epp
=
(
pref_index
-
1
)
<<
2
;
epp
=
(
pref_index
-
1
)
<<
2
;
...
@@ -697,10 +704,12 @@ static ssize_t show_energy_performance_available_preferences(
...
@@ -697,10 +704,12 @@ static ssize_t show_energy_performance_available_preferences(
cpufreq_freq_attr_ro
(
energy_performance_available_preferences
);
cpufreq_freq_attr_ro
(
energy_performance_available_preferences
);
static
struct
cpufreq_driver
intel_pstate
;
static
ssize_t
store_energy_performance_preference
(
static
ssize_t
store_energy_performance_preference
(
struct
cpufreq_policy
*
policy
,
const
char
*
buf
,
size_t
count
)
struct
cpufreq_policy
*
policy
,
const
char
*
buf
,
size_t
count
)
{
{
struct
cpudata
*
cpu
_data
=
all_cpu_data
[
policy
->
cpu
];
struct
cpudata
*
cpu
=
all_cpu_data
[
policy
->
cpu
];
char
str_preference
[
21
];
char
str_preference
[
21
];
bool
raw
=
false
;
bool
raw
=
false
;
ssize_t
ret
;
ssize_t
ret
;
...
@@ -725,15 +734,44 @@ static ssize_t store_energy_performance_preference(
...
@@ -725,15 +734,44 @@ static ssize_t store_energy_performance_preference(
raw
=
true
;
raw
=
true
;
}
}
/*
* This function runs with the policy R/W semaphore held, which
* guarantees that the driver pointer will not change while it is
* running.
*/
if
(
!
intel_pstate_driver
)
return
-
EAGAIN
;
mutex_lock
(
&
intel_pstate_limits_lock
);
mutex_lock
(
&
intel_pstate_limits_lock
);
ret
=
intel_pstate_set_energy_pref_index
(
cpu_data
,
ret
,
raw
,
epp
);
if
(
intel_pstate_driver
==
&
intel_pstate
)
{
if
(
!
ret
)
ret
=
intel_pstate_set_energy_pref_index
(
cpu
,
ret
,
raw
,
epp
);
ret
=
count
;
}
else
{
/*
* In the passive mode the governor needs to be stopped on the
* target CPU before the EPP update and restarted after it,
* which is super-heavy-weight, so make sure it is worth doing
* upfront.
*/
if
(
!
raw
)
epp
=
ret
?
epp_values
[
ret
-
1
]
:
cpu
->
epp_default
;
if
(
cpu
->
epp_cached
!=
epp
)
{
int
err
;
cpufreq_stop_governor
(
policy
);
ret
=
intel_pstate_set_epp
(
cpu
,
epp
);
err
=
cpufreq_start_governor
(
policy
);
if
(
!
ret
)
{
cpu
->
epp_cached
=
epp
;
ret
=
err
;
}
}
}
mutex_unlock
(
&
intel_pstate_limits_lock
);
mutex_unlock
(
&
intel_pstate_limits_lock
);
return
ret
;
return
ret
?:
count
;
}
}
static
ssize_t
show_energy_performance_preference
(
static
ssize_t
show_energy_performance_preference
(
...
@@ -1145,8 +1183,6 @@ static ssize_t store_no_turbo(struct kobject *a, struct kobj_attribute *b,
...
@@ -1145,8 +1183,6 @@ static ssize_t store_no_turbo(struct kobject *a, struct kobj_attribute *b,
return
count
;
return
count
;
}
}
static
struct
cpufreq_driver
intel_pstate
;
static
void
update_qos_request
(
enum
freq_qos_req_type
type
)
static
void
update_qos_request
(
enum
freq_qos_req_type
type
)
{
{
int
max_state
,
turbo_max
,
freq
,
i
,
perf_pct
;
int
max_state
,
turbo_max
,
freq
,
i
,
perf_pct
;
...
@@ -1330,9 +1366,10 @@ static const struct attribute_group intel_pstate_attr_group = {
...
@@ -1330,9 +1366,10 @@ static const struct attribute_group intel_pstate_attr_group = {
static
const
struct
x86_cpu_id
intel_pstate_cpu_ee_disable_ids
[];
static
const
struct
x86_cpu_id
intel_pstate_cpu_ee_disable_ids
[];
static
struct
kobject
*
intel_pstate_kobject
;
static
void
__init
intel_pstate_sysfs_expose_params
(
void
)
static
void
__init
intel_pstate_sysfs_expose_params
(
void
)
{
{
struct
kobject
*
intel_pstate_kobject
;
int
rc
;
int
rc
;
intel_pstate_kobject
=
kobject_create_and_add
(
"intel_pstate"
,
intel_pstate_kobject
=
kobject_create_and_add
(
"intel_pstate"
,
...
@@ -1357,17 +1394,31 @@ static void __init intel_pstate_sysfs_expose_params(void)
...
@@ -1357,17 +1394,31 @@ static void __init intel_pstate_sysfs_expose_params(void)
rc
=
sysfs_create_file
(
intel_pstate_kobject
,
&
min_perf_pct
.
attr
);
rc
=
sysfs_create_file
(
intel_pstate_kobject
,
&
min_perf_pct
.
attr
);
WARN_ON
(
rc
);
WARN_ON
(
rc
);
if
(
hwp_active
)
{
rc
=
sysfs_create_file
(
intel_pstate_kobject
,
&
hwp_dynamic_boost
.
attr
);
WARN_ON
(
rc
);
}
if
(
x86_match_cpu
(
intel_pstate_cpu_ee_disable_ids
))
{
if
(
x86_match_cpu
(
intel_pstate_cpu_ee_disable_ids
))
{
rc
=
sysfs_create_file
(
intel_pstate_kobject
,
&
energy_efficiency
.
attr
);
rc
=
sysfs_create_file
(
intel_pstate_kobject
,
&
energy_efficiency
.
attr
);
WARN_ON
(
rc
);
WARN_ON
(
rc
);
}
}
}
}
static
void
intel_pstate_sysfs_expose_hwp_dynamic_boost
(
void
)
{
int
rc
;
if
(
!
hwp_active
)
return
;
rc
=
sysfs_create_file
(
intel_pstate_kobject
,
&
hwp_dynamic_boost
.
attr
);
WARN_ON_ONCE
(
rc
);
}
static
void
intel_pstate_sysfs_hide_hwp_dynamic_boost
(
void
)
{
if
(
!
hwp_active
)
return
;
sysfs_remove_file
(
intel_pstate_kobject
,
&
hwp_dynamic_boost
.
attr
);
}
/************************** sysfs end ************************/
/************************** sysfs end ************************/
static
void
intel_pstate_hwp_enable
(
struct
cpudata
*
cpudata
)
static
void
intel_pstate_hwp_enable
(
struct
cpudata
*
cpudata
)
...
@@ -2247,7 +2298,10 @@ static int intel_pstate_verify_policy(struct cpufreq_policy_data *policy)
...
@@ -2247,7 +2298,10 @@ static int intel_pstate_verify_policy(struct cpufreq_policy_data *policy)
static
void
intel_cpufreq_stop_cpu
(
struct
cpufreq_policy
*
policy
)
static
void
intel_cpufreq_stop_cpu
(
struct
cpufreq_policy
*
policy
)
{
{
intel_pstate_set_min_pstate
(
all_cpu_data
[
policy
->
cpu
]);
if
(
hwp_active
)
intel_pstate_hwp_force_min_perf
(
policy
->
cpu
);
else
intel_pstate_set_min_pstate
(
all_cpu_data
[
policy
->
cpu
]);
}
}
static
void
intel_pstate_stop_cpu
(
struct
cpufreq_policy
*
policy
)
static
void
intel_pstate_stop_cpu
(
struct
cpufreq_policy
*
policy
)
...
@@ -2255,12 +2309,10 @@ static void intel_pstate_stop_cpu(struct cpufreq_policy *policy)
...
@@ -2255,12 +2309,10 @@ static void intel_pstate_stop_cpu(struct cpufreq_policy *policy)
pr_debug
(
"CPU %d exiting
\n
"
,
policy
->
cpu
);
pr_debug
(
"CPU %d exiting
\n
"
,
policy
->
cpu
);
intel_pstate_clear_update_util_hook
(
policy
->
cpu
);
intel_pstate_clear_update_util_hook
(
policy
->
cpu
);
if
(
hwp_active
)
{
if
(
hwp_active
)
intel_pstate_hwp_save_state
(
policy
);
intel_pstate_hwp_save_state
(
policy
);
intel_pstate_hwp_force_min_perf
(
policy
->
cpu
);
}
else
{
intel_cpufreq_stop_cpu
(
policy
);
intel_cpufreq_stop_cpu
(
policy
);
}
}
}
static
int
intel_pstate_cpu_exit
(
struct
cpufreq_policy
*
policy
)
static
int
intel_pstate_cpu_exit
(
struct
cpufreq_policy
*
policy
)
...
@@ -2390,13 +2442,71 @@ static void intel_cpufreq_trace(struct cpudata *cpu, unsigned int trace_type, in
...
@@ -2390,13 +2442,71 @@ static void intel_cpufreq_trace(struct cpudata *cpu, unsigned int trace_type, in
fp_toint
(
cpu
->
iowait_boost
*
100
));
fp_toint
(
cpu
->
iowait_boost
*
100
));
}
}
static
void
intel_cpufreq_adjust_hwp
(
struct
cpudata
*
cpu
,
u32
target_pstate
,
bool
fast_switch
)
{
u64
prev
=
READ_ONCE
(
cpu
->
hwp_req_cached
),
value
=
prev
;
value
&=
~
HWP_MIN_PERF
(
~
0L
);
value
|=
HWP_MIN_PERF
(
target_pstate
);
/*
* The entire MSR needs to be updated in order to update the HWP min
* field in it, so opportunistically update the max too if needed.
*/
value
&=
~
HWP_MAX_PERF
(
~
0L
);
value
|=
HWP_MAX_PERF
(
cpu
->
max_perf_ratio
);
if
(
value
==
prev
)
return
;
WRITE_ONCE
(
cpu
->
hwp_req_cached
,
value
);
if
(
fast_switch
)
wrmsrl
(
MSR_HWP_REQUEST
,
value
);
else
wrmsrl_on_cpu
(
cpu
->
cpu
,
MSR_HWP_REQUEST
,
value
);
}
static
void
intel_cpufreq_adjust_perf_ctl
(
struct
cpudata
*
cpu
,
u32
target_pstate
,
bool
fast_switch
)
{
if
(
fast_switch
)
wrmsrl
(
MSR_IA32_PERF_CTL
,
pstate_funcs
.
get_val
(
cpu
,
target_pstate
));
else
wrmsrl_on_cpu
(
cpu
->
cpu
,
MSR_IA32_PERF_CTL
,
pstate_funcs
.
get_val
(
cpu
,
target_pstate
));
}
static
int
intel_cpufreq_update_pstate
(
struct
cpudata
*
cpu
,
int
target_pstate
,
bool
fast_switch
)
{
int
old_pstate
=
cpu
->
pstate
.
current_pstate
;
target_pstate
=
intel_pstate_prepare_request
(
cpu
,
target_pstate
);
if
(
target_pstate
!=
old_pstate
)
{
cpu
->
pstate
.
current_pstate
=
target_pstate
;
if
(
hwp_active
)
intel_cpufreq_adjust_hwp
(
cpu
,
target_pstate
,
fast_switch
);
else
intel_cpufreq_adjust_perf_ctl
(
cpu
,
target_pstate
,
fast_switch
);
}
intel_cpufreq_trace
(
cpu
,
fast_switch
?
INTEL_PSTATE_TRACE_FAST_SWITCH
:
INTEL_PSTATE_TRACE_TARGET
,
old_pstate
);
return
target_pstate
;
}
static
int
intel_cpufreq_target
(
struct
cpufreq_policy
*
policy
,
static
int
intel_cpufreq_target
(
struct
cpufreq_policy
*
policy
,
unsigned
int
target_freq
,
unsigned
int
target_freq
,
unsigned
int
relation
)
unsigned
int
relation
)
{
{
struct
cpudata
*
cpu
=
all_cpu_data
[
policy
->
cpu
];
struct
cpudata
*
cpu
=
all_cpu_data
[
policy
->
cpu
];
struct
cpufreq_freqs
freqs
;
struct
cpufreq_freqs
freqs
;
int
target_pstate
,
old_pstate
;
int
target_pstate
;
update_turbo_state
();
update_turbo_state
();
...
@@ -2404,6 +2514,7 @@ static int intel_cpufreq_target(struct cpufreq_policy *policy,
...
@@ -2404,6 +2514,7 @@ static int intel_cpufreq_target(struct cpufreq_policy *policy,
freqs
.
new
=
target_freq
;
freqs
.
new
=
target_freq
;
cpufreq_freq_transition_begin
(
policy
,
&
freqs
);
cpufreq_freq_transition_begin
(
policy
,
&
freqs
);
switch
(
relation
)
{
switch
(
relation
)
{
case
CPUFREQ_RELATION_L
:
case
CPUFREQ_RELATION_L
:
target_pstate
=
DIV_ROUND_UP
(
freqs
.
new
,
cpu
->
pstate
.
scaling
);
target_pstate
=
DIV_ROUND_UP
(
freqs
.
new
,
cpu
->
pstate
.
scaling
);
...
@@ -2415,15 +2526,11 @@ static int intel_cpufreq_target(struct cpufreq_policy *policy,
...
@@ -2415,15 +2526,11 @@ static int intel_cpufreq_target(struct cpufreq_policy *policy,
target_pstate
=
DIV_ROUND_CLOSEST
(
freqs
.
new
,
cpu
->
pstate
.
scaling
);
target_pstate
=
DIV_ROUND_CLOSEST
(
freqs
.
new
,
cpu
->
pstate
.
scaling
);
break
;
break
;
}
}
target_pstate
=
intel_pstate_prepare_request
(
cpu
,
target_pstate
);
old_pstate
=
cpu
->
pstate
.
current_pstate
;
target_pstate
=
intel_cpufreq_update_pstate
(
cpu
,
target_pstate
,
false
);
if
(
target_pstate
!=
cpu
->
pstate
.
current_pstate
)
{
cpu
->
pstate
.
current_pstate
=
target_pstate
;
wrmsrl_on_cpu
(
policy
->
cpu
,
MSR_IA32_PERF_CTL
,
pstate_funcs
.
get_val
(
cpu
,
target_pstate
));
}
freqs
.
new
=
target_pstate
*
cpu
->
pstate
.
scaling
;
freqs
.
new
=
target_pstate
*
cpu
->
pstate
.
scaling
;
intel_cpufreq_trace
(
cpu
,
INTEL_PSTATE_TRACE_TARGET
,
old_pstate
);
cpufreq_freq_transition_end
(
policy
,
&
freqs
,
false
);
cpufreq_freq_transition_end
(
policy
,
&
freqs
,
false
);
return
0
;
return
0
;
...
@@ -2433,15 +2540,14 @@ static unsigned int intel_cpufreq_fast_switch(struct cpufreq_policy *policy,
...
@@ -2433,15 +2540,14 @@ static unsigned int intel_cpufreq_fast_switch(struct cpufreq_policy *policy,
unsigned
int
target_freq
)
unsigned
int
target_freq
)
{
{
struct
cpudata
*
cpu
=
all_cpu_data
[
policy
->
cpu
];
struct
cpudata
*
cpu
=
all_cpu_data
[
policy
->
cpu
];
int
target_pstate
,
old_pstate
;
int
target_pstate
;
update_turbo_state
();
update_turbo_state
();
target_pstate
=
DIV_ROUND_UP
(
target_freq
,
cpu
->
pstate
.
scaling
);
target_pstate
=
DIV_ROUND_UP
(
target_freq
,
cpu
->
pstate
.
scaling
);
target_pstate
=
intel_pstate_prepare_request
(
cpu
,
target_pstate
);
old_pstate
=
cpu
->
pstate
.
current_pstate
;
target_pstate
=
intel_cpufreq_update_pstate
(
cpu
,
target_pstate
,
true
);
intel_pstate_update_pstate
(
cpu
,
target_pstate
);
intel_cpufreq_trace
(
cpu
,
INTEL_PSTATE_TRACE_FAST_SWITCH
,
old_pstate
);
return
target_pstate
*
cpu
->
pstate
.
scaling
;
return
target_pstate
*
cpu
->
pstate
.
scaling
;
}
}
...
@@ -2461,7 +2567,6 @@ static int intel_cpufreq_cpu_init(struct cpufreq_policy *policy)
...
@@ -2461,7 +2567,6 @@ static int intel_cpufreq_cpu_init(struct cpufreq_policy *policy)
return
ret
;
return
ret
;
policy
->
cpuinfo
.
transition_latency
=
INTEL_CPUFREQ_TRANSITION_LATENCY
;
policy
->
cpuinfo
.
transition_latency
=
INTEL_CPUFREQ_TRANSITION_LATENCY
;
policy
->
transition_delay_us
=
INTEL_CPUFREQ_TRANSITION_DELAY
;
/* This reflects the intel_pstate_get_cpu_pstates() setting. */
/* This reflects the intel_pstate_get_cpu_pstates() setting. */
policy
->
cur
=
policy
->
cpuinfo
.
min_freq
;
policy
->
cur
=
policy
->
cpuinfo
.
min_freq
;
...
@@ -2473,10 +2578,18 @@ static int intel_cpufreq_cpu_init(struct cpufreq_policy *policy)
...
@@ -2473,10 +2578,18 @@ static int intel_cpufreq_cpu_init(struct cpufreq_policy *policy)
cpu
=
all_cpu_data
[
policy
->
cpu
];
cpu
=
all_cpu_data
[
policy
->
cpu
];
if
(
hwp_active
)
if
(
hwp_active
)
{
u64
value
;
intel_pstate_get_hwp_max
(
policy
->
cpu
,
&
turbo_max
,
&
max_state
);
intel_pstate_get_hwp_max
(
policy
->
cpu
,
&
turbo_max
,
&
max_state
);
else
policy
->
transition_delay_us
=
INTEL_CPUFREQ_TRANSITION_DELAY_HWP
;
rdmsrl_on_cpu
(
cpu
->
cpu
,
MSR_HWP_REQUEST
,
&
value
);
WRITE_ONCE
(
cpu
->
hwp_req_cached
,
value
);
cpu
->
epp_cached
=
(
value
&
GENMASK_ULL
(
31
,
24
))
>>
24
;
}
else
{
turbo_max
=
cpu
->
pstate
.
turbo_pstate
;
turbo_max
=
cpu
->
pstate
.
turbo_pstate
;
policy
->
transition_delay_us
=
INTEL_CPUFREQ_TRANSITION_DELAY
;
}
min_freq
=
DIV_ROUND_UP
(
turbo_max
*
global
.
min_perf_pct
,
100
);
min_freq
=
DIV_ROUND_UP
(
turbo_max
*
global
.
min_perf_pct
,
100
);
min_freq
*=
cpu
->
pstate
.
scaling
;
min_freq
*=
cpu
->
pstate
.
scaling
;
...
@@ -2553,6 +2666,10 @@ static void intel_pstate_driver_cleanup(void)
...
@@ -2553,6 +2666,10 @@ static void intel_pstate_driver_cleanup(void)
}
}
}
}
put_online_cpus
();
put_online_cpus
();
if
(
intel_pstate_driver
==
&
intel_pstate
)
intel_pstate_sysfs_hide_hwp_dynamic_boost
();
intel_pstate_driver
=
NULL
;
intel_pstate_driver
=
NULL
;
}
}
...
@@ -2560,6 +2677,9 @@ static int intel_pstate_register_driver(struct cpufreq_driver *driver)
...
@@ -2560,6 +2677,9 @@ static int intel_pstate_register_driver(struct cpufreq_driver *driver)
{
{
int
ret
;
int
ret
;
if
(
driver
==
&
intel_pstate
)
intel_pstate_sysfs_expose_hwp_dynamic_boost
();
memset
(
&
global
,
0
,
sizeof
(
global
));
memset
(
&
global
,
0
,
sizeof
(
global
));
global
.
max_perf_pct
=
100
;
global
.
max_perf_pct
=
100
;
...
@@ -2577,9 +2697,6 @@ static int intel_pstate_register_driver(struct cpufreq_driver *driver)
...
@@ -2577,9 +2697,6 @@ static int intel_pstate_register_driver(struct cpufreq_driver *driver)
static
int
intel_pstate_unregister_driver
(
void
)
static
int
intel_pstate_unregister_driver
(
void
)
{
{
if
(
hwp_active
)
return
-
EBUSY
;
cpufreq_unregister_driver
(
intel_pstate_driver
);
cpufreq_unregister_driver
(
intel_pstate_driver
);
intel_pstate_driver_cleanup
();
intel_pstate_driver_cleanup
();
...
@@ -2835,7 +2952,10 @@ static int __init intel_pstate_init(void)
...
@@ -2835,7 +2952,10 @@ static int __init intel_pstate_init(void)
hwp_active
++
;
hwp_active
++
;
hwp_mode_bdw
=
id
->
driver_data
;
hwp_mode_bdw
=
id
->
driver_data
;
intel_pstate
.
attr
=
hwp_cpufreq_attrs
;
intel_pstate
.
attr
=
hwp_cpufreq_attrs
;
default_driver
=
&
intel_pstate
;
intel_cpufreq
.
attr
=
hwp_cpufreq_attrs
;
if
(
!
default_driver
)
default_driver
=
&
intel_pstate
;
goto
hwp_cpu_matched
;
goto
hwp_cpu_matched
;
}
}
}
else
{
}
else
{
...
@@ -2906,14 +3026,13 @@ static int __init intel_pstate_setup(char *str)
...
@@ -2906,14 +3026,13 @@ static int __init intel_pstate_setup(char *str)
if
(
!
str
)
if
(
!
str
)
return
-
EINVAL
;
return
-
EINVAL
;
if
(
!
strcmp
(
str
,
"disable"
))
{
if
(
!
strcmp
(
str
,
"disable"
))
no_load
=
1
;
no_load
=
1
;
}
else
if
(
!
strcmp
(
str
,
"active"
))
{
else
if
(
!
strcmp
(
str
,
"active"
))
default_driver
=
&
intel_pstate
;
default_driver
=
&
intel_pstate
;
}
else
if
(
!
strcmp
(
str
,
"passive"
))
{
else
if
(
!
strcmp
(
str
,
"passive"
))
default_driver
=
&
intel_cpufreq
;
default_driver
=
&
intel_cpufreq
;
no_hwp
=
1
;
}
if
(
!
strcmp
(
str
,
"no_hwp"
))
{
if
(
!
strcmp
(
str
,
"no_hwp"
))
{
pr_info
(
"HWP disabled
\n
"
);
pr_info
(
"HWP disabled
\n
"
);
no_hwp
=
1
;
no_hwp
=
1
;
...
...
include/linux/cpufreq.h
View file @
f3db6de5
...
@@ -576,6 +576,8 @@ unsigned int cpufreq_driver_resolve_freq(struct cpufreq_policy *policy,
...
@@ -576,6 +576,8 @@ unsigned int cpufreq_driver_resolve_freq(struct cpufreq_policy *policy,
unsigned
int
cpufreq_policy_transition_delay_us
(
struct
cpufreq_policy
*
policy
);
unsigned
int
cpufreq_policy_transition_delay_us
(
struct
cpufreq_policy
*
policy
);
int
cpufreq_register_governor
(
struct
cpufreq_governor
*
governor
);
int
cpufreq_register_governor
(
struct
cpufreq_governor
*
governor
);
void
cpufreq_unregister_governor
(
struct
cpufreq_governor
*
governor
);
void
cpufreq_unregister_governor
(
struct
cpufreq_governor
*
governor
);
int
cpufreq_start_governor
(
struct
cpufreq_policy
*
policy
);
void
cpufreq_stop_governor
(
struct
cpufreq_policy
*
policy
);
#define cpufreq_governor_init(__governor) \
#define cpufreq_governor_init(__governor) \
static int __init __governor##_init(void) \
static int __init __governor##_init(void) \
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment