Commit 0c181b1d authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'pm-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "These are mostly cpufreq updates, including a significant intel-pstate
  driver update and several amd-pstate improvements plus some updates of
  ARM cpufreq drivers, general fixes and cleanups.

  Also included are changes related to system sleep, power capping
  updates adding support for a new platform and a new hardware feature
  (among other things), a Samsung exynos-asv driver update allowing it
  to change its Energy Model after adjusting voltage, minor cpuidle and
  devfreq updates and a small documentation cleanup.

  Specifics:

   - Rework the handling of disabled turbo in the intel_pstate driver
     and make it update the maximum CPU frequency consistently
     regardless of the reason on top of a number of cleanups (Rafael
     Wysocki)

   - Add missing checks for NULL .exit() cpufreq driver callback to the
     cpufreq core (Viresh Kumar)

   - Prevent pulicy->max from going above the frequency QoS maximum
     value when cpufreq_frequency_table_verify() is used (Xuewen Yan)

   - Prevent a negative CPU number or frequency value from being printed
     if they are really large (Joshua Yeong)

   - Update MAINTAINERS entry for amd-pstate to add two new
     submaintainers and a designated reviewer (Huang Rui)

   - Clean up the amd-pstate driver and update its documentation
     (Gautham Shenoy)

   - Fix the highest frequency issue in the amd-pstate driver which
     limits performance (Perry Yuan)

   - Enable CPPC v2 for certain processors in the family 17H, as
     requested by TR40 processor users who expect improved performance
     and lower system temperature (Perry Yuan)

   - Change latency and delay values to be read from platform firmware
     firstly for more accurate timing (Perry Yuan)

   - A new quirk is introduced for supporting amd-pstate on legacy
     processors which either lack CPPC capability, or only only have
     CPPC v2 capability (Perry Yuan)

   - Sun50i cpufreq: Add support for opp_supported_hw, H616 platform and
     general cleanups (Andre Przywara, Martin Botka, Brandon Cheo Fusi,
     Dan Carpenter, Viresh Kumar)

   - CPPC cpufreq: Fix possible null pointer dereference (Aleksandr
     Mishin)

   - Eliminate uses of of_node_put() from cpufreq (Javier Carrasco,
     Shivani Gupta)

   - brcmstb-avs: ISO C90 forbids mixed declarations (Portia Stephens)

   - mediatek cpufreq: Add support for MT7988A (Sam Shih)

   - cpufreq-qcom-hw: Add SM4450 compatibles in DT bindings (Tengfei
     Fan)

   - Fix struct cpudata::epp_cached kernel-doc in the intel_pstate
     cpufreq driver (Jeff Johnson)

   - Fix kerneldoc description of ladder_do_selection() (Jeff Johnson)

   - Convert the cpuidle kirkwood driver to platform remove callback
     returning void (Yangtao Li)

   - Replace deprecated strncpy() with strscpy() in the hibernation core
     code (Justin Stitt)

   - Use %ps to simplify debug output in the core system-wide suspend
     and resume code (Len Brown)

   - Remove unnecessary else from device_init_wakeup() and make
     device_wakeup_disable() return void (Dhruva Gole)

   - Enable PMU support in the Intel TPMI RAPL driver (Zhang Rui)

   - Add support for ArrowLake-H platform to the Intel RAPL driver
     (Zhang Rui)

   - Avoid explicit cpumask allocation on stack in DTPM (Dawei Li)

   - Make the Samsung exynos-asv driver update the Energy Model after
     adjusting voltage on top of some preliminary changes of the OPP and
     Enery Model generic code (Lukasz Luba)

   - Remove a reference to a function that has been dropped from the
     power management documentation (Bjorn Helgaas)

   - Convert the platfrom remove callback to .remove_new for the
     exyno-nocp, exynos-ppmu, mtk-cci-devfreq, sun8i-a33-mbus, and
     rk3399_dmc devfreq drivers (Uwe Kleine-König)

   - Use DEFINE_SIMPLE_PM_OPS for exyno-bus.c driver (Anand Moon)"

* tag 'pm-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (68 commits)
  PM / devfreq: exynos: Use DEFINE_SIMPLE_DEV_PM_OPS for PM functions
  PM / devfreq: rk3399_dmc: Convert to platform remove callback returning void
  PM / devfreq: sun8i-a33-mbus: Convert to platform remove callback returning void
  PM / devfreq: mtk-cci: Convert to platform remove callback returning void
  PM / devfreq: exynos-ppmu: Convert to platform remove callback returning void
  PM / devfreq: exynos-nocp: Convert to platform remove callback returning void
  cpufreq: amd-pstate: fix the highest frequency issue which limits performance
  cpufreq: intel_pstate: fix struct cpudata::epp_cached kernel-doc
  cpuidle: ladder: fix ladder_do_selection() kernel-doc
  powercap: intel_rapl_tpmi: Enable PMU support
  powercap: intel_rapl: Introduce APIs for PMU support
  PM: hibernate: replace deprecated strncpy() with strscpy()
  cpufreq: Fix up printing large CPU numbers and frequency values
  MAINTAINERS: cpufreq: amd-pstate: Add co-maintainers and reviewer
  cpufreq: amd-pstate: remove unused variable lowest_nonlinear_freq
  cpufreq: amd-pstate: fix code format problems
  cpufreq: amd-pstate: Add quirk for the pstate CPPC capabilities missing
  cppc_acpi: print error message if CPPC is unsupported
  cpufreq: amd-pstate: get transition delay and latency value from ACPI tables
  cpufreq: amd-pstate: Bail out if min/max/nominal_freq is 0
  ...
parents f952b6c8 de1c2722
......@@ -38,6 +38,7 @@ properties:
- qcom,sc7280-cpufreq-epss
- qcom,sc8280xp-cpufreq-epss
- qcom,sdx75-cpufreq-epss
- qcom,sm4450-cpufreq-epss
- qcom,sm6375-cpufreq-epss
- qcom,sm8250-cpufreq-epss
- qcom,sm8350-cpufreq-epss
......@@ -133,6 +134,7 @@ allOf:
- qcom,sc8280xp-cpufreq-epss
- qcom,sdm670-cpufreq-hw
- qcom,sdm845-cpufreq-hw
- qcom,sm4450-cpufreq-epss
- qcom,sm6115-cpufreq-hw
- qcom,sm6350-cpufreq-hw
- qcom,sm6375-cpufreq-epss
......
......@@ -13,25 +13,25 @@ maintainers:
description: |
For some SoCs, the CPU frequency subset and voltage value of each
OPP varies based on the silicon variant in use. Allwinner Process
Voltage Scaling Tables defines the voltage and frequency value based
on the speedbin blown in the efuse combination. The
sun50i-cpufreq-nvmem driver reads the efuse value from the SoC to
provide the OPP framework with required information.
Voltage Scaling Tables define the voltage and frequency values based
on the speedbin blown in the efuse combination.
allOf:
- $ref: opp-v2-base.yaml#
properties:
compatible:
const: allwinner,sun50i-h6-operating-points
enum:
- allwinner,sun50i-h6-operating-points
- allwinner,sun50i-h616-operating-points
nvmem-cells:
description: |
A phandle pointing to a nvmem-cells node representing the efuse
registers that has information about the speedbin that is used
register that has information about the speedbin that is used
to select the right frequency/voltage value pair. Please refer
the for nvmem-cells bindings
Documentation/devicetree/bindings/nvmem/nvmem.txt and also
to the nvmem-cells bindings in
Documentation/devicetree/bindings/nvmem/nvmem.yaml and also the
examples below.
opp-shared: true
......@@ -47,15 +47,18 @@ patternProperties:
properties:
opp-hz: true
clock-latency-ns: true
opp-microvolt: true
opp-supported-hw:
maxItems: 1
description:
A single 32 bit bitmap value, representing compatible HW, one
bit per speed bin index.
patternProperties:
"^opp-microvolt-speed[0-9]$": true
required:
- opp-hz
- opp-microvolt-speed0
- opp-microvolt-speed1
- opp-microvolt-speed2
unevaluatedProperties: false
......@@ -77,58 +80,54 @@ examples:
opp-microvolt-speed2 = <800000>;
};
opp-720000000 {
opp-1080000000 {
clock-latency-ns = <244144>; /* 8 32k periods */
opp-hz = /bits/ 64 <720000000>;
opp-hz = /bits/ 64 <1080000000>;
opp-microvolt-speed0 = <880000>;
opp-microvolt-speed1 = <820000>;
opp-microvolt-speed2 = <800000>;
opp-microvolt-speed0 = <1060000>;
opp-microvolt-speed1 = <880000>;
opp-microvolt-speed2 = <840000>;
};
opp-816000000 {
opp-1488000000 {
clock-latency-ns = <244144>; /* 8 32k periods */
opp-hz = /bits/ 64 <816000000>;
opp-hz = /bits/ 64 <1488000000>;
opp-microvolt-speed0 = <880000>;
opp-microvolt-speed1 = <820000>;
opp-microvolt-speed2 = <800000>;
opp-microvolt-speed0 = <1160000>;
opp-microvolt-speed1 = <1000000>;
opp-microvolt-speed2 = <960000>;
};
};
opp-888000000 {
clock-latency-ns = <244144>; /* 8 32k periods */
opp-hz = /bits/ 64 <888000000>;
opp-microvolt-speed0 = <940000>;
opp-microvolt-speed1 = <820000>;
opp-microvolt-speed2 = <800000>;
};
- |
opp-table {
compatible = "allwinner,sun50i-h616-operating-points";
nvmem-cells = <&speedbin_efuse>;
opp-shared;
opp-1080000000 {
opp-480000000 {
clock-latency-ns = <244144>; /* 8 32k periods */
opp-hz = /bits/ 64 <1080000000>;
opp-hz = /bits/ 64 <480000000>;
opp-microvolt-speed0 = <1060000>;
opp-microvolt-speed1 = <880000>;
opp-microvolt-speed2 = <840000>;
opp-microvolt = <900000>;
opp-supported-hw = <0x1f>;
};
opp-1320000000 {
opp-792000000 {
clock-latency-ns = <244144>; /* 8 32k periods */
opp-hz = /bits/ 64 <1320000000>;
opp-hz = /bits/ 64 <792000000>;
opp-microvolt-speed0 = <1160000>;
opp-microvolt-speed1 = <940000>;
opp-microvolt-speed2 = <900000>;
opp-microvolt-speed1 = <900000>;
opp-microvolt-speed4 = <940000>;
opp-supported-hw = <0x12>;
};
opp-1488000000 {
opp-1512000000 {
clock-latency-ns = <244144>; /* 8 32k periods */
opp-hz = /bits/ 64 <1488000000>;
opp-hz = /bits/ 64 <1512000000>;
opp-microvolt-speed0 = <1160000>;
opp-microvolt-speed1 = <1000000>;
opp-microvolt-speed2 = <960000>;
opp-microvolt = <1100000>;
opp-supported-hw = <0x0a>;
};
};
......
......@@ -333,7 +333,7 @@ struct pci_dev.
The PCI subsystem's first task related to device power management is to
prepare the device for power management and initialize the fields of struct
pci_dev used for this purpose. This happens in two functions defined in
drivers/pci/pci.c, pci_pm_init() and platform_pci_wakeup_init().
drivers/pci/, pci_pm_init() and pci_acpi_setup().
The first of these functions checks if the device supports native PCI PM
and if that's the case the offset of its power management capability structure
......
......@@ -1062,6 +1062,9 @@ F: drivers/gpu/drm/amd/pm/
AMD PSTATE DRIVER
M: Huang Rui <ray.huang@amd.com>
M: Gautham R. Shenoy <gautham.shenoy@amd.com>
M: Mario Limonciello <mario.limonciello@amd.com>
R: Perry Yuan <perry.yuan@amd.com>
L: linux-pm@vger.kernel.org
S: Supported
F: Documentation/admin-guide/pm/amd-pstate.rst
......
......@@ -6,6 +6,7 @@
/dts-v1/;
#include "sun50i-h616.dtsi"
#include "sun50i-h616-cpu-opp.dtsi"
#include <dt-bindings/gpio/gpio.h>
#include <dt-bindings/interrupt-controller/arm-gic.h>
......@@ -62,6 +63,10 @@ wifi_pwrseq: wifi-pwrseq {
};
};
&cpu0 {
cpu-supply = <&reg_dcdc2>;
};
&mmc0 {
vmmc-supply = <&reg_dldo1>;
/* Card detection pin is not connected */
......
// SPDX-License-Identifier: (GPL-2.0+ OR MIT)
// Copyright (C) 2023 Martin Botka <martin@somainline.org>
/ {
cpu_opp_table: opp-table-cpu {
compatible = "allwinner,sun50i-h616-operating-points";
nvmem-cells = <&cpu_speed_grade>;
opp-shared;
opp-480000000 {
opp-hz = /bits/ 64 <480000000>;
opp-microvolt = <900000>;
clock-latency-ns = <244144>; /* 8 32k periods */
opp-supported-hw = <0x1f>;
};
opp-600000000 {
opp-hz = /bits/ 64 <600000000>;
opp-microvolt = <900000>;
clock-latency-ns = <244144>; /* 8 32k periods */
opp-supported-hw = <0x12>;
};
opp-720000000 {
opp-hz = /bits/ 64 <720000000>;
opp-microvolt = <900000>;
clock-latency-ns = <244144>; /* 8 32k periods */
opp-supported-hw = <0x0d>;
};
opp-792000000 {
opp-hz = /bits/ 64 <792000000>;
opp-microvolt-speed1 = <900000>;
opp-microvolt-speed4 = <940000>;
clock-latency-ns = <244144>; /* 8 32k periods */
opp-supported-hw = <0x12>;
};
opp-936000000 {
opp-hz = /bits/ 64 <936000000>;
opp-microvolt = <900000>;
clock-latency-ns = <244144>; /* 8 32k periods */
opp-supported-hw = <0x0d>;
};
opp-1008000000 {
opp-hz = /bits/ 64 <1008000000>;
opp-microvolt-speed0 = <950000>;
opp-microvolt-speed1 = <940000>;
opp-microvolt-speed2 = <950000>;
opp-microvolt-speed3 = <950000>;
opp-microvolt-speed4 = <1020000>;
clock-latency-ns = <244144>; /* 8 32k periods */
opp-supported-hw = <0x1f>;
};
opp-1104000000 {
opp-hz = /bits/ 64 <1104000000>;
opp-microvolt-speed0 = <1000000>;
opp-microvolt-speed2 = <1000000>;
opp-microvolt-speed3 = <1000000>;
clock-latency-ns = <244144>; /* 8 32k periods */
opp-supported-hw = <0x0d>;
};
opp-1200000000 {
opp-hz = /bits/ 64 <1200000000>;
opp-microvolt-speed0 = <1050000>;
opp-microvolt-speed1 = <1020000>;
opp-microvolt-speed2 = <1050000>;
opp-microvolt-speed3 = <1050000>;
opp-microvolt-speed4 = <1100000>;
clock-latency-ns = <244144>; /* 8 32k periods */
opp-supported-hw = <0x1f>;
};
opp-1320000000 {
opp-hz = /bits/ 64 <1320000000>;
opp-microvolt = <1100000>;
clock-latency-ns = <244144>; /* 8 32k periods */
opp-supported-hw = <0x1d>;
};
opp-1416000000 {
opp-hz = /bits/ 64 <1416000000>;
opp-microvolt = <1100000>;
clock-latency-ns = <244144>; /* 8 32k periods */
opp-supported-hw = <0x0d>;
};
opp-1512000000 {
opp-hz = /bits/ 64 <1512000000>;
opp-microvolt-speed1 = <1100000>;
opp-microvolt-speed3 = <1100000>;
clock-latency-ns = <244144>; /* 8 32k periods */
opp-supported-hw = <0x0a>;
};
};
};
&cpu0 {
operating-points-v2 = <&cpu_opp_table>;
};
&cpu1 {
operating-points-v2 = <&cpu_opp_table>;
};
&cpu2 {
operating-points-v2 = <&cpu_opp_table>;
};
&cpu3 {
operating-points-v2 = <&cpu_opp_table>;
};
......@@ -6,12 +6,17 @@
/dts-v1/;
#include "sun50i-h616-orangepi-zero.dtsi"
#include "sun50i-h616-cpu-opp.dtsi"
/ {
model = "OrangePi Zero2";
compatible = "xunlong,orangepi-zero2", "allwinner,sun50i-h616";
};
&cpu0 {
cpu-supply = <&reg_dcdca>;
};
&emac0 {
allwinner,rx-delay-ps = <3100>;
allwinner,tx-delay-ps = <700>;
......
......@@ -6,6 +6,7 @@
/dts-v1/;
#include "sun50i-h616.dtsi"
#include "sun50i-h616-cpu-opp.dtsi"
#include <dt-bindings/gpio/gpio.h>
#include <dt-bindings/interrupt-controller/arm-gic.h>
......@@ -32,6 +33,10 @@ reg_vcc5v: vcc5v {
};
};
&cpu0 {
cpu-supply = <&reg_dcdca>;
};
&ehci0 {
status = "okay";
};
......
......@@ -26,6 +26,7 @@ cpu0: cpu@0 {
reg = <0>;
enable-method = "psci";
clocks = <&ccu CLK_CPUX>;
#cooling-cells = <2>;
};
cpu1: cpu@1 {
......@@ -34,6 +35,7 @@ cpu1: cpu@1 {
reg = <1>;
enable-method = "psci";
clocks = <&ccu CLK_CPUX>;
#cooling-cells = <2>;
};
cpu2: cpu@2 {
......@@ -42,6 +44,7 @@ cpu2: cpu@2 {
reg = <2>;
enable-method = "psci";
clocks = <&ccu CLK_CPUX>;
#cooling-cells = <2>;
};
cpu3: cpu@3 {
......@@ -50,6 +53,7 @@ cpu3: cpu@3 {
reg = <3>;
enable-method = "psci";
clocks = <&ccu CLK_CPUX>;
#cooling-cells = <2>;
};
};
......@@ -156,6 +160,10 @@ sid: efuse@3006000 {
ths_calibration: thermal-sensor-calibration@14 {
reg = <0x14 0x8>;
};
cpu_speed_grade: cpu-speed-grade@0 {
reg = <0x0 2>;
};
};
watchdog: watchdog@30090a0 {
......
......@@ -4,6 +4,11 @@
*/
#include "sun50i-h616.dtsi"
#include "sun50i-h616-cpu-opp.dtsi"
&cpu0 {
cpu-supply = <&reg_dcdc2>;
};
&mmc2 {
pinctrl-names = "default";
......
......@@ -6,6 +6,7 @@
/dts-v1/;
#include "sun50i-h616.dtsi"
#include "sun50i-h616-cpu-opp.dtsi"
#include <dt-bindings/gpio/gpio.h>
#include <dt-bindings/interrupt-controller/arm-gic.h>
......@@ -53,6 +54,10 @@ reg_vcc3v3: vcc3v3 {
};
};
&cpu0 {
cpu-supply = <&reg_dcdc2>;
};
&ehci1 {
status = "okay";
};
......
......@@ -6,12 +6,17 @@
/dts-v1/;
#include "sun50i-h616-orangepi-zero.dtsi"
#include "sun50i-h616-cpu-opp.dtsi"
/ {
model = "OrangePi Zero3";
compatible = "xunlong,orangepi-zero3", "allwinner,sun50i-h618";
};
&cpu0 {
cpu-supply = <&reg_dcdc2>;
};
&emac0 {
allwinner,tx-delay-ps = <700>;
phy-mode = "rgmii-rxid";
......
......@@ -6,6 +6,7 @@
/dts-v1/;
#include "sun50i-h616.dtsi"
#include "sun50i-h616-cpu-opp.dtsi"
#include <dt-bindings/gpio/gpio.h>
#include <dt-bindings/interrupt-controller/arm-gic.h>
......@@ -51,6 +52,10 @@ wifi_pwrseq: pwrseq {
};
};
&cpu0 {
cpu-supply = <&reg_dcdc2>;
};
&ehci0 {
status = "okay";
};
......
......@@ -686,8 +686,10 @@ int acpi_cppc_processor_probe(struct acpi_processor *pr)
if (!osc_sb_cppc2_support_acked) {
pr_debug("CPPC v2 _OSC not acked\n");
if (!cpc_supported_by_cpu())
if (!cpc_supported_by_cpu()) {
pr_debug("CPPC is not supported by the CPU\n");
return -ENODEV;
}
}
/* Parse the ACPI _CPC table for this CPU. */
......
......@@ -208,7 +208,7 @@ static ktime_t initcall_debug_start(struct device *dev, void *cb)
if (!pm_print_times_enabled)
return 0;
dev_info(dev, "calling %pS @ %i, parent: %s\n", cb,
dev_info(dev, "calling %ps @ %i, parent: %s\n", cb,
task_pid_nr(current),
dev->parent ? dev_name(dev->parent) : "none");
return ktime_get();
......@@ -223,7 +223,7 @@ static void initcall_debug_report(struct device *dev, ktime_t calltime,
return;
rettime = ktime_get();
dev_info(dev, "%pS returned %d after %Ld usecs\n", cb, error,
dev_info(dev, "%ps returned %d after %Ld usecs\n", cb, error,
(unsigned long long)ktime_us_delta(rettime, calltime));
}
......@@ -1927,7 +1927,7 @@ EXPORT_SYMBOL_GPL(dpm_suspend_start);
void __suspend_report_result(const char *function, struct device *dev, void *fn, int ret)
{
if (ret)
dev_err(dev, "%s(): %pS returns %d\n", function, fn, ret);
dev_err(dev, "%s(): %ps returns %d\n", function, fn, ret);
}
EXPORT_SYMBOL_GPL(__suspend_report_result);
......
......@@ -451,16 +451,15 @@ static struct wakeup_source *device_wakeup_detach(struct device *dev)
* Detach the @dev's wakeup source object from it, unregister this wakeup source
* object and destroy it.
*/
int device_wakeup_disable(struct device *dev)
void device_wakeup_disable(struct device *dev)
{
struct wakeup_source *ws;
if (!dev || !dev->power.can_wakeup)
return -EINVAL;
return;
ws = device_wakeup_detach(dev);
wakeup_source_unregister(ws);
return 0;
}
EXPORT_SYMBOL_GPL(device_wakeup_disable);
......@@ -502,7 +501,11 @@ EXPORT_SYMBOL_GPL(device_set_wakeup_capable);
*/
int device_set_wakeup_enable(struct device *dev, bool enable)
{
return enable ? device_wakeup_enable(dev) : device_wakeup_disable(dev);
if (enable)
return device_wakeup_enable(dev);
device_wakeup_disable(dev);
return 0;
}
EXPORT_SYMBOL_GPL(device_set_wakeup_enable);
......
......@@ -50,7 +50,8 @@
#define AMD_PSTATE_TRANSITION_LATENCY 20000
#define AMD_PSTATE_TRANSITION_DELAY 1000
#define AMD_PSTATE_PREFCORE_THRESHOLD 166
#define CPPC_HIGHEST_PERF_PERFORMANCE 196
#define CPPC_HIGHEST_PERF_DEFAULT 166
/*
* TODO: We need more time to fine tune processors with shared memory solution
......@@ -67,6 +68,7 @@ static struct cpufreq_driver amd_pstate_epp_driver;
static int cppc_state = AMD_PSTATE_UNDEFINED;
static bool cppc_enabled;
static bool amd_pstate_prefcore = true;
static struct quirk_entry *quirks;
/*
* AMD Energy Preference Performance (EPP)
......@@ -111,6 +113,41 @@ static unsigned int epp_values[] = {
typedef int (*cppc_mode_transition_fn)(int);
static struct quirk_entry quirk_amd_7k62 = {
.nominal_freq = 2600,
.lowest_freq = 550,
};
static int __init dmi_matched_7k62_bios_bug(const struct dmi_system_id *dmi)
{
/**
* match the broken bios for family 17h processor support CPPC V2
* broken BIOS lack of nominal_freq and lowest_freq capabilities
* definition in ACPI tables
*/
if (boot_cpu_has(X86_FEATURE_ZEN2)) {
quirks = dmi->driver_data;
pr_info("Overriding nominal and lowest frequencies for %s\n", dmi->ident);
return 1;
}
return 0;
}
static const struct dmi_system_id amd_pstate_quirks_table[] __initconst = {
{
.callback = dmi_matched_7k62_bios_bug,
.ident = "AMD EPYC 7K62",
.matches = {
DMI_MATCH(DMI_BIOS_VERSION, "5.14"),
DMI_MATCH(DMI_BIOS_RELEASE, "12/12/2019"),
},
.driver_data = &quirk_amd_7k62,
},
{}
};
MODULE_DEVICE_TABLE(dmi, amd_pstate_quirks_table);
static inline int get_mode_idx_from_str(const char *str, size_t size)
{
int i;
......@@ -290,6 +327,21 @@ static inline int amd_pstate_enable(bool enable)
return static_call(amd_pstate_enable)(enable);
}
static u32 amd_pstate_highest_perf_set(struct amd_cpudata *cpudata)
{
struct cpuinfo_x86 *c = &cpu_data(0);
/*
* For AMD CPUs with Family ID 19H and Model ID range 0x70 to 0x7f,
* the highest performance level is set to 196.
* https://bugzilla.kernel.org/show_bug.cgi?id=218759
*/
if (c->x86 == 0x19 && (c->x86_model >= 0x70 && c->x86_model <= 0x7f))
return CPPC_HIGHEST_PERF_PERFORMANCE;
return CPPC_HIGHEST_PERF_DEFAULT;
}
static int pstate_init_perf(struct amd_cpudata *cpudata)
{
u64 cap1;
......@@ -306,7 +358,7 @@ static int pstate_init_perf(struct amd_cpudata *cpudata)
* the default max perf.
*/
if (cpudata->hw_prefcore)
highest_perf = AMD_PSTATE_PREFCORE_THRESHOLD;
highest_perf = amd_pstate_highest_perf_set(cpudata);
else
highest_perf = AMD_CPPC_HIGHEST_PERF(cap1);
......@@ -330,7 +382,7 @@ static int cppc_init_perf(struct amd_cpudata *cpudata)
return ret;
if (cpudata->hw_prefcore)
highest_perf = AMD_PSTATE_PREFCORE_THRESHOLD;
highest_perf = amd_pstate_highest_perf_set(cpudata);
else
highest_perf = cppc_perf.highest_perf;
......@@ -604,78 +656,6 @@ static void amd_pstate_adjust_perf(unsigned int cpu,
cpufreq_cpu_put(policy);
}
static int amd_get_min_freq(struct amd_cpudata *cpudata)
{
struct cppc_perf_caps cppc_perf;
int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
if (ret)
return ret;
/* Switch to khz */
return cppc_perf.lowest_freq * 1000;
}
static int amd_get_max_freq(struct amd_cpudata *cpudata)
{
struct cppc_perf_caps cppc_perf;
u32 max_perf, max_freq, nominal_freq, nominal_perf;
u64 boost_ratio;
int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
if (ret)
return ret;
nominal_freq = cppc_perf.nominal_freq;
nominal_perf = READ_ONCE(cpudata->nominal_perf);
max_perf = READ_ONCE(cpudata->highest_perf);
boost_ratio = div_u64(max_perf << SCHED_CAPACITY_SHIFT,
nominal_perf);
max_freq = nominal_freq * boost_ratio >> SCHED_CAPACITY_SHIFT;
/* Switch to khz */
return max_freq * 1000;
}
static int amd_get_nominal_freq(struct amd_cpudata *cpudata)
{
struct cppc_perf_caps cppc_perf;
int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
if (ret)
return ret;
/* Switch to khz */
return cppc_perf.nominal_freq * 1000;
}
static int amd_get_lowest_nonlinear_freq(struct amd_cpudata *cpudata)
{
struct cppc_perf_caps cppc_perf;
u32 lowest_nonlinear_freq, lowest_nonlinear_perf,
nominal_freq, nominal_perf;
u64 lowest_nonlinear_ratio;
int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
if (ret)
return ret;
nominal_freq = cppc_perf.nominal_freq;
nominal_perf = READ_ONCE(cpudata->nominal_perf);
lowest_nonlinear_perf = cppc_perf.lowest_nonlinear_perf;
lowest_nonlinear_ratio = div_u64(lowest_nonlinear_perf << SCHED_CAPACITY_SHIFT,
nominal_perf);
lowest_nonlinear_freq = nominal_freq * lowest_nonlinear_ratio >> SCHED_CAPACITY_SHIFT;
/* Switch to khz */
return lowest_nonlinear_freq * 1000;
}
static int amd_pstate_set_boost(struct cpufreq_policy *policy, int state)
{
struct amd_cpudata *cpudata = policy->driver_data;
......@@ -828,9 +808,93 @@ static void amd_pstate_update_limits(unsigned int cpu)
mutex_unlock(&amd_pstate_driver_lock);
}
/*
* Get pstate transition delay time from ACPI tables that firmware set
* instead of using hardcode value directly.
*/
static u32 amd_pstate_get_transition_delay_us(unsigned int cpu)
{
u32 transition_delay_ns;
transition_delay_ns = cppc_get_transition_latency(cpu);
if (transition_delay_ns == CPUFREQ_ETERNAL)
return AMD_PSTATE_TRANSITION_DELAY;
return transition_delay_ns / NSEC_PER_USEC;
}
/*
* Get pstate transition latency value from ACPI tables that firmware
* set instead of using hardcode value directly.
*/
static u32 amd_pstate_get_transition_latency(unsigned int cpu)
{
u32 transition_latency;
transition_latency = cppc_get_transition_latency(cpu);
if (transition_latency == CPUFREQ_ETERNAL)
return AMD_PSTATE_TRANSITION_LATENCY;
return transition_latency;
}
/*
* amd_pstate_init_freq: Initialize the max_freq, min_freq,
* nominal_freq and lowest_nonlinear_freq for
* the @cpudata object.
*
* Requires: highest_perf, lowest_perf, nominal_perf and
* lowest_nonlinear_perf members of @cpudata to be
* initialized.
*
* Returns 0 on success, non-zero value on failure.
*/
static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
{
int ret;
u32 min_freq;
u32 highest_perf, max_freq;
u32 nominal_perf, nominal_freq;
u32 lowest_nonlinear_perf, lowest_nonlinear_freq;
u32 boost_ratio, lowest_nonlinear_ratio;
struct cppc_perf_caps cppc_perf;
ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
if (ret)
return ret;
if (quirks && quirks->lowest_freq)
min_freq = quirks->lowest_freq * 1000;
else
min_freq = cppc_perf.lowest_freq * 1000;
if (quirks && quirks->nominal_freq)
nominal_freq = quirks->nominal_freq ;
else
nominal_freq = cppc_perf.nominal_freq;
nominal_perf = READ_ONCE(cpudata->nominal_perf);
highest_perf = READ_ONCE(cpudata->highest_perf);
boost_ratio = div_u64(highest_perf << SCHED_CAPACITY_SHIFT, nominal_perf);
max_freq = (nominal_freq * boost_ratio >> SCHED_CAPACITY_SHIFT) * 1000;
lowest_nonlinear_perf = READ_ONCE(cpudata->lowest_nonlinear_perf);
lowest_nonlinear_ratio = div_u64(lowest_nonlinear_perf << SCHED_CAPACITY_SHIFT,
nominal_perf);
lowest_nonlinear_freq = (nominal_freq * lowest_nonlinear_ratio >> SCHED_CAPACITY_SHIFT) * 1000;
WRITE_ONCE(cpudata->min_freq, min_freq);
WRITE_ONCE(cpudata->lowest_nonlinear_freq, lowest_nonlinear_freq);
WRITE_ONCE(cpudata->nominal_freq, nominal_freq);
WRITE_ONCE(cpudata->max_freq, max_freq);
return 0;
}
static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
{
int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
int min_freq, max_freq, nominal_freq, ret;
struct device *dev;
struct amd_cpudata *cpudata;
......@@ -855,20 +919,25 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
if (ret)
goto free_cpudata1;
min_freq = amd_get_min_freq(cpudata);
max_freq = amd_get_max_freq(cpudata);
nominal_freq = amd_get_nominal_freq(cpudata);
lowest_nonlinear_freq = amd_get_lowest_nonlinear_freq(cpudata);
ret = amd_pstate_init_freq(cpudata);
if (ret)
goto free_cpudata1;
min_freq = READ_ONCE(cpudata->min_freq);
max_freq = READ_ONCE(cpudata->max_freq);
nominal_freq = READ_ONCE(cpudata->nominal_freq);
if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
dev_err(dev, "min_freq(%d) or max_freq(%d) value is incorrect\n",
min_freq, max_freq);
if (min_freq <= 0 || max_freq <= 0 ||
nominal_freq <= 0 || min_freq > max_freq) {
dev_err(dev,
"min_freq(%d) or max_freq(%d) or nominal_freq (%d) value is incorrect, check _CPC in ACPI tables\n",
min_freq, max_freq, nominal_freq);
ret = -EINVAL;
goto free_cpudata1;
}
policy->cpuinfo.transition_latency = AMD_PSTATE_TRANSITION_LATENCY;
policy->transition_delay_us = AMD_PSTATE_TRANSITION_DELAY;
policy->cpuinfo.transition_latency = amd_pstate_get_transition_latency(policy->cpu);
policy->transition_delay_us = amd_pstate_get_transition_delay_us(policy->cpu);
policy->min = min_freq;
policy->max = max_freq;
......@@ -896,13 +965,8 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
goto free_cpudata2;
}
/* Initial processor data capability frequencies */
cpudata->max_freq = max_freq;
cpudata->min_freq = min_freq;
cpudata->max_limit_freq = max_freq;
cpudata->min_limit_freq = min_freq;
cpudata->nominal_freq = nominal_freq;
cpudata->lowest_nonlinear_freq = lowest_nonlinear_freq;
policy->driver_data = cpudata;
......@@ -966,7 +1030,7 @@ static ssize_t show_amd_pstate_max_freq(struct cpufreq_policy *policy,
int max_freq;
struct amd_cpudata *cpudata = policy->driver_data;
max_freq = amd_get_max_freq(cpudata);
max_freq = READ_ONCE(cpudata->max_freq);
if (max_freq < 0)
return max_freq;
......@@ -979,7 +1043,7 @@ static ssize_t show_amd_pstate_lowest_nonlinear_freq(struct cpufreq_policy *poli
int freq;
struct amd_cpudata *cpudata = policy->driver_data;
freq = amd_get_lowest_nonlinear_freq(cpudata);
freq = READ_ONCE(cpudata->lowest_nonlinear_freq);
if (freq < 0)
return freq;
......@@ -1290,7 +1354,7 @@ static bool amd_pstate_acpi_pm_profile_undefined(void)
static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
{
int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret;
int min_freq, max_freq, nominal_freq, ret;
struct amd_cpudata *cpudata;
struct device *dev;
u64 value;
......@@ -1317,13 +1381,18 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
if (ret)
goto free_cpudata1;
min_freq = amd_get_min_freq(cpudata);
max_freq = amd_get_max_freq(cpudata);
nominal_freq = amd_get_nominal_freq(cpudata);
lowest_nonlinear_freq = amd_get_lowest_nonlinear_freq(cpudata);
if (min_freq < 0 || max_freq < 0 || min_freq > max_freq) {
dev_err(dev, "min_freq(%d) or max_freq(%d) value is incorrect\n",
min_freq, max_freq);
ret = amd_pstate_init_freq(cpudata);
if (ret)
goto free_cpudata1;
min_freq = READ_ONCE(cpudata->min_freq);
max_freq = READ_ONCE(cpudata->max_freq);
nominal_freq = READ_ONCE(cpudata->nominal_freq);
if (min_freq <= 0 || max_freq <= 0 ||
nominal_freq <= 0 || min_freq > max_freq) {
dev_err(dev,
"min_freq(%d) or max_freq(%d) or nominal_freq(%d) value is incorrect, check _CPC in ACPI tables\n",
min_freq, max_freq, nominal_freq);
ret = -EINVAL;
goto free_cpudata1;
}
......@@ -1333,12 +1402,6 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
/* It will be updated by governor */
policy->cur = policy->cpuinfo.min_freq;
/* Initial processor data capability frequencies */
cpudata->max_freq = max_freq;
cpudata->min_freq = min_freq;
cpudata->nominal_freq = nominal_freq;
cpudata->lowest_nonlinear_freq = lowest_nonlinear_freq;
policy->driver_data = cpudata;
cpudata->epp_cached = amd_pstate_get_epp(cpudata, 0);
......@@ -1656,6 +1719,11 @@ static int __init amd_pstate_init(void)
if (cpufreq_get_current_driver())
return -EEXIST;
quirks = NULL;
/* check if this machine need CPPC quirks */
dmi_check_system(amd_pstate_quirks_table);
switch (cppc_state) {
case AMD_PSTATE_UNDEFINED:
/* Disable on the following configs by default:
......
......@@ -481,9 +481,12 @@ static bool brcm_avs_is_firmware_loaded(struct private_data *priv)
static unsigned int brcm_avs_cpufreq_get(unsigned int cpu)
{
struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
struct private_data *priv;
if (!policy)
return 0;
struct private_data *priv = policy->driver_data;
priv = policy->driver_data;
cpufreq_cpu_put(policy);
......
......@@ -741,10 +741,15 @@ static unsigned int cppc_cpufreq_get_rate(unsigned int cpu)
{
struct cppc_perf_fb_ctrs fb_ctrs_t0 = {0}, fb_ctrs_t1 = {0};
struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
struct cppc_cpudata *cpu_data = policy->driver_data;
struct cppc_cpudata *cpu_data;
u64 delivered_perf;
int ret;
if (!policy)
return -ENODEV;
cpu_data = policy->driver_data;
cpufreq_cpu_put(policy);
ret = cppc_get_perf_ctrs(cpu, &fb_ctrs_t0);
......@@ -822,10 +827,15 @@ static struct cpufreq_driver cppc_cpufreq_driver = {
static unsigned int hisi_cppc_cpufreq_get_rate(unsigned int cpu)
{
struct cpufreq_policy *policy = cpufreq_cpu_get(cpu);
struct cppc_cpudata *cpu_data = policy->driver_data;
struct cppc_cpudata *cpu_data;
u64 desired_perf;
int ret;
if (!policy)
return -ENODEV;
cpu_data = policy->driver_data;
cpufreq_cpu_put(policy);
ret = cppc_get_desired_perf(cpu, &desired_perf);
......
......@@ -104,6 +104,9 @@ static const struct of_device_id allowlist[] __initconst = {
*/
static const struct of_device_id blocklist[] __initconst = {
{ .compatible = "allwinner,sun50i-h6", },
{ .compatible = "allwinner,sun50i-h616", },
{ .compatible = "allwinner,sun50i-h618", },
{ .compatible = "allwinner,sun50i-h700", },
{ .compatible = "apple,arm-platform", },
......@@ -195,19 +198,18 @@ static const struct of_device_id blocklist[] __initconst = {
static bool __init cpu0_node_has_opp_v2_prop(void)
{
struct device_node *np = of_cpu_device_node_get(0);
struct device_node *np __free(device_node) = of_cpu_device_node_get(0);
bool ret = false;
if (of_property_present(np, "operating-points-v2"))
ret = true;
of_node_put(np);
return ret;
}
static int __init cpufreq_dt_platdev_init(void)
{
struct device_node *np = of_find_node_by_path("/");
struct device_node *np __free(device_node) = of_find_node_by_path("/");
const struct of_device_id *match;
const void *data = NULL;
......@@ -223,11 +225,9 @@ static int __init cpufreq_dt_platdev_init(void)
if (cpu0_node_has_opp_v2_prop() && !of_match_node(blocklist, np))
goto create_pdev;
of_node_put(np);
return -ENODEV;
create_pdev:
of_node_put(np);
return PTR_ERR_OR_ZERO(platform_device_register_data(NULL, "cpufreq-dt",
-1, data,
sizeof(struct cpufreq_dt_platform_data)));
......
......@@ -68,12 +68,9 @@ static int set_target(struct cpufreq_policy *policy, unsigned int index)
*/
static const char *find_supply_name(struct device *dev)
{
struct device_node *np;
struct device_node *np __free(device_node) = of_node_get(dev->of_node);
struct property *pp;
int cpu = dev->id;
const char *name = NULL;
np = of_node_get(dev->of_node);
/* This must be valid for sure */
if (WARN_ON(!np))
......@@ -82,22 +79,16 @@ static const char *find_supply_name(struct device *dev)
/* Try "cpu0" for older DTs */
if (!cpu) {
pp = of_find_property(np, "cpu0-supply", NULL);
if (pp) {
name = "cpu0";
goto node_put;
}
if (pp)
return "cpu0";
}
pp = of_find_property(np, "cpu-supply", NULL);
if (pp) {
name = "cpu";
goto node_put;
}
if (pp)
return "cpu";
dev_dbg(dev, "no regulator for cpu%d\n", cpu);
node_put:
of_node_put(np);
return name;
return NULL;
}
static int cpufreq_init(struct cpufreq_policy *policy)
......
......@@ -1679,10 +1679,13 @@ static void __cpufreq_offline(unsigned int cpu, struct cpufreq_policy *policy)
*/
if (cpufreq_driver->offline) {
cpufreq_driver->offline(policy);
} else if (cpufreq_driver->exit) {
cpufreq_driver->exit(policy);
policy->freq_table = NULL;
return;
}
if (cpufreq_driver->exit)
cpufreq_driver->exit(policy);
policy->freq_table = NULL;
}
static int cpufreq_offline(unsigned int cpu)
......@@ -1740,7 +1743,7 @@ static void cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif)
}
/* We did light-weight exit earlier, do full tear down now */
if (cpufreq_driver->offline)
if (cpufreq_driver->offline && cpufreq_driver->exit)
cpufreq_driver->exit(policy);
up_write(&policy->rwsem);
......
......@@ -70,7 +70,7 @@ int cpufreq_frequency_table_verify(struct cpufreq_policy_data *policy,
struct cpufreq_frequency_table *table)
{
struct cpufreq_frequency_table *pos;
unsigned int freq, next_larger = ~0;
unsigned int freq, prev_smaller = 0;
bool found = false;
pr_debug("request for verification of policy (%u - %u kHz) for cpu %u\n",
......@@ -86,12 +86,12 @@ int cpufreq_frequency_table_verify(struct cpufreq_policy_data *policy,
break;
}
if ((next_larger > freq) && (freq > policy->max))
next_larger = freq;
if ((prev_smaller < freq) && (freq <= policy->max))
prev_smaller = freq;
}
if (!found) {
policy->max = next_larger;
policy->max = prev_smaller;
cpufreq_verify_within_cpu_limits(policy);
}
......@@ -194,7 +194,7 @@ int cpufreq_table_index_unsorted(struct cpufreq_policy *policy,
}
if (optimal.driver_data > i) {
if (suboptimal.driver_data > i) {
WARN(1, "Invalid frequency table: %d\n", policy->cpu);
WARN(1, "Invalid frequency table: %u\n", policy->cpu);
return 0;
}
......@@ -254,7 +254,7 @@ static ssize_t show_available_freqs(struct cpufreq_policy *policy, char *buf,
if (show_boost ^ (pos->flags & CPUFREQ_BOOST_FREQ))
continue;
count += sprintf(&buf[count], "%d ", pos->frequency);
count += sprintf(&buf[count], "%u ", pos->frequency);
}
count += sprintf(&buf[count], "\n");
......
......@@ -173,7 +173,6 @@ struct vid_data {
* based on the MSR_IA32_MISC_ENABLE value and whether or
* not the maximum reported turbo P-state is different from
* the maximum reported non-turbo one.
* @turbo_disabled_mf: The @turbo_disabled value reflected by cpuinfo.max_freq.
* @min_perf_pct: Minimum capacity limit in percent of the maximum turbo
* P-state capacity.
* @max_perf_pct: Maximum capacity limit in percent of the maximum turbo
......@@ -182,7 +181,6 @@ struct vid_data {
struct global_params {
bool no_turbo;
bool turbo_disabled;
bool turbo_disabled_mf;
int max_perf_pct;
int min_perf_pct;
};
......@@ -213,7 +211,7 @@ struct global_params {
* @epp_policy: Last saved policy used to set EPP/EPB
* @epp_default: Power on default HWP energy performance
* preference/bias
* @epp_cached Cached HWP energy-performance preference value
* @epp_cached: Cached HWP energy-performance preference value
* @hwp_req_cached: Cached value of the last HWP Request MSR
* @hwp_cap_cached: Cached value of the last HWP Capabilities MSR
* @last_io_update: Last time when IO wake flag was set
......@@ -292,11 +290,11 @@ struct pstate_funcs {
static struct pstate_funcs pstate_funcs __read_mostly;
static int hwp_active __read_mostly;
static int hwp_mode_bdw __read_mostly;
static bool per_cpu_limits __read_mostly;
static bool hwp_active __ro_after_init;
static int hwp_mode_bdw __ro_after_init;
static bool per_cpu_limits __ro_after_init;
static bool hwp_forced __ro_after_init;
static bool hwp_boost __read_mostly;
static bool hwp_forced __read_mostly;
static struct cpufreq_driver *intel_pstate_driver __read_mostly;
......@@ -594,12 +592,13 @@ static void intel_pstate_hybrid_hwp_adjust(struct cpudata *cpu)
cpu->pstate.min_pstate = intel_pstate_freq_to_hwp(cpu, freq);
}
static inline void update_turbo_state(void)
static bool turbo_is_disabled(void)
{
u64 misc_en;
rdmsrl(MSR_IA32_MISC_ENABLE, misc_en);
global.turbo_disabled = misc_en & MSR_IA32_MISC_ENABLE_TURBO_DISABLE;
return !!(misc_en & MSR_IA32_MISC_ENABLE_TURBO_DISABLE);
}
static int min_perf_pct_min(void)
......@@ -1154,12 +1153,15 @@ static void intel_pstate_update_policies(void)
static void __intel_pstate_update_max_freq(struct cpudata *cpudata,
struct cpufreq_policy *policy)
{
policy->cpuinfo.max_freq = global.turbo_disabled_mf ?
intel_pstate_get_hwp_cap(cpudata);
policy->cpuinfo.max_freq = READ_ONCE(global.no_turbo) ?
cpudata->pstate.max_freq : cpudata->pstate.turbo_freq;
refresh_frequency_limits(policy);
}
static void intel_pstate_update_max_freq(unsigned int cpu)
static void intel_pstate_update_limits(unsigned int cpu)
{
struct cpufreq_policy *policy = cpufreq_cpu_acquire(cpu);
......@@ -1171,25 +1173,12 @@ static void intel_pstate_update_max_freq(unsigned int cpu)
cpufreq_cpu_release(policy);
}
static void intel_pstate_update_limits(unsigned int cpu)
static void intel_pstate_update_limits_for_all(void)
{
mutex_lock(&intel_pstate_driver_lock);
update_turbo_state();
/*
* If turbo has been turned on or off globally, policy limits for
* all CPUs need to be updated to reflect that.
*/
if (global.turbo_disabled_mf != global.turbo_disabled) {
global.turbo_disabled_mf = global.turbo_disabled;
arch_set_max_freq_ratio(global.turbo_disabled);
for_each_possible_cpu(cpu)
intel_pstate_update_max_freq(cpu);
} else {
cpufreq_update_policy(cpu);
}
int cpu;
mutex_unlock(&intel_pstate_driver_lock);
for_each_possible_cpu(cpu)
intel_pstate_update_limits(cpu);
}
/************************** sysfs begin ************************/
......@@ -1287,11 +1276,7 @@ static ssize_t show_no_turbo(struct kobject *kobj,
return -EAGAIN;
}
update_turbo_state();
if (global.turbo_disabled)
ret = sprintf(buf, "%u\n", global.turbo_disabled);
else
ret = sprintf(buf, "%u\n", global.no_turbo);
ret = sprintf(buf, "%u\n", global.no_turbo);
mutex_unlock(&intel_pstate_driver_lock);
......@@ -1302,32 +1287,34 @@ static ssize_t store_no_turbo(struct kobject *a, struct kobj_attribute *b,
const char *buf, size_t count)
{
unsigned int input;
int ret;
bool no_turbo;
ret = sscanf(buf, "%u", &input);
if (ret != 1)
if (sscanf(buf, "%u", &input) != 1)
return -EINVAL;
mutex_lock(&intel_pstate_driver_lock);
if (!intel_pstate_driver) {
mutex_unlock(&intel_pstate_driver_lock);
return -EAGAIN;
count = -EAGAIN;
goto unlock_driver;
}
mutex_lock(&intel_pstate_limits_lock);
no_turbo = !!clamp_t(int, input, 0, 1);
if (no_turbo == global.no_turbo)
goto unlock_driver;
update_turbo_state();
if (global.turbo_disabled) {
pr_notice_once("Turbo disabled by BIOS or unavailable on processor\n");
mutex_unlock(&intel_pstate_limits_lock);
mutex_unlock(&intel_pstate_driver_lock);
return -EPERM;
count = -EPERM;
goto unlock_driver;
}
global.no_turbo = clamp_t(int, input, 0, 1);
WRITE_ONCE(global.no_turbo, no_turbo);
if (global.no_turbo) {
mutex_lock(&intel_pstate_limits_lock);
if (no_turbo) {
struct cpudata *cpu = all_cpu_data[0];
int pct = cpu->pstate.max_pstate * 100 / cpu->pstate.turbo_pstate;
......@@ -1338,9 +1325,10 @@ static ssize_t store_no_turbo(struct kobject *a, struct kobj_attribute *b,
mutex_unlock(&intel_pstate_limits_lock);
intel_pstate_update_policies();
arch_set_max_freq_ratio(global.no_turbo);
intel_pstate_update_limits_for_all();
arch_set_max_freq_ratio(no_turbo);
unlock_driver:
mutex_unlock(&intel_pstate_driver_lock);
return count;
......@@ -1621,7 +1609,6 @@ static void intel_pstate_notify_work(struct work_struct *work)
struct cpufreq_policy *policy = cpufreq_cpu_acquire(cpudata->cpu);
if (policy) {
intel_pstate_get_hwp_cap(cpudata);
__intel_pstate_update_max_freq(cpudata, policy);
cpufreq_cpu_release(policy);
......@@ -1636,11 +1623,10 @@ static cpumask_t hwp_intr_enable_mask;
void notify_hwp_interrupt(void)
{
unsigned int this_cpu = smp_processor_id();
struct cpudata *cpudata;
unsigned long flags;
u64 value;
if (!READ_ONCE(hwp_active) || !boot_cpu_has(X86_FEATURE_HWP_NOTIFY))
if (!hwp_active || !boot_cpu_has(X86_FEATURE_HWP_NOTIFY))
return;
rdmsrl_safe(MSR_HWP_STATUS, &value);
......@@ -1652,24 +1638,8 @@ void notify_hwp_interrupt(void)
if (!cpumask_test_cpu(this_cpu, &hwp_intr_enable_mask))
goto ack_intr;
/*
* Currently we never free all_cpu_data. And we can't reach here
* without this allocated. But for safety for future changes, added
* check.
*/
if (unlikely(!READ_ONCE(all_cpu_data)))
goto ack_intr;
/*
* The free is done during cleanup, when cpufreq registry is failed.
* We wouldn't be here if it fails on init or switch status. But for
* future changes, added check.
*/
cpudata = READ_ONCE(all_cpu_data[this_cpu]);
if (unlikely(!cpudata))
goto ack_intr;
schedule_delayed_work(&cpudata->hwp_notify_work, msecs_to_jiffies(10));
schedule_delayed_work(&all_cpu_data[this_cpu]->hwp_notify_work,
msecs_to_jiffies(10));
spin_unlock_irqrestore(&hwp_notify_lock, flags);
......@@ -1682,7 +1652,7 @@ void notify_hwp_interrupt(void)
static void intel_pstate_disable_hwp_interrupt(struct cpudata *cpudata)
{
unsigned long flags;
bool cancel_work;
if (!boot_cpu_has(X86_FEATURE_HWP_NOTIFY))
return;
......@@ -1690,22 +1660,22 @@ static void intel_pstate_disable_hwp_interrupt(struct cpudata *cpudata)
/* wrmsrl_on_cpu has to be outside spinlock as this can result in IPC */
wrmsrl_on_cpu(cpudata->cpu, MSR_HWP_INTERRUPT, 0x00);
spin_lock_irqsave(&hwp_notify_lock, flags);
if (cpumask_test_and_clear_cpu(cpudata->cpu, &hwp_intr_enable_mask))
cancel_delayed_work(&cpudata->hwp_notify_work);
spin_unlock_irqrestore(&hwp_notify_lock, flags);
spin_lock_irq(&hwp_notify_lock);
cancel_work = cpumask_test_and_clear_cpu(cpudata->cpu, &hwp_intr_enable_mask);
spin_unlock_irq(&hwp_notify_lock);
if (cancel_work)
cancel_delayed_work_sync(&cpudata->hwp_notify_work);
}
static void intel_pstate_enable_hwp_interrupt(struct cpudata *cpudata)
{
/* Enable HWP notification interrupt for guaranteed performance change */
if (boot_cpu_has(X86_FEATURE_HWP_NOTIFY)) {
unsigned long flags;
spin_lock_irqsave(&hwp_notify_lock, flags);
spin_lock_irq(&hwp_notify_lock);
INIT_DELAYED_WORK(&cpudata->hwp_notify_work, intel_pstate_notify_work);
cpumask_set_cpu(cpudata->cpu, &hwp_intr_enable_mask);
spin_unlock_irqrestore(&hwp_notify_lock, flags);
spin_unlock_irq(&hwp_notify_lock);
/* wrmsrl_on_cpu has to be outside spinlock as this can result in IPC */
wrmsrl_on_cpu(cpudata->cpu, MSR_HWP_INTERRUPT, 0x01);
......@@ -1791,7 +1761,7 @@ static u64 atom_get_val(struct cpudata *cpudata, int pstate)
u32 vid;
val = (u64)pstate << 8;
if (global.no_turbo && !global.turbo_disabled)
if (READ_ONCE(global.no_turbo) && !global.turbo_disabled)
val |= (u64)1 << 32;
vid_fp = cpudata->vid.min + mul_fp(
......@@ -1956,7 +1926,7 @@ static u64 core_get_val(struct cpudata *cpudata, int pstate)
u64 val;
val = (u64)pstate << 8;
if (global.no_turbo && !global.turbo_disabled)
if (READ_ONCE(global.no_turbo) && !global.turbo_disabled)
val |= (u64)1 << 32;
return val;
......@@ -2029,14 +1999,6 @@ static void intel_pstate_set_min_pstate(struct cpudata *cpu)
intel_pstate_set_pstate(cpu, cpu->pstate.min_pstate);
}
static void intel_pstate_max_within_limits(struct cpudata *cpu)
{
int pstate = max(cpu->pstate.min_pstate, cpu->max_perf_ratio);
update_turbo_state();
intel_pstate_set_pstate(cpu, pstate);
}
static void intel_pstate_get_cpu_pstates(struct cpudata *cpu)
{
int perf_ctl_max_phys = pstate_funcs.get_max_physical(cpu->cpu);
......@@ -2262,7 +2224,7 @@ static inline int32_t get_target_pstate(struct cpudata *cpu)
sample->busy_scaled = busy_frac * 100;
target = global.no_turbo || global.turbo_disabled ?
target = READ_ONCE(global.no_turbo) ?
cpu->pstate.max_pstate : cpu->pstate.turbo_pstate;
target += target >> 2;
target = mul_fp(target, busy_frac);
......@@ -2306,8 +2268,6 @@ static void intel_pstate_adjust_pstate(struct cpudata *cpu)
struct sample *sample;
int target_pstate;
update_turbo_state();
target_pstate = get_target_pstate(cpu);
target_pstate = intel_pstate_prepare_request(cpu, target_pstate);
trace_cpu_frequency(target_pstate * cpu->pstate.scaling, cpu->cpu);
......@@ -2437,6 +2397,7 @@ static const struct x86_cpu_id intel_pstate_cpu_ids[] = {
};
MODULE_DEVICE_TABLE(x86cpu, intel_pstate_cpu_ids);
#ifdef CONFIG_ACPI
static const struct x86_cpu_id intel_pstate_cpu_oob_ids[] __initconst = {
X86_MATCH(BROADWELL_D, core_funcs),
X86_MATCH(BROADWELL_X, core_funcs),
......@@ -2445,6 +2406,7 @@ static const struct x86_cpu_id intel_pstate_cpu_oob_ids[] __initconst = {
X86_MATCH(SAPPHIRERAPIDS_X, core_funcs),
{}
};
#endif
static const struct x86_cpu_id intel_pstate_cpu_ee_disable_ids[] = {
X86_MATCH(KABYLAKE, core_funcs),
......@@ -2526,7 +2488,7 @@ static void intel_pstate_clear_update_util_hook(unsigned int cpu)
static int intel_pstate_get_max_freq(struct cpudata *cpu)
{
return global.turbo_disabled || global.no_turbo ?
return READ_ONCE(global.no_turbo) ?
cpu->pstate.max_freq : cpu->pstate.turbo_freq;
}
......@@ -2611,12 +2573,14 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy)
intel_pstate_update_perf_limits(cpu, policy->min, policy->max);
if (cpu->policy == CPUFREQ_POLICY_PERFORMANCE) {
int pstate = max(cpu->pstate.min_pstate, cpu->max_perf_ratio);
/*
* NOHZ_FULL CPUs need this as the governor callback may not
* be invoked on them.
*/
intel_pstate_clear_update_util_hook(policy->cpu);
intel_pstate_max_within_limits(cpu);
intel_pstate_set_pstate(cpu, pstate);
} else {
intel_pstate_set_update_util_hook(policy->cpu);
}
......@@ -2659,10 +2623,9 @@ static void intel_pstate_verify_cpu_policy(struct cpudata *cpu,
{
int max_freq;
update_turbo_state();
if (hwp_active) {
intel_pstate_get_hwp_cap(cpu);
max_freq = global.no_turbo || global.turbo_disabled ?
max_freq = READ_ONCE(global.no_turbo) ?
cpu->pstate.max_freq : cpu->pstate.turbo_freq;
} else {
max_freq = intel_pstate_get_max_freq(cpu);
......@@ -2756,9 +2719,7 @@ static int __intel_pstate_cpu_init(struct cpufreq_policy *policy)
/* cpuinfo and default policy values */
policy->cpuinfo.min_freq = cpu->pstate.min_freq;
update_turbo_state();
global.turbo_disabled_mf = global.turbo_disabled;
policy->cpuinfo.max_freq = global.turbo_disabled ?
policy->cpuinfo.max_freq = READ_ONCE(global.no_turbo) ?
cpu->pstate.max_freq : cpu->pstate.turbo_freq;
policy->min = policy->cpuinfo.min_freq;
......@@ -2923,8 +2884,6 @@ static int intel_cpufreq_target(struct cpufreq_policy *policy,
struct cpufreq_freqs freqs;
int target_pstate;
update_turbo_state();
freqs.old = policy->cur;
freqs.new = target_freq;
......@@ -2946,8 +2905,6 @@ static unsigned int intel_cpufreq_fast_switch(struct cpufreq_policy *policy,
struct cpudata *cpu = all_cpu_data[policy->cpu];
int target_pstate;
update_turbo_state();
target_pstate = intel_pstate_freq_to_hwp(cpu, target_freq);
target_pstate = intel_cpufreq_update_pstate(policy, target_pstate, true);
......@@ -2965,9 +2922,9 @@ static void intel_cpufreq_adjust_perf(unsigned int cpunum,
int old_pstate = cpu->pstate.current_pstate;
int cap_pstate, min_pstate, max_pstate, target_pstate;
update_turbo_state();
cap_pstate = global.turbo_disabled ? HWP_GUARANTEED_PERF(hwp_cap) :
HWP_HIGHEST_PERF(hwp_cap);
cap_pstate = READ_ONCE(global.no_turbo) ?
HWP_GUARANTEED_PERF(hwp_cap) :
HWP_HIGHEST_PERF(hwp_cap);
/* Optimization: Avoid unnecessary divisions. */
......@@ -3135,10 +3092,8 @@ static void intel_pstate_driver_cleanup(void)
if (intel_pstate_driver == &intel_pstate)
intel_pstate_clear_update_util_hook(cpu);
spin_lock(&hwp_notify_lock);
kfree(all_cpu_data[cpu]);
WRITE_ONCE(all_cpu_data[cpu], NULL);
spin_unlock(&hwp_notify_lock);
}
}
cpus_read_unlock();
......@@ -3155,6 +3110,10 @@ static int intel_pstate_register_driver(struct cpufreq_driver *driver)
memset(&global, 0, sizeof(global));
global.max_perf_pct = 100;
global.turbo_disabled = turbo_is_disabled();
global.no_turbo = global.turbo_disabled;
arch_set_max_freq_ratio(global.turbo_disabled);
intel_pstate_driver = driver;
ret = cpufreq_register_driver(intel_pstate_driver);
......@@ -3466,7 +3425,7 @@ static int __init intel_pstate_init(void)
* deal with it.
*/
if ((!no_hwp && boot_cpu_has(X86_FEATURE_HWP_EPP)) || hwp_forced) {
WRITE_ONCE(hwp_active, 1);
hwp_active = true;
hwp_mode_bdw = id->driver_data;
intel_pstate.attr = hwp_cpufreq_attrs;
intel_cpufreq.attr = hwp_cpufreq_attrs;
......
......@@ -707,6 +707,15 @@ static const struct mtk_cpufreq_platform_data mt7623_platform_data = {
.ccifreq_supported = false,
};
static const struct mtk_cpufreq_platform_data mt7988_platform_data = {
.min_volt_shift = 100000,
.max_volt_shift = 200000,
.proc_max_volt = 900000,
.sram_min_volt = 0,
.sram_max_volt = 1150000,
.ccifreq_supported = true,
};
static const struct mtk_cpufreq_platform_data mt8183_platform_data = {
.min_volt_shift = 100000,
.max_volt_shift = 200000,
......@@ -740,6 +749,7 @@ static const struct of_device_id mtk_cpufreq_machines[] __initconst = {
{ .compatible = "mediatek,mt2712", .data = &mt2701_platform_data },
{ .compatible = "mediatek,mt7622", .data = &mt7622_platform_data },
{ .compatible = "mediatek,mt7623", .data = &mt7623_platform_data },
{ .compatible = "mediatek,mt7988a", .data = &mt7988_platform_data },
{ .compatible = "mediatek,mt8167", .data = &mt8516_platform_data },
{ .compatible = "mediatek,mt817x", .data = &mt2701_platform_data },
{ .compatible = "mediatek,mt8173", .data = &mt2701_platform_data },
......
......@@ -10,6 +10,7 @@
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <linux/arm-smccc.h>
#include <linux/cpu.h>
#include <linux/module.h>
#include <linux/nvmem-consumer.h>
......@@ -18,26 +19,155 @@
#include <linux/pm_opp.h>
#include <linux/slab.h>
#define MAX_NAME_LEN 7
#define NVMEM_MASK 0x7
#define NVMEM_SHIFT 5
static struct platform_device *cpufreq_dt_pdev, *sun50i_cpufreq_pdev;
struct sunxi_cpufreq_data {
u32 (*efuse_xlate)(u32 speedbin);
};
static u32 sun50i_h6_efuse_xlate(u32 speedbin)
{
u32 efuse_value;
efuse_value = (speedbin >> NVMEM_SHIFT) & NVMEM_MASK;
/*
* We treat unexpected efuse values as if the SoC was from
* the slowest bin. Expected efuse values are 1-3, slowest
* to fastest.
*/
if (efuse_value >= 1 && efuse_value <= 3)
return efuse_value - 1;
else
return 0;
}
static int get_soc_id_revision(void)
{
#ifdef CONFIG_HAVE_ARM_SMCCC_DISCOVERY
return arm_smccc_get_soc_id_revision();
#else
return SMCCC_RET_NOT_SUPPORTED;
#endif
}
/*
* Judging by the OPP tables in the vendor BSP, the quality order of the
* returned speedbin index is 4 -> 0/2 -> 3 -> 1, from worst to best.
* 0 and 2 seem identical from the OPP tables' point of view.
*/
static u32 sun50i_h616_efuse_xlate(u32 speedbin)
{
int ver_bits = get_soc_id_revision();
u32 value = 0;
switch (speedbin & 0xffff) {
case 0x2000:
value = 0;
break;
case 0x2400:
case 0x7400:
case 0x2c00:
case 0x7c00:
if (ver_bits != SMCCC_RET_NOT_SUPPORTED && ver_bits <= 1) {
/* ic version A/B */
value = 1;
} else {
/* ic version C and later version */
value = 2;
}
break;
case 0x5000:
case 0x5400:
case 0x6000:
value = 3;
break;
case 0x5c00:
value = 4;
break;
case 0x5d00:
value = 0;
break;
default:
pr_warn("sun50i-cpufreq-nvmem: unknown speed bin 0x%x, using default bin 0\n",
speedbin & 0xffff);
value = 0;
break;
}
return value;
}
static struct sunxi_cpufreq_data sun50i_h6_cpufreq_data = {
.efuse_xlate = sun50i_h6_efuse_xlate,
};
static struct sunxi_cpufreq_data sun50i_h616_cpufreq_data = {
.efuse_xlate = sun50i_h616_efuse_xlate,
};
static const struct of_device_id cpu_opp_match_list[] = {
{ .compatible = "allwinner,sun50i-h6-operating-points",
.data = &sun50i_h6_cpufreq_data,
},
{ .compatible = "allwinner,sun50i-h616-operating-points",
.data = &sun50i_h616_cpufreq_data,
},
{}
};
/**
* dt_has_supported_hw() - Check if any OPPs use opp-supported-hw
*
* If we ask the cpufreq framework to use the opp-supported-hw feature, it
* will ignore every OPP node without that DT property. If none of the OPPs
* have it, the driver will fail probing, due to the lack of OPPs.
*
* Returns true if we have at least one OPP with the opp-supported-hw property.
*/
static bool dt_has_supported_hw(void)
{
bool has_opp_supported_hw = false;
struct device_node *np, *opp;
struct device *cpu_dev;
cpu_dev = get_cpu_device(0);
if (!cpu_dev)
return false;
np = dev_pm_opp_of_get_opp_desc_node(cpu_dev);
if (!np)
return false;
for_each_child_of_node(np, opp) {
if (of_find_property(opp, "opp-supported-hw", NULL)) {
has_opp_supported_hw = true;
break;
}
}
of_node_put(np);
return has_opp_supported_hw;
}
/**
* sun50i_cpufreq_get_efuse() - Determine speed grade from efuse value
* @versions: Set to the value parsed from efuse
*
* Returns 0 if success.
* Returns non-negative speed bin index on success, a negative error
* value otherwise.
*/
static int sun50i_cpufreq_get_efuse(u32 *versions)
static int sun50i_cpufreq_get_efuse(void)
{
const struct sunxi_cpufreq_data *opp_data;
struct nvmem_cell *speedbin_nvmem;
const struct of_device_id *match;
struct device_node *np;
struct device *cpu_dev;
u32 *speedbin, efuse_value;
size_t len;
u32 *speedbin;
int ret;
cpu_dev = get_cpu_device(0);
......@@ -48,12 +178,12 @@ static int sun50i_cpufreq_get_efuse(u32 *versions)
if (!np)
return -ENOENT;
ret = of_device_is_compatible(np,
"allwinner,sun50i-h6-operating-points");
if (!ret) {
match = of_match_node(cpu_opp_match_list, np);
if (!match) {
of_node_put(np);
return -ENOENT;
}
opp_data = match->data;
speedbin_nvmem = of_nvmem_cell_get(np, NULL);
of_node_put(np);
......@@ -61,33 +191,25 @@ static int sun50i_cpufreq_get_efuse(u32 *versions)
return dev_err_probe(cpu_dev, PTR_ERR(speedbin_nvmem),
"Could not get nvmem cell\n");
speedbin = nvmem_cell_read(speedbin_nvmem, &len);
speedbin = nvmem_cell_read(speedbin_nvmem, NULL);
nvmem_cell_put(speedbin_nvmem);
if (IS_ERR(speedbin))
return PTR_ERR(speedbin);
efuse_value = (*speedbin >> NVMEM_SHIFT) & NVMEM_MASK;
/*
* We treat unexpected efuse values as if the SoC was from
* the slowest bin. Expected efuse values are 1-3, slowest
* to fastest.
*/
if (efuse_value >= 1 && efuse_value <= 3)
*versions = efuse_value - 1;
else
*versions = 0;
ret = opp_data->efuse_xlate(*speedbin);
kfree(speedbin);
return 0;
return ret;
};
static int sun50i_cpufreq_nvmem_probe(struct platform_device *pdev)
{
int *opp_tokens;
char name[MAX_NAME_LEN];
unsigned int cpu;
u32 speed = 0;
char name[] = "speedXXXXXXXXXXX"; /* Integers can take 11 chars max */
unsigned int cpu, supported_hw;
struct dev_pm_opp_config config = {};
int speed;
int ret;
opp_tokens = kcalloc(num_possible_cpus(), sizeof(*opp_tokens),
......@@ -95,13 +217,24 @@ static int sun50i_cpufreq_nvmem_probe(struct platform_device *pdev)
if (!opp_tokens)
return -ENOMEM;
ret = sun50i_cpufreq_get_efuse(&speed);
if (ret) {
speed = sun50i_cpufreq_get_efuse();
if (speed < 0) {
kfree(opp_tokens);
return ret;
return speed;
}
snprintf(name, MAX_NAME_LEN, "speed%d", speed);
/*
* We need at least one OPP with the "opp-supported-hw" property,
* or else the upper layers will ignore every OPP and will bail out.
*/
if (dt_has_supported_hw()) {
supported_hw = 1U << speed;
config.supported_hw = &supported_hw;
config.supported_hw_count = 1;
}
snprintf(name, sizeof(name), "speed%d", speed);
config.prop_name = name;
for_each_possible_cpu(cpu) {
struct device *cpu_dev = get_cpu_device(cpu);
......@@ -111,12 +244,11 @@ static int sun50i_cpufreq_nvmem_probe(struct platform_device *pdev)
goto free_opp;
}
opp_tokens[cpu] = dev_pm_opp_set_prop_name(cpu_dev, name);
if (opp_tokens[cpu] < 0) {
ret = opp_tokens[cpu];
pr_err("Failed to set prop name\n");
ret = dev_pm_opp_set_config(cpu_dev, &config);
if (ret < 0)
goto free_opp;
}
opp_tokens[cpu] = ret;
}
cpufreq_dt_pdev = platform_device_register_simple("cpufreq-dt", -1,
......@@ -131,7 +263,7 @@ static int sun50i_cpufreq_nvmem_probe(struct platform_device *pdev)
free_opp:
for_each_possible_cpu(cpu)
dev_pm_opp_put_prop_name(opp_tokens[cpu]);
dev_pm_opp_clear_config(opp_tokens[cpu]);
kfree(opp_tokens);
return ret;
......@@ -145,7 +277,7 @@ static void sun50i_cpufreq_nvmem_remove(struct platform_device *pdev)
platform_device_unregister(cpufreq_dt_pdev);
for_each_possible_cpu(cpu)
dev_pm_opp_put_prop_name(opp_tokens[cpu]);
dev_pm_opp_clear_config(opp_tokens[cpu]);
kfree(opp_tokens);
}
......@@ -160,6 +292,9 @@ static struct platform_driver sun50i_cpufreq_driver = {
static const struct of_device_id sun50i_cpufreq_match_list[] = {
{ .compatible = "allwinner,sun50i-h6" },
{ .compatible = "allwinner,sun50i-h616" },
{ .compatible = "allwinner,sun50i-h618" },
{ .compatible = "allwinner,sun50i-h700" },
{}
};
MODULE_DEVICE_TABLE(of, sun50i_cpufreq_match_list);
......
......@@ -52,12 +52,15 @@ static int tegra124_cpu_switch_to_dfll(struct tegra124_cpufreq_priv *priv)
static int tegra124_cpufreq_probe(struct platform_device *pdev)
{
struct device_node *np __free(device_node) = of_cpu_device_node_get(0);
struct tegra124_cpufreq_priv *priv;
struct device_node *np;
struct device *cpu_dev;
struct platform_device_info cpufreq_dt_devinfo = {};
int ret;
if (!np)
return -ENODEV;
priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
if (!priv)
return -ENOMEM;
......@@ -66,15 +69,9 @@ static int tegra124_cpufreq_probe(struct platform_device *pdev)
if (!cpu_dev)
return -ENODEV;
np = of_cpu_device_node_get(0);
if (!np)
return -ENODEV;
priv->cpu_clk = of_clk_get_by_name(np, "cpu_g");
if (IS_ERR(priv->cpu_clk)) {
ret = PTR_ERR(priv->cpu_clk);
goto out_put_np;
}
if (IS_ERR(priv->cpu_clk))
return PTR_ERR(priv->cpu_clk);
priv->dfll_clk = of_clk_get_by_name(np, "dfll");
if (IS_ERR(priv->dfll_clk)) {
......@@ -110,8 +107,6 @@ static int tegra124_cpufreq_probe(struct platform_device *pdev)
platform_set_drvdata(pdev, priv);
of_node_put(np);
return 0;
out_put_pllp_clk:
......@@ -122,8 +117,6 @@ static int tegra124_cpufreq_probe(struct platform_device *pdev)
clk_put(priv->dfll_clk);
out_put_cpu_clk:
clk_put(priv->cpu_clk);
out_put_np:
of_node_put(np);
return ret;
}
......
......@@ -347,12 +347,10 @@ static const struct of_device_id ti_cpufreq_of_match[] = {
static const struct of_device_id *ti_cpufreq_match_node(void)
{
struct device_node *np;
struct device_node *np __free(device_node) = of_find_node_by_path("/");
const struct of_device_id *match;
np = of_find_node_by_path("/");
match = of_match_node(ti_cpufreq_of_match, np);
of_node_put(np);
return match;
}
......
......@@ -59,15 +59,14 @@ static int kirkwood_cpuidle_probe(struct platform_device *pdev)
return cpuidle_register(&kirkwood_idle_driver, NULL);
}
static int kirkwood_cpuidle_remove(struct platform_device *pdev)
static void kirkwood_cpuidle_remove(struct platform_device *pdev)
{
cpuidle_unregister(&kirkwood_idle_driver);
return 0;
}
static struct platform_driver kirkwood_cpuidle_driver = {
.probe = kirkwood_cpuidle_probe,
.remove = kirkwood_cpuidle_remove,
.remove_new = kirkwood_cpuidle_remove,
.driver = {
.name = "kirkwood_cpuidle",
},
......
......@@ -44,6 +44,7 @@ static DEFINE_PER_CPU(struct ladder_device, ladder_devices);
/**
* ladder_do_selection - prepares private data for a state change
* @dev: the CPU
* @ldev: the ladder device
* @old_idx: the current state index
* @new_idx: the new target state index
......
......@@ -275,18 +275,16 @@ static int exynos_nocp_probe(struct platform_device *pdev)
return 0;
}
static int exynos_nocp_remove(struct platform_device *pdev)
static void exynos_nocp_remove(struct platform_device *pdev)
{
struct exynos_nocp *nocp = platform_get_drvdata(pdev);
clk_disable_unprepare(nocp->clk);
return 0;
}
static struct platform_driver exynos_nocp_driver = {
.probe = exynos_nocp_probe,
.remove = exynos_nocp_remove,
.remove_new = exynos_nocp_remove,
.driver = {
.name = "exynos-nocp",
.of_match_table = exynos_nocp_id_match,
......
......@@ -692,18 +692,16 @@ static int exynos_ppmu_probe(struct platform_device *pdev)
return 0;
}
static int exynos_ppmu_remove(struct platform_device *pdev)
static void exynos_ppmu_remove(struct platform_device *pdev)
{
struct exynos_ppmu *info = platform_get_drvdata(pdev);
clk_disable_unprepare(info->ppmu.clk);
return 0;
}
static struct platform_driver exynos_ppmu_driver = {
.probe = exynos_ppmu_probe,
.remove = exynos_ppmu_remove,
.remove_new = exynos_ppmu_remove,
.driver = {
.name = "exynos-ppmu",
.of_match_table = exynos_ppmu_id_match,
......
......@@ -467,7 +467,6 @@ static void exynos_bus_shutdown(struct platform_device *pdev)
devfreq_suspend_device(bus->devfreq);
}
#ifdef CONFIG_PM_SLEEP
static int exynos_bus_resume(struct device *dev)
{
struct exynos_bus *bus = dev_get_drvdata(dev);
......@@ -495,11 +494,9 @@ static int exynos_bus_suspend(struct device *dev)
return 0;
}
#endif
static const struct dev_pm_ops exynos_bus_pm = {
SET_SYSTEM_SLEEP_PM_OPS(exynos_bus_suspend, exynos_bus_resume)
};
static DEFINE_SIMPLE_DEV_PM_OPS(exynos_bus_pm,
exynos_bus_suspend, exynos_bus_resume);
static const struct of_device_id exynos_bus_of_match[] = {
{ .compatible = "samsung,exynos-bus", },
......@@ -512,7 +509,7 @@ static struct platform_driver exynos_bus_platdrv = {
.shutdown = exynos_bus_shutdown,
.driver = {
.name = "exynos-bus",
.pm = &exynos_bus_pm,
.pm = pm_sleep_ptr(&exynos_bus_pm),
.of_match_table = exynos_bus_of_match,
},
};
......
......@@ -392,7 +392,7 @@ static int mtk_ccifreq_probe(struct platform_device *pdev)
return ret;
}
static int mtk_ccifreq_remove(struct platform_device *pdev)
static void mtk_ccifreq_remove(struct platform_device *pdev)
{
struct device *dev = &pdev->dev;
struct mtk_ccifreq_drv *drv;
......@@ -405,8 +405,6 @@ static int mtk_ccifreq_remove(struct platform_device *pdev)
regulator_disable(drv->proc_reg);
if (drv->sram_reg)
regulator_disable(drv->sram_reg);
return 0;
}
static const struct mtk_ccifreq_platform_data mt8183_platform_data = {
......@@ -432,7 +430,7 @@ MODULE_DEVICE_TABLE(of, mtk_ccifreq_machines);
static struct platform_driver mtk_ccifreq_platdrv = {
.probe = mtk_ccifreq_probe,
.remove = mtk_ccifreq_remove,
.remove_new = mtk_ccifreq_remove,
.driver = {
.name = "mtk-ccifreq",
.of_match_table = mtk_ccifreq_machines,
......
......@@ -459,13 +459,11 @@ static int rk3399_dmcfreq_probe(struct platform_device *pdev)
return ret;
}
static int rk3399_dmcfreq_remove(struct platform_device *pdev)
static void rk3399_dmcfreq_remove(struct platform_device *pdev)
{
struct rk3399_dmcfreq *dmcfreq = dev_get_drvdata(&pdev->dev);
devfreq_event_disable_edev(dmcfreq->edev);
return 0;
}
static const struct of_device_id rk3399dmc_devfreq_of_match[] = {
......@@ -476,7 +474,7 @@ MODULE_DEVICE_TABLE(of, rk3399dmc_devfreq_of_match);
static struct platform_driver rk3399_dmcfreq_driver = {
.probe = rk3399_dmcfreq_probe,
.remove = rk3399_dmcfreq_remove,
.remove_new = rk3399_dmcfreq_remove,
.driver = {
.name = "rk3399-dmc-freq",
.pm = &rk3399_dmcfreq_pm,
......
......@@ -458,7 +458,7 @@ static int sun8i_a33_mbus_probe(struct platform_device *pdev)
return dev_err_probe(dev, ret, err);
}
static int sun8i_a33_mbus_remove(struct platform_device *pdev)
static void sun8i_a33_mbus_remove(struct platform_device *pdev)
{
struct sun8i_a33_mbus *priv = platform_get_drvdata(pdev);
unsigned long initial_freq = priv->profile.initial_freq;
......@@ -475,8 +475,6 @@ static int sun8i_a33_mbus_remove(struct platform_device *pdev)
clk_rate_exclusive_put(priv->clk_mbus);
clk_rate_exclusive_put(priv->clk_dram);
clk_disable_unprepare(priv->clk_bus);
return 0;
}
static const struct sun8i_a33_mbus_variant sun50i_a64_mbus = {
......@@ -497,7 +495,7 @@ static SIMPLE_DEV_PM_OPS(sun8i_a33_mbus_pm_ops,
static struct platform_driver sun8i_a33_mbus_driver = {
.probe = sun8i_a33_mbus_probe,
.remove = sun8i_a33_mbus_remove,
.remove_new = sun8i_a33_mbus_remove,
.driver = {
.name = "sun8i-a33-mbus",
.of_match_table = sun8i_a33_mbus_of_match,
......
......@@ -69,6 +69,7 @@ s32 arm_smccc_get_soc_id_revision(void)
{
return smccc_soc_id_revision;
}
EXPORT_SYMBOL_GPL(arm_smccc_get_soc_id_revision);
static int __init smccc_devices_init(void)
{
......
......@@ -63,7 +63,7 @@ static int sdhci_pci_init_wakeup(struct sdhci_pci_chip *chip)
if ((pm_flags & MMC_PM_KEEP_POWER) && (pm_flags & MMC_PM_WAKE_SDIO_IRQ))
return device_wakeup_enable(&chip->pdev->dev);
else if (!cap_cd_wake)
return device_wakeup_disable(&chip->pdev->dev);
device_wakeup_disable(&chip->pdev->dev);
return 0;
}
......
......@@ -1494,20 +1494,26 @@ _get_dt_power(struct device *dev, unsigned long *uW, unsigned long *kHz)
return 0;
}
/*
* Callback function provided to the Energy Model framework upon registration.
/**
* dev_pm_opp_calc_power() - Calculate power value for device with EM
* @dev : Device for which an Energy Model has to be registered
* @uW : New power value that is calculated
* @kHz : Frequency for which the new power is calculated
*
* This computes the power estimated by @dev at @kHz if it is the frequency
* of an existing OPP, or at the frequency of the first OPP above @kHz otherwise
* (see dev_pm_opp_find_freq_ceil()). This function updates @kHz to the ceiled
* frequency and @uW to the associated power. The power is estimated as
* P = C * V^2 * f with C being the device's capacitance and V and f
* respectively the voltage and frequency of the OPP.
* It is also used as a callback function provided to the Energy Model
* framework upon registration.
*
* Returns -EINVAL if the power calculation failed because of missing
* parameters, 0 otherwise.
*/
static int __maybe_unused _get_power(struct device *dev, unsigned long *uW,
unsigned long *kHz)
int dev_pm_opp_calc_power(struct device *dev, unsigned long *uW,
unsigned long *kHz)
{
struct dev_pm_opp *opp;
struct device_node *np;
......@@ -1544,6 +1550,7 @@ static int __maybe_unused _get_power(struct device *dev, unsigned long *uW,
return 0;
}
EXPORT_SYMBOL_GPL(dev_pm_opp_calc_power);
static bool _of_has_opp_microwatt_property(struct device *dev)
{
......@@ -1619,7 +1626,7 @@ int dev_pm_opp_of_register_em(struct device *dev, struct cpumask *cpus)
goto failed;
}
EM_SET_ACTIVE_POWER_CB(em_cb, _get_power);
EM_SET_ACTIVE_POWER_CB(em_cb, dev_pm_opp_calc_power);
register_em:
ret = em_dev_register_perf_domain(dev, nr_opp, &em_cb, cpus, true);
......
......@@ -43,13 +43,11 @@ static u64 set_pd_power_limit(struct dtpm *dtpm, u64 power_limit)
struct dtpm_cpu *dtpm_cpu = to_dtpm_cpu(dtpm);
struct em_perf_domain *pd = em_cpu_get(dtpm_cpu->cpu);
struct em_perf_state *table;
struct cpumask cpus;
unsigned long freq;
u64 power;
int i, nr_cpus;
cpumask_and(&cpus, cpu_online_mask, to_cpumask(pd->cpus));
nr_cpus = cpumask_weight(&cpus);
nr_cpus = cpumask_weight_and(cpu_online_mask, to_cpumask(pd->cpus));
rcu_read_lock();
table = em_perf_state_from_pd(pd);
......@@ -123,11 +121,9 @@ static int update_pd_power_uw(struct dtpm *dtpm)
struct dtpm_cpu *dtpm_cpu = to_dtpm_cpu(dtpm);
struct em_perf_domain *em = em_cpu_get(dtpm_cpu->cpu);
struct em_perf_state *table;
struct cpumask cpus;
int nr_cpus;
cpumask_and(&cpus, cpu_online_mask, to_cpumask(em->cpus));
nr_cpus = cpumask_weight(&cpus);
nr_cpus = cpumask_weight_and(cpu_online_mask, to_cpumask(em->cpus));
rcu_read_lock();
table = em_perf_state_from_pd(em);
......
......@@ -5,27 +5,29 @@
*/
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <linux/bitmap.h>
#include <linux/cleanup.h>
#include <linux/cpu.h>
#include <linux/delay.h>
#include <linux/device.h>
#include <linux/intel_rapl.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/list.h>
#include <linux/types.h>
#include <linux/device.h>
#include <linux/slab.h>
#include <linux/log2.h>
#include <linux/bitmap.h>
#include <linux/delay.h>
#include <linux/sysfs.h>
#include <linux/cpu.h>
#include <linux/module.h>
#include <linux/nospec.h>
#include <linux/perf_event.h>
#include <linux/platform_device.h>
#include <linux/powercap.h>
#include <linux/suspend.h>
#include <linux/intel_rapl.h>
#include <linux/processor.h>
#include <linux/platform_device.h>
#include <linux/slab.h>
#include <linux/suspend.h>
#include <linux/sysfs.h>
#include <linux/types.h>
#include <asm/iosf_mbi.h>
#include <asm/cpu_device_id.h>
#include <asm/intel-family.h>
#include <asm/iosf_mbi.h>
/* bitmasks for RAPL MSRs, used by primitive access functions */
#define ENERGY_STATUS_MASK 0xffffffff
......@@ -1263,6 +1265,7 @@ static const struct x86_cpu_id rapl_ids[] __initconst = {
X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X, &rapl_defaults_spr_server),
X86_MATCH_INTEL_FAM6_MODEL(EMERALDRAPIDS_X, &rapl_defaults_spr_server),
X86_MATCH_INTEL_FAM6_MODEL(LUNARLAKE_M, &rapl_defaults_core),
X86_MATCH_INTEL_FAM6_MODEL(ARROWLAKE_H, &rapl_defaults_core),
X86_MATCH_INTEL_FAM6_MODEL(ARROWLAKE, &rapl_defaults_core),
X86_MATCH_INTEL_FAM6_MODEL(LAKEFIELD, &rapl_defaults_core),
......@@ -1506,6 +1509,586 @@ static int rapl_detect_domains(struct rapl_package *rp)
return 0;
}
#ifdef CONFIG_PERF_EVENTS
/*
* Support for RAPL PMU
*
* Register a PMU if any of the registered RAPL Packages have the requirement
* of exposing its energy counters via Perf PMU.
*
* PMU Name:
* power
*
* Events:
* Name Event id RAPL Domain
* energy_cores 0x01 RAPL_DOMAIN_PP0
* energy_pkg 0x02 RAPL_DOMAIN_PACKAGE
* energy_ram 0x03 RAPL_DOMAIN_DRAM
* energy_gpu 0x04 RAPL_DOMAIN_PP1
* energy_psys 0x05 RAPL_DOMAIN_PLATFORM
*
* Unit:
* Joules
*
* Scale:
* 2.3283064365386962890625e-10
* The same RAPL domain in different RAPL Packages may have different
* energy units. Use 2.3283064365386962890625e-10 (2^-32) Joules as
* the fixed unit for all energy counters, and covert each hardware
* counter increase to N times of PMU event counter increases.
*
* This is fully compatible with the current MSR RAPL PMU. This means that
* userspace programs like turbostat can use the same code to handle RAPL Perf
* PMU, no matter what RAPL Interface driver (MSR/TPMI, etc) is running
* underlying on the platform.
*
* Note that RAPL Packages can be probed/removed dynamically, and the events
* supported by each TPMI RAPL device can be different. Thus the RAPL PMU
* support is done on demand, which means
* 1. PMU is registered only if it is needed by a RAPL Package. PMU events for
* unsupported counters are not exposed.
* 2. PMU is unregistered and registered when a new RAPL Package is probed and
* supports new counters that are not supported by current PMU.
* 3. PMU is unregistered when all registered RAPL Packages don't need PMU.
*/
struct rapl_pmu {
struct pmu pmu; /* Perf PMU structure */
u64 timer_ms; /* Maximum expiration time to avoid counter overflow */
unsigned long domain_map; /* Events supported by current registered PMU */
bool registered; /* Whether the PMU has been registered or not */
};
static struct rapl_pmu rapl_pmu;
/* PMU helpers */
static int get_pmu_cpu(struct rapl_package *rp)
{
int cpu;
if (!rp->has_pmu)
return nr_cpu_ids;
/* Only TPMI RAPL is supported for now */
if (rp->priv->type != RAPL_IF_TPMI)
return nr_cpu_ids;
/* TPMI RAPL uses any CPU in the package for PMU */
for_each_online_cpu(cpu)
if (topology_physical_package_id(cpu) == rp->id)
return cpu;
return nr_cpu_ids;
}
static bool is_rp_pmu_cpu(struct rapl_package *rp, int cpu)
{
if (!rp->has_pmu)
return false;
/* Only TPMI RAPL is supported for now */
if (rp->priv->type != RAPL_IF_TPMI)
return false;
/* TPMI RAPL uses any CPU in the package for PMU */
return topology_physical_package_id(cpu) == rp->id;
}
static struct rapl_package_pmu_data *event_to_pmu_data(struct perf_event *event)
{
struct rapl_package *rp = event->pmu_private;
return &rp->pmu_data;
}
/* PMU event callbacks */
static u64 event_read_counter(struct perf_event *event)
{
struct rapl_package *rp = event->pmu_private;
u64 val;
int ret;
/* Return 0 for unsupported events */
if (event->hw.idx < 0)
return 0;
ret = rapl_read_data_raw(&rp->domains[event->hw.idx], ENERGY_COUNTER, false, &val);
/* Return 0 for failed read */
if (ret)
return 0;
return val;
}
static void __rapl_pmu_event_start(struct perf_event *event)
{
struct rapl_package_pmu_data *data = event_to_pmu_data(event);
if (WARN_ON_ONCE(!(event->hw.state & PERF_HES_STOPPED)))
return;
event->hw.state = 0;
list_add_tail(&event->active_entry, &data->active_list);
local64_set(&event->hw.prev_count, event_read_counter(event));
if (++data->n_active == 1)
hrtimer_start(&data->hrtimer, data->timer_interval,
HRTIMER_MODE_REL_PINNED);
}
static void rapl_pmu_event_start(struct perf_event *event, int mode)
{
struct rapl_package_pmu_data *data = event_to_pmu_data(event);
unsigned long flags;
raw_spin_lock_irqsave(&data->lock, flags);
__rapl_pmu_event_start(event);
raw_spin_unlock_irqrestore(&data->lock, flags);
}
static u64 rapl_event_update(struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
struct rapl_package_pmu_data *data = event_to_pmu_data(event);
u64 prev_raw_count, new_raw_count;
s64 delta, sdelta;
/*
* Follow the generic code to drain hwc->prev_count.
* The loop is not expected to run for multiple times.
*/
prev_raw_count = local64_read(&hwc->prev_count);
do {
new_raw_count = event_read_counter(event);
} while (!local64_try_cmpxchg(&hwc->prev_count,
&prev_raw_count, new_raw_count));
/*
* Now we have the new raw value and have updated the prev
* timestamp already. We can now calculate the elapsed delta
* (event-)time and add that to the generic event.
*/
delta = new_raw_count - prev_raw_count;
/*
* Scale delta to smallest unit (2^-32)
* users must then scale back: count * 1/(1e9*2^32) to get Joules
* or use ldexp(count, -32).
* Watts = Joules/Time delta
*/
sdelta = delta * data->scale[event->hw.flags];
local64_add(sdelta, &event->count);
return new_raw_count;
}
static void rapl_pmu_event_stop(struct perf_event *event, int mode)
{
struct rapl_package_pmu_data *data = event_to_pmu_data(event);
struct hw_perf_event *hwc = &event->hw;
unsigned long flags;
raw_spin_lock_irqsave(&data->lock, flags);
/* Mark event as deactivated and stopped */
if (!(hwc->state & PERF_HES_STOPPED)) {
WARN_ON_ONCE(data->n_active <= 0);
if (--data->n_active == 0)
hrtimer_cancel(&data->hrtimer);
list_del(&event->active_entry);
WARN_ON_ONCE(hwc->state & PERF_HES_STOPPED);
hwc->state |= PERF_HES_STOPPED;
}
/* Check if update of sw counter is necessary */
if ((mode & PERF_EF_UPDATE) && !(hwc->state & PERF_HES_UPTODATE)) {
/*
* Drain the remaining delta count out of a event
* that we are disabling:
*/
rapl_event_update(event);
hwc->state |= PERF_HES_UPTODATE;
}
raw_spin_unlock_irqrestore(&data->lock, flags);
}
static int rapl_pmu_event_add(struct perf_event *event, int mode)
{
struct rapl_package_pmu_data *data = event_to_pmu_data(event);
struct hw_perf_event *hwc = &event->hw;
unsigned long flags;
raw_spin_lock_irqsave(&data->lock, flags);
hwc->state = PERF_HES_UPTODATE | PERF_HES_STOPPED;
if (mode & PERF_EF_START)
__rapl_pmu_event_start(event);
raw_spin_unlock_irqrestore(&data->lock, flags);
return 0;
}
static void rapl_pmu_event_del(struct perf_event *event, int flags)
{
rapl_pmu_event_stop(event, PERF_EF_UPDATE);
}
/* RAPL PMU event ids, same as shown in sysfs */
enum perf_rapl_events {
PERF_RAPL_PP0 = 1, /* all cores */
PERF_RAPL_PKG, /* entire package */
PERF_RAPL_RAM, /* DRAM */
PERF_RAPL_PP1, /* gpu */
PERF_RAPL_PSYS, /* psys */
PERF_RAPL_MAX
};
#define RAPL_EVENT_MASK GENMASK(7, 0)
static const int event_to_domain[PERF_RAPL_MAX] = {
[PERF_RAPL_PP0] = RAPL_DOMAIN_PP0,
[PERF_RAPL_PKG] = RAPL_DOMAIN_PACKAGE,
[PERF_RAPL_RAM] = RAPL_DOMAIN_DRAM,
[PERF_RAPL_PP1] = RAPL_DOMAIN_PP1,
[PERF_RAPL_PSYS] = RAPL_DOMAIN_PLATFORM,
};
static int rapl_pmu_event_init(struct perf_event *event)
{
struct rapl_package *pos, *rp = NULL;
u64 cfg = event->attr.config & RAPL_EVENT_MASK;
int domain, idx;
/* Only look at RAPL events */
if (event->attr.type != event->pmu->type)
return -ENOENT;
/* Check for supported events only */
if (!cfg || cfg >= PERF_RAPL_MAX)
return -EINVAL;
if (event->cpu < 0)
return -EINVAL;
/* Find out which Package the event belongs to */
list_for_each_entry(pos, &rapl_packages, plist) {
if (is_rp_pmu_cpu(pos, event->cpu)) {
rp = pos;
break;
}
}
if (!rp)
return -ENODEV;
/* Find out which RAPL Domain the event belongs to */
domain = event_to_domain[cfg];
event->event_caps |= PERF_EV_CAP_READ_ACTIVE_PKG;
event->pmu_private = rp; /* Which package */
event->hw.flags = domain; /* Which domain */
event->hw.idx = -1;
/* Find out the index in rp->domains[] to get domain pointer */
for (idx = 0; idx < rp->nr_domains; idx++) {
if (rp->domains[idx].id == domain) {
event->hw.idx = idx;
break;
}
}
return 0;
}
static void rapl_pmu_event_read(struct perf_event *event)
{
rapl_event_update(event);
}
static enum hrtimer_restart rapl_hrtimer_handle(struct hrtimer *hrtimer)
{
struct rapl_package_pmu_data *data =
container_of(hrtimer, struct rapl_package_pmu_data, hrtimer);
struct perf_event *event;
unsigned long flags;
if (!data->n_active)
return HRTIMER_NORESTART;
raw_spin_lock_irqsave(&data->lock, flags);
list_for_each_entry(event, &data->active_list, active_entry)
rapl_event_update(event);
raw_spin_unlock_irqrestore(&data->lock, flags);
hrtimer_forward_now(hrtimer, data->timer_interval);
return HRTIMER_RESTART;
}
/* PMU sysfs attributes */
/*
* There are no default events, but we need to create "events" group (with
* empty attrs) before updating it with detected events.
*/
static struct attribute *attrs_empty[] = {
NULL,
};
static struct attribute_group pmu_events_group = {
.name = "events",
.attrs = attrs_empty,
};
static ssize_t cpumask_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
struct rapl_package *rp;
cpumask_var_t cpu_mask;
int cpu;
int ret;
if (!alloc_cpumask_var(&cpu_mask, GFP_KERNEL))
return -ENOMEM;
cpus_read_lock();
cpumask_clear(cpu_mask);
/* Choose a cpu for each RAPL Package */
list_for_each_entry(rp, &rapl_packages, plist) {
cpu = get_pmu_cpu(rp);
if (cpu < nr_cpu_ids)
cpumask_set_cpu(cpu, cpu_mask);
}
cpus_read_unlock();
ret = cpumap_print_to_pagebuf(true, buf, cpu_mask);
free_cpumask_var(cpu_mask);
return ret;
}
static DEVICE_ATTR_RO(cpumask);
static struct attribute *pmu_cpumask_attrs[] = {
&dev_attr_cpumask.attr,
NULL
};
static struct attribute_group pmu_cpumask_group = {
.attrs = pmu_cpumask_attrs,
};
PMU_FORMAT_ATTR(event, "config:0-7");
static struct attribute *pmu_format_attr[] = {
&format_attr_event.attr,
NULL
};
static struct attribute_group pmu_format_group = {
.name = "format",
.attrs = pmu_format_attr,
};
static const struct attribute_group *pmu_attr_groups[] = {
&pmu_events_group,
&pmu_cpumask_group,
&pmu_format_group,
NULL
};
#define RAPL_EVENT_ATTR_STR(_name, v, str) \
static struct perf_pmu_events_attr event_attr_##v = { \
.attr = __ATTR(_name, 0444, perf_event_sysfs_show, NULL), \
.event_str = str, \
}
RAPL_EVENT_ATTR_STR(energy-cores, rapl_cores, "event=0x01");
RAPL_EVENT_ATTR_STR(energy-pkg, rapl_pkg, "event=0x02");
RAPL_EVENT_ATTR_STR(energy-ram, rapl_ram, "event=0x03");
RAPL_EVENT_ATTR_STR(energy-gpu, rapl_gpu, "event=0x04");
RAPL_EVENT_ATTR_STR(energy-psys, rapl_psys, "event=0x05");
RAPL_EVENT_ATTR_STR(energy-cores.unit, rapl_unit_cores, "Joules");
RAPL_EVENT_ATTR_STR(energy-pkg.unit, rapl_unit_pkg, "Joules");
RAPL_EVENT_ATTR_STR(energy-ram.unit, rapl_unit_ram, "Joules");
RAPL_EVENT_ATTR_STR(energy-gpu.unit, rapl_unit_gpu, "Joules");
RAPL_EVENT_ATTR_STR(energy-psys.unit, rapl_unit_psys, "Joules");
RAPL_EVENT_ATTR_STR(energy-cores.scale, rapl_scale_cores, "2.3283064365386962890625e-10");
RAPL_EVENT_ATTR_STR(energy-pkg.scale, rapl_scale_pkg, "2.3283064365386962890625e-10");
RAPL_EVENT_ATTR_STR(energy-ram.scale, rapl_scale_ram, "2.3283064365386962890625e-10");
RAPL_EVENT_ATTR_STR(energy-gpu.scale, rapl_scale_gpu, "2.3283064365386962890625e-10");
RAPL_EVENT_ATTR_STR(energy-psys.scale, rapl_scale_psys, "2.3283064365386962890625e-10");
#define RAPL_EVENT_GROUP(_name, domain) \
static struct attribute *pmu_attr_##_name[] = { \
&event_attr_rapl_##_name.attr.attr, \
&event_attr_rapl_unit_##_name.attr.attr, \
&event_attr_rapl_scale_##_name.attr.attr, \
NULL \
}; \
static umode_t is_visible_##_name(struct kobject *kobj, struct attribute *attr, int event) \
{ \
return rapl_pmu.domain_map & BIT(domain) ? attr->mode : 0; \
} \
static struct attribute_group pmu_group_##_name = { \
.name = "events", \
.attrs = pmu_attr_##_name, \
.is_visible = is_visible_##_name, \
}
RAPL_EVENT_GROUP(cores, RAPL_DOMAIN_PP0);
RAPL_EVENT_GROUP(pkg, RAPL_DOMAIN_PACKAGE);
RAPL_EVENT_GROUP(ram, RAPL_DOMAIN_DRAM);
RAPL_EVENT_GROUP(gpu, RAPL_DOMAIN_PP1);
RAPL_EVENT_GROUP(psys, RAPL_DOMAIN_PLATFORM);
static const struct attribute_group *pmu_attr_update[] = {
&pmu_group_cores,
&pmu_group_pkg,
&pmu_group_ram,
&pmu_group_gpu,
&pmu_group_psys,
NULL
};
static int rapl_pmu_update(struct rapl_package *rp)
{
int ret = 0;
/* Return if PMU already covers all events supported by current RAPL Package */
if (rapl_pmu.registered && !(rp->domain_map & (~rapl_pmu.domain_map)))
goto end;
/* Unregister previous registered PMU */
if (rapl_pmu.registered)
perf_pmu_unregister(&rapl_pmu.pmu);
rapl_pmu.registered = false;
rapl_pmu.domain_map |= rp->domain_map;
memset(&rapl_pmu.pmu, 0, sizeof(struct pmu));
rapl_pmu.pmu.attr_groups = pmu_attr_groups;
rapl_pmu.pmu.attr_update = pmu_attr_update;
rapl_pmu.pmu.task_ctx_nr = perf_invalid_context;
rapl_pmu.pmu.event_init = rapl_pmu_event_init;
rapl_pmu.pmu.add = rapl_pmu_event_add;
rapl_pmu.pmu.del = rapl_pmu_event_del;
rapl_pmu.pmu.start = rapl_pmu_event_start;
rapl_pmu.pmu.stop = rapl_pmu_event_stop;
rapl_pmu.pmu.read = rapl_pmu_event_read;
rapl_pmu.pmu.module = THIS_MODULE;
rapl_pmu.pmu.capabilities = PERF_PMU_CAP_NO_EXCLUDE | PERF_PMU_CAP_NO_INTERRUPT;
ret = perf_pmu_register(&rapl_pmu.pmu, "power", -1);
if (ret) {
pr_info("Failed to register PMU\n");
return ret;
}
rapl_pmu.registered = true;
end:
rp->has_pmu = true;
return ret;
}
int rapl_package_add_pmu(struct rapl_package *rp)
{
struct rapl_package_pmu_data *data = &rp->pmu_data;
int idx;
if (rp->has_pmu)
return -EEXIST;
guard(cpus_read_lock)();
for (idx = 0; idx < rp->nr_domains; idx++) {
struct rapl_domain *rd = &rp->domains[idx];
int domain = rd->id;
u64 val;
if (!test_bit(domain, &rp->domain_map))
continue;
/*
* The RAPL PMU granularity is 2^-32 Joules
* data->scale[]: times of 2^-32 Joules for each ENERGY COUNTER increase
*/
val = rd->energy_unit * (1ULL << 32);
do_div(val, ENERGY_UNIT_SCALE * 1000000);
data->scale[domain] = val;
if (!rapl_pmu.timer_ms) {
struct rapl_primitive_info *rpi = get_rpi(rp, ENERGY_COUNTER);
/*
* Calculate the timer rate:
* Use reference of 200W for scaling the timeout to avoid counter
* overflows.
*
* max_count = rpi->mask >> rpi->shift + 1
* max_energy_pj = max_count * rd->energy_unit
* max_time_sec = (max_energy_pj / 1000000000) / 200w
*
* rapl_pmu.timer_ms = max_time_sec * 1000 / 2
*/
val = (rpi->mask >> rpi->shift) + 1;
val *= rd->energy_unit;
do_div(val, 1000000 * 200 * 2);
rapl_pmu.timer_ms = val;
pr_debug("%llu ms overflow timer\n", rapl_pmu.timer_ms);
}
pr_debug("Domain %s: hw unit %lld * 2^-32 Joules\n", rd->name, data->scale[domain]);
}
/* Initialize per package PMU data */
raw_spin_lock_init(&data->lock);
INIT_LIST_HEAD(&data->active_list);
data->timer_interval = ms_to_ktime(rapl_pmu.timer_ms);
hrtimer_init(&data->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
data->hrtimer.function = rapl_hrtimer_handle;
return rapl_pmu_update(rp);
}
EXPORT_SYMBOL_GPL(rapl_package_add_pmu);
void rapl_package_remove_pmu(struct rapl_package *rp)
{
struct rapl_package *pos;
if (!rp->has_pmu)
return;
guard(cpus_read_lock)();
list_for_each_entry(pos, &rapl_packages, plist) {
/* PMU is still needed */
if (pos->has_pmu && pos != rp)
return;
}
perf_pmu_unregister(&rapl_pmu.pmu);
memset(&rapl_pmu, 0, sizeof(struct rapl_pmu));
}
EXPORT_SYMBOL_GPL(rapl_package_remove_pmu);
#endif
/* called from CPU hotplug notifier, hotplug lock held */
void rapl_remove_package_cpuslocked(struct rapl_package *rp)
{
......
......@@ -302,6 +302,8 @@ static int intel_rapl_tpmi_probe(struct auxiliary_device *auxdev,
goto err;
}
rapl_package_add_pmu(trp->rp);
auxiliary_set_drvdata(auxdev, trp);
return 0;
......@@ -314,6 +316,7 @@ static void intel_rapl_tpmi_remove(struct auxiliary_device *auxdev)
{
struct tpmi_rapl_package *trp = auxiliary_get_drvdata(auxdev);
rapl_package_remove_pmu(trp->rp);
rapl_remove_package(trp->rp);
trp_release(trp);
}
......
......@@ -11,6 +11,7 @@
#include <linux/cpu.h>
#include <linux/device.h>
#include <linux/energy_model.h>
#include <linux/errno.h>
#include <linux/of.h>
#include <linux/pm_opp.h>
......@@ -97,9 +98,16 @@ static int exynos_asv_update_opps(struct exynos_asv *asv)
last_opp_table = opp_table;
ret = exynos_asv_update_cpu_opps(asv, cpu);
if (ret < 0)
if (!ret) {
/*
* Update EM power values since OPP
* voltage values may have changed.
*/
em_dev_update_chip_binning(cpu);
} else {
dev_err(asv->dev, "Couldn't udate OPPs for cpu%d\n",
cpuid);
}
}
dev_pm_opp_put_opp_table(opp_table);
......
......@@ -49,13 +49,17 @@ struct amd_aperf_mperf {
* @lowest_perf: the absolute lowest performance level of the processor
* @prefcore_ranking: the preferred core ranking, the higher value indicates a higher
* priority.
* @max_freq: the frequency that mapped to highest_perf
* @min_freq: the frequency that mapped to lowest_perf
* @nominal_freq: the frequency that mapped to nominal_perf
* @lowest_nonlinear_freq: the frequency that mapped to lowest_nonlinear_perf
* @min_limit_perf: Cached value of the performance corresponding to policy->min
* @max_limit_perf: Cached value of the performance corresponding to policy->max
* @min_limit_freq: Cached value of policy->min (in khz)
* @max_limit_freq: Cached value of policy->max (in khz)
* @max_freq: the frequency (in khz) that mapped to highest_perf
* @min_freq: the frequency (in khz) that mapped to lowest_perf
* @nominal_freq: the frequency (in khz) that mapped to nominal_perf
* @lowest_nonlinear_freq: the frequency (in khz) that mapped to lowest_nonlinear_perf
* @cur: Difference of Aperf/Mperf/tsc count between last and current sample
* @prev: Last Aperf/Mperf/tsc count value read from register
* @freq: current cpu frequency value
* @freq: current cpu frequency value (in khz)
* @boost_supported: check whether the Processor or SBIOS supports boost mode
* @hw_prefcore: check whether HW supports preferred core featue.
* Only when hw_prefcore and early prefcore param are true,
......@@ -124,4 +128,10 @@ static const char * const amd_pstate_mode_string[] = {
[AMD_PSTATE_GUIDED] = "guided",
NULL,
};
struct quirk_entry {
u32 nominal_freq;
u32 lowest_freq;
};
#endif /* _LINUX_AMD_PSTATE_H */
......@@ -172,6 +172,7 @@ struct em_perf_table __rcu *em_table_alloc(struct em_perf_domain *pd);
void em_table_free(struct em_perf_table __rcu *table);
int em_dev_compute_costs(struct device *dev, struct em_perf_state *table,
int nr_states);
int em_dev_update_chip_binning(struct device *dev);
/**
* em_pd_get_efficient_state() - Get an efficient performance state from the EM
......@@ -386,6 +387,10 @@ int em_dev_compute_costs(struct device *dev, struct em_perf_state *table,
{
return -EINVAL;
}
static inline int em_dev_update_chip_binning(struct device *dev)
{
return -EINVAL;
}
#endif
#endif
......@@ -158,6 +158,26 @@ struct rapl_if_priv {
void *rpi;
};
#ifdef CONFIG_PERF_EVENTS
/**
* struct rapl_package_pmu_data: Per package data for PMU support
* @scale: Scale of 2^-32 Joules for each energy counter increase.
* @lock: Lock to protect n_active and active_list.
* @n_active: Number of active events.
* @active_list: List of active events.
* @timer_interval: Maximum timer expiration time before counter overflow.
* @hrtimer: Periodically update the counter to prevent overflow.
*/
struct rapl_package_pmu_data {
u64 scale[RAPL_DOMAIN_MAX];
raw_spinlock_t lock;
int n_active;
struct list_head active_list;
ktime_t timer_interval;
struct hrtimer hrtimer;
};
#endif
/* maximum rapl package domain name: package-%d-die-%d */
#define PACKAGE_DOMAIN_NAME_LENGTH 30
......@@ -176,6 +196,10 @@ struct rapl_package {
struct cpumask cpumask;
char name[PACKAGE_DOMAIN_NAME_LENGTH];
struct rapl_if_priv *priv;
#ifdef CONFIG_PERF_EVENTS
bool has_pmu;
struct rapl_package_pmu_data pmu_data;
#endif
};
struct rapl_package *rapl_find_package_domain_cpuslocked(int id, struct rapl_if_priv *priv,
......@@ -188,4 +212,12 @@ struct rapl_package *rapl_find_package_domain(int id, struct rapl_if_priv *priv,
struct rapl_package *rapl_add_package(int id, struct rapl_if_priv *priv, bool id_is_cpu);
void rapl_remove_package(struct rapl_package *rp);
#ifdef CONFIG_PERF_EVENTS
int rapl_package_add_pmu(struct rapl_package *rp);
void rapl_package_remove_pmu(struct rapl_package *rp);
#else
static inline int rapl_package_add_pmu(struct rapl_package *rp) { return 0; }
static inline void rapl_package_remove_pmu(struct rapl_package *rp) { }
#endif
#endif /* __INTEL_RAPL_H__ */
......@@ -476,6 +476,8 @@ struct device_node *dev_pm_opp_get_of_node(struct dev_pm_opp *opp);
int of_get_required_opp_performance_state(struct device_node *np, int index);
int dev_pm_opp_of_find_icc_paths(struct device *dev, struct opp_table *opp_table);
int dev_pm_opp_of_register_em(struct device *dev, struct cpumask *cpus);
int dev_pm_opp_calc_power(struct device *dev, unsigned long *uW,
unsigned long *kHz);
static inline void dev_pm_opp_of_unregister_em(struct device *dev)
{
em_dev_unregister_perf_domain(dev);
......@@ -539,6 +541,12 @@ static inline void dev_pm_opp_of_unregister_em(struct device *dev)
{
}
static inline int dev_pm_opp_calc_power(struct device *dev, unsigned long *uW,
unsigned long *kHz)
{
return -EOPNOTSUPP;
}
static inline int of_get_required_opp_performance_state(struct device_node *np, int index)
{
return -EOPNOTSUPP;
......
......@@ -107,7 +107,7 @@ extern void wakeup_sources_read_unlock(int idx);
extern struct wakeup_source *wakeup_sources_walk_start(void);
extern struct wakeup_source *wakeup_sources_walk_next(struct wakeup_source *ws);
extern int device_wakeup_enable(struct device *dev);
extern int device_wakeup_disable(struct device *dev);
extern void device_wakeup_disable(struct device *dev);
extern void device_set_wakeup_capable(struct device *dev, bool capable);
extern int device_set_wakeup_enable(struct device *dev, bool enable);
extern void __pm_stay_awake(struct wakeup_source *ws);
......@@ -154,10 +154,9 @@ static inline int device_wakeup_enable(struct device *dev)
return 0;
}
static inline int device_wakeup_disable(struct device *dev)
static inline void device_wakeup_disable(struct device *dev)
{
dev->power.should_wakeup = false;
return 0;
}
static inline int device_set_wakeup_enable(struct device *dev, bool enable)
......@@ -235,11 +234,10 @@ static inline int device_init_wakeup(struct device *dev, bool enable)
if (enable) {
device_set_wakeup_capable(dev, true);
return device_wakeup_enable(dev);
} else {
device_wakeup_disable(dev);
device_set_wakeup_capable(dev, false);
return 0;
}
device_wakeup_disable(dev);
device_set_wakeup_capable(dev, false);
return 0;
}
#endif /* _LINUX_PM_WAKEUP_H */
......@@ -674,23 +674,15 @@ void em_dev_unregister_perf_domain(struct device *dev)
}
EXPORT_SYMBOL_GPL(em_dev_unregister_perf_domain);
/*
* Adjustment of CPU performance values after boot, when all CPUs capacites
* are correctly calculated.
*/
static void em_adjust_new_capacity(struct device *dev,
struct em_perf_domain *pd,
u64 max_cap)
static struct em_perf_table __rcu *em_table_dup(struct em_perf_domain *pd)
{
struct em_perf_table __rcu *em_table;
struct em_perf_state *ps, *new_ps;
int ret, ps_size;
int ps_size;
em_table = em_table_alloc(pd);
if (!em_table) {
dev_warn(dev, "EM: allocation failed\n");
return;
}
if (!em_table)
return NULL;
new_ps = em_table->state;
......@@ -702,24 +694,52 @@ static void em_adjust_new_capacity(struct device *dev,
rcu_read_unlock();
em_init_performance(dev, pd, new_ps, pd->nr_perf_states);
ret = em_compute_costs(dev, new_ps, NULL, pd->nr_perf_states,
return em_table;
}
static int em_recalc_and_update(struct device *dev, struct em_perf_domain *pd,
struct em_perf_table __rcu *em_table)
{
int ret;
ret = em_compute_costs(dev, em_table->state, NULL, pd->nr_perf_states,
pd->flags);
if (ret) {
dev_warn(dev, "EM: compute costs failed\n");
return;
}
if (ret)
goto free_em_table;
ret = em_dev_update_perf_domain(dev, em_table);
if (ret)
dev_warn(dev, "EM: update failed %d\n", ret);
goto free_em_table;
/*
* This is one-time-update, so give up the ownership in this updater.
* The EM framework has incremented the usage counter and from now
* will keep the reference (then free the memory when needed).
*/
free_em_table:
em_table_free(em_table);
return ret;
}
/*
* Adjustment of CPU performance values after boot, when all CPUs capacites
* are correctly calculated.
*/
static void em_adjust_new_capacity(struct device *dev,
struct em_perf_domain *pd,
u64 max_cap)
{
struct em_perf_table __rcu *em_table;
em_table = em_table_dup(pd);
if (!em_table) {
dev_warn(dev, "EM: allocation failed\n");
return;
}
em_init_performance(dev, pd, em_table->state, pd->nr_perf_states);
em_recalc_and_update(dev, pd, em_table);
}
static void em_check_capacity_update(void)
......@@ -788,3 +808,51 @@ static void em_update_workfn(struct work_struct *work)
{
em_check_capacity_update();
}
/**
* em_dev_update_chip_binning() - Update Energy Model after the new voltage
* information is present in the OPPs.
* @dev : Device for which the Energy Model has to be updated.
*
* This function allows to update easily the EM with new values available in
* the OPP framework and DT. It can be used after the chip has been properly
* verified by device drivers and the voltages adjusted for the 'chip binning'.
*/
int em_dev_update_chip_binning(struct device *dev)
{
struct em_perf_table __rcu *em_table;
struct em_perf_domain *pd;
int i, ret;
if (IS_ERR_OR_NULL(dev))
return -EINVAL;
pd = em_pd_get(dev);
if (!pd) {
dev_warn(dev, "Couldn't find Energy Model\n");
return -EINVAL;
}
em_table = em_table_dup(pd);
if (!em_table) {
dev_warn(dev, "EM: allocation failed\n");
return -ENOMEM;
}
/* Update power values which might change due to new voltage in OPPs */
for (i = 0; i < pd->nr_perf_states; i++) {
unsigned long freq = em_table->state[i].frequency;
unsigned long power;
ret = dev_pm_opp_calc_power(dev, &power, &freq);
if (ret) {
em_table_free(em_table);
return ret;
}
em_table->state[i].power = power;
}
return em_recalc_and_update(dev, pd, em_table);
}
EXPORT_SYMBOL_GPL(em_dev_update_chip_binning);
......@@ -1361,7 +1361,7 @@ static int __init resume_setup(char *str)
if (noresume)
return 1;
strncpy(resume_file, str, 255);
strscpy(resume_file, str);
return 1;
}
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment