Commit 045e222d authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'pm-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "These include two new drivers (cpufreq driver for Apple SoC CPU
  P-states and the SCMI Powercap based power capping driver), other new
  hardware support and driver extensions (Qualcomm cpufreq driver and
  its DT bindings, TI cpufreq driver, intel_pstate, intel-uncore-freq),
  a bunch of fixes and cleanups all over and a cpupower utility update
  including new features related to RAPL support.

  Specifics:

   - Fix nasty and hard to debug race condition introduced by mistake in
     the runtime PM core code and clean up that code somewhat on top of
     the fix (Rafael Wysocki)

   - Generalize of_perf_domain_get_sharing_cpumask phandle format
     (Hector Martin)

   - Add new cpufreq driver for Apple SoC CPU P-states (Hector Martin)

   - Update Qualcomm cpufreq driver (Manivannan Sadhasivam, Chen Hui):
      - CPU clock provider support
      - Generic cleanups or reorganization
      - Potential memleak fix
      - Fix of the return value of cpufreq_driver->get()

   - Update Qualcomm cpufreq driver's DT bindings (Manivannan
     Sadhasivam, Rob Herring, Melody Olvera):
      - Support for CPU clock provider
      - Missing cache-related properties fixes
      - Support for QDU1000/QRU1000

   - Add support for ti,am625 SoC and enable build of ti-cpufreq for
     ARCH_K3 (Dave Gerlach, and Vibhore Vardhan)

   - Use flexible array to simplify memory allocation in the tegra186
     cpufreq driver (Christophe JAILLET)

   - Convert cpufreq statistics code to use sysfs_emit_at() (ye
     xingchen)

   - Allow intel_pstate to use no-HWP mode on Sapphire Rapids (Giovanni
     Gherdovich)

   - Add missing pci_dev_put() to the amd_freq_sensitivity cpufreq
     driver (Xiongfeng Wang)

   - Initialize the kobj_unregister completion before calling
     kobject_init_and_add() in the cpufreq core code (Yongqiang Liu)

   - Defer setting boost MSRs in the ACPI cpufreq driver (Stuart Hayes,
     Nathan Chancellor)

   - Make intel_pstate accept initial EPP value of 0x80 (Srinivas
     Pandruvada)

   - Make read-only array sys_clk_src in the SPEAr cpufreq driver static
     (Colin Ian King)

   - Make array speeds in the longhaul cpufreq driver static (Colin Ian
     King)

   - Use str_enabled_disabled() helper in the ACPI cpufreq driver (Andy
     Shevchenko)

   - Drop a reference to CVS from cpufreq documentation (Conghui Wang)

   - Improve kernel messages printed by the PSCI cpuidle driver (Ulf
     Hansson)

   - Make the DT cpuidle driver return the correct number of parsed idle
     states, clean it up and clarify a comment in it (Ulf Hansson)

   - Modify the tasks freezing code to avoid using pr_cont() and refine
     an error message printed by it (Rafael Wysocki)

   - Make the hibernation core code complain about memory map mismatches
     during resume to help diagnostics (Xueqin Luo)

   - Fix mistake in a kerneldoc comment in the hibernation code
     (xiongxin)

   - Reverse the order of performance and enabling operations in the
     generic power domains code (Abel Vesa)

   - Power off[on] domains in hibernate .freeze[thaw]_noirq hook of in
     the generic power domains code (Abel Vesa)

   - Consolidate genpd_restore_noirq() and genpd_resume_noirq() (Shawn
     Guo)

   - Pass generic PM noirq hooks to genpd_finish_suspend() (Shawn Guo)

   - Drop generic power domain status manipulation during hibernate
     restore (Shawn Guo)

   - Fix compiler warnings with make W=1 in the idle_inject power
     capping driver (Srinivas Pandruvada)

   - Use kstrtobool() instead of strtobool() in the power capping sysfs
     interface (Christophe JAILLET)

   - Add SCMI Powercap based power capping driver (Cristian Marussi)

   - Add Emerald Rapids support to the intel-uncore-freq driver (Artem
     Bityutskiy)

   - Repair slips in kernel-doc comments in the generic notifier code
     (Lukas Bulwahn)

   - Fix several DT issues in the OPP library reorganize code around
     opp-microvolt-<named> DT property (Viresh Kumar)

   - Allow any of opp-microvolt, opp-microamp, or opp-microwatt
     properties to be present without the others present (James
     Calligeros)

   - Fix clock-latency-ns property in DT example (Serge Semin)

   - Add a private governor_data for devfreq governors (Kant Fan)

   - Reorganize devfreq code to use device_match_of_node() and
     devm_platform_get_and_ioremap_resource() instead of open coding
     them (ye xingchen, Minghao Chi)

   - Make cpupower choose base_cpu to display default cpupower details
     instead of picking CPU 0 (Saket Kumar Bhaskar)

   - Add Georgian translation to cpupower documentation (Zurab
     Kargareteli)

   - Introduce powercap intel-rapl library, powercap-info command, and
     RAPL monitor into cpupower (Thomas Renninger)"

* tag 'pm-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (64 commits)
  PM: runtime: Adjust white space in the core code
  cpufreq: Remove CVS version control contents from documentation
  cpufreq: stats: Convert to use sysfs_emit_at() API
  cpufreq: ACPI: Only set boost MSRs on supported CPUs
  PM: sleep: Refine error message in try_to_freeze_tasks()
  PM: sleep: Avoid using pr_cont() in the tasks freezing code
  PM: runtime: Relocate rpm_callback() right after __rpm_callback()
  PM: runtime: Do not call __rpm_callback() from rpm_idle()
  PM / devfreq: event: use devm_platform_get_and_ioremap_resource()
  PM / devfreq: event: Use device_match_of_node()
  PM / devfreq: Use device_match_of_node()
  powercap: idle_inject: Fix warnings with make W=1
  PM: hibernate: Complain about memory map mismatches during resume
  dt-bindings: cpufreq: cpufreq-qcom-hw: Add QDU1000/QRU1000 cpufreq
  cpufreq: tegra186: Use flexible array to simplify memory allocation
  cpupower: rapl monitor - shows the used power consumption in uj for each rapl domain
  cpupower: Introduce powercap intel-rapl library and powercap-info command
  cpupower: Add Georgian translation
  cpufreq: intel_pstate: Add Sapphire Rapids support in no-HWP mode
  cpufreq: amd_freq_sensitivity: Add missing pci_dev_put()
  ...
parents 631aa744 ed6a0047
......@@ -20,18 +20,15 @@ Author: Dominik Brodowski <linux@brodo.de>
Mailing List
------------
There is a CPU frequency changing CVS commit and general list where
you can report bugs, problems or submit patches. To post a message,
send an email to linux-pm@vger.kernel.org.
There is a CPU frequency general list where you can report bugs,
problems or submit patches. To post a message, send an email to
linux-pm@vger.kernel.org.
Links
-----
the FTP archives:
* ftp://ftp.linux.org.uk/pub/linux/cpufreq/
how to access the CVS repository:
* http://cvs.arm.linux.org.uk/
the CPUFreq Mailing list:
* http://vger.kernel.org/vger-lists.html#linux-pm
......
......@@ -25,6 +25,7 @@ properties:
- description: v2 of CPUFREQ HW (EPSS)
items:
- enum:
- qcom,qdu1000-cpufreq-epss
- qcom,sm6375-cpufreq-epss
- qcom,sm8250-cpufreq-epss
- const: qcom,cpufreq-epss
......@@ -56,6 +57,9 @@ properties:
'#freq-domain-cells':
const: 1
'#clock-cells':
const: 1
required:
- compatible
- reg
......@@ -83,11 +87,16 @@ examples:
enable-method = "psci";
next-level-cache = <&L2_0>;
qcom,freq-domain = <&cpufreq_hw 0>;
clocks = <&cpufreq_hw 0>;
L2_0: l2-cache {
compatible = "cache";
cache-unified;
cache-level = <2>;
next-level-cache = <&L3_0>;
L3_0: l3-cache {
compatible = "cache";
cache-unified;
cache-level = <3>;
};
};
};
......@@ -99,8 +108,11 @@ examples:
enable-method = "psci";
next-level-cache = <&L2_100>;
qcom,freq-domain = <&cpufreq_hw 0>;
clocks = <&cpufreq_hw 0>;
L2_100: l2-cache {
compatible = "cache";
cache-unified;
cache-level = <2>;
next-level-cache = <&L3_0>;
};
};
......@@ -112,8 +124,11 @@ examples:
enable-method = "psci";
next-level-cache = <&L2_200>;
qcom,freq-domain = <&cpufreq_hw 0>;
clocks = <&cpufreq_hw 0>;
L2_200: l2-cache {
compatible = "cache";
cache-unified;
cache-level = <2>;
next-level-cache = <&L3_0>;
};
};
......@@ -125,8 +140,11 @@ examples:
enable-method = "psci";
next-level-cache = <&L2_300>;
qcom,freq-domain = <&cpufreq_hw 0>;
clocks = <&cpufreq_hw 0>;
L2_300: l2-cache {
compatible = "cache";
cache-unified;
cache-level = <2>;
next-level-cache = <&L3_0>;
};
};
......@@ -138,8 +156,11 @@ examples:
enable-method = "psci";
next-level-cache = <&L2_400>;
qcom,freq-domain = <&cpufreq_hw 1>;
clocks = <&cpufreq_hw 1>;
L2_400: l2-cache {
compatible = "cache";
cache-unified;
cache-level = <2>;
next-level-cache = <&L3_0>;
};
};
......@@ -151,8 +172,11 @@ examples:
enable-method = "psci";
next-level-cache = <&L2_500>;
qcom,freq-domain = <&cpufreq_hw 1>;
clocks = <&cpufreq_hw 1>;
L2_500: l2-cache {
compatible = "cache";
cache-unified;
cache-level = <2>;
next-level-cache = <&L3_0>;
};
};
......@@ -164,8 +188,11 @@ examples:
enable-method = "psci";
next-level-cache = <&L2_600>;
qcom,freq-domain = <&cpufreq_hw 1>;
clocks = <&cpufreq_hw 1>;
L2_600: l2-cache {
compatible = "cache";
cache-unified;
cache-level = <2>;
next-level-cache = <&L3_0>;
};
};
......@@ -177,8 +204,11 @@ examples:
enable-method = "psci";
next-level-cache = <&L2_700>;
qcom,freq-domain = <&cpufreq_hw 1>;
clocks = <&cpufreq_hw 1>;
L2_700: l2-cache {
compatible = "cache";
cache-unified;
cache-level = <2>;
next-level-cache = <&L3_0>;
};
};
......@@ -197,6 +227,7 @@ examples:
clock-names = "xo", "alternate";
#freq-domain-cells = <1>;
#clock-cells = <1>;
};
};
...
......@@ -108,7 +108,7 @@ patternProperties:
The power for the OPP in micro-Watts.
Entries for multiple regulators shall be provided in the same field
separated by angular brackets <>. If current values aren't required
separated by angular brackets <>. If power values aren't required
for a regulator, then it shall be filled with 0. If power values
aren't required for any of the regulators, then this field is not
required. The OPP binding doesn't provide any provisions to relate the
......@@ -230,9 +230,9 @@ patternProperties:
minItems: 1
maxItems: 8 # Should be enough regulators
'^opp-microwatt':
'^opp-microwatt-':
description:
Named opp-microwatt property. Similar to opp-microamp property,
Named opp-microwatt property. Similar to opp-microamp-<name> property,
but for microwatt instead.
$ref: /schemas/types.yaml#/definitions/uint32-array
minItems: 1
......
......@@ -155,7 +155,7 @@ examples:
opp-hz = /bits/ 64 <1200000000>;
opp-microvolt = <1025000>;
opp-microamp = <90000>;
lock-latency-ns = <290000>;
clock-latency-ns = <290000>;
turbo-mode;
};
};
......
......@@ -20038,6 +20038,7 @@ F: drivers/clk/clk-sc[mp]i.c
F: drivers/cpufreq/sc[mp]i-cpufreq.c
F: drivers/firmware/arm_scmi/
F: drivers/firmware/arm_scpi.c
F: drivers/powercap/arm_scmi_powercap.c
F: drivers/regulator/scmi-regulator.c
F: drivers/reset/reset-scmi.c
F: include/linux/sc[mp]i_protocol.h
......
......@@ -31,6 +31,15 @@ chosen {
bootargs = "console=ttyS2,115200n8 earlycon=ns16550a,mmio32,0x02800000";
};
opp-table {
/* Add 1.4GHz OPP for am625-sk board. Requires VDD_CORE to be at 0.85V */
opp-1400000000 {
opp-hz = /bits/ 64 <1400000000>;
opp-supported-hw = <0x01 0x0004>;
clock-latency-ns = <6000000>;
};
};
memory@80000000 {
device_type = "memory";
/* 2G RAM */
......
......@@ -48,6 +48,8 @@ cpu0: cpu@0 {
d-cache-line-size = <64>;
d-cache-sets = <128>;
next-level-cache = <&L2_0>;
operating-points-v2 = <&a53_opp_table>;
clocks = <&k3_clks 135 0>;
};
cpu1: cpu@1 {
......@@ -62,6 +64,8 @@ cpu1: cpu@1 {
d-cache-line-size = <64>;
d-cache-sets = <128>;
next-level-cache = <&L2_0>;
operating-points-v2 = <&a53_opp_table>;
clocks = <&k3_clks 136 0>;
};
cpu2: cpu@2 {
......@@ -76,6 +80,8 @@ cpu2: cpu@2 {
d-cache-line-size = <64>;
d-cache-sets = <128>;
next-level-cache = <&L2_0>;
operating-points-v2 = <&a53_opp_table>;
clocks = <&k3_clks 137 0>;
};
cpu3: cpu@3 {
......@@ -90,6 +96,51 @@ cpu3: cpu@3 {
d-cache-line-size = <64>;
d-cache-sets = <128>;
next-level-cache = <&L2_0>;
operating-points-v2 = <&a53_opp_table>;
clocks = <&k3_clks 138 0>;
};
};
a53_opp_table: opp-table {
compatible = "operating-points-v2-ti-cpu";
opp-shared;
syscon = <&wkup_conf>;
opp-200000000 {
opp-hz = /bits/ 64 <200000000>;
opp-supported-hw = <0x01 0x0007>;
clock-latency-ns = <6000000>;
};
opp-400000000 {
opp-hz = /bits/ 64 <400000000>;
opp-supported-hw = <0x01 0x0007>;
clock-latency-ns = <6000000>;
};
opp-600000000 {
opp-hz = /bits/ 64 <600000000>;
opp-supported-hw = <0x01 0x0007>;
clock-latency-ns = <6000000>;
};
opp-800000000 {
opp-hz = /bits/ 64 <800000000>;
opp-supported-hw = <0x01 0x0007>;
clock-latency-ns = <6000000>;
};
opp-1000000000 {
opp-hz = /bits/ 64 <1000000000>;
opp-supported-hw = <0x01 0x0006>;
clock-latency-ns = <6000000>;
};
opp-1250000000 {
opp-hz = /bits/ 64 <1250000000>;
opp-supported-hw = <0x01 0x0004>;
clock-latency-ns = <6000000>;
opp-suspend;
};
};
......
......@@ -964,8 +964,8 @@ static int genpd_runtime_suspend(struct device *dev)
return 0;
genpd_lock(genpd);
gpd_data->rpm_pstate = genpd_drop_performance_state(dev);
genpd_power_off(genpd, true, 0);
gpd_data->rpm_pstate = genpd_drop_performance_state(dev);
genpd_unlock(genpd);
return 0;
......@@ -1003,9 +1003,8 @@ static int genpd_runtime_resume(struct device *dev)
goto out;
genpd_lock(genpd);
genpd_restore_performance_state(dev, gpd_data->rpm_pstate);
ret = genpd_power_on(genpd, 0);
if (!ret)
genpd_restore_performance_state(dev, gpd_data->rpm_pstate);
genpd_unlock(genpd);
if (ret)
......@@ -1043,8 +1042,8 @@ static int genpd_runtime_resume(struct device *dev)
err_poweroff:
if (!pm_runtime_is_irq_safe(dev) || genpd_is_irq_safe(genpd)) {
genpd_lock(genpd);
gpd_data->rpm_pstate = genpd_drop_performance_state(dev);
genpd_power_off(genpd, true, 0);
gpd_data->rpm_pstate = genpd_drop_performance_state(dev);
genpd_unlock(genpd);
}
......@@ -1214,12 +1213,15 @@ static int genpd_prepare(struct device *dev)
* genpd_finish_suspend - Completion of suspend or hibernation of device in an
* I/O pm domain.
* @dev: Device to suspend.
* @poweroff: Specifies if this is a poweroff_noirq or suspend_noirq callback.
* @suspend_noirq: Generic suspend_noirq callback.
* @resume_noirq: Generic resume_noirq callback.
*
* Stop the device and remove power from the domain if all devices in it have
* been stopped.
*/
static int genpd_finish_suspend(struct device *dev, bool poweroff)
static int genpd_finish_suspend(struct device *dev,
int (*suspend_noirq)(struct device *dev),
int (*resume_noirq)(struct device *dev))
{
struct generic_pm_domain *genpd;
int ret = 0;
......@@ -1228,10 +1230,7 @@ static int genpd_finish_suspend(struct device *dev, bool poweroff)
if (IS_ERR(genpd))
return -EINVAL;
if (poweroff)
ret = pm_generic_poweroff_noirq(dev);
else
ret = pm_generic_suspend_noirq(dev);
ret = suspend_noirq(dev);
if (ret)
return ret;
......@@ -1242,10 +1241,7 @@ static int genpd_finish_suspend(struct device *dev, bool poweroff)
!pm_runtime_status_suspended(dev)) {
ret = genpd_stop_dev(genpd, dev);
if (ret) {
if (poweroff)
pm_generic_restore_noirq(dev);
else
pm_generic_resume_noirq(dev);
resume_noirq(dev);
return ret;
}
}
......@@ -1269,16 +1265,20 @@ static int genpd_suspend_noirq(struct device *dev)
{
dev_dbg(dev, "%s()\n", __func__);
return genpd_finish_suspend(dev, false);
return genpd_finish_suspend(dev,
pm_generic_suspend_noirq,
pm_generic_resume_noirq);
}
/**
* genpd_resume_noirq - Start of resume of device in an I/O PM domain.
* genpd_finish_resume - Completion of resume of device in an I/O PM domain.
* @dev: Device to resume.
* @resume_noirq: Generic resume_noirq callback.
*
* Restore power to the device's PM domain, if necessary, and start the device.
*/
static int genpd_resume_noirq(struct device *dev)
static int genpd_finish_resume(struct device *dev,
int (*resume_noirq)(struct device *dev))
{
struct generic_pm_domain *genpd;
int ret;
......@@ -1290,7 +1290,7 @@ static int genpd_resume_noirq(struct device *dev)
return -EINVAL;
if (device_wakeup_path(dev) && genpd_is_active_wakeup(genpd))
return pm_generic_resume_noirq(dev);
return resume_noirq(dev);
genpd_lock(genpd);
genpd_sync_power_on(genpd, true, 0);
......@@ -1307,6 +1307,19 @@ static int genpd_resume_noirq(struct device *dev)
return pm_generic_resume_noirq(dev);
}
/**
* genpd_resume_noirq - Start of resume of device in an I/O PM domain.
* @dev: Device to resume.
*
* Restore power to the device's PM domain, if necessary, and start the device.
*/
static int genpd_resume_noirq(struct device *dev)
{
dev_dbg(dev, "%s()\n", __func__);
return genpd_finish_resume(dev, pm_generic_resume_noirq);
}
/**
* genpd_freeze_noirq - Completion of freezing a device in an I/O PM domain.
* @dev: Device to freeze.
......@@ -1318,24 +1331,11 @@ static int genpd_resume_noirq(struct device *dev)
*/
static int genpd_freeze_noirq(struct device *dev)
{
const struct generic_pm_domain *genpd;
int ret = 0;
dev_dbg(dev, "%s()\n", __func__);
genpd = dev_to_genpd(dev);
if (IS_ERR(genpd))
return -EINVAL;
ret = pm_generic_freeze_noirq(dev);
if (ret)
return ret;
if (genpd->dev_ops.stop && genpd->dev_ops.start &&
!pm_runtime_status_suspended(dev))
ret = genpd_stop_dev(genpd, dev);
return ret;
return genpd_finish_suspend(dev,
pm_generic_freeze_noirq,
pm_generic_thaw_noirq);
}
/**
......@@ -1347,23 +1347,9 @@ static int genpd_freeze_noirq(struct device *dev)
*/
static int genpd_thaw_noirq(struct device *dev)
{
const struct generic_pm_domain *genpd;
int ret = 0;
dev_dbg(dev, "%s()\n", __func__);
genpd = dev_to_genpd(dev);
if (IS_ERR(genpd))
return -EINVAL;
if (genpd->dev_ops.stop && genpd->dev_ops.start &&
!pm_runtime_status_suspended(dev)) {
ret = genpd_start_dev(genpd, dev);
if (ret)
return ret;
}
return pm_generic_thaw_noirq(dev);
return genpd_finish_resume(dev, pm_generic_thaw_noirq);
}
/**
......@@ -1378,7 +1364,9 @@ static int genpd_poweroff_noirq(struct device *dev)
{
dev_dbg(dev, "%s()\n", __func__);
return genpd_finish_suspend(dev, true);
return genpd_finish_suspend(dev,
pm_generic_poweroff_noirq,
pm_generic_restore_noirq);
}
/**
......@@ -1390,40 +1378,9 @@ static int genpd_poweroff_noirq(struct device *dev)
*/
static int genpd_restore_noirq(struct device *dev)
{
struct generic_pm_domain *genpd;
int ret = 0;
dev_dbg(dev, "%s()\n", __func__);
genpd = dev_to_genpd(dev);
if (IS_ERR(genpd))
return -EINVAL;
/*
* At this point suspended_count == 0 means we are being run for the
* first time for the given domain in the present cycle.
*/
genpd_lock(genpd);
if (genpd->suspended_count++ == 0) {
/*
* The boot kernel might put the domain into arbitrary state,
* so make it appear as powered off to genpd_sync_power_on(),
* so that it tries to power it on in case it was really off.
*/
genpd->status = GENPD_STATE_OFF;
}
genpd_sync_power_on(genpd, true, 0);
genpd_unlock(genpd);
if (genpd->dev_ops.stop && genpd->dev_ops.start &&
!pm_runtime_status_suspended(dev)) {
ret = genpd_start_dev(genpd, dev);
if (ret)
return ret;
}
return pm_generic_restore_noirq(dev);
return genpd_finish_resume(dev, pm_generic_restore_noirq);
}
/**
......@@ -2775,17 +2732,6 @@ static int __genpd_dev_pm_attach(struct device *dev, struct device *base_dev,
dev->pm_domain->detach = genpd_dev_pm_detach;
dev->pm_domain->sync = genpd_dev_pm_sync;
if (power_on) {
genpd_lock(pd);
ret = genpd_power_on(pd, 0);
genpd_unlock(pd);
}
if (ret) {
genpd_remove_device(pd, dev);
return -EPROBE_DEFER;
}
/* Set the default performance state */
pstate = of_get_required_opp_performance_state(dev->of_node, index);
if (pstate < 0 && pstate != -ENODEV && pstate != -EOPNOTSUPP) {
......@@ -2797,6 +2743,24 @@ static int __genpd_dev_pm_attach(struct device *dev, struct device *base_dev,
goto err;
dev_gpd_data(dev)->default_pstate = pstate;
}
if (power_on) {
genpd_lock(pd);
ret = genpd_power_on(pd, 0);
genpd_unlock(pd);
}
if (ret) {
/* Drop the default performance state */
if (dev_gpd_data(dev)->default_pstate) {
dev_pm_genpd_set_performance_state(dev, 0);
dev_gpd_data(dev)->default_pstate = 0;
}
genpd_remove_device(pd, dev);
return -EPROBE_DEFER;
}
return 1;
err:
......
......@@ -243,8 +243,7 @@ void pm_runtime_set_memalloc_noio(struct device *dev, bool enable)
* flag was set by any one of the descendants.
*/
if (!dev || (!enable &&
device_for_each_child(dev, NULL,
dev_memalloc_noio)))
device_for_each_child(dev, NULL, dev_memalloc_noio)))
break;
}
mutex_unlock(&dev_hotplug_mutex);
......@@ -265,15 +264,13 @@ static int rpm_check_suspend_allowed(struct device *dev)
retval = -EACCES;
else if (atomic_read(&dev->power.usage_count))
retval = -EAGAIN;
else if (!dev->power.ignore_children &&
atomic_read(&dev->power.child_count))
else if (!dev->power.ignore_children && atomic_read(&dev->power.child_count))
retval = -EBUSY;
/* Pending resume requests take precedence over suspends. */
else if ((dev->power.deferred_resume
&& dev->power.runtime_status == RPM_SUSPENDING)
|| (dev->power.request_pending
&& dev->power.request == RPM_REQ_RESUME))
else if ((dev->power.deferred_resume &&
dev->power.runtime_status == RPM_SUSPENDING) ||
(dev->power.request_pending && dev->power.request == RPM_REQ_RESUME))
retval = -EAGAIN;
else if (__dev_pm_qos_resume_latency(dev) == 0)
retval = -EPERM;
......@@ -404,9 +401,9 @@ static int __rpm_callback(int (*cb)(struct device *), struct device *dev)
*
* Do that if resume fails too.
*/
if (use_links
&& ((dev->power.runtime_status == RPM_SUSPENDING && !retval)
|| (dev->power.runtime_status == RPM_RESUMING && retval))) {
if (use_links &&
((dev->power.runtime_status == RPM_SUSPENDING && !retval) ||
(dev->power.runtime_status == RPM_RESUMING && retval))) {
idx = device_links_read_lock();
__rpm_put_suppliers(dev, false);
......@@ -421,6 +418,38 @@ static int __rpm_callback(int (*cb)(struct device *), struct device *dev)
return retval;
}
/**
* rpm_callback - Run a given runtime PM callback for a given device.
* @cb: Runtime PM callback to run.
* @dev: Device to run the callback for.
*/
static int rpm_callback(int (*cb)(struct device *), struct device *dev)
{
int retval;
if (dev->power.memalloc_noio) {
unsigned int noio_flag;
/*
* Deadlock might be caused if memory allocation with
* GFP_KERNEL happens inside runtime_suspend and
* runtime_resume callbacks of one block device's
* ancestor or the block device itself. Network
* device might be thought as part of iSCSI block
* device, so network device and its ancestor should
* be marked as memalloc_noio too.
*/
noio_flag = memalloc_noio_save();
retval = __rpm_callback(cb, dev);
memalloc_noio_restore(noio_flag);
} else {
retval = __rpm_callback(cb, dev);
}
dev->power.runtime_error = retval;
return retval != -EACCES ? retval : -EIO;
}
/**
* rpm_idle - Notify device bus type if the device can be suspended.
* @dev: Device to notify the bus type about.
......@@ -459,6 +488,7 @@ static int rpm_idle(struct device *dev, int rpmflags)
/* Act as though RPM_NOWAIT is always set. */
else if (dev->power.idle_notification)
retval = -EINPROGRESS;
if (retval)
goto out;
......@@ -484,7 +514,17 @@ static int rpm_idle(struct device *dev, int rpmflags)
dev->power.idle_notification = true;
retval = __rpm_callback(callback, dev);
if (dev->power.irq_safe)
spin_unlock(&dev->power.lock);
else
spin_unlock_irq(&dev->power.lock);
retval = callback(dev);
if (dev->power.irq_safe)
spin_lock(&dev->power.lock);
else
spin_lock_irq(&dev->power.lock);
dev->power.idle_notification = false;
wake_up_all(&dev->power.wait_queue);
......@@ -494,38 +534,6 @@ static int rpm_idle(struct device *dev, int rpmflags)
return retval ? retval : rpm_suspend(dev, rpmflags | RPM_AUTO);
}
/**
* rpm_callback - Run a given runtime PM callback for a given device.
* @cb: Runtime PM callback to run.
* @dev: Device to run the callback for.
*/
static int rpm_callback(int (*cb)(struct device *), struct device *dev)
{
int retval;
if (dev->power.memalloc_noio) {
unsigned int noio_flag;
/*
* Deadlock might be caused if memory allocation with
* GFP_KERNEL happens inside runtime_suspend and
* runtime_resume callbacks of one block device's
* ancestor or the block device itself. Network
* device might be thought as part of iSCSI block
* device, so network device and its ancestor should
* be marked as memalloc_noio too.
*/
noio_flag = memalloc_noio_save();
retval = __rpm_callback(cb, dev);
memalloc_noio_restore(noio_flag);
} else {
retval = __rpm_callback(cb, dev);
}
dev->power.runtime_error = retval;
return retval != -EACCES ? retval : -EIO;
}
/**
* rpm_suspend - Carry out runtime suspend of given device.
* @dev: Device to suspend.
......@@ -564,12 +572,12 @@ static int rpm_suspend(struct device *dev, int rpmflags)
/* Synchronous suspends are not allowed in the RPM_RESUMING state. */
if (dev->power.runtime_status == RPM_RESUMING && !(rpmflags & RPM_ASYNC))
retval = -EAGAIN;
if (retval)
goto out;
/* If the autosuspend_delay time hasn't expired yet, reschedule. */
if ((rpmflags & RPM_AUTO)
&& dev->power.runtime_status != RPM_SUSPENDING) {
if ((rpmflags & RPM_AUTO) && dev->power.runtime_status != RPM_SUSPENDING) {
u64 expires = pm_runtime_autosuspend_expiration(dev);
if (expires != 0) {
......@@ -584,7 +592,7 @@ static int rpm_suspend(struct device *dev, int rpmflags)
* rest.
*/
if (!(dev->power.timer_expires &&
dev->power.timer_expires <= expires)) {
dev->power.timer_expires <= expires)) {
/*
* We add a slack of 25% to gather wakeups
* without sacrificing the granularity.
......@@ -594,9 +602,9 @@ static int rpm_suspend(struct device *dev, int rpmflags)
dev->power.timer_expires = expires;
hrtimer_start_range_ns(&dev->power.suspend_timer,
ns_to_ktime(expires),
slack,
HRTIMER_MODE_ABS);
ns_to_ktime(expires),
slack,
HRTIMER_MODE_ABS);
}
dev->power.timer_autosuspends = 1;
goto out;
......@@ -787,8 +795,8 @@ static int rpm_resume(struct device *dev, int rpmflags)
goto out;
}
if (dev->power.runtime_status == RPM_RESUMING
|| dev->power.runtime_status == RPM_SUSPENDING) {
if (dev->power.runtime_status == RPM_RESUMING ||
dev->power.runtime_status == RPM_SUSPENDING) {
DEFINE_WAIT(wait);
if (rpmflags & (RPM_ASYNC | RPM_NOWAIT)) {
......@@ -815,8 +823,8 @@ static int rpm_resume(struct device *dev, int rpmflags)
for (;;) {
prepare_to_wait(&dev->power.wait_queue, &wait,
TASK_UNINTERRUPTIBLE);
if (dev->power.runtime_status != RPM_RESUMING
&& dev->power.runtime_status != RPM_SUSPENDING)
if (dev->power.runtime_status != RPM_RESUMING &&
dev->power.runtime_status != RPM_SUSPENDING)
break;
spin_unlock_irq(&dev->power.lock);
......@@ -836,9 +844,9 @@ static int rpm_resume(struct device *dev, int rpmflags)
*/
if (dev->power.no_callbacks && !parent && dev->parent) {
spin_lock_nested(&dev->parent->power.lock, SINGLE_DEPTH_NESTING);
if (dev->parent->power.disable_depth > 0
|| dev->parent->power.ignore_children
|| dev->parent->power.runtime_status == RPM_ACTIVE) {
if (dev->parent->power.disable_depth > 0 ||
dev->parent->power.ignore_children ||
dev->parent->power.runtime_status == RPM_ACTIVE) {
atomic_inc(&dev->parent->power.child_count);
spin_unlock(&dev->parent->power.lock);
retval = 1;
......@@ -867,6 +875,7 @@ static int rpm_resume(struct device *dev, int rpmflags)
parent = dev->parent;
if (dev->power.irq_safe)
goto skip_parent;
spin_unlock(&dev->power.lock);
pm_runtime_get_noresume(parent);
......@@ -876,8 +885,8 @@ static int rpm_resume(struct device *dev, int rpmflags)
* Resume the parent if it has runtime PM enabled and not been
* set to ignore its children.
*/
if (!parent->power.disable_depth
&& !parent->power.ignore_children) {
if (!parent->power.disable_depth &&
!parent->power.ignore_children) {
rpm_resume(parent, 0);
if (parent->power.runtime_status != RPM_ACTIVE)
retval = -EBUSY;
......@@ -887,6 +896,7 @@ static int rpm_resume(struct device *dev, int rpmflags)
spin_lock(&dev->power.lock);
if (retval)
goto out;
goto repeat;
}
skip_parent:
......@@ -1291,9 +1301,9 @@ int __pm_runtime_set_status(struct device *dev, unsigned int status)
* not active, has runtime PM enabled and the
* 'power.ignore_children' flag unset.
*/
if (!parent->power.disable_depth
&& !parent->power.ignore_children
&& parent->power.runtime_status != RPM_ACTIVE) {
if (!parent->power.disable_depth &&
!parent->power.ignore_children &&
parent->power.runtime_status != RPM_ACTIVE) {
dev_err(dev, "runtime PM trying to activate child device %s but parent (%s) is not active\n",
dev_name(dev),
dev_name(parent));
......@@ -1358,9 +1368,9 @@ static void __pm_runtime_barrier(struct device *dev)
dev->power.request_pending = false;
}
if (dev->power.runtime_status == RPM_SUSPENDING
|| dev->power.runtime_status == RPM_RESUMING
|| dev->power.idle_notification) {
if (dev->power.runtime_status == RPM_SUSPENDING ||
dev->power.runtime_status == RPM_RESUMING ||
dev->power.idle_notification) {
DEFINE_WAIT(wait);
/* Suspend, wake-up or idle notification in progress. */
......@@ -1445,8 +1455,8 @@ void __pm_runtime_disable(struct device *dev, bool check_resume)
* means there probably is some I/O to process and disabling runtime PM
* shouldn't prevent the device from processing the I/O.
*/
if (check_resume && dev->power.request_pending
&& dev->power.request == RPM_REQ_RESUME) {
if (check_resume && dev->power.request_pending &&
dev->power.request == RPM_REQ_RESUME) {
/*
* Prevent suspends and idle notifications from being carried
* out after we have woken up the device.
......@@ -1606,6 +1616,7 @@ void pm_runtime_irq_safe(struct device *dev)
{
if (dev->parent)
pm_runtime_get_sync(dev->parent);
spin_lock_irq(&dev->power.lock);
dev->power.irq_safe = 1;
spin_unlock_irq(&dev->power.lock);
......
......@@ -41,6 +41,15 @@ config ARM_ALLWINNER_SUN50I_CPUFREQ_NVMEM
To compile this driver as a module, choose M here: the
module will be called sun50i-cpufreq-nvmem.
config ARM_APPLE_SOC_CPUFREQ
tristate "Apple Silicon SoC CPUFreq support"
depends on ARCH_APPLE || (COMPILE_TEST && 64BIT)
select PM_OPP
default ARCH_APPLE
help
This adds the CPUFreq driver for Apple Silicon machines
(e.g. Apple M1).
config ARM_ARMADA_37XX_CPUFREQ
tristate "Armada 37xx CPUFreq support"
depends on ARCH_MVEBU && CPUFREQ_DT
......@@ -340,8 +349,8 @@ config ARM_TEGRA194_CPUFREQ
config ARM_TI_CPUFREQ
bool "Texas Instruments CPUFreq support"
depends on ARCH_OMAP2PLUS
default ARCH_OMAP2PLUS
depends on ARCH_OMAP2PLUS || ARCH_K3
default y
help
This driver enables valid OPPs on the running platform based on
values contained within the SoC in use. Enable this in order to
......
......@@ -52,6 +52,7 @@ obj-$(CONFIG_X86_AMD_FREQ_SENSITIVITY) += amd_freq_sensitivity.o
##################################################################################
# ARM SoC drivers
obj-$(CONFIG_ARM_APPLE_SOC_CPUFREQ) += apple-soc-cpufreq.o
obj-$(CONFIG_ARM_ARMADA_37XX_CPUFREQ) += armada-37xx-cpufreq.o
obj-$(CONFIG_ARM_ARMADA_8K_CPUFREQ) += armada-8k-cpufreq.o
obj-$(CONFIG_ARM_BRCMSTB_AVS_CPUFREQ) += brcmstb-avs-cpufreq.o
......
......@@ -19,6 +19,7 @@
#include <linux/compiler.h>
#include <linux/dmi.h>
#include <linux/slab.h>
#include <linux/string_helpers.h>
#include <linux/acpi.h>
#include <linux/io.h>
......@@ -135,8 +136,8 @@ static int set_boost(struct cpufreq_policy *policy, int val)
{
on_each_cpu_mask(policy->cpus, boost_set_msr_each,
(void *)(long)val, 1);
pr_debug("CPU %*pbl: Core Boosting %sabled.\n",
cpumask_pr_args(policy->cpus), val ? "en" : "dis");
pr_debug("CPU %*pbl: Core Boosting %s.\n",
cpumask_pr_args(policy->cpus), str_enabled_disabled(val));
return 0;
}
......@@ -535,15 +536,6 @@ static void free_acpi_perf_data(void)
free_percpu(acpi_perf_data);
}
static int cpufreq_boost_online(unsigned int cpu)
{
/*
* On the CPU_UP path we simply keep the boost-disable flag
* in sync with the current global state.
*/
return boost_set_msr(acpi_cpufreq_driver.boost_enabled);
}
static int cpufreq_boost_down_prep(unsigned int cpu)
{
/*
......@@ -897,6 +889,9 @@ static int acpi_cpufreq_cpu_init(struct cpufreq_policy *policy)
if (perf->states[0].core_frequency * 1000 != freq_table[0].frequency)
pr_warn(FW_WARN "P-state 0 is not max freq\n");
if (acpi_cpufreq_driver.set_boost)
set_boost(policy, acpi_cpufreq_driver.boost_enabled);
return result;
err_unreg:
......@@ -916,6 +911,7 @@ static int acpi_cpufreq_cpu_exit(struct cpufreq_policy *policy)
pr_debug("%s\n", __func__);
cpufreq_boost_down_prep(policy->cpu);
policy->fast_switch_possible = false;
policy->driver_data = NULL;
acpi_processor_unregister_performance(data->acpi_perf_cpu);
......@@ -958,12 +954,8 @@ static struct cpufreq_driver acpi_cpufreq_driver = {
.attr = acpi_cpufreq_attr,
};
static enum cpuhp_state acpi_cpufreq_online;
static void __init acpi_cpufreq_boost_init(void)
{
int ret;
if (!(boot_cpu_has(X86_FEATURE_CPB) || boot_cpu_has(X86_FEATURE_IDA))) {
pr_debug("Boost capabilities not present in the processor\n");
return;
......@@ -971,24 +963,6 @@ static void __init acpi_cpufreq_boost_init(void)
acpi_cpufreq_driver.set_boost = set_boost;
acpi_cpufreq_driver.boost_enabled = boost_state(0);
/*
* This calls the online callback on all online cpu and forces all
* MSRs to the same value.
*/
ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "cpufreq/acpi:online",
cpufreq_boost_online, cpufreq_boost_down_prep);
if (ret < 0) {
pr_err("acpi_cpufreq: failed to register hotplug callbacks\n");
return;
}
acpi_cpufreq_online = ret;
}
static void acpi_cpufreq_boost_exit(void)
{
if (acpi_cpufreq_online > 0)
cpuhp_remove_state_nocalls(acpi_cpufreq_online);
}
static int __init acpi_cpufreq_init(void)
......@@ -1032,7 +1006,6 @@ static int __init acpi_cpufreq_init(void)
ret = cpufreq_register_driver(&acpi_cpufreq_driver);
if (ret) {
free_acpi_perf_data();
acpi_cpufreq_boost_exit();
}
return ret;
}
......@@ -1041,8 +1014,6 @@ static void __exit acpi_cpufreq_exit(void)
{
pr_debug("%s\n", __func__);
acpi_cpufreq_boost_exit();
cpufreq_unregister_driver(&acpi_cpufreq_driver);
free_acpi_perf_data();
......
......@@ -125,6 +125,8 @@ static int __init amd_freq_sensitivity_init(void)
if (!pcidev) {
if (!boot_cpu_has(X86_FEATURE_PROC_FEEDBACK))
return -ENODEV;
} else {
pci_dev_put(pcidev);
}
if (rdmsrl_safe(MSR_AMD64_FREQ_SENSITIVITY_ACTUAL, &val))
......
// SPDX-License-Identifier: GPL-2.0-only
/*
* Apple SoC CPU cluster performance state driver
*
* Copyright The Asahi Linux Contributors
*
* Based on scpi-cpufreq.c
*/
#include <linux/bitfield.h>
#include <linux/bitops.h>
#include <linux/cpu.h>
#include <linux/cpufreq.h>
#include <linux/cpumask.h>
#include <linux/delay.h>
#include <linux/err.h>
#include <linux/io.h>
#include <linux/iopoll.h>
#include <linux/module.h>
#include <linux/of.h>
#include <linux/of_address.h>
#include <linux/pm_opp.h>
#include <linux/slab.h>
#define APPLE_DVFS_CMD 0x20
#define APPLE_DVFS_CMD_BUSY BIT(31)
#define APPLE_DVFS_CMD_SET BIT(25)
#define APPLE_DVFS_CMD_PS2 GENMASK(16, 12)
#define APPLE_DVFS_CMD_PS1 GENMASK(4, 0)
/* Same timebase as CPU counter (24MHz) */
#define APPLE_DVFS_LAST_CHG_TIME 0x38
/*
* Apple ran out of bits and had to shift this in T8112...
*/
#define APPLE_DVFS_STATUS 0x50
#define APPLE_DVFS_STATUS_CUR_PS_T8103 GENMASK(7, 4)
#define APPLE_DVFS_STATUS_CUR_PS_SHIFT_T8103 4
#define APPLE_DVFS_STATUS_TGT_PS_T8103 GENMASK(3, 0)
#define APPLE_DVFS_STATUS_CUR_PS_T8112 GENMASK(9, 5)
#define APPLE_DVFS_STATUS_CUR_PS_SHIFT_T8112 5
#define APPLE_DVFS_STATUS_TGT_PS_T8112 GENMASK(4, 0)
/*
* Div is +1, base clock is 12MHz on existing SoCs.
* For documentation purposes. We use the OPP table to
* get the frequency.
*/
#define APPLE_DVFS_PLL_STATUS 0xc0
#define APPLE_DVFS_PLL_FACTOR 0xc8
#define APPLE_DVFS_PLL_FACTOR_MULT GENMASK(31, 16)
#define APPLE_DVFS_PLL_FACTOR_DIV GENMASK(15, 0)
#define APPLE_DVFS_TRANSITION_TIMEOUT 100
struct apple_soc_cpufreq_info {
u64 max_pstate;
u64 cur_pstate_mask;
u64 cur_pstate_shift;
};
struct apple_cpu_priv {
struct device *cpu_dev;
void __iomem *reg_base;
const struct apple_soc_cpufreq_info *info;
};
static struct cpufreq_driver apple_soc_cpufreq_driver;
static const struct apple_soc_cpufreq_info soc_t8103_info = {
.max_pstate = 15,
.cur_pstate_mask = APPLE_DVFS_STATUS_CUR_PS_T8103,
.cur_pstate_shift = APPLE_DVFS_STATUS_CUR_PS_SHIFT_T8103,
};
static const struct apple_soc_cpufreq_info soc_t8112_info = {
.max_pstate = 31,
.cur_pstate_mask = APPLE_DVFS_STATUS_CUR_PS_T8112,
.cur_pstate_shift = APPLE_DVFS_STATUS_CUR_PS_SHIFT_T8112,
};
static const struct apple_soc_cpufreq_info soc_default_info = {
.max_pstate = 15,
.cur_pstate_mask = 0, /* fallback */
};
static const struct of_device_id apple_soc_cpufreq_of_match[] = {
{
.compatible = "apple,t8103-cluster-cpufreq",
.data = &soc_t8103_info,
},
{
.compatible = "apple,t8112-cluster-cpufreq",
.data = &soc_t8112_info,
},
{
.compatible = "apple,cluster-cpufreq",
.data = &soc_default_info,
},
{}
};
static unsigned int apple_soc_cpufreq_get_rate(unsigned int cpu)
{
struct cpufreq_policy *policy = cpufreq_cpu_get_raw(cpu);
struct apple_cpu_priv *priv = policy->driver_data;
struct cpufreq_frequency_table *p;
unsigned int pstate;
if (priv->info->cur_pstate_mask) {
u64 reg = readq_relaxed(priv->reg_base + APPLE_DVFS_STATUS);
pstate = (reg & priv->info->cur_pstate_mask) >> priv->info->cur_pstate_shift;
} else {
/*
* For the fallback case we might not know the layout of DVFS_STATUS,
* so just use the command register value (which ignores boost limitations).
*/
u64 reg = readq_relaxed(priv->reg_base + APPLE_DVFS_CMD);
pstate = FIELD_GET(APPLE_DVFS_CMD_PS1, reg);
}
cpufreq_for_each_valid_entry(p, policy->freq_table)
if (p->driver_data == pstate)
return p->frequency;
dev_err(priv->cpu_dev, "could not find frequency for pstate %d\n",
pstate);
return 0;
}
static int apple_soc_cpufreq_set_target(struct cpufreq_policy *policy,
unsigned int index)
{
struct apple_cpu_priv *priv = policy->driver_data;
unsigned int pstate = policy->freq_table[index].driver_data;
u64 reg;
/* Fallback for newer SoCs */
if (index > priv->info->max_pstate)
index = priv->info->max_pstate;
if (readq_poll_timeout_atomic(priv->reg_base + APPLE_DVFS_CMD, reg,
!(reg & APPLE_DVFS_CMD_BUSY), 2,
APPLE_DVFS_TRANSITION_TIMEOUT)) {
return -EIO;
}
reg &= ~(APPLE_DVFS_CMD_PS1 | APPLE_DVFS_CMD_PS2);
reg |= FIELD_PREP(APPLE_DVFS_CMD_PS1, pstate);
reg |= FIELD_PREP(APPLE_DVFS_CMD_PS2, pstate);
reg |= APPLE_DVFS_CMD_SET;
writeq_relaxed(reg, priv->reg_base + APPLE_DVFS_CMD);
return 0;
}
static unsigned int apple_soc_cpufreq_fast_switch(struct cpufreq_policy *policy,
unsigned int target_freq)
{
if (apple_soc_cpufreq_set_target(policy, policy->cached_resolved_idx) < 0)
return 0;
return policy->freq_table[policy->cached_resolved_idx].frequency;
}
static int apple_soc_cpufreq_find_cluster(struct cpufreq_policy *policy,
void __iomem **reg_base,
const struct apple_soc_cpufreq_info **info)
{
struct of_phandle_args args;
const struct of_device_id *match;
int ret = 0;
ret = of_perf_domain_get_sharing_cpumask(policy->cpu, "performance-domains",
"#performance-domain-cells",
policy->cpus, &args);
if (ret < 0)
return ret;
match = of_match_node(apple_soc_cpufreq_of_match, args.np);
of_node_put(args.np);
if (!match)
return -ENODEV;
*info = match->data;
*reg_base = of_iomap(args.np, 0);
if (IS_ERR(*reg_base))
return PTR_ERR(*reg_base);
return 0;
}
static struct freq_attr *apple_soc_cpufreq_hw_attr[] = {
&cpufreq_freq_attr_scaling_available_freqs,
NULL, /* Filled in below if boost is enabled */
NULL,
};
static int apple_soc_cpufreq_init(struct cpufreq_policy *policy)
{
int ret, i;
unsigned int transition_latency;
void __iomem *reg_base;
struct device *cpu_dev;
struct apple_cpu_priv *priv;
const struct apple_soc_cpufreq_info *info;
struct cpufreq_frequency_table *freq_table;
cpu_dev = get_cpu_device(policy->cpu);
if (!cpu_dev) {
pr_err("failed to get cpu%d device\n", policy->cpu);
return -ENODEV;
}
ret = dev_pm_opp_of_add_table(cpu_dev);
if (ret < 0) {
dev_err(cpu_dev, "%s: failed to add OPP table: %d\n", __func__, ret);
return ret;
}
ret = apple_soc_cpufreq_find_cluster(policy, &reg_base, &info);
if (ret) {
dev_err(cpu_dev, "%s: failed to get cluster info: %d\n", __func__, ret);
return ret;
}
ret = dev_pm_opp_set_sharing_cpus(cpu_dev, policy->cpus);
if (ret) {
dev_err(cpu_dev, "%s: failed to mark OPPs as shared: %d\n", __func__, ret);
goto out_iounmap;
}
ret = dev_pm_opp_get_opp_count(cpu_dev);
if (ret <= 0) {
dev_dbg(cpu_dev, "OPP table is not ready, deferring probe\n");
ret = -EPROBE_DEFER;
goto out_free_opp;
}
priv = kzalloc(sizeof(*priv), GFP_KERNEL);
if (!priv) {
ret = -ENOMEM;
goto out_free_opp;
}
ret = dev_pm_opp_init_cpufreq_table(cpu_dev, &freq_table);
if (ret) {
dev_err(cpu_dev, "failed to init cpufreq table: %d\n", ret);
goto out_free_priv;
}
/* Get OPP levels (p-state indexes) and stash them in driver_data */
for (i = 0; freq_table[i].frequency != CPUFREQ_TABLE_END; i++) {
unsigned long rate = freq_table[i].frequency * 1000 + 999;
struct dev_pm_opp *opp = dev_pm_opp_find_freq_floor(cpu_dev, &rate);
if (IS_ERR(opp)) {
ret = PTR_ERR(opp);
goto out_free_cpufreq_table;
}
freq_table[i].driver_data = dev_pm_opp_get_level(opp);
dev_pm_opp_put(opp);
}
priv->cpu_dev = cpu_dev;
priv->reg_base = reg_base;
priv->info = info;
policy->driver_data = priv;
policy->freq_table = freq_table;
transition_latency = dev_pm_opp_get_max_transition_latency(cpu_dev);
if (!transition_latency)
transition_latency = CPUFREQ_ETERNAL;
policy->cpuinfo.transition_latency = transition_latency;
policy->dvfs_possible_from_any_cpu = true;
policy->fast_switch_possible = true;
if (policy_has_boost_freq(policy)) {
ret = cpufreq_enable_boost_support();
if (ret) {
dev_warn(cpu_dev, "failed to enable boost: %d\n", ret);
} else {
apple_soc_cpufreq_hw_attr[1] = &cpufreq_freq_attr_scaling_boost_freqs;
apple_soc_cpufreq_driver.boost_enabled = true;
}
}
return 0;
out_free_cpufreq_table:
dev_pm_opp_free_cpufreq_table(cpu_dev, &freq_table);
out_free_priv:
kfree(priv);
out_free_opp:
dev_pm_opp_remove_all_dynamic(cpu_dev);
out_iounmap:
iounmap(reg_base);
return ret;
}
static int apple_soc_cpufreq_exit(struct cpufreq_policy *policy)
{
struct apple_cpu_priv *priv = policy->driver_data;
dev_pm_opp_free_cpufreq_table(priv->cpu_dev, &policy->freq_table);
dev_pm_opp_remove_all_dynamic(priv->cpu_dev);
iounmap(priv->reg_base);
kfree(priv);
return 0;
}
static struct cpufreq_driver apple_soc_cpufreq_driver = {
.name = "apple-cpufreq",
.flags = CPUFREQ_HAVE_GOVERNOR_PER_POLICY |
CPUFREQ_NEED_INITIAL_FREQ_CHECK | CPUFREQ_IS_COOLING_DEV,
.verify = cpufreq_generic_frequency_table_verify,
.attr = cpufreq_generic_attr,
.get = apple_soc_cpufreq_get_rate,
.init = apple_soc_cpufreq_init,
.exit = apple_soc_cpufreq_exit,
.target_index = apple_soc_cpufreq_set_target,
.fast_switch = apple_soc_cpufreq_fast_switch,
.register_em = cpufreq_register_em_with_opp,
.attr = apple_soc_cpufreq_hw_attr,
};
static int __init apple_soc_cpufreq_module_init(void)
{
if (!of_machine_is_compatible("apple,arm-platform"))
return -ENODEV;
return cpufreq_register_driver(&apple_soc_cpufreq_driver);
}
module_init(apple_soc_cpufreq_module_init);
static void __exit apple_soc_cpufreq_module_exit(void)
{
cpufreq_unregister_driver(&apple_soc_cpufreq_driver);
}
module_exit(apple_soc_cpufreq_module_exit);
MODULE_DEVICE_TABLE(of, apple_soc_cpufreq_of_match);
MODULE_AUTHOR("Hector Martin <marcan@marcan.st>");
MODULE_DESCRIPTION("Apple SoC CPU cluster DVFS driver");
MODULE_LICENSE("GPL");
......@@ -103,6 +103,8 @@ static const struct of_device_id allowlist[] __initconst = {
static const struct of_device_id blocklist[] __initconst = {
{ .compatible = "allwinner,sun50i-h6", },
{ .compatible = "apple,arm-platform", },
{ .compatible = "arm,vexpress", },
{ .compatible = "calxeda,highbank", },
......@@ -160,6 +162,7 @@ static const struct of_device_id blocklist[] __initconst = {
{ .compatible = "ti,am43", },
{ .compatible = "ti,dra7", },
{ .compatible = "ti,omap3", },
{ .compatible = "ti,am625", },
{ .compatible = "qcom,ipq8064", },
{ .compatible = "qcom,apq8064", },
......
......@@ -1207,6 +1207,7 @@ static struct cpufreq_policy *cpufreq_policy_alloc(unsigned int cpu)
if (!zalloc_cpumask_var(&policy->real_cpus, GFP_KERNEL))
goto err_free_rcpumask;
init_completion(&policy->kobj_unregister);
ret = kobject_init_and_add(&policy->kobj, &ktype_cpufreq,
cpufreq_global_kobject, "policy%u", cpu);
if (ret) {
......@@ -1245,7 +1246,6 @@ static struct cpufreq_policy *cpufreq_policy_alloc(unsigned int cpu)
init_rwsem(&policy->rwsem);
spin_lock_init(&policy->transition_lock);
init_waitqueue_head(&policy->transition_wait);
init_completion(&policy->kobj_unregister);
INIT_WORK(&policy->update, handle_update);
policy->cpu = cpu;
......
......@@ -128,25 +128,23 @@ static ssize_t show_trans_table(struct cpufreq_policy *policy, char *buf)
ssize_t len = 0;
int i, j, count;
len += scnprintf(buf + len, PAGE_SIZE - len, " From : To\n");
len += scnprintf(buf + len, PAGE_SIZE - len, " : ");
len += sysfs_emit_at(buf, len, " From : To\n");
len += sysfs_emit_at(buf, len, " : ");
for (i = 0; i < stats->state_num; i++) {
if (len >= PAGE_SIZE)
break;
len += scnprintf(buf + len, PAGE_SIZE - len, "%9u ",
stats->freq_table[i]);
len += sysfs_emit_at(buf, len, "%9u ", stats->freq_table[i]);
}
if (len >= PAGE_SIZE)
return PAGE_SIZE;
len += scnprintf(buf + len, PAGE_SIZE - len, "\n");
len += sysfs_emit_at(buf, len, "\n");
for (i = 0; i < stats->state_num; i++) {
if (len >= PAGE_SIZE)
break;
len += scnprintf(buf + len, PAGE_SIZE - len, "%9u: ",
stats->freq_table[i]);
len += sysfs_emit_at(buf, len, "%9u: ", stats->freq_table[i]);
for (j = 0; j < stats->state_num; j++) {
if (len >= PAGE_SIZE)
......@@ -157,11 +155,11 @@ static ssize_t show_trans_table(struct cpufreq_policy *policy, char *buf)
else
count = stats->trans_table[i * stats->max_state + j];
len += scnprintf(buf + len, PAGE_SIZE - len, "%9u ", count);
len += sysfs_emit_at(buf, len, "%9u ", count);
}
if (len >= PAGE_SIZE)
break;
len += scnprintf(buf + len, PAGE_SIZE - len, "\n");
len += sysfs_emit_at(buf, len, "\n");
}
if (len >= PAGE_SIZE) {
......
......@@ -298,6 +298,7 @@ static int hwp_active __read_mostly;
static int hwp_mode_bdw __read_mostly;
static bool per_cpu_limits __read_mostly;
static bool hwp_boost __read_mostly;
static bool hwp_forced __read_mostly;
static struct cpufreq_driver *intel_pstate_driver __read_mostly;
......@@ -1679,12 +1680,12 @@ static void intel_pstate_update_epp_defaults(struct cpudata *cpudata)
return;
/*
* If powerup EPP is something other than chipset default 0x80 and
* - is more performance oriented than 0x80 (default balance_perf EPP)
* If the EPP is set by firmware, which means that firmware enabled HWP
* - Is equal or less than 0x80 (default balance_perf EPP)
* - But less performance oriented than performance EPP
* then use this as new balance_perf EPP.
*/
if (cpudata->epp_default < HWP_EPP_BALANCE_PERFORMANCE &&
if (hwp_forced && cpudata->epp_default <= HWP_EPP_BALANCE_PERFORMANCE &&
cpudata->epp_default > HWP_EPP_PERFORMANCE) {
epp_values[EPP_INDEX_BALANCE_PERFORMANCE] = cpudata->epp_default;
return;
......@@ -2378,6 +2379,7 @@ static const struct x86_cpu_id intel_pstate_cpu_ids[] = {
X86_MATCH(COMETLAKE, core_funcs),
X86_MATCH(ICELAKE_X, core_funcs),
X86_MATCH(TIGERLAKE, core_funcs),
X86_MATCH(SAPPHIRERAPIDS_X, core_funcs),
{}
};
MODULE_DEVICE_TABLE(x86cpu, intel_pstate_cpu_ids);
......@@ -3384,7 +3386,7 @@ static int __init intel_pstate_init(void)
id = x86_match_cpu(hwp_support_ids);
if (id) {
bool hwp_forced = intel_pstate_hwp_is_enabled();
hwp_forced = intel_pstate_hwp_is_enabled();
if (hwp_forced)
pr_info("HWP enabled by BIOS\n");
......
......@@ -407,10 +407,10 @@ static int guess_fsb(int mult)
{
int speed = cpu_khz / 1000;
int i;
int speeds[] = { 666, 1000, 1333, 2000 };
static const int speeds[] = { 666, 1000, 1333, 2000 };
int f_max, f_min;
for (i = 0; i < 4; i++) {
for (i = 0; i < ARRAY_SIZE(speeds); i++) {
f_max = ((speeds[i] * mult) + 50) / 100;
f_max += (ROUNDING / 2);
f_min = f_max - ROUNDING;
......
......@@ -160,6 +160,7 @@ static int mtk_cpu_resources_init(struct platform_device *pdev,
struct mtk_cpufreq_data *data;
struct device *dev = &pdev->dev;
struct resource *res;
struct of_phandle_args args;
void __iomem *base;
int ret, i;
int index;
......@@ -168,11 +169,14 @@ static int mtk_cpu_resources_init(struct platform_device *pdev,
if (!data)
return -ENOMEM;
index = of_perf_domain_get_sharing_cpumask(policy->cpu, "performance-domains",
"#performance-domain-cells",
policy->cpus);
if (index < 0)
return index;
ret = of_perf_domain_get_sharing_cpumask(policy->cpu, "performance-domains",
"#performance-domain-cells",
policy->cpus, &args);
if (ret < 0)
return ret;
index = args.args[0];
of_node_put(args.np);
res = platform_get_resource(pdev, IORESOURCE_MEM, index);
if (!res) {
......
This diff is collapsed.
......@@ -39,7 +39,7 @@ static struct clk *spear1340_cpu_get_possible_parent(unsigned long newfreq)
* In SPEAr1340, cpu clk's parent sys clk can take input from
* following sources
*/
const char *sys_clk_src[] = {
static const char * const sys_clk_src[] = {
"sys_syn_clk",
"pll1_clk",
"pll2_clk",
......
......@@ -65,8 +65,8 @@ struct tegra186_cpufreq_cluster {
struct tegra186_cpufreq_data {
void __iomem *regs;
struct tegra186_cpufreq_cluster *clusters;
const struct tegra186_cpufreq_cpu *cpus;
struct tegra186_cpufreq_cluster clusters[];
};
static int tegra186_cpufreq_init(struct cpufreq_policy *policy)
......@@ -221,15 +221,12 @@ static int tegra186_cpufreq_probe(struct platform_device *pdev)
struct tegra_bpmp *bpmp;
unsigned int i = 0, err;
data = devm_kzalloc(&pdev->dev, sizeof(*data), GFP_KERNEL);
data = devm_kzalloc(&pdev->dev,
struct_size(data, clusters, TEGRA186_NUM_CLUSTERS),
GFP_KERNEL);
if (!data)
return -ENOMEM;
data->clusters = devm_kcalloc(&pdev->dev, TEGRA186_NUM_CLUSTERS,
sizeof(*data->clusters), GFP_KERNEL);
if (!data->clusters)
return -ENOMEM;
data->cpus = tegra186_cpus;
bpmp = tegra_bpmp_get(&pdev->dev);
......
......@@ -39,6 +39,14 @@
#define OMAP34xx_ProdID_SKUID 0x4830A20C
#define OMAP3_SYSCON_BASE (0x48000000 + 0x2000 + 0x270)
#define AM625_EFUSE_K_MPU_OPP 11
#define AM625_EFUSE_S_MPU_OPP 19
#define AM625_EFUSE_T_MPU_OPP 20
#define AM625_SUPPORT_K_MPU_OPP BIT(0)
#define AM625_SUPPORT_S_MPU_OPP BIT(1)
#define AM625_SUPPORT_T_MPU_OPP BIT(2)
#define VERSION_COUNT 2
struct ti_cpufreq_data;
......@@ -104,6 +112,25 @@ static unsigned long omap3_efuse_xlate(struct ti_cpufreq_data *opp_data,
return BIT(efuse);
}
static unsigned long am625_efuse_xlate(struct ti_cpufreq_data *opp_data,
unsigned long efuse)
{
unsigned long calculated_efuse = AM625_SUPPORT_K_MPU_OPP;
switch (efuse) {
case AM625_EFUSE_T_MPU_OPP:
calculated_efuse |= AM625_SUPPORT_T_MPU_OPP;
fallthrough;
case AM625_EFUSE_S_MPU_OPP:
calculated_efuse |= AM625_SUPPORT_S_MPU_OPP;
fallthrough;
case AM625_EFUSE_K_MPU_OPP:
calculated_efuse |= AM625_SUPPORT_K_MPU_OPP;
}
return calculated_efuse;
}
static struct ti_cpufreq_soc_data am3x_soc_data = {
.efuse_xlate = amx3_efuse_xlate,
.efuse_fallback = AM33XX_800M_ARM_MPU_MAX_FREQ,
......@@ -198,6 +225,14 @@ static struct ti_cpufreq_soc_data am3517_soc_data = {
.multi_regulator = false,
};
static struct ti_cpufreq_soc_data am625_soc_data = {
.efuse_xlate = am625_efuse_xlate,
.efuse_offset = 0x0018,
.efuse_mask = 0x07c0,
.efuse_shift = 0x6,
.rev_offset = 0x0014,
.multi_regulator = false,
};
/**
* ti_cpufreq_get_efuse() - Parse and return efuse value present on SoC
......@@ -301,6 +336,7 @@ static const struct of_device_id ti_cpufreq_of_match[] = {
{ .compatible = "ti,dra7", .data = &dra7_soc_data },
{ .compatible = "ti,omap34xx", .data = &omap34xx_soc_data, },
{ .compatible = "ti,omap36xx", .data = &omap36xx_soc_data, },
{ .compatible = "ti,am625", .data = &am625_soc_data, },
/* legacy */
{ .compatible = "ti,omap3430", .data = &omap34xx_soc_data, },
{ .compatible = "ti,omap3630", .data = &omap36xx_soc_data, },
......
......@@ -181,7 +181,8 @@ static int psci_cpuidle_domain_probe(struct platform_device *pdev)
if (ret)
goto remove_pd;
pr_info("Initialized CPU PM domain topology\n");
pr_info("Initialized CPU PM domain topology using %s mode\n",
use_osi ? "OSI" : "PC");
return 0;
put_node:
......
......@@ -211,18 +211,15 @@ int dt_init_idle_driver(struct cpuidle_driver *drv,
of_node_put(cpu_node);
if (err)
return err;
/*
* Update the driver state count only if some valid DT idle states
* were detected
*/
if (i)
drv->state_count = state_idx;
/* Set the number of total supported idle states. */
drv->state_count = state_idx;
/*
* Return the number of present and valid DT idle states, which can
* also be 0 on platforms with missing DT idle states or legacy DT
* configuration predating the DT idle states bindings.
*/
return i;
return state_idx - start_idx;
}
EXPORT_SYMBOL_GPL(dt_init_idle_driver);
......@@ -233,7 +233,7 @@ struct devfreq_event_dev *devfreq_event_get_edev_by_phandle(struct device *dev,
mutex_lock(&devfreq_event_list_lock);
list_for_each_entry(edev, &devfreq_event_list, node) {
if (edev->dev.parent && edev->dev.parent->of_node == node)
if (edev->dev.parent && device_match_of_node(edev->dev.parent, node))
goto out;
}
......
......@@ -776,8 +776,7 @@ static void remove_sysfs_files(struct devfreq *devfreq,
* @dev: the device to add devfreq feature.
* @profile: device-specific profile to run devfreq.
* @governor_name: name of the policy to choose frequency.
* @data: private data for the governor. The devfreq framework does not
* touch this value.
* @data: devfreq driver pass to governors, governor should not change it.
*/
struct devfreq *devfreq_add_device(struct device *dev,
struct devfreq_dev_profile *profile,
......@@ -1011,8 +1010,7 @@ static void devm_devfreq_dev_release(struct device *dev, void *res)
* @dev: the device to add devfreq feature.
* @profile: device-specific profile to run devfreq.
* @governor_name: name of the policy to choose frequency.
* @data: private data for the governor. The devfreq framework does not
* touch this value.
* @data: devfreq driver pass to governors, governor should not change it.
*
* This function manages automatically the memory of devfreq device using device
* resource management and simplify the free operation for memory of devfreq
......@@ -1059,7 +1057,7 @@ struct devfreq *devfreq_get_devfreq_by_node(struct device_node *node)
mutex_lock(&devfreq_list_lock);
list_for_each_entry(devfreq, &devfreq_list, node) {
if (devfreq->dev.parent
&& devfreq->dev.parent->of_node == node) {
&& device_match_of_node(devfreq->dev.parent, node)) {
mutex_unlock(&devfreq_list_lock);
return devfreq;
}
......
......@@ -214,8 +214,7 @@ static int exynos_nocp_parse_dt(struct platform_device *pdev,
nocp->clk = NULL;
/* Maps the memory mapped IO to control nocp register */
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
base = devm_ioremap_resource(dev, res);
base = devm_platform_get_and_ioremap_resource(pdev, 0, &res);
if (IS_ERR(base))
return PTR_ERR(base);
......
......@@ -21,7 +21,7 @@ struct userspace_data {
static int devfreq_userspace_func(struct devfreq *df, unsigned long *freq)
{
struct userspace_data *data = df->data;
struct userspace_data *data = df->governor_data;
if (data->valid)
*freq = data->user_frequency;
......@@ -40,7 +40,7 @@ static ssize_t set_freq_store(struct device *dev, struct device_attribute *attr,
int err = 0;
mutex_lock(&devfreq->lock);
data = devfreq->data;
data = devfreq->governor_data;
sscanf(buf, "%lu", &wanted);
data->user_frequency = wanted;
......@@ -60,7 +60,7 @@ static ssize_t set_freq_show(struct device *dev,
int err = 0;
mutex_lock(&devfreq->lock);
data = devfreq->data;
data = devfreq->governor_data;
if (data->valid)
err = sprintf(buf, "%lu\n", data->user_frequency);
......@@ -91,7 +91,7 @@ static int userspace_init(struct devfreq *devfreq)
goto out;
}
data->valid = false;
devfreq->data = data;
devfreq->governor_data = data;
err = sysfs_create_group(&devfreq->dev.kobj, &dev_attr_group);
out:
......@@ -107,8 +107,8 @@ static void userspace_exit(struct devfreq *devfreq)
if (devfreq->dev.kobj.sd)
sysfs_remove_group(&devfreq->dev.kobj, &dev_attr_group);
kfree(devfreq->data);
devfreq->data = NULL;
kfree(devfreq->governor_data);
devfreq->governor_data = NULL;
}
static int devfreq_userspace_handler(struct devfreq *devfreq,
......
......@@ -578,169 +578,140 @@ static bool _opp_is_supported(struct device *dev, struct opp_table *opp_table,
return false;
}
static int opp_parse_supplies(struct dev_pm_opp *opp, struct device *dev,
struct opp_table *opp_table)
static u32 *_parse_named_prop(struct dev_pm_opp *opp, struct device *dev,
struct opp_table *opp_table,
const char *prop_type, bool *triplet)
{
u32 *microvolt, *microamp = NULL, *microwatt = NULL;
int supplies = opp_table->regulator_count;
int vcount, icount, pcount, ret, i, j;
struct property *prop = NULL;
char name[NAME_MAX];
int count, ret;
u32 *out;
/* Search for "opp-microvolt-<name>" */
/* Search for "opp-<prop_type>-<name>" */
if (opp_table->prop_name) {
snprintf(name, sizeof(name), "opp-microvolt-%s",
snprintf(name, sizeof(name), "opp-%s-%s", prop_type,
opp_table->prop_name);
prop = of_find_property(opp->np, name, NULL);
}
if (!prop) {
/* Search for "opp-microvolt" */
sprintf(name, "opp-microvolt");
/* Search for "opp-<prop_type>" */
snprintf(name, sizeof(name), "opp-%s", prop_type);
prop = of_find_property(opp->np, name, NULL);
/* Missing property isn't a problem, but an invalid entry is */
if (!prop) {
if (unlikely(supplies == -1)) {
/* Initialize regulator_count */
opp_table->regulator_count = 0;
return 0;
}
if (!supplies)
return 0;
dev_err(dev, "%s: opp-microvolt missing although OPP managing regulators\n",
__func__);
return -EINVAL;
}
if (!prop)
return NULL;
}
if (unlikely(supplies == -1)) {
/* Initialize regulator_count */
supplies = opp_table->regulator_count = 1;
} else if (unlikely(!supplies)) {
dev_err(dev, "%s: opp-microvolt wasn't expected\n", __func__);
return -EINVAL;
count = of_property_count_u32_elems(opp->np, name);
if (count < 0) {
dev_err(dev, "%s: Invalid %s property (%d)\n", __func__, name,
count);
return ERR_PTR(count);
}
vcount = of_property_count_u32_elems(opp->np, name);
if (vcount < 0) {
dev_err(dev, "%s: Invalid %s property (%d)\n",
__func__, name, vcount);
return vcount;
}
/* There can be one or three elements per supply */
if (vcount != supplies && vcount != supplies * 3) {
dev_err(dev, "%s: Invalid number of elements in %s property (%d) with supplies (%d)\n",
__func__, name, vcount, supplies);
return -EINVAL;
/*
* Initialize regulator_count, if regulator information isn't provided
* by the platform. Now that one of the properties is available, fix the
* regulator_count to 1.
*/
if (unlikely(opp_table->regulator_count == -1))
opp_table->regulator_count = 1;
if (count != opp_table->regulator_count &&
(!triplet || count != opp_table->regulator_count * 3)) {
dev_err(dev, "%s: Invalid number of elements in %s property (%u) with supplies (%d)\n",
__func__, prop_type, count, opp_table->regulator_count);
return ERR_PTR(-EINVAL);
}
microvolt = kmalloc_array(vcount, sizeof(*microvolt), GFP_KERNEL);
if (!microvolt)
return -ENOMEM;
out = kmalloc_array(count, sizeof(*out), GFP_KERNEL);
if (!out)
return ERR_PTR(-EINVAL);
ret = of_property_read_u32_array(opp->np, name, microvolt, vcount);
ret = of_property_read_u32_array(opp->np, name, out, count);
if (ret) {
dev_err(dev, "%s: error parsing %s: %d\n", __func__, name, ret);
ret = -EINVAL;
goto free_microvolt;
}
/* Search for "opp-microamp-<name>" */
prop = NULL;
if (opp_table->prop_name) {
snprintf(name, sizeof(name), "opp-microamp-%s",
opp_table->prop_name);
prop = of_find_property(opp->np, name, NULL);
kfree(out);
return ERR_PTR(-EINVAL);
}
if (!prop) {
/* Search for "opp-microamp" */
sprintf(name, "opp-microamp");
prop = of_find_property(opp->np, name, NULL);
}
if (triplet)
*triplet = count != opp_table->regulator_count;
if (prop) {
icount = of_property_count_u32_elems(opp->np, name);
if (icount < 0) {
dev_err(dev, "%s: Invalid %s property (%d)\n", __func__,
name, icount);
ret = icount;
goto free_microvolt;
}
return out;
}
if (icount != supplies) {
dev_err(dev, "%s: Invalid number of elements in %s property (%d) with supplies (%d)\n",
__func__, name, icount, supplies);
ret = -EINVAL;
goto free_microvolt;
}
static u32 *opp_parse_microvolt(struct dev_pm_opp *opp, struct device *dev,
struct opp_table *opp_table, bool *triplet)
{
u32 *microvolt;
microamp = kmalloc_array(icount, sizeof(*microamp), GFP_KERNEL);
if (!microamp) {
ret = -EINVAL;
goto free_microvolt;
}
microvolt = _parse_named_prop(opp, dev, opp_table, "microvolt", triplet);
if (IS_ERR(microvolt))
return microvolt;
ret = of_property_read_u32_array(opp->np, name, microamp,
icount);
if (ret) {
dev_err(dev, "%s: error parsing %s: %d\n", __func__,
name, ret);
ret = -EINVAL;
goto free_microamp;
if (!microvolt) {
/*
* Missing property isn't a problem, but an invalid
* entry is. This property isn't optional if regulator
* information is provided. Check only for the first OPP, as
* regulator_count may get initialized after that to a valid
* value.
*/
if (list_empty(&opp_table->opp_list) &&
opp_table->regulator_count > 0) {
dev_err(dev, "%s: opp-microvolt missing although OPP managing regulators\n",
__func__);
return ERR_PTR(-EINVAL);
}
}
/* Search for "opp-microwatt" */
sprintf(name, "opp-microwatt");
prop = of_find_property(opp->np, name, NULL);
if (prop) {
pcount = of_property_count_u32_elems(opp->np, name);
if (pcount < 0) {
dev_err(dev, "%s: Invalid %s property (%d)\n", __func__,
name, pcount);
ret = pcount;
goto free_microamp;
}
return microvolt;
}
if (pcount != supplies) {
dev_err(dev, "%s: Invalid number of elements in %s property (%d) with supplies (%d)\n",
__func__, name, pcount, supplies);
ret = -EINVAL;
goto free_microamp;
}
static int opp_parse_supplies(struct dev_pm_opp *opp, struct device *dev,
struct opp_table *opp_table)
{
u32 *microvolt, *microamp, *microwatt;
int ret = 0, i, j;
bool triplet;
microwatt = kmalloc_array(pcount, sizeof(*microwatt),
GFP_KERNEL);
if (!microwatt) {
ret = -EINVAL;
goto free_microamp;
}
microvolt = opp_parse_microvolt(opp, dev, opp_table, &triplet);
if (IS_ERR(microvolt))
return PTR_ERR(microvolt);
ret = of_property_read_u32_array(opp->np, name, microwatt,
pcount);
if (ret) {
dev_err(dev, "%s: error parsing %s: %d\n", __func__,
name, ret);
ret = -EINVAL;
goto free_microwatt;
}
microamp = _parse_named_prop(opp, dev, opp_table, "microamp", NULL);
if (IS_ERR(microamp)) {
ret = PTR_ERR(microamp);
goto free_microvolt;
}
for (i = 0, j = 0; i < supplies; i++) {
opp->supplies[i].u_volt = microvolt[j++];
microwatt = _parse_named_prop(opp, dev, opp_table, "microwatt", NULL);
if (IS_ERR(microwatt)) {
ret = PTR_ERR(microwatt);
goto free_microamp;
}
if (vcount == supplies) {
opp->supplies[i].u_volt_min = opp->supplies[i].u_volt;
opp->supplies[i].u_volt_max = opp->supplies[i].u_volt;
} else {
opp->supplies[i].u_volt_min = microvolt[j++];
opp->supplies[i].u_volt_max = microvolt[j++];
/*
* Initialize regulator_count if it is uninitialized and no properties
* are found.
*/
if (unlikely(opp_table->regulator_count == -1)) {
opp_table->regulator_count = 0;
return 0;
}
for (i = 0, j = 0; i < opp_table->regulator_count; i++) {
if (microvolt) {
opp->supplies[i].u_volt = microvolt[j++];
if (triplet) {
opp->supplies[i].u_volt_min = microvolt[j++];
opp->supplies[i].u_volt_max = microvolt[j++];
} else {
opp->supplies[i].u_volt_min = opp->supplies[i].u_volt;
opp->supplies[i].u_volt_max = opp->supplies[i].u_volt;
}
}
if (microamp)
......@@ -750,7 +721,6 @@ static int opp_parse_supplies(struct dev_pm_opp *opp, struct device *dev,
opp->supplies[i].u_watt = microwatt[i];
}
free_microwatt:
kfree(microwatt);
free_microamp:
kfree(microamp);
......
......@@ -203,6 +203,7 @@ static const struct x86_cpu_id intel_uncore_cpu_ids[] = {
X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X, NULL),
X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_D, NULL),
X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X, NULL),
X86_MATCH_INTEL_FAM6_MODEL(EMERALDRAPIDS_X, NULL),
{}
};
MODULE_DEVICE_TABLE(x86cpu, intel_uncore_cpu_ids);
......
......@@ -44,6 +44,19 @@ config IDLE_INJECT
synchronously on a set of specified CPUs or alternatively
on a per CPU basis.
config ARM_SCMI_POWERCAP
tristate "ARM SCMI Powercap driver"
depends on ARM_SCMI_PROTOCOL
help
This enables support for the ARM Powercap based on ARM SCMI
Powercap protocol.
ARM SCMI Powercap protocol allows power limits to be enforced
and monitored against the SCMI Powercap domains advertised as
available by the SCMI platform firmware.
When compiled as module it will be called arm_scmi_powercap.ko.
config DTPM
bool "Power capping for Dynamic Thermal Power Management (EXPERIMENTAL)"
depends on OF
......
......@@ -6,3 +6,4 @@ obj-$(CONFIG_POWERCAP) += powercap_sys.o
obj-$(CONFIG_INTEL_RAPL_CORE) += intel_rapl_common.o
obj-$(CONFIG_INTEL_RAPL) += intel_rapl_msr.o
obj-$(CONFIG_IDLE_INJECT) += idle_inject.o
obj-$(CONFIG_ARM_SCMI_POWERCAP) += arm_scmi_powercap.o
This diff is collapsed.
......@@ -147,6 +147,7 @@ static void idle_inject_fn(unsigned int cpu)
/**
* idle_inject_set_duration - idle and run duration update helper
* @ii_dev: idle injection control device structure
* @run_duration_us: CPU run time to allow in microseconds
* @idle_duration_us: CPU idle time to inject in microseconds
*/
......@@ -162,6 +163,7 @@ void idle_inject_set_duration(struct idle_inject_device *ii_dev,
/**
* idle_inject_get_duration - idle and run duration retrieval helper
* @ii_dev: idle injection control device structure
* @run_duration_us: memory location to store the current CPU run time
* @idle_duration_us: memory location to store the current CPU idle time
*/
......@@ -175,6 +177,7 @@ void idle_inject_get_duration(struct idle_inject_device *ii_dev,
/**
* idle_inject_set_latency - set the maximum latency allowed
* @ii_dev: idle injection control device structure
* @latency_us: set the latency requirement for the idle state
*/
void idle_inject_set_latency(struct idle_inject_device *ii_dev,
......
......@@ -7,6 +7,7 @@
#include <linux/module.h>
#include <linux/device.h>
#include <linux/err.h>
#include <linux/kstrtox.h>
#include <linux/slab.h>
#include <linux/powercap.h>
......@@ -446,7 +447,7 @@ static ssize_t enabled_store(struct device *dev,
{
bool mode;
if (strtobool(buf, &mode))
if (kstrtobool(buf, &mode))
return -EINVAL;
if (dev->parent) {
struct powercap_zone *power_zone = to_powercap_zone(dev);
......
......@@ -1110,10 +1110,10 @@ cpufreq_table_set_inefficient(struct cpufreq_policy *policy,
}
static inline int parse_perf_domain(int cpu, const char *list_name,
const char *cell_name)
const char *cell_name,
struct of_phandle_args *args)
{
struct device_node *cpu_np;
struct of_phandle_args args;
int ret;
cpu_np = of_cpu_device_node_get(cpu);
......@@ -1121,41 +1121,44 @@ static inline int parse_perf_domain(int cpu, const char *list_name,
return -ENODEV;
ret = of_parse_phandle_with_args(cpu_np, list_name, cell_name, 0,
&args);
args);
if (ret < 0)
return ret;
of_node_put(cpu_np);
return args.args[0];
return 0;
}
static inline int of_perf_domain_get_sharing_cpumask(int pcpu, const char *list_name,
const char *cell_name, struct cpumask *cpumask)
const char *cell_name, struct cpumask *cpumask,
struct of_phandle_args *pargs)
{
int target_idx;
int cpu, ret;
struct of_phandle_args args;
ret = parse_perf_domain(pcpu, list_name, cell_name);
ret = parse_perf_domain(pcpu, list_name, cell_name, pargs);
if (ret < 0)
return ret;
target_idx = ret;
cpumask_set_cpu(pcpu, cpumask);
for_each_possible_cpu(cpu) {
if (cpu == pcpu)
continue;
ret = parse_perf_domain(cpu, list_name, cell_name);
ret = parse_perf_domain(cpu, list_name, cell_name, &args);
if (ret < 0)
continue;
if (target_idx == ret)
if (pargs->np == args.np && pargs->args_count == args.args_count &&
!memcmp(pargs->args, args.args, sizeof(args.args[0]) * args.args_count))
cpumask_set_cpu(cpu, cpumask);
of_node_put(args.np);
}
return target_idx;
return 0;
}
#else
static inline int cpufreq_boost_trigger_state(int state)
......@@ -1185,7 +1188,8 @@ cpufreq_table_set_inefficient(struct cpufreq_policy *policy,
}
static inline int of_perf_domain_get_sharing_cpumask(int pcpu, const char *list_name,
const char *cell_name, struct cpumask *cpumask)
const char *cell_name, struct cpumask *cpumask,
struct of_phandle_args *pargs)
{
return -EOPNOTSUPP;
}
......
......@@ -152,8 +152,8 @@ struct devfreq_stats {
* @max_state: count of entry present in the frequency table.
* @previous_freq: previously configured frequency value.
* @last_status: devfreq user device info, performance statistics
* @data: Private data of the governor. The devfreq framework does not
* touch this.
* @data: devfreq driver pass to governors, governor should not change it.
* @governor_data: private data for governors, devfreq core doesn't touch it.
* @user_min_freq_req: PM QoS minimum frequency request from user (via sysfs)
* @user_max_freq_req: PM QoS maximum frequency request from user (via sysfs)
* @scaling_min_freq: Limit minimum frequency requested by OPP interface
......@@ -193,7 +193,8 @@ struct devfreq {
unsigned long previous_freq;
struct devfreq_dev_status last_status;
void *data; /* private data for governors */
void *data;
void *governor_data;
struct dev_pm_qos_request user_min_freq_req;
struct dev_pm_qos_request user_max_freq_req;
......
......@@ -62,7 +62,7 @@ static int notifier_chain_unregister(struct notifier_block **nl,
* value of this parameter is -1.
* @nr_calls: Records the number of notifications sent. Don't care
* value of this field is NULL.
* @returns: notifier_call_chain returns the value returned by the
* Return: notifier_call_chain returns the value returned by the
* last notifier function called.
*/
static int notifier_call_chain(struct notifier_block **nl,
......@@ -105,13 +105,13 @@ NOKPROBE_SYMBOL(notifier_call_chain);
* @val_up: Value passed unmodified to the notifier function
* @val_down: Value passed unmodified to the notifier function when recovering
* from an error on @val_up
* @v Pointer passed unmodified to the notifier function
* @v: Pointer passed unmodified to the notifier function
*
* NOTE: It is important the @nl chain doesn't change between the two
* invocations of notifier_call_chain() such that we visit the
* exact same notifier callbacks; this rules out any RCU usage.
*
* Returns: the return value of the @val_up call.
* Return: the return value of the @val_up call.
*/
static int notifier_call_chain_robust(struct notifier_block **nl,
unsigned long val_up, unsigned long val_down,
......
......@@ -27,6 +27,8 @@ unsigned int __read_mostly freeze_timeout_msecs = 20 * MSEC_PER_SEC;
static int try_to_freeze_tasks(bool user_only)
{
const char *what = user_only ? "user space processes" :
"remaining freezable tasks";
struct task_struct *g, *p;
unsigned long end_time;
unsigned int todo;
......@@ -36,6 +38,8 @@ static int try_to_freeze_tasks(bool user_only)
bool wakeup = false;
int sleep_usecs = USEC_PER_MSEC;
pr_info("Freezing %s\n", what);
start = ktime_get_boottime();
end_time = jiffies + msecs_to_jiffies(freeze_timeout_msecs);
......@@ -82,9 +86,8 @@ static int try_to_freeze_tasks(bool user_only)
elapsed_msecs = ktime_to_ms(elapsed);
if (todo) {
pr_cont("\n");
pr_err("Freezing of tasks %s after %d.%03d seconds "
"(%d tasks refusing to freeze, wq_busy=%d):\n",
pr_err("Freezing %s %s after %d.%03d seconds "
"(%d tasks refusing to freeze, wq_busy=%d):\n", what,
wakeup ? "aborted" : "failed",
elapsed_msecs / 1000, elapsed_msecs % 1000,
todo - wq_busy, wq_busy);
......@@ -101,8 +104,8 @@ static int try_to_freeze_tasks(bool user_only)
read_unlock(&tasklist_lock);
}
} else {
pr_cont("(elapsed %d.%03d seconds) ", elapsed_msecs / 1000,
elapsed_msecs % 1000);
pr_info("Freezing %s completed (elapsed %d.%03d seconds)\n",
what, elapsed_msecs / 1000, elapsed_msecs % 1000);
}
return todo ? -EBUSY : 0;
......@@ -130,14 +133,11 @@ int freeze_processes(void)
static_branch_inc(&freezer_active);
pm_wakeup_clear(0);
pr_info("Freezing user space processes ... ");
pm_freezing = true;
error = try_to_freeze_tasks(true);
if (!error) {
if (!error)
__usermodehelper_set_disable_depth(UMH_DISABLED);
pr_cont("done.");
}
pr_cont("\n");
BUG_ON(in_atomic());
/*
......@@ -166,14 +166,9 @@ int freeze_kernel_threads(void)
{
int error;
pr_info("Freezing remaining freezable tasks ... ");
pm_nosig_freezing = true;
error = try_to_freeze_tasks(false);
if (!error)
pr_cont("done.");
pr_cont("\n");
BUG_ON(in_atomic());
if (error)
......
......@@ -1723,8 +1723,8 @@ static unsigned long minimum_image_size(unsigned long saveable)
* /sys/power/reserved_size, respectively). To make this happen, we compute the
* total number of available page frames and allocate at least
*
* ([page frames total] + PAGES_FOR_IO + [metadata pages]) / 2
* + 2 * DIV_ROUND_UP(reserved_size, PAGE_SIZE)
* ([page frames total] - PAGES_FOR_IO - [metadata pages]) / 2
* - 2 * DIV_ROUND_UP(reserved_size, PAGE_SIZE)
*
* of them, which corresponds to the maximum size of a hibernation image.
*
......@@ -2259,10 +2259,14 @@ static int unpack_orig_pfns(unsigned long *buf, struct memory_bitmap *bm)
if (unlikely(buf[j] == BM_END_OF_MAP))
break;
if (pfn_valid(buf[j]) && memory_bm_pfn_present(bm, buf[j]))
if (pfn_valid(buf[j]) && memory_bm_pfn_present(bm, buf[j])) {
memory_bm_set_bit(bm, buf[j]);
else
} else {
if (!pfn_valid(buf[j]))
pr_err(FW_BUG "Memory map mismatch at 0x%llx after hibernation\n",
(unsigned long long)PFN_PHYS(buf[j]));
return -EFAULT;
}
}
return 0;
......
......@@ -131,9 +131,10 @@ UTIL_OBJS = utils/helpers/amd.o utils/helpers/msr.o \
utils/idle_monitor/hsw_ext_idle.o \
utils/idle_monitor/amd_fam14h_idle.o utils/idle_monitor/cpuidle_sysfs.o \
utils/idle_monitor/mperf_monitor.o utils/idle_monitor/cpupower-monitor.o \
utils/idle_monitor/rapl_monitor.o \
utils/cpupower.o utils/cpufreq-info.o utils/cpufreq-set.o \
utils/cpupower-set.o utils/cpupower-info.o utils/cpuidle-info.o \
utils/cpuidle-set.o
utils/cpuidle-set.o utils/powercap-info.o
UTIL_SRC := $(UTIL_OBJS:.o=.c)
......@@ -143,9 +144,12 @@ UTIL_HEADERS = utils/helpers/helpers.h utils/idle_monitor/cpupower-monitor.h \
utils/helpers/bitmask.h \
utils/idle_monitor/idle_monitors.h utils/idle_monitor/idle_monitors.def
LIB_HEADERS = lib/cpufreq.h lib/cpupower.h lib/cpuidle.h lib/acpi_cppc.h
LIB_SRC = lib/cpufreq.c lib/cpupower.c lib/cpuidle.c lib/acpi_cppc.c
LIB_OBJS = lib/cpufreq.o lib/cpupower.o lib/cpuidle.o lib/acpi_cppc.o
LIB_HEADERS = lib/cpufreq.h lib/cpupower.h lib/cpuidle.h lib/acpi_cppc.h \
lib/powercap.h
LIB_SRC = lib/cpufreq.c lib/cpupower.c lib/cpuidle.c lib/acpi_cppc.c \
lib/powercap.c
LIB_OBJS = lib/cpufreq.o lib/cpupower.o lib/cpuidle.o lib/acpi_cppc.o \
lib/powercap.o
LIB_OBJS := $(addprefix $(OUTPUT),$(LIB_OBJS))
override CFLAGS += -pipe
......@@ -276,6 +280,7 @@ install-lib: libcpupower
$(INSTALL) -d $(DESTDIR)${includedir}
$(INSTALL_DATA) lib/cpufreq.h $(DESTDIR)${includedir}/cpufreq.h
$(INSTALL_DATA) lib/cpuidle.h $(DESTDIR)${includedir}/cpuidle.h
$(INSTALL_DATA) lib/powercap.h $(DESTDIR)${includedir}/powercap.h
install-tools: $(OUTPUT)cpupower
$(INSTALL) -d $(DESTDIR)${bindir}
......@@ -292,6 +297,7 @@ install-man:
$(INSTALL_DATA) -D man/cpupower-set.1 $(DESTDIR)${mandir}/man1/cpupower-set.1
$(INSTALL_DATA) -D man/cpupower-info.1 $(DESTDIR)${mandir}/man1/cpupower-info.1
$(INSTALL_DATA) -D man/cpupower-monitor.1 $(DESTDIR)${mandir}/man1/cpupower-monitor.1
$(INSTALL_DATA) -D man/cpupower-powercap-info.1 $(DESTDIR)${mandir}/man1/cpupower-powercap-info.1
install-gmo: create-gmo
$(INSTALL) -d $(DESTDIR)${localedir}
......@@ -321,6 +327,7 @@ uninstall:
- rm -f $(DESTDIR)${mandir}/man1/cpupower-set.1
- rm -f $(DESTDIR)${mandir}/man1/cpupower-info.1
- rm -f $(DESTDIR)${mandir}/man1/cpupower-monitor.1
- rm -f $(DESTDIR)${mandir}/man1/cpupower-powercap-info.1
- for HLANG in $(LANGUAGES); do \
rm -f $(DESTDIR)${localedir}/$$HLANG/LC_MESSAGES/cpupower.mo; \
done;
......
// SPDX-License-Identifier: GPL-2.0-only
/*
* (C) 2016 SUSE Software Solutions GmbH
* Thomas Renninger <trenn@suse.de>
*/
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <stdio.h>
#include <dirent.h>
#include "powercap.h"
static unsigned int sysfs_read_file(const char *path, char *buf, size_t buflen)
{
int fd;
ssize_t numread;
fd = open(path, O_RDONLY);
if (fd == -1)
return 0;
numread = read(fd, buf, buflen - 1);
if (numread < 1) {
close(fd);
return 0;
}
buf[numread] = '\0';
close(fd);
return (unsigned int) numread;
}
static int sysfs_get_enabled(char *path, int *mode)
{
int fd;
char yes_no;
*mode = 0;
fd = open(path, O_RDONLY);
if (fd == -1)
return -1;
if (read(fd, &yes_no, 1) != 1) {
close(fd);
return -1;
}
if (yes_no == '1') {
*mode = 1;
return 0;
} else if (yes_no == '0') {
return 0;
}
return -1;
}
int powercap_get_enabled(int *mode)
{
char path[SYSFS_PATH_MAX] = PATH_TO_POWERCAP "/intel-rapl/enabled";
return sysfs_get_enabled(path, mode);
}
/*
* Hardcoded, because rapl is the only powercap implementation
- * this needs to get more generic if more powercap implementations
* should show up
*/
int powercap_get_driver(char *driver, int buflen)
{
char file[SYSFS_PATH_MAX] = PATH_TO_RAPL;
struct stat statbuf;
if (stat(file, &statbuf) != 0 || !S_ISDIR(statbuf.st_mode)) {
driver = "";
return -1;
} else if (buflen > 10) {
strcpy(driver, "intel-rapl");
return 0;
} else
return -1;
}
enum powercap_get64 {
GET_ENERGY_UJ,
GET_MAX_ENERGY_RANGE_UJ,
GET_POWER_UW,
GET_MAX_POWER_RANGE_UW,
MAX_GET_64_FILES
};
static const char *powercap_get64_files[MAX_GET_64_FILES] = {
[GET_POWER_UW] = "power_uw",
[GET_MAX_POWER_RANGE_UW] = "max_power_range_uw",
[GET_ENERGY_UJ] = "energy_uj",
[GET_MAX_ENERGY_RANGE_UJ] = "max_energy_range_uj",
};
static int sysfs_powercap_get64_val(struct powercap_zone *zone,
enum powercap_get64 which,
uint64_t *val)
{
char file[SYSFS_PATH_MAX] = PATH_TO_POWERCAP "/";
int ret;
char buf[MAX_LINE_LEN];
strcat(file, zone->sys_name);
strcat(file, "/");
strcat(file, powercap_get64_files[which]);
ret = sysfs_read_file(file, buf, MAX_LINE_LEN);
if (ret < 0)
return ret;
if (ret == 0)
return -1;
*val = strtoll(buf, NULL, 10);
return 0;
}
int powercap_get_max_energy_range_uj(struct powercap_zone *zone, uint64_t *val)
{
return sysfs_powercap_get64_val(zone, GET_MAX_ENERGY_RANGE_UJ, val);
}
int powercap_get_energy_uj(struct powercap_zone *zone, uint64_t *val)
{
return sysfs_powercap_get64_val(zone, GET_ENERGY_UJ, val);
}
int powercap_get_max_power_range_uw(struct powercap_zone *zone, uint64_t *val)
{
return sysfs_powercap_get64_val(zone, GET_MAX_POWER_RANGE_UW, val);
}
int powercap_get_power_uw(struct powercap_zone *zone, uint64_t *val)
{
return sysfs_powercap_get64_val(zone, GET_POWER_UW, val);
}
int powercap_zone_get_enabled(struct powercap_zone *zone, int *mode)
{
char path[SYSFS_PATH_MAX] = PATH_TO_POWERCAP;
if ((strlen(PATH_TO_POWERCAP) + strlen(zone->sys_name)) +
strlen("/enabled") + 1 >= SYSFS_PATH_MAX)
return -1;
strcat(path, "/");
strcat(path, zone->sys_name);
strcat(path, "/enabled");
return sysfs_get_enabled(path, mode);
}
int powercap_zone_set_enabled(struct powercap_zone *zone, int mode)
{
/* To be done if needed */
return 0;
}
int powercap_read_zone(struct powercap_zone *zone)
{
struct dirent *dent;
DIR *zone_dir;
char sysfs_dir[SYSFS_PATH_MAX] = PATH_TO_POWERCAP;
struct powercap_zone *child_zone;
char file[SYSFS_PATH_MAX] = PATH_TO_POWERCAP;
int i, ret = 0;
uint64_t val = 0;
strcat(sysfs_dir, "/");
strcat(sysfs_dir, zone->sys_name);
zone_dir = opendir(sysfs_dir);
if (zone_dir == NULL)
return -1;
strcat(file, "/");
strcat(file, zone->sys_name);
strcat(file, "/name");
sysfs_read_file(file, zone->name, MAX_LINE_LEN);
if (zone->parent)
zone->tree_depth = zone->parent->tree_depth + 1;
ret = powercap_get_energy_uj(zone, &val);
if (ret == 0)
zone->has_energy_uj = 1;
ret = powercap_get_power_uw(zone, &val);
if (ret == 0)
zone->has_power_uw = 1;
while ((dent = readdir(zone_dir)) != NULL) {
struct stat st;
if (strcmp(dent->d_name, ".") == 0 || strcmp(dent->d_name, "..") == 0)
continue;
if (stat(dent->d_name, &st) != 0 || !S_ISDIR(st.st_mode))
if (fstatat(dirfd(zone_dir), dent->d_name, &st, 0) < 0)
continue;
if (strncmp(dent->d_name, "intel-rapl:", 11) != 0)
continue;
child_zone = calloc(1, sizeof(struct powercap_zone));
if (child_zone == NULL)
return -1;
for (i = 0; i < POWERCAP_MAX_CHILD_ZONES; i++) {
if (zone->children[i] == NULL) {
zone->children[i] = child_zone;
break;
}
if (i == POWERCAP_MAX_CHILD_ZONES - 1) {
free(child_zone);
fprintf(stderr, "Reached POWERCAP_MAX_CHILD_ZONES %d\n",
POWERCAP_MAX_CHILD_ZONES);
return -1;
}
}
strcpy(child_zone->sys_name, zone->sys_name);
strcat(child_zone->sys_name, "/");
strcat(child_zone->sys_name, dent->d_name);
child_zone->parent = zone;
if (zone->tree_depth >= POWERCAP_MAX_TREE_DEPTH) {
fprintf(stderr, "Maximum zone hierarchy depth[%d] reached\n",
POWERCAP_MAX_TREE_DEPTH);
ret = -1;
break;
}
powercap_read_zone(child_zone);
}
closedir(zone_dir);
return ret;
}
struct powercap_zone *powercap_init_zones(void)
{
int enabled;
struct powercap_zone *root_zone;
int ret;
char file[SYSFS_PATH_MAX] = PATH_TO_RAPL "/enabled";
ret = sysfs_get_enabled(file, &enabled);
if (ret)
return NULL;
if (!enabled)
return NULL;
root_zone = calloc(1, sizeof(struct powercap_zone));
if (!root_zone)
return NULL;
strcpy(root_zone->sys_name, "intel-rapl/intel-rapl:0");
powercap_read_zone(root_zone);
return root_zone;
}
/* Call function *f on the passed zone and all its children */
int powercap_walk_zones(struct powercap_zone *zone,
int (*f)(struct powercap_zone *zone))
{
int i, ret;
if (!zone)
return -1;
ret = f(zone);
if (ret)
return ret;
for (i = 0; i < POWERCAP_MAX_CHILD_ZONES; i++) {
if (zone->children[i] != NULL)
powercap_walk_zones(zone->children[i], f);
}
return 0;
}
/* SPDX-License-Identifier: GPL-2.0-only */
/*
* (C) 2016 SUSE Software Solutions GmbH
* Thomas Renninger <trenn@suse.de>
*/
#ifndef __CPUPOWER_RAPL_H__
#define __CPUPOWER_RAPL_H__
#define PATH_TO_POWERCAP "/sys/devices/virtual/powercap"
#define PATH_TO_RAPL "/sys/devices/virtual/powercap/intel-rapl"
#define PATH_TO_RAPL_CLASS "/sys/devices/virtual/powercap/intel-rapl"
#define POWERCAP_MAX_CHILD_ZONES 10
#define POWERCAP_MAX_TREE_DEPTH 10
#define MAX_LINE_LEN 4096
#define SYSFS_PATH_MAX 255
#include <stdint.h>
struct powercap_zone {
char name[MAX_LINE_LEN];
/*
* sys_name relative to PATH_TO_POWERCAP,
* do not forget the / in between
*/
char sys_name[SYSFS_PATH_MAX];
int tree_depth;
struct powercap_zone *parent;
struct powercap_zone *children[POWERCAP_MAX_CHILD_ZONES];
/* More possible caps or attributes to be added? */
uint32_t has_power_uw:1,
has_energy_uj:1;
};
int powercap_walk_zones(struct powercap_zone *zone,
int (*f)(struct powercap_zone *zone));
struct powercap_zone *powercap_init_zones(void);
int powercap_get_enabled(int *mode);
int powercap_set_enabled(int mode);
int powercap_get_driver(char *driver, int buflen);
int powercap_get_max_energy_range_uj(struct powercap_zone *zone, uint64_t *val);
int powercap_get_energy_uj(struct powercap_zone *zone, uint64_t *val);
int powercap_get_max_power_range_uw(struct powercap_zone *zone, uint64_t *val);
int powercap_get_power_uw(struct powercap_zone *zone, uint64_t *val);
int powercap_zone_get_enabled(struct powercap_zone *zone, int *mode);
int powercap_zone_set_enabled(struct powercap_zone *zone, int mode);
#endif /* __CPUPOWER_RAPL_H__ */
.TH CPUPOWER\-POWERCAP\-INFO "1" "05/08/2016" "" "cpupower Manual"
.SH NAME
cpupower\-powercap\-info \- Shows powercapping related kernel and hardware configurations
.SH SYNOPSIS
.ft B
.B cpupower powercap-info
.SH DESCRIPTION
\fBcpupower powercap-info \fP shows kernel powercapping subsystem information.
This needs hardware support and a loaded powercapping driver (at this time only
intel_rapl driver exits) exporting hardware values userspace via sysfs.
Some options are platform wide, some affect single cores. By default values
of core zero are displayed only. cpupower --cpu all cpuinfo will show the
settings of all cores, see cpupower(1) how to choose specific cores.
.SH "DOCUMENTATION"
kernel sources:
Documentation/power/powercap/powercap.txt
.SH "SEE ALSO"
cpupower(1)
This diff is collapsed.
......@@ -8,6 +8,8 @@ extern int cmd_freq_set(int argc, const char **argv);
extern int cmd_freq_info(int argc, const char **argv);
extern int cmd_idle_set(int argc, const char **argv);
extern int cmd_idle_info(int argc, const char **argv);
extern int cmd_cap_info(int argc, const char **argv);
extern int cmd_cap_set(int argc, const char **argv);
extern int cmd_monitor(int argc, const char **argv);
#endif
......@@ -572,9 +572,9 @@ int cmd_freq_info(int argc, char **argv)
ret = 0;
/* Default is: show output of CPU 0 only */
/* Default is: show output of base_cpu only */
if (bitmask_isallclear(cpus_chosen))
bitmask_setbit(cpus_chosen, 0);
bitmask_setbit(cpus_chosen, base_cpu);
switch (output_param) {
case -1:
......
......@@ -176,9 +176,9 @@ int cmd_idle_info(int argc, char **argv)
cpuidle_exit(EXIT_FAILURE);
}
/* Default is: show output of CPU 0 only */
/* Default is: show output of base_cpu only */
if (bitmask_isallclear(cpus_chosen))
bitmask_setbit(cpus_chosen, 0);
bitmask_setbit(cpus_chosen, base_cpu);
if (output_param == 0)
cpuidle_general_output();
......
......@@ -67,9 +67,9 @@ int cmd_info(int argc, char **argv)
if (!params.params)
params.params = 0x7;
/* Default is: show output of CPU 0 only */
/* Default is: show output of base_cpu only */
if (bitmask_isallclear(cpus_chosen))
bitmask_setbit(cpus_chosen, 0);
bitmask_setbit(cpus_chosen, base_cpu);
/* Add more per cpu options here */
if (!params.perf_bias)
......
......@@ -54,6 +54,7 @@ static struct cmd_struct commands[] = {
{ "frequency-set", cmd_freq_set, 1 },
{ "idle-info", cmd_idle_info, 0 },
{ "idle-set", cmd_idle_set, 1 },
{ "powercap-info", cmd_cap_info, 0 },
{ "set", cmd_set, 1 },
{ "info", cmd_info, 0 },
{ "monitor", cmd_monitor, 0 },
......
......@@ -459,9 +459,10 @@ int cmd_monitor(int argc, char **argv)
print_results(1, cpu);
}
for (num = 0; num < avail_monitors; num++)
monitors[num]->unregister();
for (num = 0; num < avail_monitors; num++) {
if (monitors[num]->unregister)
monitors[num]->unregister();
}
cpu_topology_release(cpu_top);
return 0;
}
......@@ -4,5 +4,6 @@ DEF(intel_nhm)
DEF(intel_snb)
DEF(intel_hsw_ext)
DEF(mperf)
DEF(rapl)
#endif
DEF(cpuidle_sysfs)
// SPDX-License-Identifier: GPL-2.0-only
/*
* (C) 2016 SUSE Software Solutions GmbH
* Thomas Renninger <trenn@suse.de>
*/
#if defined(__i386__) || defined(__x86_64__)
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <time.h>
#include <string.h>
#include <pci/pci.h>
#include "idle_monitor/cpupower-monitor.h"
#include "helpers/helpers.h"
#include "powercap.h"
#define MAX_RAPL_ZONES 10
int rapl_zone_count;
cstate_t rapl_zones[MAX_RAPL_ZONES];
struct powercap_zone *rapl_zones_pt[MAX_RAPL_ZONES] = { 0 };
unsigned long long rapl_zone_previous_count[MAX_RAPL_ZONES];
unsigned long long rapl_zone_current_count[MAX_RAPL_ZONES];
unsigned long long rapl_max_count;
static int rapl_get_count_uj(unsigned int id, unsigned long long *count,
unsigned int cpu)
{
if (rapl_zones_pt[id] == NULL)
/* error */
return -1;
*count = rapl_zone_current_count[id] - rapl_zone_previous_count[id];
return 0;
}
static int powercap_count_zones(struct powercap_zone *zone)
{
uint64_t val;
int uj;
if (rapl_zone_count >= MAX_RAPL_ZONES)
return -1;
if (!zone->has_energy_uj)
return 0;
printf("%s\n", zone->sys_name);
uj = powercap_get_energy_uj(zone, &val);
printf("%d\n", uj);
strncpy(rapl_zones[rapl_zone_count].name, zone->name, CSTATE_NAME_LEN - 1);
strcpy(rapl_zones[rapl_zone_count].desc, "");
rapl_zones[rapl_zone_count].id = rapl_zone_count;
rapl_zones[rapl_zone_count].range = RANGE_MACHINE;
rapl_zones[rapl_zone_count].get_count = rapl_get_count_uj;
rapl_zones_pt[rapl_zone_count] = zone;
rapl_zone_count++;
return 0;
}
static int rapl_start(void)
{
int i, ret;
uint64_t uj_val;
for (i = 0; i < rapl_zone_count; i++) {
ret = powercap_get_energy_uj(rapl_zones_pt[i], &uj_val);
if (ret)
return ret;
rapl_zone_previous_count[i] = uj_val;
}
return 0;
}
static int rapl_stop(void)
{
int i;
uint64_t uj_val;
for (i = 0; i < rapl_zone_count; i++) {
int ret;
ret = powercap_get_energy_uj(rapl_zones_pt[i], &uj_val);
if (ret)
return ret;
rapl_zone_current_count[i] = uj_val;
if (rapl_max_count < uj_val)
rapl_max_count = uj_val - rapl_zone_previous_count[i];
}
return 0;
}
struct cpuidle_monitor *rapl_register(void)
{
struct powercap_zone *root_zone;
char line[MAX_LINE_LEN] = "";
int ret, val;
ret = powercap_get_driver(line, MAX_LINE_LEN);
if (ret < 0) {
dprint("No powercapping driver loaded\n");
return NULL;
}
dprint("Driver: %s\n", line);
ret = powercap_get_enabled(&val);
if (ret < 0)
return NULL;
if (!val) {
dprint("Powercapping is disabled\n");
return NULL;
}
dprint("Powercap domain hierarchy:\n\n");
root_zone = powercap_init_zones();
if (root_zone == NULL) {
dprint("No powercap info found\n");
return NULL;
}
powercap_walk_zones(root_zone, powercap_count_zones);
rapl_monitor.hw_states_num = rapl_zone_count;
return &rapl_monitor;
}
struct cpuidle_monitor rapl_monitor = {
.name = "RAPL",
.hw_states = rapl_zones,
.hw_states_num = 0,
.start = rapl_start,
.stop = rapl_stop,
.do_register = rapl_register,
.flags.needs_root = 0,
.overflow_s = 60 * 60 * 24 * 100, /* To be implemented */
};
#endif
// SPDX-License-Identifier: GPL-2.0-only
/*
* (C) 2016 SUSE Software Solutions GmbH
* Thomas Renninger <trenn@suse.de>
*/
#include <unistd.h>
#include <stdio.h>
#include <errno.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>
#include <getopt.h>
#include "powercap.h"
#include "helpers/helpers.h"
int powercap_show_all;
static struct option info_opts[] = {
{ "all", no_argument, NULL, 'a'},
{ },
};
static int powercap_print_one_zone(struct powercap_zone *zone)
{
int mode, i, ret = 0;
char pr_prefix[1024] = "";
for (i = 0; i < zone->tree_depth && i < POWERCAP_MAX_TREE_DEPTH; i++)
strcat(pr_prefix, "\t");
printf("%sZone: %s", pr_prefix, zone->name);
ret = powercap_zone_get_enabled(zone, &mode);
if (ret < 0)
return ret;
printf(" (%s)\n", mode ? "enabled" : "disabled");
if (zone->has_power_uw)
printf(_("%sPower can be monitored in micro Jules\n"),
pr_prefix);
if (zone->has_energy_uj)
printf(_("%sPower can be monitored in micro Watts\n"),
pr_prefix);
printf("\n");
if (ret != 0)
return ret;
return ret;
}
static int powercap_show(void)
{
struct powercap_zone *root_zone;
char line[MAX_LINE_LEN] = "";
int ret, val;
ret = powercap_get_driver(line, MAX_LINE_LEN);
if (ret < 0) {
printf(_("No powercapping driver loaded\n"));
return ret;
}
printf("Driver: %s\n", line);
ret = powercap_get_enabled(&val);
if (ret < 0)
return ret;
if (!val) {
printf(_("Powercapping is disabled\n"));
return -1;
}
printf(_("Powercap domain hierarchy:\n\n"));
root_zone = powercap_init_zones();
if (root_zone == NULL) {
printf(_("No powercap info found\n"));
return 1;
}
powercap_walk_zones(root_zone, powercap_print_one_zone);
return 0;
}
int cmd_cap_set(int argc, char **argv)
{
return 0;
};
int cmd_cap_info(int argc, char **argv)
{
int ret = 0, cont = 1;
do {
ret = getopt_long(argc, argv, "a", info_opts, NULL);
switch (ret) {
case '?':
cont = 0;
break;
case -1:
cont = 0;
break;
case 'a':
powercap_show_all = 1;
break;
default:
fprintf(stderr, _("invalid or unknown argument\n"));
return EXIT_FAILURE;
}
} while (cont);
powercap_show();
return 0;
}
......@@ -1462,7 +1462,7 @@ class Data:
'TIMEOUT' : r'(?i).*\bTIMEOUT\b.*',
'ABORT' : r'(?i).*\bABORT\b.*',
'IRQ' : r'.*\bgenirq: .*',
'TASKFAIL': r'.*Freezing of tasks *.*',
'TASKFAIL': r'.*Freezing .*after *.*',
'ACPI' : r'.*\bACPI *(?P<b>[A-Za-z]*) *Error[: ].*',
'DISKFULL': r'.*\bNo space left on device.*',
'USBERR' : r'.*usb .*device .*, error [0-9-]*',
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment