• Srinivas Pandruvada's avatar
    cpufreq: intel_pstate: Per CPU P-State limits · eae48f04
    Srinivas Pandruvada authored
    Intel P-State offers two interface to set performance limits:
    - Intel P-State sysfs
    	/sys/devices/system/cpu/intel_pstate/max_perf_pct
    	/sys/devices/system/cpu/intel_pstate/min_perf_pct
    - cpufreq
    	/sys/devices/system/cpu/cpu*/cpufreq/scaling_max_freq
    	/sys/devices/system/cpu/cpu*/cpufreq/scaling_min_freq
    
    In the current implementation both of the above methods, change limits
    to every CPU in the system. Moreover the limits placed using cpufreq
    policy interface also presented in the Intel P-State sysfs via modified
    max_perf_pct and min_per_pct during sysfs reads. This allows to check
    percent of reduced/increased performance, irrespective of method used to
    limit.
    
    There are some new generations of processors, where it is possible to
    have limits placed on individual CPU cores. Using cpufreq interface it
    is possible to set limits on each CPU. But the current processing will
    use last limits placed on all CPUs. So the per core limit feature of
    CPUs can't be used.
    
    This change brings in capability to set P-States limits for each CPU,
    with some limitations. In this case what should be the read of
    max_perf_pct and min_perf_pct? It can be most restrictive limits placed
    on any CPU or max possible performance on any given CPU on which no
    limits are placed. In either case someone will have issue.
    
    So the consensus is, we can't have both sysfs controls present when user
    wants to use limit per core limits.
    - By default per-core-control feature is not enabled. So no one will
    notice any difference.
    - The way to enable is by kernel command line
    intel_pstate=per_cpu_perf_limits
    - When the per-core-controls are enabled there is no display of for both
    read and write on
    	/sys/devices/system/cpu/intel_pstate/max_perf_pct
    	/sys/devices/system/cpu/intel_pstate/min_perf_pct
    - User can change limits using
    	/sys/devices/system/cpu/cpu*/cpufreq/scaling_max_freq
    	/sys/devices/system/cpu/cpu*/cpufreq/scaling_min_freq
    	/sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
    - User can still observe turbo percent and number of P-States from
    	/sys/devices/system/cpu/intel_pstate/turbo_pct
    	/sys/devices/system/cpu/intel_pstate/num_pstates
    - User can read write system wide turbo status
    	/sys/devices/system/cpu/no_turbo
    
    While changing this BUG_ON is changed to WARN_ON, as they are not fatal
    errors for the system.
    Signed-off-by: default avatarSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
    Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
    eae48f04
intel_pstate.c 50.1 KB