An error occurred fetching the project authors.
  1. 20 Apr, 2022 1 commit
    • Darren Hart's avatar
      topology: make core_mask include at least cluster_siblings · db1e5948
      Darren Hart authored
      Ampere Altra defines CPU clusters in the ACPI PPTT. They share a Snoop
      Control Unit, but have no shared CPU-side last level cache.
      
      cpu_coregroup_mask() will return a cpumask with weight 1, while
      cpu_clustergroup_mask() will return a cpumask with weight 2.
      
      As a result, build_sched_domain() will BUG() once per CPU with:
      
      BUG: arch topology borken
      the CLS domain not a subset of the MC domain
      
      The MC level cpumask is then extended to that of the CLS child, and is
      later removed entirely as redundant. This sched domain topology is an
      improvement over previous topologies, or those built without
      SCHED_CLUSTER, particularly for certain latency sensitive workloads.
      With the current scheduler model and heuristics, this is a desirable
      default topology for Ampere Altra and Altra Max system.
      
      Rather than create a custom sched domains topology structure and
      introduce new logic in arch/arm64 to detect these systems, update the
      core_mask so coregroup is never a subset of clustergroup, extending it
      to cluster_siblings if necessary. Only do this if CONFIG_SCHED_CLUSTER
      is enabled to avoid also changing the topology (MC) when
      CONFIG_SCHED_CLUSTER is disabled.
      
      This has the added benefit over a custom topology of working for both
      symmetric and asymmetric topologies. It does not address systems where
      the CLUSTER topology is above a populated MC topology, but these are not
      considered today and can be addressed separately if and when they
      appear.
      
      The final sched domain topology for a 2 socket Ampere Altra system is
      unchanged with or without CONFIG_SCHED_CLUSTER, and the BUG is avoided:
      
      For CPU0:
      
      CONFIG_SCHED_CLUSTER=y
      CLS  [0-1]
      DIE  [0-79]
      NUMA [0-159]
      
      CONFIG_SCHED_CLUSTER is not set
      DIE  [0-79]
      NUMA [0-159]
      
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: "Rafael J. Wysocki" <rafael@kernel.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Vincent Guittot <vincent.guittot@linaro.org>
      Cc: D. Scott Phillips <scott@os.amperecomputing.com>
      Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
      Cc: <stable@vger.kernel.org> # 5.16.x
      Suggested-by: default avatarBarry Song <song.bao.hua@hisilicon.com>
      Reviewed-by: default avatarBarry Song <song.bao.hua@hisilicon.com>
      Reviewed-by: default avatarDietmar Eggemann <dietmar.eggemann@arm.com>
      Acked-by: default avatarSudeep Holla <sudeep.holla@arm.com>
      Signed-off-by: default avatarDarren Hart <darren@os.amperecomputing.com>
      Link: https://lore.kernel.org/r/c8fe9fce7c86ed56b4c455b8c902982dc2303868.1649696956.git.darren@os.amperecomputing.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      db1e5948
  2. 10 Mar, 2022 1 commit
    • Ionela Voinescu's avatar
      arch_topology: obtain cpu capacity using information from CPPC · 9924fbb5
      Ionela Voinescu authored
      Define topology_init_cpu_capacity_cppc() to use highest performance
      values from _CPC objects to obtain and set maximum capacity information
      for each CPU. acpi_cppc_processor_probe() is a good point at which to
      trigger the initialization of CPU (u-arch) capacity values, as at this
      point the highest performance values can be obtained from each CPU's
      _CPC objects. Architectures can therefore use this functionality
      through arch_init_invariance_cppc().
      
      The performance scale used by CPPC is a unified scale for all CPUs in
      the system. Therefore, by obtaining the raw highest performance values
      from the _CPC objects, and normalizing them on the [0, 1024] capacity
      scale, used by the task scheduler, we obtain the CPU capacity of each
      CPU.
      
      While an ACPI Notify(0x85) could alert about a change in the highest
      performance value, which should in turn retrigger the CPU capacity
      computations, this notification is not currently handled by the ACPI
      processor driver. When supported, a call to arch_init_invariance_cppc()
      would perform the update.
      Signed-off-by: default avatarIonela Voinescu <ionela.voinescu@arm.com>
      Acked-by: default avatarSudeep Holla <sudeep.holla@arm.com>
      Tested-by: default avatarValentin Schneider <valentin.schneider@arm.com>
      Tested-by: default avatarYicong Yang <yangyicong@hisilicon.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      9924fbb5
  3. 23 Nov, 2021 2 commits
  4. 11 Nov, 2021 1 commit
    • Wang ShaoBo's avatar
      arch_topology: Fix missing clear cluster_cpumask in remove_cpu_topology() · 4cc4cc28
      Wang ShaoBo authored
      When testing cpu online and offline, warning happened like this:
      
      [  146.746743] WARNING: CPU: 92 PID: 974 at kernel/sched/topology.c:2215 build_sched_domains+0x81c/0x11b0
      [  146.749988] CPU: 92 PID: 974 Comm: kworker/92:2 Not tainted 5.15.0 #9
      [  146.750402] Hardware name: Huawei TaiShan 2280 V2/BC82AMDDA, BIOS 1.79 08/21/2021
      [  146.751213] Workqueue: events cpuset_hotplug_workfn
      [  146.751629] pstate: 00400009 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      [  146.752048] pc : build_sched_domains+0x81c/0x11b0
      [  146.752461] lr : build_sched_domains+0x414/0x11b0
      [  146.752860] sp : ffff800040a83a80
      [  146.753247] x29: ffff800040a83a80 x28: ffff20801f13a980 x27: ffff20800448ae00
      [  146.753644] x26: ffff800012a858e8 x25: ffff800012ea48c0 x24: 0000000000000000
      [  146.754039] x23: ffff800010ab7d60 x22: ffff800012f03758 x21: 000000000000005f
      [  146.754427] x20: 000000000000005c x19: ffff004080012840 x18: ffffffffffffffff
      [  146.754814] x17: 3661613030303230 x16: 30303078303a3239 x15: ffff800011f92b48
      [  146.755197] x14: ffff20be3f95cef6 x13: 2e6e69616d6f642d x12: 6465686373204c4c
      [  146.755578] x11: ffff20bf7fc83a00 x10: 0000000000000040 x9 : 0000000000000000
      [  146.755957] x8 : 0000000000000002 x7 : ffffffffe0000000 x6 : 0000000000000002
      [  146.756334] x5 : 0000000090000000 x4 : 00000000f0000000 x3 : 0000000000000001
      [  146.756705] x2 : 0000000000000080 x1 : ffff800012f03860 x0 : 0000000000000001
      [  146.757070] Call trace:
      [  146.757421]  build_sched_domains+0x81c/0x11b0
      [  146.757771]  partition_sched_domains_locked+0x57c/0x978
      [  146.758118]  rebuild_sched_domains_locked+0x44c/0x7f0
      [  146.758460]  rebuild_sched_domains+0x2c/0x48
      [  146.758791]  cpuset_hotplug_workfn+0x3fc/0x888
      [  146.759114]  process_one_work+0x1f4/0x480
      [  146.759429]  worker_thread+0x48/0x460
      [  146.759734]  kthread+0x158/0x168
      [  146.760030]  ret_from_fork+0x10/0x20
      [  146.760318] ---[ end trace 82c44aad6900e81a ]---
      
      For some architectures like risc-v and arm64 which use common code
      clear_cpu_topology() in shutting down CPUx, When CONFIG_SCHED_CLUSTER
      is set, cluster_sibling in cpu_topology of each sibling adjacent
      to CPUx is missed clearing, this causes checking failed in
      topology_span_sane() and rebuilding topology failure at end when CPU online.
      
      Different sibling's cluster_sibling in cpu_topology[] when CPU92 offline
      (CPU 92, 93, 94, 95 are in one cluster):
      
      Before revision:
      CPU                 [92]      [93]      [94]      [95]
      cluster_sibling     [92]     [92-95]   [92-95]   [92-95]
      
      After revision:
      CPU                 [92]      [93]      [94]      [95]
      cluster_sibling     [92]     [93-95]   [93-95]   [93-95]
      Signed-off-by: default avatarWang ShaoBo <bobo.shaobowang@huawei.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarDietmar Eggemann <dietmar.eggemann@arm.com>
      Acked-by: default avatarBarry Song <song.bao.hua@hisilicon.com>
      Tested-by: default avatarDietmar Eggemann <dietmar.eggemann@arm.com>
      Link: https://lore.kernel.org/r/20211110095856.469360-1-bobo.shaobowang@huawei.com
      4cc4cc28
  5. 15 Oct, 2021 1 commit
    • Jonathan Cameron's avatar
      topology: Represent clusters of CPUs within a die · c5e22fef
      Jonathan Cameron authored
      Both ACPI and DT provide the ability to describe additional layers of
      topology between that of individual cores and higher level constructs
      such as the level at which the last level cache is shared.
      In ACPI this can be represented in PPTT as a Processor Hierarchy
      Node Structure [1] that is the parent of the CPU cores and in turn
      has a parent Processor Hierarchy Nodes Structure representing
      a higher level of topology.
      
      For example Kunpeng 920 has 6 or 8 clusters in each NUMA node, and each
      cluster has 4 cpus. All clusters share L3 cache data, but each cluster
      has local L3 tag. On the other hand, each clusters will share some
      internal system bus.
      
      +-----------------------------------+                          +---------+
      |  +------+    +------+             +--------------------------+         |
      |  | CPU0 |    | cpu1 |             |    +-----------+         |         |
      |  +------+    +------+             |    |           |         |         |
      |                                   +----+    L3     |         |         |
      |  +------+    +------+   cluster   |    |    tag    |         |         |
      |  | CPU2 |    | CPU3 |             |    |           |         |         |
      |  +------+    +------+             |    +-----------+         |         |
      |                                   |                          |         |
      +-----------------------------------+                          |         |
      +-----------------------------------+                          |         |
      |  +------+    +------+             +--------------------------+         |
      |  |      |    |      |             |    +-----------+         |         |
      |  +------+    +------+             |    |           |         |         |
      |                                   |    |    L3     |         |         |
      |  +------+    +------+             +----+    tag    |         |         |
      |  |      |    |      |             |    |           |         |         |
      |  +------+    +------+             |    +-----------+         |         |
      |                                   |                          |         |
      +-----------------------------------+                          |   L3    |
                                                                     |   data  |
      +-----------------------------------+                          |         |
      |  +------+    +------+             |    +-----------+         |         |
      |  |      |    |      |             |    |           |         |         |
      |  +------+    +------+             +----+    L3     |         |         |
      |                                   |    |    tag    |         |         |
      |  +------+    +------+             |    |           |         |         |
      |  |      |    |      |             |    +-----------+         |         |
      |  +------+    +------+             +--------------------------+         |
      +-----------------------------------|                          |         |
      +-----------------------------------|                          |         |
      |  +------+    +------+             +--------------------------+         |
      |  |      |    |      |             |    +-----------+         |         |
      |  +------+    +------+             |    |           |         |         |
      |                                   +----+    L3     |         |         |
      |  +------+    +------+             |    |    tag    |         |         |
      |  |      |    |      |             |    |           |         |         |
      |  +------+    +------+             |    +-----------+         |         |
      |                                   |                          |         |
      +-----------------------------------+                          |         |
      +-----------------------------------+                          |         |
      |  +------+    +------+             +--------------------------+         |
      |  |      |    |      |             |   +-----------+          |         |
      |  +------+    +------+             |   |           |          |         |
      |                                   |   |    L3     |          |         |
      |  +------+    +------+             +---+    tag    |          |         |
      |  |      |    |      |             |   |           |          |         |
      |  +------+    +------+             |   +-----------+          |         |
      |                                   |                          |         |
      +-----------------------------------+                          |         |
      +-----------------------------------+                          |         |
      |  +------+    +------+             +--------------------------+         |
      |  |      |    |      |             |  +-----------+           |         |
      |  +------+    +------+             |  |           |           |         |
      |                                   |  |    L3     |           |         |
      |  +------+    +------+             +--+    tag    |           |         |
      |  |      |    |      |             |  |           |           |         |
      |  +------+    +------+             |  +-----------+           |         |
      |                                   |                          +---------+
      +-----------------------------------+
      
      That means spreading tasks among clusters will bring more bandwidth
      while packing tasks within one cluster will lead to smaller cache
      synchronization latency. So both kernel and userspace will have
      a chance to leverage this topology to deploy tasks accordingly to
      achieve either smaller cache latency within one cluster or an even
      distribution of load among clusters for higher throughput.
      
      This patch exposes cluster topology to both kernel and userspace.
      Libraried like hwloc will know cluster by cluster_cpus and related
      sysfs attributes. PoC of HWLOC support at [2].
      
      Note this patch only handle the ACPI case.
      
      Special consideration is needed for SMT processors, where it is
      necessary to move 2 levels up the hierarchy from the leaf nodes
      (thus skipping the processor core level).
      
      Note that arm64 / ACPI does not provide any means of identifying
      a die level in the topology but that may be unrelate to the cluster
      level.
      
      [1] ACPI Specification 6.3 - section 5.2.29.1 processor hierarchy node
          structure (Type 0)
      [2] https://github.com/hisilicon/hwloc/tree/linux-clusterSigned-off-by: default avatarJonathan Cameron <Jonathan.Cameron@huawei.com>
      Signed-off-by: default avatarTian Tao <tiantao6@hisilicon.com>
      Signed-off-by: default avatarBarry Song <song.bao.hua@hisilicon.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20210924085104.44806-2-21cnbao@gmail.com
      c5e22fef
  6. 05 Oct, 2021 1 commit
  7. 30 Aug, 2021 1 commit
  8. 01 Jul, 2021 1 commit
    • Viresh Kumar's avatar
      arch_topology: Avoid use-after-free for scale_freq_data · 83150f5d
      Viresh Kumar authored
      Currently topology_scale_freq_tick() (which gets called from
      scheduler_tick()) may end up using a pointer to "struct
      scale_freq_data", which was previously cleared by
      topology_clear_scale_freq_source(), as there is no protection in place
      here. The users of topology_clear_scale_freq_source() though needs a
      guarantee that the previously cleared scale_freq_data isn't used
      anymore, so they can free the related resources.
      
      Since topology_scale_freq_tick() is called from scheduler tick, we don't
      want to add locking in there. Use the RCU update mechanism instead
      (which is already used by the scheduler's utilization update path) to
      guarantee race free updates here.
      
      synchronize_rcu() makes sure that all RCU critical sections that started
      before it is called, will finish before it returns. And so the callers
      of topology_clear_scale_freq_source() don't need to worry about their
      callback getting called anymore.
      
      Cc: Paul E. McKenney <paulmck@kernel.org>
      Fixes: 01e055c1 ("arch_topology: Allow multiple entities to provide sched_freq_tick() callback")
      Tested-by: default avatarVincent Guittot <vincent.guittot@linaro.org>
      Reviewed-by: default avatarIonela Voinescu <ionela.voinescu@arm.com>
      Tested-by: default avatarQian Cai <quic_qiancai@quicinc.com>
      Signed-off-by: default avatarViresh Kumar <viresh.kumar@linaro.org>
      83150f5d
  9. 12 Mar, 2021 1 commit
  10. 10 Mar, 2021 2 commits
    • Viresh Kumar's avatar
      arch_topology: Allow multiple entities to provide sched_freq_tick() callback · 01e055c1
      Viresh Kumar authored
      This patch attempts to make it generic enough so other parts of the
      kernel can also provide their own implementation of scale_freq_tick()
      callback, which is called by the scheduler periodically to update the
      per-cpu arch_freq_scale variable.
      
      The implementations now need to provide 'struct scale_freq_data' for the
      CPUs for which they have hardware counters available, and a callback
      gets registered for each possible CPU in a per-cpu variable.
      
      The arch specific (or ARM AMU) counters are updated to adapt to this and
      they take the highest priority if they are available, i.e. they will be
      used instead of CPPC based counters for example.
      
      The special code to rebuild the sched domains, in case invariance status
      change for the system, is moved out of arm64 specific code and is added
      to arch_topology.c.
      
      Note that this also defines SCALE_FREQ_SOURCE_CPUFREQ but doesn't use it
      and it is added to show that cpufreq is also acts as source of
      information for FIE and will be used by default if no other counters are
      supported for a platform.
      Reviewed-by: default avatarIonela Voinescu <ionela.voinescu@arm.com>
      Tested-by: default avatarIonela Voinescu <ionela.voinescu@arm.com>
      Acked-by: Will Deacon <will@kernel.org> # for arm64
      Tested-by: default avatarVincent Guittot <vincent.guittot@linaro.org>
      Signed-off-by: default avatarViresh Kumar <viresh.kumar@linaro.org>
      01e055c1
    • Viresh Kumar's avatar
      arch_topology: Rename freq_scale as arch_freq_scale · eec73529
      Viresh Kumar authored
      Rename freq_scale to a less generic name, as it will get exported soon
      for modules. Since x86 already names its own implementation of this as
      arch_freq_scale, lets stick to that.
      Suggested-by: default avatarWill Deacon <will@kernel.org>
      Signed-off-by: default avatarViresh Kumar <viresh.kumar@linaro.org>
      eec73529
  11. 08 Oct, 2020 1 commit
    • Ionela Voinescu's avatar
      cpufreq,arm,arm64: restructure definitions of arch_set_freq_scale() · a20b7053
      Ionela Voinescu authored
      Compared to other arch_* functions, arch_set_freq_scale() has an atypical
      weak definition that can be replaced by a strong architecture specific
      implementation.
      
      The more typical support for architectural functions involves defining
      an empty stub in a header file if the symbol is not already defined in
      architecture code. Some examples involve:
       - #define arch_scale_freq_capacity	topology_get_freq_scale
       - #define arch_scale_freq_invariant	topology_scale_freq_invariant
       - #define arch_scale_cpu_capacity	topology_get_cpu_scale
       - #define arch_update_cpu_topology	topology_update_cpu_topology
       - #define arch_scale_thermal_pressure	topology_get_thermal_pressure
       - #define arch_set_thermal_pressure	topology_set_thermal_pressure
      
      Bring arch_set_freq_scale() in line with these functions by renaming it to
      topology_set_freq_scale() in the arch topology driver, and by defining the
      arch_set_freq_scale symbol to point to the new function for arm and arm64.
      
      While there are other users of the arch_topology driver, this patch defines
      arch_set_freq_scale for arm and arm64 only, due to their existing
      definitions of arch_scale_freq_capacity. This is the getter function of the
      frequency invariance scale factor and without a getter function, the
      setter function - arch_set_freq_scale() has not purpose.
      Signed-off-by: default avatarIonela Voinescu <ionela.voinescu@arm.com>
      Acked-by: default avatarViresh Kumar <viresh.kumar@linaro.org>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: Sudeep Holla <sudeep.holla@arm.com> (BL_SWITCHER and topology parts)
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      a20b7053
  12. 02 Oct, 2020 1 commit
    • Joe Perches's avatar
      drivers core: Use sysfs_emit and sysfs_emit_at for show(device *...) functions · aa838896
      Joe Perches authored
      Convert the various sprintf fmaily calls in sysfs device show functions
      to sysfs_emit and sysfs_emit_at for PAGE_SIZE buffer safety.
      
      Done with:
      
      $ spatch -sp-file sysfs_emit_dev.cocci --in-place --max-width=80 .
      
      And cocci script:
      
      $ cat sysfs_emit_dev.cocci
      @@
      identifier d_show;
      identifier dev, attr, buf;
      @@
      
      ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
      {
      	<...
      	return
      -	sprintf(buf,
      +	sysfs_emit(buf,
      	...);
      	...>
      }
      
      @@
      identifier d_show;
      identifier dev, attr, buf;
      @@
      
      ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
      {
      	<...
      	return
      -	snprintf(buf, PAGE_SIZE,
      +	sysfs_emit(buf,
      	...);
      	...>
      }
      
      @@
      identifier d_show;
      identifier dev, attr, buf;
      @@
      
      ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
      {
      	<...
      	return
      -	scnprintf(buf, PAGE_SIZE,
      +	sysfs_emit(buf,
      	...);
      	...>
      }
      
      @@
      identifier d_show;
      identifier dev, attr, buf;
      expression chr;
      @@
      
      ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
      {
      	<...
      	return
      -	strcpy(buf, chr);
      +	sysfs_emit(buf, chr);
      	...>
      }
      
      @@
      identifier d_show;
      identifier dev, attr, buf;
      identifier len;
      @@
      
      ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
      {
      	<...
      	len =
      -	sprintf(buf,
      +	sysfs_emit(buf,
      	...);
      	...>
      	return len;
      }
      
      @@
      identifier d_show;
      identifier dev, attr, buf;
      identifier len;
      @@
      
      ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
      {
      	<...
      	len =
      -	snprintf(buf, PAGE_SIZE,
      +	sysfs_emit(buf,
      	...);
      	...>
      	return len;
      }
      
      @@
      identifier d_show;
      identifier dev, attr, buf;
      identifier len;
      @@
      
      ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
      {
      	<...
      	len =
      -	scnprintf(buf, PAGE_SIZE,
      +	sysfs_emit(buf,
      	...);
      	...>
      	return len;
      }
      
      @@
      identifier d_show;
      identifier dev, attr, buf;
      identifier len;
      @@
      
      ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
      {
      	<...
      -	len += scnprintf(buf + len, PAGE_SIZE - len,
      +	len += sysfs_emit_at(buf, len,
      	...);
      	...>
      	return len;
      }
      
      @@
      identifier d_show;
      identifier dev, attr, buf;
      expression chr;
      @@
      
      ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
      {
      	...
      -	strcpy(buf, chr);
      -	return strlen(buf);
      +	return sysfs_emit(buf, chr);
      }
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Link: https://lore.kernel.org/r/3d033c33056d88bbe34d4ddb62afd05ee166ab9a.1600285923.git.joe@perches.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      aa838896
  13. 18 Sep, 2020 3 commits
  14. 22 Jul, 2020 1 commit
    • Valentin Schneider's avatar
      arch_topology, sched/core: Cleanup thermal pressure definition · 25980c7a
      Valentin Schneider authored
      The following commit:
      
        14533a16 ("thermal/cpu-cooling, sched/core: Move the arch_set_thermal_pressure() API to generic scheduler code")
      
      moved the definition of arch_set_thermal_pressure() to sched/core.c, but
      kept its declaration in linux/arch_topology.h. When building e.g. an x86
      kernel with CONFIG_SCHED_THERMAL_PRESSURE=y, cpufreq_cooling.c ends up
      getting the declaration of arch_set_thermal_pressure() from
      include/linux/arch_topology.h, which is somewhat awkward.
      
      On top of this, sched/core.c unconditionally defines
      o The thermal_pressure percpu variable
      o arch_set_thermal_pressure()
      
      while arch_scale_thermal_pressure() does nothing unless redefined by the
      architecture.
      
      arch_*() functions are meant to be defined by architectures, so revert the
      aforementioned commit and re-implement it in a way that keeps
      arch_set_thermal_pressure() architecture-definable, and doesn't define the
      thermal pressure percpu variable for kernels that don't need
      it (CONFIG_SCHED_THERMAL_PRESSURE=n).
      Signed-off-by: default avatarValentin Schneider <valentin.schneider@arm.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/20200712165917.9168-2-valentin.schneider@arm.com
      25980c7a
  15. 18 Mar, 2020 1 commit
  16. 11 Mar, 2020 2 commits
  17. 06 Mar, 2020 3 commits
    • Ionela Voinescu's avatar
      arm64: use activity monitors for frequency invariance · cd0ed03a
      Ionela Voinescu authored
      The Frequency Invariance Engine (FIE) is providing a frequency
      scaling correction factor that helps achieve more accurate
      load-tracking.
      
      So far, for arm and arm64 platforms, this scale factor has been
      obtained based on the ratio between the current frequency and the
      maximum supported frequency recorded by the cpufreq policy. The
      setting of this scale factor is triggered from cpufreq drivers by
      calling arch_set_freq_scale. The current frequency used in computation
      is the frequency requested by a governor, but it may not be the
      frequency that was implemented by the platform.
      
      This correction factor can also be obtained using a core counter and a
      constant counter to get information on the performance (frequency based
      only) obtained in a period of time. This will more accurately reflect
      the actual current frequency of the CPU, compared with the alternative
      implementation that reflects the request of a performance level from
      the OS.
      
      Therefore, implement arch_scale_freq_tick to use activity monitors, if
      present, for the computation of the frequency scale factor.
      
      The use of AMU counters depends on:
       - CONFIG_ARM64_AMU_EXTN - depents on the AMU extension being present
       - CONFIG_CPU_FREQ - the current frequency obtained using counter
         information is divided by the maximum frequency obtained from the
         cpufreq policy.
      
      While it is possible to have a combination of CPUs in the system with
      and without support for activity monitors, the use of counters for
      frequency invariance is only enabled for a CPU if all related CPUs
      (CPUs in the same frequency domain) support and have enabled the core
      and constant activity monitor counters. In this way, there is a clear
      separation between the policies for which arch_set_freq_scale (cpufreq
      based FIE) is used, and the policies for which arch_scale_freq_tick
      (counter based FIE) is used to set the frequency scale factor. For
      this purpose, a late_initcall_sync is registered to trigger validation
      work for policies that will enable or disable the use of AMU counters
      for frequency invariance. If CONFIG_CPU_FREQ is not defined, the use
      of counters is enabled on all CPUs only if all possible CPUs correctly
      support the necessary counters.
      Signed-off-by: default avatarIonela Voinescu <ionela.voinescu@arm.com>
      Reviewed-by: default avatarLukasz Luba <lukasz.luba@arm.com>
      Acked-by: default avatarSudeep Holla <sudeep.holla@arm.com>
      Cc: Sudeep Holla <sudeep.holla@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      cd0ed03a
    • Ingo Molnar's avatar
      thermal/cpu-cooling, sched/core: Move the arch_set_thermal_pressure() API to generic scheduler code · 14533a16
      Ingo Molnar authored
      drivers/base/arch_topology.c is only built if CONFIG_GENERIC_ARCH_TOPOLOGY=y,
      resulting in such build failures:
      
        cpufreq_cooling.c:(.text+0x1e7): undefined reference to `arch_set_thermal_pressure'
      
      Move it to sched/core.c instead, and keep it enabled on x86 despite
      us not having a arch_scale_thermal_pressure() facility there, to
      build-test this thing.
      
      Cc: Thara Gopinath <thara.gopinath@linaro.org>
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      14533a16
    • Thara Gopinath's avatar
      drivers/base/arch_topology: Add infrastructure to store and update instantaneous thermal pressure · ad58cc5c
      Thara Gopinath authored
      Add architecture specific APIs to update and track thermal pressure on a
      per CPU basis. A per CPU variable thermal_pressure is introduced to keep
      track of instantaneous per CPU thermal pressure. Thermal pressure is the
      delta between maximum capacity and capped capacity due to a thermal event.
      
      topology_get_thermal_pressure can be hooked into the scheduler specified
      arch_scale_thermal_pressure to retrieve instantaneous thermal pressure of
      a CPU.
      
      arch_set_thermal_pressure can be used to update the thermal pressure.
      
      Considering topology_get_thermal_pressure reads thermal_pressure and
      arch_set_thermal_pressure writes into thermal_pressure, one can argue for
      some sort of locking mechanism to avoid a stale value.  But considering
      topology_get_thermal_pressure can be called from a system critical path
      like scheduler tick function, a locking mechanism is not ideal. This means
      that it is possible the thermal_pressure value used to calculate average
      thermal pressure for a CPU can be stale for up to 1 tick period.
      Signed-off-by: default avatarThara Gopinath <thara.gopinath@linaro.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Link: https://lkml.kernel.org/r/20200222005213.3873-4-thara.gopinath@linaro.org
      ad58cc5c
  18. 17 Jan, 2020 1 commit
  19. 26 Aug, 2019 1 commit
  20. 22 Jul, 2019 2 commits
  21. 03 Jul, 2019 1 commit
  22. 24 Jun, 2019 1 commit
  23. 04 Apr, 2019 1 commit
    • Lingutla Chandrasekhar's avatar
      arch_topology: Make cpu_capacity sysfs node as read-only · 5d777b18
      Lingutla Chandrasekhar authored
      If user updates any cpu's cpu_capacity, then the new value is going to
      be applied to all its online sibling cpus. But this need not to be correct
      always, as sibling cpus (in ARM, same micro architecture cpus) would have
      different cpu_capacity with different performance characteristics.
      So, updating the user supplied cpu_capacity to all cpu siblings
      is not correct.
      
      And another problem is, current code assumes that 'all cpus in a cluster
      or with same package_id (core_siblings), would have same cpu_capacity'.
      But with commit '5bdd2b3f ("arm64: topology: add support to remove
      cpu topology sibling masks")', when a cpu hotplugged out, the cpu
      information gets cleared in its sibling cpus. So, user supplied
      cpu_capacity would be applied to only online sibling cpus at the time.
      After that, if any cpu hotplugged in, it would have different cpu_capacity
      than its siblings, which breaks the above assumption.
      
      So, instead of mucking around the core sibling mask for user supplied
      value, use device-tree to set cpu capacity. And make the cpu_capacity
      node as read-only to know the asymmetry between cpus in the system.
      While at it, remove cpu_scale_mutex usage, which used for sysfs write
      protection.
      Tested-by: default avatarDietmar Eggemann <dietmar.eggemann@arm.com>
      Tested-by: default avatarQuentin Perret <quentin.perret@arm.com>
      Reviewed-by: default avatarQuentin Perret <quentin.perret@arm.com>
      Acked-by: default avatarSudeep Holla <sudeep.holla@arm.com>
      Signed-off-by: default avatarLingutla Chandrasekhar <clingutla@codeaurora.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5d777b18
  24. 10 Sep, 2018 1 commit
  25. 15 Mar, 2018 1 commit
    • Gaku Inami's avatar
      Revert "base: arch_topology: fix section mismatch build warnings" · 9de9a449
      Gaku Inami authored
      This reverts commit 452562ab ("base: arch_topology: fix section
      mismatch build warnings"). It causes the notifier call hangs in some
      use-cases.
      
      In some cases with using maxcpus, some of cpus are booted first and
      then the remaining cpus are booted. As an example, some users who want
      to realize fast boot up often use the following procedure.
      
        1) Define all CPUs on device tree (CA57x4 + CA53x4)
        2) Add "maxcpus=4" in bootargs
        3) Kernel boot up with CA57x4
        4) After kernel boot up, CA53x4 is booted from user
      
      When kernel init was finished, CPUFREQ_POLICY_NOTIFIER was not still
      unregisterd. This means that "__init init_cpu_capacity_callback()"
      will be called after kernel init sequence. To avoid this problem,
      it needs to remove __init{,data} annotations by reverting this commit.
      
      Also, this commit was needed to fix kernel compile issue below.
      However, this issue was also fixed by another patch: commit 82d8ba71
      ("arch_topology: Fix section miss match warning due to
      free_raw_capacity()") in v4.15 as well.
      Whereas commit 452562ab added all the missing __init annotations,
      commit 82d8ba71 removed it from free_raw_capacity().
      
      WARNING: vmlinux.o(.text+0x548f24): Section mismatch in reference
      from the function init_cpu_capacity_callback() to the variable
      .init.text:$x
      The function init_cpu_capacity_callback() references
      the variable __init $x.
      This is often because init_cpu_capacity_callback lacks a __init
      annotation or the annotation of $x is wrong.
      
      Fixes: 82d8ba71 ("arch_topology: Fix section miss match warning due to free_raw_capacity()")
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGaku Inami <gaku.inami.xh@renesas.com>
      Reviewed-by: default avatarDietmar Eggemann <dietmar.eggemann@arm.com>
      Tested-by: default avatarDietmar Eggemann <dietmar.eggemann@arm.com>
      Acked-by: default avatarSudeep Holla <sudeep.holla@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9de9a449
  26. 07 Dec, 2017 1 commit
    • Greg Kroah-Hartman's avatar
      drivers: core: arch_topology.c: move SPDX tag to top of the file · 6ee97d35
      Greg Kroah-Hartman authored
      arch_topology.c had a SPDX tag in it, so move it to the top of the file
      like the rest of the kernel files have it.
      
      Also remove the redundant license text as it is not needed if the SPDX
      tag is in the file, as the tag identifies the license in a specific and
      legally-defined manner.
      
      This is done on a quest to remove the 700+ different ways that files in
      the kernel describe the GPL license text.  And there's unneeded stuff
      like the address (sometimes incorrect) for the FSF which is never
      needed.
      
      No copyright headers or other non-license-description text was removed.
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6ee97d35
  27. 20 Oct, 2017 1 commit
  28. 03 Oct, 2017 3 commits
  29. 18 Sep, 2017 1 commit
    • Sudeep Holla's avatar
      base: arch_topology: fix section mismatch build warnings · 452562ab
      Sudeep Holla authored
      Commit 2ef7a295 ("arm, arm64: factorize common cpu capacity default code")
      introduced init_cpu_capacity_callback and init_cpu_capacity_notifier
      which are referenced from initcall and are missing __init{,data}
      annotations resulting the below section mismatch build warnings.
      
      "WARNING: vmlinux.o(.text+0xbab790): Section mismatch in reference from
      the function init_cpu_capacity_callback() to the variable .init.text:$x
      The function init_cpu_capacity_callback() references the variable
      __init $x. This is often because init_cpu_capacity_callback lacks a
      __init annotation or the annotation of $x is wrong."
      
      This patch fixes the above build warnings by adding the required annotations.
      
      Fixes: 2ef7a295 ("arm, arm64: factorize common cpu capacity default code")
      Cc: Juri Lelli <juri.lelli@arm.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarSudeep Holla <sudeep.holla@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      452562ab
  30. 28 Aug, 2017 1 commit