Commit d06fa5a1 authored by Will Deacon's avatar Will Deacon

Merge tag 'common/for-v5.4-rc1/cpu-topology' of...

Merge tag 'common/for-v5.4-rc1/cpu-topology' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux into for-next/cpu-topology

Pull in generic CPU topology changes from Paul Walmsley (RISC-V).

* tag 'common/for-v5.4-rc1/cpu-topology' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  MAINTAINERS: Add an entry for generic architecture topology
  base: arch_topology: update Kconfig help description
  RISC-V: Parse cpu topology during boot.
  arm: Use common cpu_topology structure and functions.
  cpu-topology: Move cpu topology code to common code.
  dt-binding: cpu-topology: Move cpu-map to a common binding.
  Documentation: DT: arm: add support for sockets defining package boundaries
parents 98dc1990 f51edcec
=========================================== ===========================================
ARM topology binding description CPU topology binding description
=========================================== ===========================================
=========================================== ===========================================
1 - Introduction 1 - Introduction
=========================================== ===========================================
In an ARM system, the hierarchy of CPUs is defined through three entities that In a SMP system, the hierarchy of CPUs is defined through three entities that
are used to describe the layout of physical CPUs in the system: are used to describe the layout of physical CPUs in the system:
- socket
- cluster - cluster
- core - core
- thread - thread
The cpu nodes (bindings defined in [1]) represent the devices that
correspond to physical CPUs and are to be mapped to the hierarchy levels.
The bottom hierarchy level sits at core or thread level depending on whether The bottom hierarchy level sits at core or thread level depending on whether
symmetric multi-threading (SMT) is supported or not. symmetric multi-threading (SMT) is supported or not.
...@@ -24,33 +22,31 @@ threads existing in the system and map to the hierarchy level "thread" above. ...@@ -24,33 +22,31 @@ threads existing in the system and map to the hierarchy level "thread" above.
In systems where SMT is not supported "cpu" nodes represent all cores present In systems where SMT is not supported "cpu" nodes represent all cores present
in the system and map to the hierarchy level "core" above. in the system and map to the hierarchy level "core" above.
ARM topology bindings allow one to associate cpu nodes with hierarchical groups CPU topology bindings allow one to associate cpu nodes with hierarchical groups
corresponding to the system hierarchy; syntactically they are defined as device corresponding to the system hierarchy; syntactically they are defined as device
tree nodes. tree nodes.
The remainder of this document provides the topology bindings for ARM, based Currently, only ARM/RISC-V intend to use this cpu topology binding but it may be
on the Devicetree Specification, available from: used for any other architecture as well.
https://www.devicetree.org/specifications/ The cpu nodes, as per bindings defined in [4], represent the devices that
correspond to physical CPUs and are to be mapped to the hierarchy levels.
If not stated otherwise, whenever a reference to a cpu node phandle is made its
value must point to a cpu node compliant with the cpu node bindings as
documented in [1].
A topology description containing phandles to cpu nodes that are not compliant A topology description containing phandles to cpu nodes that are not compliant
with bindings standardized in [1] is therefore considered invalid. with bindings standardized in [4] is therefore considered invalid.
=========================================== ===========================================
2 - cpu-map node 2 - cpu-map node
=========================================== ===========================================
The ARM CPU topology is defined within the cpu-map node, which is a direct The ARM/RISC-V CPU topology is defined within the cpu-map node, which is a direct
child of the cpus node and provides a container where the actual topology child of the cpus node and provides a container where the actual topology
nodes are listed. nodes are listed.
- cpu-map node - cpu-map node
Usage: Optional - On ARM SMP systems provide CPUs topology to the OS. Usage: Optional - On SMP systems provide CPUs topology to the OS.
ARM uniprocessor systems do not require a topology Uniprocessor systems do not require a topology
description and therefore should not define a description and therefore should not define a
cpu-map node. cpu-map node.
...@@ -63,21 +59,23 @@ nodes are listed. ...@@ -63,21 +59,23 @@ nodes are listed.
The cpu-map node's child nodes can be: The cpu-map node's child nodes can be:
- one or more cluster nodes - one or more cluster nodes or
- one or more socket nodes in a multi-socket system
Any other configuration is considered invalid. Any other configuration is considered invalid.
The cpu-map node can only contain three types of child nodes: The cpu-map node can only contain 4 types of child nodes:
- socket node
- cluster node - cluster node
- core node - core node
- thread node - thread node
whose bindings are described in paragraph 3. whose bindings are described in paragraph 3.
The nodes describing the CPU topology (cluster/core/thread) can only The nodes describing the CPU topology (socket/cluster/core/thread) can
be defined within the cpu-map node and every core/thread in the system only be defined within the cpu-map node and every core/thread in the
must be defined within the topology. Any other configuration is system must be defined within the topology. Any other configuration is
invalid and therefore must be ignored. invalid and therefore must be ignored.
=========================================== ===========================================
...@@ -85,26 +83,44 @@ invalid and therefore must be ignored. ...@@ -85,26 +83,44 @@ invalid and therefore must be ignored.
=========================================== ===========================================
cpu-map child nodes must follow a naming convention where the node name cpu-map child nodes must follow a naming convention where the node name
must be "clusterN", "coreN", "threadN" depending on the node type (ie must be "socketN", "clusterN", "coreN", "threadN" depending on the node type
cluster/core/thread) (where N = {0, 1, ...} is the node number; nodes which (ie socket/cluster/core/thread) (where N = {0, 1, ...} is the node number; nodes
are siblings within a single common parent node must be given a unique and which are siblings within a single common parent node must be given a unique and
sequential N value, starting from 0). sequential N value, starting from 0).
cpu-map child nodes which do not share a common parent node can have the same cpu-map child nodes which do not share a common parent node can have the same
name (ie same number N as other cpu-map child nodes at different device tree name (ie same number N as other cpu-map child nodes at different device tree
levels) since name uniqueness will be guaranteed by the device tree hierarchy. levels) since name uniqueness will be guaranteed by the device tree hierarchy.
=========================================== ===========================================
3 - cluster/core/thread node bindings 3 - socket/cluster/core/thread node bindings
=========================================== ===========================================
Bindings for cluster/cpu/thread nodes are defined as follows: Bindings for socket/cluster/cpu/thread nodes are defined as follows:
- socket node
Description: must be declared within a cpu-map node, one node
per physical socket in the system. A system can
contain single or multiple physical socket.
The association of sockets and NUMA nodes is beyond
the scope of this bindings, please refer [2] for
NUMA bindings.
This node is optional for a single socket system.
The socket node name must be "socketN" as described in 2.1 above.
A socket node can not be a leaf node.
A socket node's child nodes must be one or more cluster nodes.
Any other configuration is considered invalid.
- cluster node - cluster node
Description: must be declared within a cpu-map node, one node Description: must be declared within a cpu-map node, one node
per cluster. A system can contain several layers of per cluster. A system can contain several layers of
clustering and cluster nodes can be contained in parent clustering within a single physical socket and cluster
cluster nodes. nodes can be contained in parent cluster nodes.
The cluster node name must be "clusterN" as described in 2.1 above. The cluster node name must be "clusterN" as described in 2.1 above.
A cluster node can not be a leaf node. A cluster node can not be a leaf node.
...@@ -164,90 +180,93 @@ Bindings for cluster/cpu/thread nodes are defined as follows: ...@@ -164,90 +180,93 @@ Bindings for cluster/cpu/thread nodes are defined as follows:
4 - Example dts 4 - Example dts
=========================================== ===========================================
Example 1 (ARM 64-bit, 16-cpu system, two clusters of clusters): Example 1 (ARM 64-bit, 16-cpu system, two clusters of clusters in a single
physical socket):
cpus { cpus {
#size-cells = <0>; #size-cells = <0>;
#address-cells = <2>; #address-cells = <2>;
cpu-map { cpu-map {
cluster0 { socket0 {
cluster0 { cluster0 {
core0 { cluster0 {
thread0 { core0 {
cpu = <&CPU0>; thread0 {
cpu = <&CPU0>;
};
thread1 {
cpu = <&CPU1>;
};
}; };
thread1 {
cpu = <&CPU1>;
};
};
core1 { core1 {
thread0 { thread0 {
cpu = <&CPU2>; cpu = <&CPU2>;
}; };
thread1 { thread1 {
cpu = <&CPU3>; cpu = <&CPU3>;
};
}; };
}; };
};
cluster1 { cluster1 {
core0 { core0 {
thread0 { thread0 {
cpu = <&CPU4>; cpu = <&CPU4>;
}; };
thread1 { thread1 {
cpu = <&CPU5>; cpu = <&CPU5>;
};
}; };
};
core1 { core1 {
thread0 { thread0 {
cpu = <&CPU6>; cpu = <&CPU6>;
}; };
thread1 { thread1 {
cpu = <&CPU7>; cpu = <&CPU7>;
}; };
};
};
};
cluster1 {
cluster0 {
core0 {
thread0 {
cpu = <&CPU8>;
};
thread1 {
cpu = <&CPU9>;
};
};
core1 {
thread0 {
cpu = <&CPU10>;
};
thread1 {
cpu = <&CPU11>;
}; };
}; };
}; };
cluster1 { cluster1 {
core0 { cluster0 {
thread0 { core0 {
cpu = <&CPU12>; thread0 {
cpu = <&CPU8>;
};
thread1 {
cpu = <&CPU9>;
};
}; };
thread1 { core1 {
cpu = <&CPU13>; thread0 {
cpu = <&CPU10>;
};
thread1 {
cpu = <&CPU11>;
};
}; };
}; };
core1 {
thread0 { cluster1 {
cpu = <&CPU14>; core0 {
thread0 {
cpu = <&CPU12>;
};
thread1 {
cpu = <&CPU13>;
};
}; };
thread1 { core1 {
cpu = <&CPU15>; thread0 {
cpu = <&CPU14>;
};
thread1 {
cpu = <&CPU15>;
};
}; };
}; };
}; };
...@@ -470,6 +489,65 @@ cpus { ...@@ -470,6 +489,65 @@ cpus {
}; };
}; };
Example 3: HiFive Unleashed (RISC-V 64 bit, 4 core system)
{
#address-cells = <2>;
#size-cells = <2>;
compatible = "sifive,fu540g", "sifive,fu500";
model = "sifive,hifive-unleashed-a00";
...
cpus {
#address-cells = <1>;
#size-cells = <0>;
cpu-map {
socket0 {
cluster0 {
core0 {
cpu = <&CPU1>;
};
core1 {
cpu = <&CPU2>;
};
core2 {
cpu0 = <&CPU2>;
};
core3 {
cpu0 = <&CPU3>;
};
};
};
};
CPU1: cpu@1 {
device_type = "cpu";
compatible = "sifive,rocket0", "riscv";
reg = <0x1>;
}
CPU2: cpu@2 {
device_type = "cpu";
compatible = "sifive,rocket0", "riscv";
reg = <0x2>;
}
CPU3: cpu@3 {
device_type = "cpu";
compatible = "sifive,rocket0", "riscv";
reg = <0x3>;
}
CPU4: cpu@4 {
device_type = "cpu";
compatible = "sifive,rocket0", "riscv";
reg = <0x4>;
}
}
};
=============================================================================== ===============================================================================
[1] ARM Linux kernel documentation [1] ARM Linux kernel documentation
Documentation/devicetree/bindings/arm/cpus.yaml Documentation/devicetree/bindings/arm/cpus.yaml
[2] Devicetree NUMA binding description
Documentation/devicetree/bindings/numa.txt
[3] RISC-V Linux kernel documentation
Documentation/devicetree/bindings/riscv/cpus.txt
[4] https://www.devicetree.org/specifications/
...@@ -6724,6 +6724,13 @@ W: https://linuxtv.org ...@@ -6724,6 +6724,13 @@ W: https://linuxtv.org
S: Maintained S: Maintained
F: drivers/media/radio/radio-gemtek* F: drivers/media/radio/radio-gemtek*
GENERIC ARCHITECTURE TOPOLOGY
M: Sudeep Holla <sudeep.holla@arm.com>
L: linux-kernel@vger.kernel.org
S: Maintained
F: drivers/base/arch_topology.c
F: include/linux/arch_topology.h
GENERIC GPIO I2C DRIVER GENERIC GPIO I2C DRIVER
M: Wolfram Sang <wsa+renesas@sang-engineering.com> M: Wolfram Sang <wsa+renesas@sang-engineering.com>
S: Supported S: Supported
......
...@@ -5,26 +5,6 @@ ...@@ -5,26 +5,6 @@
#ifdef CONFIG_ARM_CPU_TOPOLOGY #ifdef CONFIG_ARM_CPU_TOPOLOGY
#include <linux/cpumask.h> #include <linux/cpumask.h>
struct cputopo_arm {
int thread_id;
int core_id;
int socket_id;
cpumask_t thread_sibling;
cpumask_t core_sibling;
};
extern struct cputopo_arm cpu_topology[NR_CPUS];
#define topology_physical_package_id(cpu) (cpu_topology[cpu].socket_id)
#define topology_core_id(cpu) (cpu_topology[cpu].core_id)
#define topology_core_cpumask(cpu) (&cpu_topology[cpu].core_sibling)
#define topology_sibling_cpumask(cpu) (&cpu_topology[cpu].thread_sibling)
void init_cpu_topology(void);
void store_cpu_topology(unsigned int cpuid);
const struct cpumask *cpu_coregroup_mask(int cpu);
#include <linux/arch_topology.h> #include <linux/arch_topology.h>
/* Replace task scheduler's default frequency-invariant accounting */ /* Replace task scheduler's default frequency-invariant accounting */
......
...@@ -177,17 +177,6 @@ static inline void parse_dt_topology(void) {} ...@@ -177,17 +177,6 @@ static inline void parse_dt_topology(void) {}
static inline void update_cpu_capacity(unsigned int cpuid) {} static inline void update_cpu_capacity(unsigned int cpuid) {}
#endif #endif
/*
* cpu topology table
*/
struct cputopo_arm cpu_topology[NR_CPUS];
EXPORT_SYMBOL_GPL(cpu_topology);
const struct cpumask *cpu_coregroup_mask(int cpu)
{
return &cpu_topology[cpu].core_sibling;
}
/* /*
* The current assumption is that we can power gate each core independently. * The current assumption is that we can power gate each core independently.
* This will be superseded by DT binding once available. * This will be superseded by DT binding once available.
...@@ -197,32 +186,6 @@ const struct cpumask *cpu_corepower_mask(int cpu) ...@@ -197,32 +186,6 @@ const struct cpumask *cpu_corepower_mask(int cpu)
return &cpu_topology[cpu].thread_sibling; return &cpu_topology[cpu].thread_sibling;
} }
static void update_siblings_masks(unsigned int cpuid)
{
struct cputopo_arm *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
int cpu;
/* update core and thread sibling masks */
for_each_possible_cpu(cpu) {
cpu_topo = &cpu_topology[cpu];
if (cpuid_topo->socket_id != cpu_topo->socket_id)
continue;
cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
if (cpu != cpuid)
cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
if (cpuid_topo->core_id != cpu_topo->core_id)
continue;
cpumask_set_cpu(cpuid, &cpu_topo->thread_sibling);
if (cpu != cpuid)
cpumask_set_cpu(cpu, &cpuid_topo->thread_sibling);
}
smp_wmb();
}
/* /*
* store_cpu_topology is called at boot when only one cpu is running * store_cpu_topology is called at boot when only one cpu is running
* and with the mutex cpu_hotplug.lock locked, when several cpus have booted, * and with the mutex cpu_hotplug.lock locked, when several cpus have booted,
...@@ -230,7 +193,7 @@ static void update_siblings_masks(unsigned int cpuid) ...@@ -230,7 +193,7 @@ static void update_siblings_masks(unsigned int cpuid)
*/ */
void store_cpu_topology(unsigned int cpuid) void store_cpu_topology(unsigned int cpuid)
{ {
struct cputopo_arm *cpuid_topo = &cpu_topology[cpuid]; struct cpu_topology *cpuid_topo = &cpu_topology[cpuid];
unsigned int mpidr; unsigned int mpidr;
/* If the cpu topology has been already set, just return */ /* If the cpu topology has been already set, just return */
...@@ -250,12 +213,12 @@ void store_cpu_topology(unsigned int cpuid) ...@@ -250,12 +213,12 @@ void store_cpu_topology(unsigned int cpuid)
/* core performance interdependency */ /* core performance interdependency */
cpuid_topo->thread_id = MPIDR_AFFINITY_LEVEL(mpidr, 0); cpuid_topo->thread_id = MPIDR_AFFINITY_LEVEL(mpidr, 0);
cpuid_topo->core_id = MPIDR_AFFINITY_LEVEL(mpidr, 1); cpuid_topo->core_id = MPIDR_AFFINITY_LEVEL(mpidr, 1);
cpuid_topo->socket_id = MPIDR_AFFINITY_LEVEL(mpidr, 2); cpuid_topo->package_id = MPIDR_AFFINITY_LEVEL(mpidr, 2);
} else { } else {
/* largely independent cores */ /* largely independent cores */
cpuid_topo->thread_id = -1; cpuid_topo->thread_id = -1;
cpuid_topo->core_id = MPIDR_AFFINITY_LEVEL(mpidr, 0); cpuid_topo->core_id = MPIDR_AFFINITY_LEVEL(mpidr, 0);
cpuid_topo->socket_id = MPIDR_AFFINITY_LEVEL(mpidr, 1); cpuid_topo->package_id = MPIDR_AFFINITY_LEVEL(mpidr, 1);
} }
} else { } else {
/* /*
...@@ -265,7 +228,7 @@ void store_cpu_topology(unsigned int cpuid) ...@@ -265,7 +228,7 @@ void store_cpu_topology(unsigned int cpuid)
*/ */
cpuid_topo->thread_id = -1; cpuid_topo->thread_id = -1;
cpuid_topo->core_id = 0; cpuid_topo->core_id = 0;
cpuid_topo->socket_id = -1; cpuid_topo->package_id = -1;
} }
update_siblings_masks(cpuid); update_siblings_masks(cpuid);
...@@ -275,7 +238,7 @@ void store_cpu_topology(unsigned int cpuid) ...@@ -275,7 +238,7 @@ void store_cpu_topology(unsigned int cpuid)
pr_info("CPU%u: thread %d, cpu %d, socket %d, mpidr %x\n", pr_info("CPU%u: thread %d, cpu %d, socket %d, mpidr %x\n",
cpuid, cpu_topology[cpuid].thread_id, cpuid, cpu_topology[cpuid].thread_id,
cpu_topology[cpuid].core_id, cpu_topology[cpuid].core_id,
cpu_topology[cpuid].socket_id, mpidr); cpu_topology[cpuid].package_id, mpidr);
} }
static inline int cpu_corepower_flags(void) static inline int cpu_corepower_flags(void)
...@@ -298,18 +261,7 @@ static struct sched_domain_topology_level arm_topology[] = { ...@@ -298,18 +261,7 @@ static struct sched_domain_topology_level arm_topology[] = {
*/ */
void __init init_cpu_topology(void) void __init init_cpu_topology(void)
{ {
unsigned int cpu; reset_cpu_topology();
/* init core mask and capacity */
for_each_possible_cpu(cpu) {
struct cputopo_arm *cpu_topo = &(cpu_topology[cpu]);
cpu_topo->thread_id = -1;
cpu_topo->core_id = -1;
cpu_topo->socket_id = -1;
cpumask_clear(&cpu_topo->core_sibling);
cpumask_clear(&cpu_topo->thread_sibling);
}
smp_wmb(); smp_wmb();
parse_dt_topology(); parse_dt_topology();
......
...@@ -4,29 +4,6 @@ ...@@ -4,29 +4,6 @@
#include <linux/cpumask.h> #include <linux/cpumask.h>
struct cpu_topology {
int thread_id;
int core_id;
int package_id;
int llc_id;
cpumask_t thread_sibling;
cpumask_t core_sibling;
cpumask_t llc_sibling;
};
extern struct cpu_topology cpu_topology[NR_CPUS];
#define topology_physical_package_id(cpu) (cpu_topology[cpu].package_id)
#define topology_core_id(cpu) (cpu_topology[cpu].core_id)
#define topology_core_cpumask(cpu) (&cpu_topology[cpu].core_sibling)
#define topology_sibling_cpumask(cpu) (&cpu_topology[cpu].thread_sibling)
#define topology_llc_cpumask(cpu) (&cpu_topology[cpu].llc_sibling)
void init_cpu_topology(void);
void store_cpu_topology(unsigned int cpuid);
void remove_cpu_topology(unsigned int cpuid);
const struct cpumask *cpu_coregroup_mask(int cpu);
#ifdef CONFIG_NUMA #ifdef CONFIG_NUMA
struct pci_bus; struct pci_bus;
......
...@@ -14,250 +14,13 @@ ...@@ -14,250 +14,13 @@
#include <linux/acpi.h> #include <linux/acpi.h>
#include <linux/arch_topology.h> #include <linux/arch_topology.h>
#include <linux/cacheinfo.h> #include <linux/cacheinfo.h>
#include <linux/cpu.h>
#include <linux/cpumask.h>
#include <linux/init.h> #include <linux/init.h>
#include <linux/percpu.h> #include <linux/percpu.h>
#include <linux/node.h>
#include <linux/nodemask.h>
#include <linux/of.h>
#include <linux/sched.h>
#include <linux/sched/topology.h>
#include <linux/slab.h>
#include <linux/smp.h>
#include <linux/string.h>
#include <asm/cpu.h> #include <asm/cpu.h>
#include <asm/cputype.h> #include <asm/cputype.h>
#include <asm/topology.h> #include <asm/topology.h>
static int __init get_cpu_for_node(struct device_node *node)
{
struct device_node *cpu_node;
int cpu;
cpu_node = of_parse_phandle(node, "cpu", 0);
if (!cpu_node)
return -1;
cpu = of_cpu_node_to_id(cpu_node);
if (cpu >= 0)
topology_parse_cpu_capacity(cpu_node, cpu);
else
pr_crit("Unable to find CPU node for %pOF\n", cpu_node);
of_node_put(cpu_node);
return cpu;
}
static int __init parse_core(struct device_node *core, int package_id,
int core_id)
{
char name[10];
bool leaf = true;
int i = 0;
int cpu;
struct device_node *t;
do {
snprintf(name, sizeof(name), "thread%d", i);
t = of_get_child_by_name(core, name);
if (t) {
leaf = false;
cpu = get_cpu_for_node(t);
if (cpu >= 0) {
cpu_topology[cpu].package_id = package_id;
cpu_topology[cpu].core_id = core_id;
cpu_topology[cpu].thread_id = i;
} else {
pr_err("%pOF: Can't get CPU for thread\n",
t);
of_node_put(t);
return -EINVAL;
}
of_node_put(t);
}
i++;
} while (t);
cpu = get_cpu_for_node(core);
if (cpu >= 0) {
if (!leaf) {
pr_err("%pOF: Core has both threads and CPU\n",
core);
return -EINVAL;
}
cpu_topology[cpu].package_id = package_id;
cpu_topology[cpu].core_id = core_id;
} else if (leaf) {
pr_err("%pOF: Can't get CPU for leaf core\n", core);
return -EINVAL;
}
return 0;
}
static int __init parse_cluster(struct device_node *cluster, int depth)
{
char name[10];
bool leaf = true;
bool has_cores = false;
struct device_node *c;
static int package_id __initdata;
int core_id = 0;
int i, ret;
/*
* First check for child clusters; we currently ignore any
* information about the nesting of clusters and present the
* scheduler with a flat list of them.
*/
i = 0;
do {
snprintf(name, sizeof(name), "cluster%d", i);
c = of_get_child_by_name(cluster, name);
if (c) {
leaf = false;
ret = parse_cluster(c, depth + 1);
of_node_put(c);
if (ret != 0)
return ret;
}
i++;
} while (c);
/* Now check for cores */
i = 0;
do {
snprintf(name, sizeof(name), "core%d", i);
c = of_get_child_by_name(cluster, name);
if (c) {
has_cores = true;
if (depth == 0) {
pr_err("%pOF: cpu-map children should be clusters\n",
c);
of_node_put(c);
return -EINVAL;
}
if (leaf) {
ret = parse_core(c, package_id, core_id++);
} else {
pr_err("%pOF: Non-leaf cluster with core %s\n",
cluster, name);
ret = -EINVAL;
}
of_node_put(c);
if (ret != 0)
return ret;
}
i++;
} while (c);
if (leaf && !has_cores)
pr_warn("%pOF: empty cluster\n", cluster);
if (leaf)
package_id++;
return 0;
}
static int __init parse_dt_topology(void)
{
struct device_node *cn, *map;
int ret = 0;
int cpu;
cn = of_find_node_by_path("/cpus");
if (!cn) {
pr_err("No CPU information found in DT\n");
return 0;
}
/*
* When topology is provided cpu-map is essentially a root
* cluster with restricted subnodes.
*/
map = of_get_child_by_name(cn, "cpu-map");
if (!map)
goto out;
ret = parse_cluster(map, 0);
if (ret != 0)
goto out_map;
topology_normalize_cpu_scale();
/*
* Check that all cores are in the topology; the SMP code will
* only mark cores described in the DT as possible.
*/
for_each_possible_cpu(cpu)
if (cpu_topology[cpu].package_id == -1)
ret = -EINVAL;
out_map:
of_node_put(map);
out:
of_node_put(cn);
return ret;
}
/*
* cpu topology table
*/
struct cpu_topology cpu_topology[NR_CPUS];
EXPORT_SYMBOL_GPL(cpu_topology);
const struct cpumask *cpu_coregroup_mask(int cpu)
{
const cpumask_t *core_mask = cpumask_of_node(cpu_to_node(cpu));
/* Find the smaller of NUMA, core or LLC siblings */
if (cpumask_subset(&cpu_topology[cpu].core_sibling, core_mask)) {
/* not numa in package, lets use the package siblings */
core_mask = &cpu_topology[cpu].core_sibling;
}
if (cpu_topology[cpu].llc_id != -1) {
if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
core_mask = &cpu_topology[cpu].llc_sibling;
}
return core_mask;
}
static void update_siblings_masks(unsigned int cpuid)
{
struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
int cpu;
/* update core and thread sibling masks */
for_each_online_cpu(cpu) {
cpu_topo = &cpu_topology[cpu];
if (cpuid_topo->llc_id == cpu_topo->llc_id) {
cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling);
cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling);
}
if (cpuid_topo->package_id != cpu_topo->package_id)
continue;
cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
if (cpuid_topo->core_id != cpu_topo->core_id)
continue;
cpumask_set_cpu(cpuid, &cpu_topo->thread_sibling);
cpumask_set_cpu(cpu, &cpuid_topo->thread_sibling);
}
}
void store_cpu_topology(unsigned int cpuid) void store_cpu_topology(unsigned int cpuid)
{ {
struct cpu_topology *cpuid_topo = &cpu_topology[cpuid]; struct cpu_topology *cpuid_topo = &cpu_topology[cpuid];
...@@ -296,49 +59,6 @@ void store_cpu_topology(unsigned int cpuid) ...@@ -296,49 +59,6 @@ void store_cpu_topology(unsigned int cpuid)
update_siblings_masks(cpuid); update_siblings_masks(cpuid);
} }
static void clear_cpu_topology(int cpu)
{
struct cpu_topology *cpu_topo = &cpu_topology[cpu];
cpumask_clear(&cpu_topo->llc_sibling);
cpumask_set_cpu(cpu, &cpu_topo->llc_sibling);
cpumask_clear(&cpu_topo->core_sibling);
cpumask_set_cpu(cpu, &cpu_topo->core_sibling);
cpumask_clear(&cpu_topo->thread_sibling);
cpumask_set_cpu(cpu, &cpu_topo->thread_sibling);
}
static void __init reset_cpu_topology(void)
{
unsigned int cpu;
for_each_possible_cpu(cpu) {
struct cpu_topology *cpu_topo = &cpu_topology[cpu];
cpu_topo->thread_id = -1;
cpu_topo->core_id = 0;
cpu_topo->package_id = -1;
cpu_topo->llc_id = -1;
clear_cpu_topology(cpu);
}
}
void remove_cpu_topology(unsigned int cpu)
{
int sibling;
for_each_cpu(sibling, topology_core_cpumask(cpu))
cpumask_clear_cpu(cpu, topology_core_cpumask(sibling));
for_each_cpu(sibling, topology_sibling_cpumask(cpu))
cpumask_clear_cpu(cpu, topology_sibling_cpumask(sibling));
for_each_cpu(sibling, topology_llc_cpumask(cpu))
cpumask_clear_cpu(cpu, topology_llc_cpumask(sibling));
clear_cpu_topology(cpu);
}
#ifdef CONFIG_ACPI #ifdef CONFIG_ACPI
static bool __init acpi_cpu_is_threaded(int cpu) static bool __init acpi_cpu_is_threaded(int cpu)
{ {
...@@ -358,10 +78,13 @@ static bool __init acpi_cpu_is_threaded(int cpu) ...@@ -358,10 +78,13 @@ static bool __init acpi_cpu_is_threaded(int cpu)
* Propagate the topology information of the processor_topology_node tree to the * Propagate the topology information of the processor_topology_node tree to the
* cpu_topology array. * cpu_topology array.
*/ */
static int __init parse_acpi_topology(void) int __init parse_acpi_topology(void)
{ {
int cpu, topology_id; int cpu, topology_id;
if (acpi_disabled)
return 0;
for_each_possible_cpu(cpu) { for_each_possible_cpu(cpu) {
int i, cache_id; int i, cache_id;
...@@ -395,24 +118,6 @@ static int __init parse_acpi_topology(void) ...@@ -395,24 +118,6 @@ static int __init parse_acpi_topology(void)
return 0; return 0;
} }
#else
static inline int __init parse_acpi_topology(void)
{
return -EINVAL;
}
#endif #endif
void __init init_cpu_topology(void)
{
reset_cpu_topology();
/*
* Discard anything that was parsed if we hit an error so we
* don't use partial information.
*/
if (!acpi_disabled && parse_acpi_topology())
reset_cpu_topology();
else if (of_have_populated_dt() && parse_dt_topology())
reset_cpu_topology();
}
...@@ -48,6 +48,7 @@ config RISCV ...@@ -48,6 +48,7 @@ config RISCV
select PCI_MSI if PCI select PCI_MSI if PCI
select RISCV_TIMER select RISCV_TIMER
select GENERIC_IRQ_MULTI_HANDLER select GENERIC_IRQ_MULTI_HANDLER
select GENERIC_ARCH_TOPOLOGY if SMP
select ARCH_HAS_PTE_SPECIAL select ARCH_HAS_PTE_SPECIAL
select ARCH_HAS_MMIOWB select ARCH_HAS_MMIOWB
select HAVE_EBPF_JIT if 64BIT select HAVE_EBPF_JIT if 64BIT
......
...@@ -8,6 +8,7 @@ ...@@ -8,6 +8,7 @@
* Copyright (C) 2017 SiFive * Copyright (C) 2017 SiFive
*/ */
#include <linux/arch_topology.h>
#include <linux/module.h> #include <linux/module.h>
#include <linux/init.h> #include <linux/init.h>
#include <linux/kernel.h> #include <linux/kernel.h>
...@@ -35,6 +36,7 @@ static DECLARE_COMPLETION(cpu_running); ...@@ -35,6 +36,7 @@ static DECLARE_COMPLETION(cpu_running);
void __init smp_prepare_boot_cpu(void) void __init smp_prepare_boot_cpu(void)
{ {
init_cpu_topology();
} }
void __init smp_prepare_cpus(unsigned int max_cpus) void __init smp_prepare_cpus(unsigned int max_cpus)
...@@ -138,6 +140,7 @@ asmlinkage void __init smp_callin(void) ...@@ -138,6 +140,7 @@ asmlinkage void __init smp_callin(void)
trap_init(); trap_init();
notify_cpu_starting(smp_processor_id()); notify_cpu_starting(smp_processor_id());
update_siblings_masks(smp_processor_id());
set_cpu_online(smp_processor_id(), 1); set_cpu_online(smp_processor_id(), 1);
/* /*
* Remote TLB flushes are ignored while the CPU is offline, so emit * Remote TLB flushes are ignored while the CPU is offline, so emit
......
...@@ -202,7 +202,7 @@ config GENERIC_ARCH_TOPOLOGY ...@@ -202,7 +202,7 @@ config GENERIC_ARCH_TOPOLOGY
help help
Enable support for architectures common topology code: e.g., parsing Enable support for architectures common topology code: e.g., parsing
CPU capacity information from DT, usage of such information for CPU capacity information from DT, usage of such information for
appropriate scaling, sysfs interface for changing capacity values at appropriate scaling, sysfs interface for reading capacity values at
runtime. runtime.
endmenu endmenu
...@@ -15,6 +15,11 @@ ...@@ -15,6 +15,11 @@
#include <linux/string.h> #include <linux/string.h>
#include <linux/sched/topology.h> #include <linux/sched/topology.h>
#include <linux/cpuset.h> #include <linux/cpuset.h>
#include <linux/cpumask.h>
#include <linux/init.h>
#include <linux/percpu.h>
#include <linux/sched.h>
#include <linux/smp.h>
DEFINE_PER_CPU(unsigned long, freq_scale) = SCHED_CAPACITY_SCALE; DEFINE_PER_CPU(unsigned long, freq_scale) = SCHED_CAPACITY_SCALE;
...@@ -241,3 +246,296 @@ static void parsing_done_workfn(struct work_struct *work) ...@@ -241,3 +246,296 @@ static void parsing_done_workfn(struct work_struct *work)
#else #else
core_initcall(free_raw_capacity); core_initcall(free_raw_capacity);
#endif #endif
#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
static int __init get_cpu_for_node(struct device_node *node)
{
struct device_node *cpu_node;
int cpu;
cpu_node = of_parse_phandle(node, "cpu", 0);
if (!cpu_node)
return -1;
cpu = of_cpu_node_to_id(cpu_node);
if (cpu >= 0)
topology_parse_cpu_capacity(cpu_node, cpu);
else
pr_crit("Unable to find CPU node for %pOF\n", cpu_node);
of_node_put(cpu_node);
return cpu;
}
static int __init parse_core(struct device_node *core, int package_id,
int core_id)
{
char name[10];
bool leaf = true;
int i = 0;
int cpu;
struct device_node *t;
do {
snprintf(name, sizeof(name), "thread%d", i);
t = of_get_child_by_name(core, name);
if (t) {
leaf = false;
cpu = get_cpu_for_node(t);
if (cpu >= 0) {
cpu_topology[cpu].package_id = package_id;
cpu_topology[cpu].core_id = core_id;
cpu_topology[cpu].thread_id = i;
} else {
pr_err("%pOF: Can't get CPU for thread\n",
t);
of_node_put(t);
return -EINVAL;
}
of_node_put(t);
}
i++;
} while (t);
cpu = get_cpu_for_node(core);
if (cpu >= 0) {
if (!leaf) {
pr_err("%pOF: Core has both threads and CPU\n",
core);
return -EINVAL;
}
cpu_topology[cpu].package_id = package_id;
cpu_topology[cpu].core_id = core_id;
} else if (leaf) {
pr_err("%pOF: Can't get CPU for leaf core\n", core);
return -EINVAL;
}
return 0;
}
static int __init parse_cluster(struct device_node *cluster, int depth)
{
char name[10];
bool leaf = true;
bool has_cores = false;
struct device_node *c;
static int package_id __initdata;
int core_id = 0;
int i, ret;
/*
* First check for child clusters; we currently ignore any
* information about the nesting of clusters and present the
* scheduler with a flat list of them.
*/
i = 0;
do {
snprintf(name, sizeof(name), "cluster%d", i);
c = of_get_child_by_name(cluster, name);
if (c) {
leaf = false;
ret = parse_cluster(c, depth + 1);
of_node_put(c);
if (ret != 0)
return ret;
}
i++;
} while (c);
/* Now check for cores */
i = 0;
do {
snprintf(name, sizeof(name), "core%d", i);
c = of_get_child_by_name(cluster, name);
if (c) {
has_cores = true;
if (depth == 0) {
pr_err("%pOF: cpu-map children should be clusters\n",
c);
of_node_put(c);
return -EINVAL;
}
if (leaf) {
ret = parse_core(c, package_id, core_id++);
} else {
pr_err("%pOF: Non-leaf cluster with core %s\n",
cluster, name);
ret = -EINVAL;
}
of_node_put(c);
if (ret != 0)
return ret;
}
i++;
} while (c);
if (leaf && !has_cores)
pr_warn("%pOF: empty cluster\n", cluster);
if (leaf)
package_id++;
return 0;
}
static int __init parse_dt_topology(void)
{
struct device_node *cn, *map;
int ret = 0;
int cpu;
cn = of_find_node_by_path("/cpus");
if (!cn) {
pr_err("No CPU information found in DT\n");
return 0;
}
/*
* When topology is provided cpu-map is essentially a root
* cluster with restricted subnodes.
*/
map = of_get_child_by_name(cn, "cpu-map");
if (!map)
goto out;
ret = parse_cluster(map, 0);
if (ret != 0)
goto out_map;
topology_normalize_cpu_scale();
/*
* Check that all cores are in the topology; the SMP code will
* only mark cores described in the DT as possible.
*/
for_each_possible_cpu(cpu)
if (cpu_topology[cpu].package_id == -1)
ret = -EINVAL;
out_map:
of_node_put(map);
out:
of_node_put(cn);
return ret;
}
#endif
/*
* cpu topology table
*/
struct cpu_topology cpu_topology[NR_CPUS];
EXPORT_SYMBOL_GPL(cpu_topology);
const struct cpumask *cpu_coregroup_mask(int cpu)
{
const cpumask_t *core_mask = cpumask_of_node(cpu_to_node(cpu));
/* Find the smaller of NUMA, core or LLC siblings */
if (cpumask_subset(&cpu_topology[cpu].core_sibling, core_mask)) {
/* not numa in package, lets use the package siblings */
core_mask = &cpu_topology[cpu].core_sibling;
}
if (cpu_topology[cpu].llc_id != -1) {
if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask))
core_mask = &cpu_topology[cpu].llc_sibling;
}
return core_mask;
}
void update_siblings_masks(unsigned int cpuid)
{
struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
int cpu;
/* update core and thread sibling masks */
for_each_online_cpu(cpu) {
cpu_topo = &cpu_topology[cpu];
if (cpuid_topo->llc_id == cpu_topo->llc_id) {
cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling);
cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling);
}
if (cpuid_topo->package_id != cpu_topo->package_id)
continue;
cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
if (cpuid_topo->core_id != cpu_topo->core_id)
continue;
cpumask_set_cpu(cpuid, &cpu_topo->thread_sibling);
cpumask_set_cpu(cpu, &cpuid_topo->thread_sibling);
}
}
static void clear_cpu_topology(int cpu)
{
struct cpu_topology *cpu_topo = &cpu_topology[cpu];
cpumask_clear(&cpu_topo->llc_sibling);
cpumask_set_cpu(cpu, &cpu_topo->llc_sibling);
cpumask_clear(&cpu_topo->core_sibling);
cpumask_set_cpu(cpu, &cpu_topo->core_sibling);
cpumask_clear(&cpu_topo->thread_sibling);
cpumask_set_cpu(cpu, &cpu_topo->thread_sibling);
}
void __init reset_cpu_topology(void)
{
unsigned int cpu;
for_each_possible_cpu(cpu) {
struct cpu_topology *cpu_topo = &cpu_topology[cpu];
cpu_topo->thread_id = -1;
cpu_topo->core_id = -1;
cpu_topo->package_id = -1;
cpu_topo->llc_id = -1;
clear_cpu_topology(cpu);
}
}
void remove_cpu_topology(unsigned int cpu)
{
int sibling;
for_each_cpu(sibling, topology_core_cpumask(cpu))
cpumask_clear_cpu(cpu, topology_core_cpumask(sibling));
for_each_cpu(sibling, topology_sibling_cpumask(cpu))
cpumask_clear_cpu(cpu, topology_sibling_cpumask(sibling));
for_each_cpu(sibling, topology_llc_cpumask(cpu))
cpumask_clear_cpu(cpu, topology_llc_cpumask(sibling));
clear_cpu_topology(cpu);
}
__weak int __init parse_acpi_topology(void)
{
return 0;
}
#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV)
void __init init_cpu_topology(void)
{
reset_cpu_topology();
/*
* Discard anything that was parsed if we hit an error so we
* don't use partial information.
*/
if (parse_acpi_topology())
reset_cpu_topology();
else if (of_have_populated_dt() && parse_dt_topology())
reset_cpu_topology();
}
#endif
...@@ -33,4 +33,30 @@ unsigned long topology_get_freq_scale(int cpu) ...@@ -33,4 +33,30 @@ unsigned long topology_get_freq_scale(int cpu)
return per_cpu(freq_scale, cpu); return per_cpu(freq_scale, cpu);
} }
struct cpu_topology {
int thread_id;
int core_id;
int package_id;
int llc_id;
cpumask_t thread_sibling;
cpumask_t core_sibling;
cpumask_t llc_sibling;
};
#ifdef CONFIG_GENERIC_ARCH_TOPOLOGY
extern struct cpu_topology cpu_topology[NR_CPUS];
#define topology_physical_package_id(cpu) (cpu_topology[cpu].package_id)
#define topology_core_id(cpu) (cpu_topology[cpu].core_id)
#define topology_core_cpumask(cpu) (&cpu_topology[cpu].core_sibling)
#define topology_sibling_cpumask(cpu) (&cpu_topology[cpu].thread_sibling)
#define topology_llc_cpumask(cpu) (&cpu_topology[cpu].llc_sibling)
void init_cpu_topology(void);
void store_cpu_topology(unsigned int cpuid);
const struct cpumask *cpu_coregroup_mask(int cpu);
void update_siblings_masks(unsigned int cpu);
void remove_cpu_topology(unsigned int cpuid);
void reset_cpu_topology(void);
#endif
#endif /* _LINUX_ARCH_TOPOLOGY_H_ */ #endif /* _LINUX_ARCH_TOPOLOGY_H_ */
...@@ -27,6 +27,7 @@ ...@@ -27,6 +27,7 @@
#ifndef _LINUX_TOPOLOGY_H #ifndef _LINUX_TOPOLOGY_H
#define _LINUX_TOPOLOGY_H #define _LINUX_TOPOLOGY_H
#include <linux/arch_topology.h>
#include <linux/cpumask.h> #include <linux/cpumask.h>
#include <linux/bitops.h> #include <linux/bitops.h>
#include <linux/mmzone.h> #include <linux/mmzone.h>
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment