• Jonathan Cameron's avatar
    topology: Represent clusters of CPUs within a die · c5e22fef
    Jonathan Cameron authored
    Both ACPI and DT provide the ability to describe additional layers of
    topology between that of individual cores and higher level constructs
    such as the level at which the last level cache is shared.
    In ACPI this can be represented in PPTT as a Processor Hierarchy
    Node Structure [1] that is the parent of the CPU cores and in turn
    has a parent Processor Hierarchy Nodes Structure representing
    a higher level of topology.
    
    For example Kunpeng 920 has 6 or 8 clusters in each NUMA node, and each
    cluster has 4 cpus. All clusters share L3 cache data, but each cluster
    has local L3 tag. On the other hand, each clusters will share some
    internal system bus.
    
    +-----------------------------------+                          +---------+
    |  +------+    +------+             +--------------------------+         |
    |  | CPU0 |    | cpu1 |             |    +-----------+         |         |
    |  +------+    +------+             |    |           |         |         |
    |                                   +----+    L3     |         |         |
    |  +------+    +------+   cluster   |    |    tag    |         |         |
    |  | CPU2 |    | CPU3 |             |    |           |         |         |
    |  +------+    +------+             |    +-----------+         |         |
    |                                   |                          |         |
    +-----------------------------------+                          |         |
    +-----------------------------------+                          |         |
    |  +------+    +------+             +--------------------------+         |
    |  |      |    |      |             |    +-----------+         |         |
    |  +------+    +------+             |    |           |         |         |
    |                                   |    |    L3     |         |         |
    |  +------+    +------+             +----+    tag    |         |         |
    |  |      |    |      |             |    |           |         |         |
    |  +------+    +------+             |    +-----------+         |         |
    |                                   |                          |         |
    +-----------------------------------+                          |   L3    |
                                                                   |   data  |
    +-----------------------------------+                          |         |
    |  +------+    +------+             |    +-----------+         |         |
    |  |      |    |      |             |    |           |         |         |
    |  +------+    +------+             +----+    L3     |         |         |
    |                                   |    |    tag    |         |         |
    |  +------+    +------+             |    |           |         |         |
    |  |      |    |      |             |    +-----------+         |         |
    |  +------+    +------+             +--------------------------+         |
    +-----------------------------------|                          |         |
    +-----------------------------------|                          |         |
    |  +------+    +------+             +--------------------------+         |
    |  |      |    |      |             |    +-----------+         |         |
    |  +------+    +------+             |    |           |         |         |
    |                                   +----+    L3     |         |         |
    |  +------+    +------+             |    |    tag    |         |         |
    |  |      |    |      |             |    |           |         |         |
    |  +------+    +------+             |    +-----------+         |         |
    |                                   |                          |         |
    +-----------------------------------+                          |         |
    +-----------------------------------+                          |         |
    |  +------+    +------+             +--------------------------+         |
    |  |      |    |      |             |   +-----------+          |         |
    |  +------+    +------+             |   |           |          |         |
    |                                   |   |    L3     |          |         |
    |  +------+    +------+             +---+    tag    |          |         |
    |  |      |    |      |             |   |           |          |         |
    |  +------+    +------+             |   +-----------+          |         |
    |                                   |                          |         |
    +-----------------------------------+                          |         |
    +-----------------------------------+                          |         |
    |  +------+    +------+             +--------------------------+         |
    |  |      |    |      |             |  +-----------+           |         |
    |  +------+    +------+             |  |           |           |         |
    |                                   |  |    L3     |           |         |
    |  +------+    +------+             +--+    tag    |           |         |
    |  |      |    |      |             |  |           |           |         |
    |  +------+    +------+             |  +-----------+           |         |
    |                                   |                          +---------+
    +-----------------------------------+
    
    That means spreading tasks among clusters will bring more bandwidth
    while packing tasks within one cluster will lead to smaller cache
    synchronization latency. So both kernel and userspace will have
    a chance to leverage this topology to deploy tasks accordingly to
    achieve either smaller cache latency within one cluster or an even
    distribution of load among clusters for higher throughput.
    
    This patch exposes cluster topology to both kernel and userspace.
    Libraried like hwloc will know cluster by cluster_cpus and related
    sysfs attributes. PoC of HWLOC support at [2].
    
    Note this patch only handle the ACPI case.
    
    Special consideration is needed for SMT processors, where it is
    necessary to move 2 levels up the hierarchy from the leaf nodes
    (thus skipping the processor core level).
    
    Note that arm64 / ACPI does not provide any means of identifying
    a die level in the topology but that may be unrelate to the cluster
    level.
    
    [1] ACPI Specification 6.3 - section 5.2.29.1 processor hierarchy node
        structure (Type 0)
    [2] https://github.com/hisilicon/hwloc/tree/linux-clusterSigned-off-by: default avatarJonathan Cameron <Jonathan.Cameron@huawei.com>
    Signed-off-by: default avatarTian Tao <tiantao6@hisilicon.com>
    Signed-off-by: default avatarBarry Song <song.bao.hua@hisilicon.com>
    Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
    Link: https://lore.kernel.org/r/20210924085104.44806-2-21cnbao@gmail.com
    c5e22fef
cputopology.rst 4.1 KB