• Srikar Dronamraju's avatar
    powerpc/smp: Enable CACHE domain for shared processor · 5bf63497
    Srikar Dronamraju authored
    Currently CACHE domain is not enabled on shared processor mode PowerVM
    LPARS. On PowerVM systems, 'ibm,thread-group' device-tree property 2
    under cpu-device-node indicates which all CPUs share L2-cache. However
    'ibm,thread-group' device-tree property 2 is a relatively new property.
    In absence of 'ibm,thread-group' property 2, 'l2-cache' device property
    under cpu-device-node could help system to identify CPUs sharing L2-cache.
    However this property is not exposed by PhyP in shared processor mode
    configurations.
    
    In absence of properties that inform OS about which CPUs share L2-cache,
    fallback on core boundary.
    
    Here are some stats from Power9 shared LPAR with the changes.
    
    $ lscpu
    Architecture:        ppc64le
    Byte Order:          Little Endian
    CPU(s):              32
    On-line CPU(s) list: 0-31
    Thread(s) per core:  8
    Core(s) per socket:  1
    Socket(s):           3
    NUMA node(s):        2
    Model:               2.2 (pvr 004e 0202)
    Model name:          POWER9 (architected), altivec supported
    Hypervisor vendor:   pHyp
    Virtualization type: para
    L1d cache:           32K
    L1i cache:           32K
    NUMA node0 CPU(s):   16-23
    NUMA node1 CPU(s):   0-15,24-31
    Physical sockets:    2
    Physical chips:      1
    Physical cores/chip: 10
    
    Before patch
    $ grep -r . /sys/kernel/debug/sched/domains/cpu0/domain*/name
    Before
    /sys/kernel/debug/sched/domains/cpu0/domain0/name:SMT
    /sys/kernel/debug/sched/domains/cpu0/domain1/name:DIE
    /sys/kernel/debug/sched/domains/cpu0/domain2/name:NUMA
    
    After
    /sys/kernel/debug/sched/domains/cpu0/domain0/name:SMT
    /sys/kernel/debug/sched/domains/cpu0/domain1/name:CACHE
    /sys/kernel/debug/sched/domains/cpu0/domain2/name:DIE
    /sys/kernel/debug/sched/domains/cpu0/domain3/name:NUMA
    
    $  awk '/domain/{print $1, $2}' /proc/schedstat | sort -u | sed -e 's/00000000,//g'
    Before
    domain0 00000055
    domain0 000000aa
    domain0 00005500
    domain0 0000aa00
    domain0 00550000
    domain0 00aa0000
    domain0 55000000
    domain0 aa000000
    domain1 00ff0000
    domain1 ff00ffff
    domain2 ffffffff
    
    After
    domain0 00000055
    domain0 000000aa
    domain0 00005500
    domain0 0000aa00
    domain0 00550000
    domain0 00aa0000
    domain0 55000000
    domain0 aa000000
    domain1 000000ff
    domain1 0000ff00
    domain1 00ff0000
    domain1 ff000000
    domain2 ff00ffff
    domain2 ffffffff
    domain3 ffffffff
    
    (Lower is better)
    perf stat -a -r 5 -n perf bench sched pipe  | tail -n 2
    Before
               153.798 +- 0.142 seconds time elapsed  ( +-  0.09% )
    
    After
               111.545 +- 0.652 seconds time elapsed  ( +-  0.58% )
    
    which is an improvement of 27.47%
    Signed-off-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
    Reviewed-by: default avatarGautham R. Shenoy <ego@linux.vnet.ibm.com>
    Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/20210826100401.412519-4-srikar@linux.vnet.ibm.com
    5bf63497
smp.c 42 KB