• Xiaojian Du's avatar
    cpufreq: amd-pstate: change cpu freq transition delay for some models · c00d476c
    Xiaojian Du authored
    Some of AMD ZEN4 APU/CPU have support for adjusting the CPU core
    clock more quickly and presicely according to CPU work loading.
    This is advertised by the Fast CPPC x86 feature.
    This change will only be effective in the *passive mode* of
    AMD pstate driver. From the test results of different
    transition delay values, 600us is chosen to make a balance
    between performance and power consumption.
    
    Some test results on AMD Ryzen 7840HS(Phoenix) APU:
    
    1. Tbench
    (Energy less is better, Throughput more is better,
    PPW--Performance per Watt more is better)
    ============= =================== ============== =============== ============== =============== ============== =============== ===============
     Trans Delay   Tbench              governor:schedutil, 3-iterations average
    ============= =================== ============== =============== ============== =============== ============== =============== ===============
     1000us        Clients             1              2               4              8              12             16              32
                   Energy/Joules       2010           2804            8768           17171          16170          15132           15027
                   Throughput/(MB/s)   114            259             1041           3010           3135           4851            4605
                   PPW                 0.0567         0.0923          0.1187         0.1752         0.1938         0.3205          0.3064
     600us         Clients             1              2               4              8              12             16              32
                   Energy/Joules       2115  (5.22%)  2388  (-14.84%) 10700(22.03%)  16716 (-2.65%) 15939 (-1.43%) 15053 (-0.52%)  15083 (0.37% )
                   Throughput/(MB/s)   122   (7.02%)  234   (-9.65% ) 1188 (14.12%)  3003  (-0.23%) 3143  (0.26% ) 4842  (-0.19%)  4603  (-0.04%)
                   PPW                 0.0576(1.59%)  0.0979(6.07%  ) 0.111(-6.49%)  0.1796(2.51% ) 0.1971(1.70% ) 0.3216(0.34% )  0.3051(-0.42%)
    ============= =================== ============== ================ ============= =============== ============== =============== ===============
    
    2.Dbench
    (Energy less is better, Throughput more is better,
    PPW--Performance per Watt more is better)
    ============= =================== ============== =============== ============== =============== ============== =============== ===============
     Trans Delay   Dbench              governor:schedutil, 3-iterations average
    ============= =================== ============== =============== ============== =============== ============== =============== ===============
     1000us        Clients             1             2               4              8               12             16              32
                   Energy/Joules       4890          3779            3567           5157            5611           6500            8163
                   Throughput/(MB/s)   327           167             220            577             775            938             1397
                   PPW                 0.0668        0.0441          0.0616         0.1118          0.1381         0.1443          0.1711
     600us         Clients             1             2               4              8               12             16              32
                   Energy/Joules       4915  (0.51%) 4912  (29.98%)  3506  (-1.71%) 4907  (-4.85% ) 5011 (-10.69%) 5672  (-12.74%) 8141  (-0.27%)
                   Throughput/(MB/s)   348   (6.42%) 284   (70.06%)  220   (0.00% ) 518   (-10.23%) 712  (-8.13% ) 854   (-8.96% ) 1475  (5.58% )
                   PPW                 0.0708(5.99%) 0.0578(31.07%)  0.0627(1.79% ) 0.1055(-5.64% ) 0.142(2.82%  ) 0.1505(4.30%  ) 0.1811(5.84% )
    ============= =================== ============== =============== ============== =============== ============== =============== ===============
    
    3.Hackbench(less time is better)
    ============= =========================== ==========================
      hackbench     governor:schedutil
    ============= =========================== ==========================
      Trans Delay   Process Mode Ave time(s)  Thread Mode Ave time(s)
      1000us        14.484                      14.484
      600us         14.418(-0.46%)              15.41(+6.39%)
    ============= =========================== ==========================
    
    4.Perf_sched_bench(less time is better)
    ============= =================== ============== ============== ============== =============== =============== =============
     Trans Delay  perf_sched_bench    governor:schedutil
    ============= =================== ============== ============== ============== =============== =============== =============
      1000us        Groups             1             2              4              8               12              24
                    AveTime(s)        1.64          2.851          5.878          11.636          16.093          26.395
      600us         Groups             1             2              4              8               12              24
                    AveTime(s)        1.69(3.05%)   2.845(-0.21%)  5.843(-0.60%)  11.576(-0.52%)  16.092(-0.01%)  26.32(-0.28%)
    ============= ================== ============== ============== ============== =============== =============== ==============
    
    5.Sysbench(higher is better)
    ============= ================== ============== ================= ============== ================ =============== =================
      Sysbench    governor:schedutil
    ============= ================== ============== ================= ============== ================ =============== =================
      1000us      Thread             1               2                4              8                12               24
                  Ave events         6020.98         12273.39         24119.82       46171.57         47074.37         47831.72
      600us       Thread             1               2                4              8                12               24
                  Ave events         6154.82(2.22%)  12271.63(-0.01%) 24392.5(1.13%) 46117.64(-0.12%) 46852.19(-0.47%) 47678.92(-0.32%)
    ============= ================== ============== ================= ============== ================ =============== =================
    
    In conclusion, a shorter transition delay
    of cpu clock will make a quite positive effect to improve PPW
    on Dbench test, in the meanwhile, keep stable performance
    on Tbench, Hackbench, Perf_sched_bench and Sysbench.
    Signed-off-by: default avatarXiaojian Du <Xiaojian.Du@amd.com>
    Reviewed-by: default avatarPerry Yuan <perry.yuan@amd.com>
    Acked-by: default avatarMario Limonciello <mario.limonciello@amd.com>
    c00d476c
amd-pstate.c 47.5 KB