• Ian Rogers's avatar
    perf metrics: Compute unmerged uncore metrics individually · a59fb796
    Ian Rogers authored
    When merging counts from multiple uncore PMUs the metric is only
    computed for the metric leader. When merging/aggregation is disabled,
    prior to this patch just the leader's metric would be computed. Fix
    this by computing the metric for each PMU.
    
    On a SkylakeX:
    Before:
    ```
    $ perf stat -A -M memory_bandwidth_total -a sleep 1
    
     Performance counter stats for 'system wide':
    
    CPU0               82,217      UNC_M_CAS_COUNT.RD [uncore_imc_0] #      9.2 MB/s  memory_bandwidth_total
    CPU18                   0      UNC_M_CAS_COUNT.RD [uncore_imc_0] #      0.0 MB/s  memory_bandwidth_total
    CPU0               61,395      UNC_M_CAS_COUNT.WR [uncore_imc_0]
    CPU18                   0      UNC_M_CAS_COUNT.WR [uncore_imc_0]
    CPU0                    0      UNC_M_CAS_COUNT.RD [uncore_imc_1]
    CPU18                   0      UNC_M_CAS_COUNT.RD [uncore_imc_1]
    CPU0                    0      UNC_M_CAS_COUNT.WR [uncore_imc_1]
    CPU18                   0      UNC_M_CAS_COUNT.WR [uncore_imc_1]
    CPU0               81,570      UNC_M_CAS_COUNT.RD [uncore_imc_2]
    CPU18             113,886      UNC_M_CAS_COUNT.RD [uncore_imc_2]
    CPU0               62,330      UNC_M_CAS_COUNT.WR [uncore_imc_2]
    CPU18              66,942      UNC_M_CAS_COUNT.WR [uncore_imc_2]
    CPU0               75,489      UNC_M_CAS_COUNT.RD [uncore_imc_3]
    CPU18              27,958      UNC_M_CAS_COUNT.RD [uncore_imc_3]
    CPU0               55,864      UNC_M_CAS_COUNT.WR [uncore_imc_3]
    CPU18              38,727      UNC_M_CAS_COUNT.WR [uncore_imc_3]
    CPU0                    0      UNC_M_CAS_COUNT.RD [uncore_imc_4]
    CPU18                   0      UNC_M_CAS_COUNT.RD [uncore_imc_4]
    CPU0                    0      UNC_M_CAS_COUNT.WR [uncore_imc_4]
    CPU18                   0      UNC_M_CAS_COUNT.WR [uncore_imc_4]
    CPU0               75,423      UNC_M_CAS_COUNT.RD [uncore_imc_5]
    CPU18             104,527      UNC_M_CAS_COUNT.RD [uncore_imc_5]
    CPU0               57,596      UNC_M_CAS_COUNT.WR [uncore_imc_5]
    CPU18              56,777      UNC_M_CAS_COUNT.WR [uncore_imc_5]
    CPU0        1,003,440,851 ns   duration_time
    
           1.003440851 seconds time elapsed
    ```
    
    After:
    ```
    $ perf stat -A -M memory_bandwidth_total -a sleep 1
    
     Performance counter stats for 'system wide':
    
    CPU0               88,968      UNC_M_CAS_COUNT.RD [uncore_imc_0] #      9.5 MB/s  memory_bandwidth_total
    CPU18                   0      UNC_M_CAS_COUNT.RD [uncore_imc_0] #      0.0 MB/s  memory_bandwidth_total
    CPU0               59,498      UNC_M_CAS_COUNT.WR [uncore_imc_0]
    CPU18                   0      UNC_M_CAS_COUNT.WR [uncore_imc_0]
    CPU0                    0      UNC_M_CAS_COUNT.RD [uncore_imc_1] #      0.0 MB/s  memory_bandwidth_total
    CPU18                   0      UNC_M_CAS_COUNT.RD [uncore_imc_1] #      0.0 MB/s  memory_bandwidth_total
    CPU0                    0      UNC_M_CAS_COUNT.WR [uncore_imc_1]
    CPU18                   0      UNC_M_CAS_COUNT.WR [uncore_imc_1]
    CPU0               88,635      UNC_M_CAS_COUNT.RD [uncore_imc_2] #      9.5 MB/s  memory_bandwidth_total
    CPU18             117,975      UNC_M_CAS_COUNT.RD [uncore_imc_2] #     11.5 MB/s  memory_bandwidth_total
    CPU0               60,829      UNC_M_CAS_COUNT.WR [uncore_imc_2]
    CPU18              62,105      UNC_M_CAS_COUNT.WR [uncore_imc_2]
    CPU0               82,238      UNC_M_CAS_COUNT.RD [uncore_imc_3] #      8.7 MB/s  memory_bandwidth_total
    CPU18              22,906      UNC_M_CAS_COUNT.RD [uncore_imc_3] #      3.6 MB/s  memory_bandwidth_total
    CPU0               53,959      UNC_M_CAS_COUNT.WR [uncore_imc_3]
    CPU18              32,990      UNC_M_CAS_COUNT.WR [uncore_imc_3]
    CPU0                    0      UNC_M_CAS_COUNT.RD [uncore_imc_4] #      0.0 MB/s  memory_bandwidth_total
    CPU18                   0      UNC_M_CAS_COUNT.RD [uncore_imc_4] #      0.0 MB/s  memory_bandwidth_total
    CPU0                    0      UNC_M_CAS_COUNT.WR [uncore_imc_4]
    CPU18                   0      UNC_M_CAS_COUNT.WR [uncore_imc_4]
    CPU0               83,595      UNC_M_CAS_COUNT.RD [uncore_imc_5] #      8.9 MB/s  memory_bandwidth_total
    CPU18             110,151      UNC_M_CAS_COUNT.RD [uncore_imc_5] #     10.5 MB/s  memory_bandwidth_total
    CPU0               56,540      UNC_M_CAS_COUNT.WR [uncore_imc_5]
    CPU18              53,816      UNC_M_CAS_COUNT.WR [uncore_imc_5]
    CPU0        1,003,353,416 ns   duration_time
    ```
    
    Signed-off-by: Ian Rogers <irogers@google.com>                                  |
    Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
    Cc: K Prateek Nayak <kprateek.nayak@amd.com>
    Cc: Stephane Eranian <eranian@google.com>
    Cc: Kaige Ye <ye@kaige.org>
    Cc: Kajol Jain <kjain@linux.ibm.com>
    Cc: Kan Liang <kan.liang@linux.intel.com>
    Cc: John Garry <john.g.garry@oracle.com>
    Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
    Link: https://lore.kernel.org/r/20240221070754.4163916-2-irogers@google.com
    a59fb796
metricgroup.c 49.1 KB