Commit f0fabf9c authored by Ravi Bangoria's avatar Ravi Bangoria Committed by Arnaldo Carvalho de Melo

perf mem/c2c: Fix perf_mem_events to support powerpc

PowerPC hardware does not have a builtin latency filter (--ldlat) for
the "mem-load" event and perf_mem_events by default includes
"/ldlat=30/" which is causing a failure on PowerPC. Refactor the code to
support "perf mem/c2c" on PowerPC.

This patch depends on kernel side changes done my Madhavan:
https://lists.ozlabs.org/pipermail/linuxppc-dev/2018-December/182596.htmlSigned-off-by: default avatarRavi Bangoria <ravi.bangoria@linux.ibm.com>
Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
Cc: Dick Fowles <fowles@inreach.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/20190129132412.771-1-ravi.bangoria@linux.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
parent 489338a7
...@@ -19,8 +19,11 @@ C2C stands for Cache To Cache. ...@@ -19,8 +19,11 @@ C2C stands for Cache To Cache.
The perf c2c tool provides means for Shared Data C2C/HITM analysis. It allows The perf c2c tool provides means for Shared Data C2C/HITM analysis. It allows
you to track down the cacheline contentions. you to track down the cacheline contentions.
The tool is based on x86's load latency and precise store facility events On x86, the tool is based on load latency and precise store facility events
provided by Intel CPUs. These events provide: provided by Intel CPUs. On PowerPC, the tool uses random instruction sampling
with thresholding feature.
These events provide:
- memory address of the access - memory address of the access
- type of the access (load and store details) - type of the access (load and store details)
- latency (in cycles) of the load access - latency (in cycles) of the load access
...@@ -46,7 +49,7 @@ RECORD OPTIONS ...@@ -46,7 +49,7 @@ RECORD OPTIONS
-l:: -l::
--ldlat:: --ldlat::
Configure mem-loads latency. Configure mem-loads latency. (x86 only)
-k:: -k::
--all-kernel:: --all-kernel::
...@@ -119,11 +122,16 @@ Following perf record options are configured by default: ...@@ -119,11 +122,16 @@ Following perf record options are configured by default:
-W,-d,--phys-data,--sample-cpu -W,-d,--phys-data,--sample-cpu
Unless specified otherwise with '-e' option, following events are monitored by Unless specified otherwise with '-e' option, following events are monitored by
default: default on x86:
cpu/mem-loads,ldlat=30/P cpu/mem-loads,ldlat=30/P
cpu/mem-stores/P cpu/mem-stores/P
and following on PowerPC:
cpu/mem-loads/
cpu/mem-stores/
User can pass any 'perf record' option behind '--' mark, like (to enable User can pass any 'perf record' option behind '--' mark, like (to enable
callchains and system wide monitoring): callchains and system wide monitoring):
......
...@@ -82,7 +82,7 @@ RECORD OPTIONS ...@@ -82,7 +82,7 @@ RECORD OPTIONS
Be more verbose (show counter open errors, etc) Be more verbose (show counter open errors, etc)
--ldlat <n>:: --ldlat <n>::
Specify desired latency for loads event. Specify desired latency for loads event. (x86 only)
In addition, for report all perf report options are valid, and for record In addition, for report all perf report options are valid, and for record
all perf record options. all perf record options.
......
...@@ -2,6 +2,7 @@ libperf-y += header.o ...@@ -2,6 +2,7 @@ libperf-y += header.o
libperf-y += sym-handling.o libperf-y += sym-handling.o
libperf-y += kvm-stat.o libperf-y += kvm-stat.o
libperf-y += perf_regs.o libperf-y += perf_regs.o
libperf-y += mem-events.o
libperf-$(CONFIG_DWARF) += dwarf-regs.o libperf-$(CONFIG_DWARF) += dwarf-regs.o
libperf-$(CONFIG_DWARF) += skip-callchain-idx.o libperf-$(CONFIG_DWARF) += skip-callchain-idx.o
......
// SPDX-License-Identifier: GPL-2.0
#include "mem-events.h"
/* PowerPC does not support 'ldlat' parameter. */
char *perf_mem_events__name(int i)
{
if (i == PERF_MEM_EVENTS__LOAD)
return (char *) "cpu/mem-loads/";
return (char *) "cpu/mem-stores/";
}
...@@ -28,7 +28,7 @@ struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX] = { ...@@ -28,7 +28,7 @@ struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX] = {
static char mem_loads_name[100]; static char mem_loads_name[100];
static bool mem_loads_name__init; static bool mem_loads_name__init;
char *perf_mem_events__name(int i) char * __weak perf_mem_events__name(int i)
{ {
if (i == PERF_MEM_EVENTS__LOAD) { if (i == PERF_MEM_EVENTS__LOAD) {
if (!mem_loads_name__init) { if (!mem_loads_name__init) {
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment