Commit 6673016f authored by Robert Walker's avatar Robert Walker Committed by Arnaldo Carvalho de Melo

coresight: Update documentation for perf usage

Add notes on using perf to collect and analyze CoreSight trace
Signed-off-by: default avatarRobert Walker <robert.walker@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1518607481-4059-4-git-send-email-robert.walker@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
parent 256e751c
...@@ -330,3 +330,54 @@ Details on how to use the generic STM API can be found here [2]. ...@@ -330,3 +330,54 @@ Details on how to use the generic STM API can be found here [2].
[1]. Documentation/ABI/testing/sysfs-bus-coresight-devices-stm [1]. Documentation/ABI/testing/sysfs-bus-coresight-devices-stm
[2]. Documentation/trace/stm.txt [2]. Documentation/trace/stm.txt
Using perf tools
----------------
perf can be used to record and analyze trace of programs.
Execution can be recorded using 'perf record' with the cs_etm event,
specifying the name of the sink to record to, e.g:
perf record -e cs_etm/@20070000.etr/u --per-thread
The 'perf report' and 'perf script' commands can be used to analyze execution,
synthesizing instruction and branch events from the instruction trace.
'perf inject' can be used to replace the trace data with the synthesized events.
The --itrace option controls the type and frequency of synthesized events
(see perf documentation).
Note that only 64-bit programs are currently supported - further work is
required to support instruction decode of 32-bit Arm programs.
Generating coverage files for Feedback Directed Optimization: AutoFDO
---------------------------------------------------------------------
'perf inject' accepts the --itrace option in which case tracing data is
removed and replaced with the synthesized events. e.g.
perf inject --itrace --strip -i perf.data -o perf.data.new
Below is an example of using ARM ETM for autoFDO. It requires autofdo
(https://github.com/google/autofdo) and gcc version 5. The bubble
sort example is from the AutoFDO tutorial (https://gcc.gnu.org/wiki/AutoFDO/Tutorial).
$ gcc-5 -O3 sort.c -o sort
$ taskset -c 2 ./sort
Bubble sorting array of 30000 elements
5910 ms
$ perf record -e cs_etm/@20070000.etr/u --per-thread taskset -c 2 ./sort
Bubble sorting array of 30000 elements
12543 ms
[ perf record: Woken up 35 times to write data ]
[ perf record: Captured and wrote 69.640 MB perf.data ]
$ perf inject -i perf.data -o inj.data --itrace=il64 --strip
$ create_gcov --binary=./sort --profile=inj.data --gcov=sort.gcov -gcov_version=1
$ gcc-5 -O3 -fauto-profile=sort.gcov sort.c -o sort_autofdo
$ taskset -c 2 ./sort_autofdo
Bubble sorting array of 30000 elements
5806 ms
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment