Commit 0513e464 authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'perf-tools-fixes-for-v5.15-2021-09-27' of...

Merge tag 'perf-tools-fixes-for-v5.15-2021-09-27' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull more perf tools fixes from Arnaldo Carvalho de Melo:

 - Fix 'perf test' DWARF unwind for optimized builds.

 - Fix 'perf test' 'Object code reading' when dealing with samples in
   @plt symbols.

 - Fix off-by-one directory paths in the ARM support code.

 - Fix error message to eliminate confusion in 'perf config' when first
   creating a config file.

 - 'perf iostat' fix for system wide operation.

 - Fix printing of metrics when 'perf iostat' is used with one or more
   iio_root_ports and unconnected cpus (using -C).

 - Fix several typos in the documentation files.

 - Fix spelling mistake "icach" -> "icache" in the power8 JSON vendor
   files.

* tag 'perf-tools-fixes-for-v5.15-2021-09-27' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf iostat: Fix Segmentation fault from NULL 'struct perf_counts_values *'
  perf iostat: Use system-wide mode if the target cpu_list is unspecified
  perf config: Refine error message to eliminate confusion
  perf doc: Fix typos all over the place
  perf arm: Fix off-by-one directory paths.
  perf vendor events powerpc: Fix spelling mistake "icach" -> "icache"
  perf tests: Fix flaky test 'Object code reading'
  perf test: Fix DWARF unwind for optimized builds.
parents 9cccec2b 4da8b121
...@@ -164,7 +164,7 @@ const char unwinding_data[n]: an array of unwinding data, consisting of the EH F ...@@ -164,7 +164,7 @@ const char unwinding_data[n]: an array of unwinding data, consisting of the EH F
The EH Frame header follows the Linux Standard Base (LSB) specification as described in the document at https://refspecs.linuxfoundation.org/LSB_1.3.0/gLSB/gLSB/ehframehdr.html The EH Frame header follows the Linux Standard Base (LSB) specification as described in the document at https://refspecs.linuxfoundation.org/LSB_1.3.0/gLSB/gLSB/ehframehdr.html
The EH Frame follows the LSB specicfication as described in the document at https://refspecs.linuxbase.org/LSB_3.0.0/LSB-PDA/LSB-PDA/ehframechpt.html The EH Frame follows the LSB specification as described in the document at https://refspecs.linuxbase.org/LSB_3.0.0/LSB-PDA/LSB-PDA/ehframechpt.html
NOTE: The mapped_size is generally either the same as unwind_data_size (if the unwinding data was mapped in memory by the running process) or zero (if the unwinding data is not mapped by the process). If the unwinding data was not mapped, then only the EH Frame Header will be read, which can be used to specify FP based unwinding for a function which does not have unwinding information. NOTE: The mapped_size is generally either the same as unwind_data_size (if the unwinding data was mapped in memory by the running process) or zero (if the unwinding data is not mapped by the process). If the unwinding data was not mapped, then only the EH Frame Header will be read, which can be used to specify FP based unwinding for a function which does not have unwinding information.
...@@ -261,7 +261,7 @@ COALESCE ...@@ -261,7 +261,7 @@ COALESCE
User can specify how to sort offsets for cacheline. User can specify how to sort offsets for cacheline.
Following fields are available and governs the final Following fields are available and governs the final
output fields set for caheline offsets output: output fields set for cacheline offsets output:
tid - coalesced by process TIDs tid - coalesced by process TIDs
pid - coalesced by process PIDs pid - coalesced by process PIDs
......
...@@ -883,7 +883,7 @@ and "r" can be combined to get calls and returns. ...@@ -883,7 +883,7 @@ and "r" can be combined to get calls and returns.
"Transactions" events correspond to the start or end of transactions. The "Transactions" events correspond to the start or end of transactions. The
'flags' field can be used in perf script to determine whether the event is a 'flags' field can be used in perf script to determine whether the event is a
tranasaction start, commit or abort. transaction start, commit or abort.
Note that "instructions", "branches" and "transactions" events depend on code Note that "instructions", "branches" and "transactions" events depend on code
flow packets which can be disabled by using the config term "branch=0". Refer flow packets which can be disabled by using the config term "branch=0". Refer
......
...@@ -44,7 +44,7 @@ COMMON OPTIONS ...@@ -44,7 +44,7 @@ COMMON OPTIONS
-f:: -f::
--force:: --force::
Don't complan, do it. Don't complain, do it.
REPORT OPTIONS REPORT OPTIONS
-------------- --------------
......
...@@ -54,7 +54,7 @@ all sched_wakeup events in the system: ...@@ -54,7 +54,7 @@ all sched_wakeup events in the system:
Traces meant to be processed using a script should be recorded with Traces meant to be processed using a script should be recorded with
the above option: -a to enable system-wide collection. the above option: -a to enable system-wide collection.
The format file for the sched_wakep event defines the following fields The format file for the sched_wakeup event defines the following fields
(see /sys/kernel/debug/tracing/events/sched/sched_wakeup/format): (see /sys/kernel/debug/tracing/events/sched/sched_wakeup/format):
---- ----
......
...@@ -448,7 +448,7 @@ all sched_wakeup events in the system: ...@@ -448,7 +448,7 @@ all sched_wakeup events in the system:
Traces meant to be processed using a script should be recorded with Traces meant to be processed using a script should be recorded with
the above option: -a to enable system-wide collection. the above option: -a to enable system-wide collection.
The format file for the sched_wakep event defines the following fields The format file for the sched_wakeup event defines the following fields
(see /sys/kernel/debug/tracing/events/sched/sched_wakeup/format): (see /sys/kernel/debug/tracing/events/sched/sched_wakeup/format):
---- ----
......
...@@ -385,7 +385,7 @@ Aggregate counts per physical processor for system-wide mode measurements. ...@@ -385,7 +385,7 @@ Aggregate counts per physical processor for system-wide mode measurements.
Print metrics or metricgroups specified in a comma separated list. Print metrics or metricgroups specified in a comma separated list.
For a group all metrics from the group are added. For a group all metrics from the group are added.
The events from the metrics are automatically measured. The events from the metrics are automatically measured.
See perf list output for the possble metrics and metricgroups. See perf list output for the possible metrics and metricgroups.
-A:: -A::
--no-aggr:: --no-aggr::
......
...@@ -2,7 +2,7 @@ Using TopDown metrics in user space ...@@ -2,7 +2,7 @@ Using TopDown metrics in user space
----------------------------------- -----------------------------------
Intel CPUs (since Sandy Bridge and Silvermont) support a TopDown Intel CPUs (since Sandy Bridge and Silvermont) support a TopDown
methology to break down CPU pipeline execution into 4 bottlenecks: methodology to break down CPU pipeline execution into 4 bottlenecks:
frontend bound, backend bound, bad speculation, retiring. frontend bound, backend bound, bad speculation, retiring.
For more details on Topdown see [1][5] For more details on Topdown see [1][5]
......
...@@ -8,10 +8,10 @@ ...@@ -8,10 +8,10 @@
#include <linux/coresight-pmu.h> #include <linux/coresight-pmu.h>
#include <linux/zalloc.h> #include <linux/zalloc.h>
#include "../../util/auxtrace.h" #include "../../../util/auxtrace.h"
#include "../../util/debug.h" #include "../../../util/debug.h"
#include "../../util/evlist.h" #include "../../../util/evlist.h"
#include "../../util/pmu.h" #include "../../../util/pmu.h"
#include "cs-etm.h" #include "cs-etm.h"
#include "arm-spe.h" #include "arm-spe.h"
......
...@@ -16,19 +16,19 @@ ...@@ -16,19 +16,19 @@
#include <linux/zalloc.h> #include <linux/zalloc.h>
#include "cs-etm.h" #include "cs-etm.h"
#include "../../util/debug.h" #include "../../../util/debug.h"
#include "../../util/record.h" #include "../../../util/record.h"
#include "../../util/auxtrace.h" #include "../../../util/auxtrace.h"
#include "../../util/cpumap.h" #include "../../../util/cpumap.h"
#include "../../util/event.h" #include "../../../util/event.h"
#include "../../util/evlist.h" #include "../../../util/evlist.h"
#include "../../util/evsel.h" #include "../../../util/evsel.h"
#include "../../util/perf_api_probe.h" #include "../../../util/perf_api_probe.h"
#include "../../util/evsel_config.h" #include "../../../util/evsel_config.h"
#include "../../util/pmu.h" #include "../../../util/pmu.h"
#include "../../util/cs-etm.h" #include "../../../util/cs-etm.h"
#include <internal/lib.h> // page_size #include <internal/lib.h> // page_size
#include "../../util/session.h" #include "../../../util/session.h"
#include <errno.h> #include <errno.h>
#include <stdlib.h> #include <stdlib.h>
......
// SPDX-License-Identifier: GPL-2.0 // SPDX-License-Identifier: GPL-2.0
#include "../../util/perf_regs.h" #include "../../../util/perf_regs.h"
const struct sample_reg sample_reg_masks[] = { const struct sample_reg sample_reg_masks[] = {
SMPL_REG_END SMPL_REG_END
......
...@@ -10,7 +10,7 @@ ...@@ -10,7 +10,7 @@
#include <linux/string.h> #include <linux/string.h>
#include "arm-spe.h" #include "arm-spe.h"
#include "../../util/pmu.h" #include "../../../util/pmu.h"
struct perf_event_attr struct perf_event_attr
*perf_pmu__get_default_config(struct perf_pmu *pmu __maybe_unused) *perf_pmu__get_default_config(struct perf_pmu *pmu __maybe_unused)
......
// SPDX-License-Identifier: GPL-2.0 // SPDX-License-Identifier: GPL-2.0
#include <elfutils/libdwfl.h> #include <elfutils/libdwfl.h>
#include "../../util/unwind-libdw.h" #include "../../../util/unwind-libdw.h"
#include "../../util/perf_regs.h" #include "../../../util/perf_regs.h"
#include "../../util/event.h" #include "../../../util/event.h"
bool libdw__arch_set_initial_registers(Dwfl_Thread *thread, void *arg) bool libdw__arch_set_initial_registers(Dwfl_Thread *thread, void *arg)
{ {
......
...@@ -3,8 +3,8 @@ ...@@ -3,8 +3,8 @@
#include <errno.h> #include <errno.h>
#include <libunwind.h> #include <libunwind.h>
#include "perf_regs.h" #include "perf_regs.h"
#include "../../util/unwind.h" #include "../../../util/unwind.h"
#include "../../util/debug.h" #include "../../../util/debug.h"
int libunwind__arch_reg_id(int regnum) int libunwind__arch_reg_id(int regnum)
{ {
......
...@@ -432,7 +432,7 @@ void iostat_print_metric(struct perf_stat_config *config, struct evsel *evsel, ...@@ -432,7 +432,7 @@ void iostat_print_metric(struct perf_stat_config *config, struct evsel *evsel,
u8 die = ((struct iio_root_port *)evsel->priv)->die; u8 die = ((struct iio_root_port *)evsel->priv)->die;
struct perf_counts_values *count = perf_counts(evsel->counts, die, 0); struct perf_counts_values *count = perf_counts(evsel->counts, die, 0);
if (count->run && count->ena) { if (count && count->run && count->ena) {
if (evsel->prev_raw_counts && !out->force_header) { if (evsel->prev_raw_counts && !out->force_header) {
struct perf_counts_values *prev_count = struct perf_counts_values *prev_count =
perf_counts(evsel->prev_raw_counts, die, 0); perf_counts(evsel->prev_raw_counts, die, 0);
......
...@@ -2408,6 +2408,8 @@ int cmd_stat(int argc, const char **argv) ...@@ -2408,6 +2408,8 @@ int cmd_stat(int argc, const char **argv)
goto out; goto out;
} else if (verbose) } else if (verbose)
iostat_list(evsel_list, &stat_config); iostat_list(evsel_list, &stat_config);
if (iostat_mode == IOSTAT_RUN && !target__has_cpu(&target))
target.system_wide = true;
} }
if (add_default_attributes()) if (add_default_attributes())
......
...@@ -1046,7 +1046,7 @@ ...@@ -1046,7 +1046,7 @@
{ {
"EventCode": "0x4e010", "EventCode": "0x4e010",
"EventName": "PM_GCT_NOSLOT_IC_L3MISS", "EventName": "PM_GCT_NOSLOT_IC_L3MISS",
"BriefDescription": "Gct empty for this thread due to icach l3 miss", "BriefDescription": "Gct empty for this thread due to icache l3 miss",
"PublicDescription": "" "PublicDescription": ""
}, },
{ {
......
...@@ -229,8 +229,8 @@ static int read_object_code(u64 addr, size_t len, u8 cpumode, ...@@ -229,8 +229,8 @@ static int read_object_code(u64 addr, size_t len, u8 cpumode,
struct thread *thread, struct state *state) struct thread *thread, struct state *state)
{ {
struct addr_location al; struct addr_location al;
unsigned char buf1[BUFSZ]; unsigned char buf1[BUFSZ] = {0};
unsigned char buf2[BUFSZ]; unsigned char buf2[BUFSZ] = {0};
size_t ret_len; size_t ret_len;
u64 objdump_addr; u64 objdump_addr;
const char *objdump_name; const char *objdump_name;
......
...@@ -20,6 +20,23 @@ ...@@ -20,6 +20,23 @@
/* For bsearch. We try to unwind functions in shared object. */ /* For bsearch. We try to unwind functions in shared object. */
#include <stdlib.h> #include <stdlib.h>
/*
* The test will assert frames are on the stack but tail call optimizations lose
* the frame of the caller. Clang can disable this optimization on a called
* function but GCC currently (11/2020) lacks this attribute. The barrier is
* used to inhibit tail calls in these cases.
*/
#ifdef __has_attribute
#if __has_attribute(disable_tail_calls)
#define NO_TAIL_CALL_ATTRIBUTE __attribute__((disable_tail_calls))
#define NO_TAIL_CALL_BARRIER
#endif
#endif
#ifndef NO_TAIL_CALL_ATTRIBUTE
#define NO_TAIL_CALL_ATTRIBUTE
#define NO_TAIL_CALL_BARRIER __asm__ __volatile__("" : : : "memory");
#endif
static int mmap_handler(struct perf_tool *tool __maybe_unused, static int mmap_handler(struct perf_tool *tool __maybe_unused,
union perf_event *event, union perf_event *event,
struct perf_sample *sample, struct perf_sample *sample,
...@@ -91,7 +108,7 @@ static int unwind_entry(struct unwind_entry *entry, void *arg) ...@@ -91,7 +108,7 @@ static int unwind_entry(struct unwind_entry *entry, void *arg)
return strcmp((const char *) symbol, funcs[idx]); return strcmp((const char *) symbol, funcs[idx]);
} }
noinline int test_dwarf_unwind__thread(struct thread *thread) NO_TAIL_CALL_ATTRIBUTE noinline int test_dwarf_unwind__thread(struct thread *thread)
{ {
struct perf_sample sample; struct perf_sample sample;
unsigned long cnt = 0; unsigned long cnt = 0;
...@@ -122,7 +139,7 @@ noinline int test_dwarf_unwind__thread(struct thread *thread) ...@@ -122,7 +139,7 @@ noinline int test_dwarf_unwind__thread(struct thread *thread)
static int global_unwind_retval = -INT_MAX; static int global_unwind_retval = -INT_MAX;
noinline int test_dwarf_unwind__compare(void *p1, void *p2) NO_TAIL_CALL_ATTRIBUTE noinline int test_dwarf_unwind__compare(void *p1, void *p2)
{ {
/* Any possible value should be 'thread' */ /* Any possible value should be 'thread' */
struct thread *thread = *(struct thread **)p1; struct thread *thread = *(struct thread **)p1;
...@@ -141,7 +158,7 @@ noinline int test_dwarf_unwind__compare(void *p1, void *p2) ...@@ -141,7 +158,7 @@ noinline int test_dwarf_unwind__compare(void *p1, void *p2)
return p1 - p2; return p1 - p2;
} }
noinline int test_dwarf_unwind__krava_3(struct thread *thread) NO_TAIL_CALL_ATTRIBUTE noinline int test_dwarf_unwind__krava_3(struct thread *thread)
{ {
struct thread *array[2] = {thread, thread}; struct thread *array[2] = {thread, thread};
void *fp = &bsearch; void *fp = &bsearch;
...@@ -160,14 +177,22 @@ noinline int test_dwarf_unwind__krava_3(struct thread *thread) ...@@ -160,14 +177,22 @@ noinline int test_dwarf_unwind__krava_3(struct thread *thread)
return global_unwind_retval; return global_unwind_retval;
} }
noinline int test_dwarf_unwind__krava_2(struct thread *thread) NO_TAIL_CALL_ATTRIBUTE noinline int test_dwarf_unwind__krava_2(struct thread *thread)
{ {
return test_dwarf_unwind__krava_3(thread); int ret;
ret = test_dwarf_unwind__krava_3(thread);
NO_TAIL_CALL_BARRIER;
return ret;
} }
noinline int test_dwarf_unwind__krava_1(struct thread *thread) NO_TAIL_CALL_ATTRIBUTE noinline int test_dwarf_unwind__krava_1(struct thread *thread)
{ {
return test_dwarf_unwind__krava_2(thread); int ret;
ret = test_dwarf_unwind__krava_2(thread);
NO_TAIL_CALL_BARRIER;
return ret;
} }
int test__dwarf_unwind(struct test *test __maybe_unused, int subtest __maybe_unused) int test__dwarf_unwind(struct test *test __maybe_unused, int subtest __maybe_unused)
......
...@@ -801,7 +801,7 @@ int perf_config_set(struct perf_config_set *set, ...@@ -801,7 +801,7 @@ int perf_config_set(struct perf_config_set *set,
section->name, item->name); section->name, item->name);
ret = fn(key, value, data); ret = fn(key, value, data);
if (ret < 0) { if (ret < 0) {
pr_err("Error: wrong config key-value pair %s=%s\n", pr_err("Error in the given config file: wrong config key-value pair %s=%s\n",
key, value); key, value);
/* /*
* Can't be just a 'break', as perf_config_set__for_each_entry() * Can't be just a 'break', as perf_config_set__for_each_entry()
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment