1. 24 Nov, 2022 16 commits
  2. 23 Nov, 2022 21 commits
    • Namhyung Kim's avatar
      perf lock contention: Do not use BPF task local storage · c66a36af
      Namhyung Kim authored
      It caused some troubles when a lock inside kmalloc is contended
      because task local storage would allocate memory using kmalloc.
      It'd create a recusion and even crash in my system.
      
      There could be a couple of workarounds but I think the simplest
      one is to use a pre-allocated hash map.  We could fix the task
      local storage to use the safe BPF allocator, but it takes time
      so let's change this until it happens actually.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Blake Jones <blakejones@google.com>
      Cc: Chris Li <chriscli@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <song@kernel.org>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20221118190109.1512674-1-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c66a36af
    • John Garry's avatar
    • Michael Petlan's avatar
      perf test: Fix record test on KVM guests · 2e9f5bda
      Michael Petlan authored
      Using precise flag with br_inst_retired.near_call causes the test fail
      on KVM guests, even when the guests have PMU forwarding enabled and the
      event itself is supported.
      
      Remove the precise flag in order to make the test work on KVM guests.
      Signed-off-by: default avatarMichael Petlan <mpetlan@redhat.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Link: https://lore.kernel.org/r/20221122083121.6012-1-mpetlan@redhat.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2e9f5bda
    • Namhyung Kim's avatar
      perf inject: Set PERF_RECORD_MISC_BUILD_ID_SIZE · 19030564
      Namhyung Kim authored
      With perf inject -b, it synthesizes build-id event for DSOs.  But it
      missed to set the size and resulted in having trailing zeros.
      
      As perf record sets the size in write_build_id(), let's set the size
      here as well.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20221119002750.1568027-1-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      19030564
    • Naveen N. Rao's avatar
      perf test: Skip watchpoint tests if no watchpoints available · 7d54a4ac
      Naveen N. Rao authored
      On IBM Power9, perf watchpoint tests fail since no hardware breakpoints
      are available. Detect this by checking the error returned by
      perf_event_open() and skip the tests in that case.
      Reported-by: default avatarDisha Goel <disgoel@linux.vnet.ibm.com>
      Signed-off-by: default avatarNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Reviewed-by: Kajol Jain<kjain@linux.ibm.com>
      Tested-by: Kajol Jain<kjain@linux.ibm.com>
      Link: https://lore.kernel.org/r/20221121102747.208289-1-naveen.n.rao@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-perf-users@vger.kernel.org
      7d54a4ac
    • Leo Yan's avatar
      perf trace: Remove unused bpf map 'syscalls' · 8daf87f5
      Leo Yan authored
      augmented_raw_syscalls.c defines the bpf map 'syscalls' which is
      initialized by perf tool in user space to indicate which system calls
      are enabled for tracing, on the other flip eBPF program relies on the
      map to filter out the trace events which are not enabled.
      
      The map also includes a field 'string_args_len[6]' which presents the
      string length if the corresponding argument is a string type.
      
      Now the map 'syscalls' is not used, bpf program doesn't use it as filter
      anymore, this is replaced by using the function bpf_tail_call() and
      PROG_ARRAY syscalls map.  And we don't need to explicitly set the string
      length anymore, bpf_probe_read_str() is smart to copy the string and
      return string length.
      
      Therefore, it's safe to remove the bpf map 'syscalls'.
      
      To consolidate the code, this patch removes the definition of map
      'syscalls' from augmented_raw_syscalls.c and drops code for using
      the map in the perf trace.
      
      Note, since function trace__set_ev_qualifier_bpf_filter() is removed,
      calling trace__init_syscall_bpf_progs() from it is also removed.  We
      don't need to worry it because trace__init_syscall_bpf_progs() is
      still invoked from trace__init_syscalls_bpf_prog_array_maps() for
      initialization the system call's bpf program callback.
      
      After:
      
        # perf trace -e examples/bpf/augmented_raw_syscalls.c,open* --max-events 10 perf stat --quiet sleep 0.001
        openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
        openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libm.so.6", O_RDONLY|O_CLOEXEC) = 3
        openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libelf.so.1", O_RDONLY|O_CLOEXEC) = 3
        openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libdw.so.1", O_RDONLY|O_CLOEXEC) = 3
        openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libunwind.so.8", O_RDONLY|O_CLOEXEC) = 3
        openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libunwind-aarch64.so.8", O_RDONLY|O_CLOEXEC) = 3
        openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libcrypto.so.3", O_RDONLY|O_CLOEXEC) = 3
        openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libslang.so.2", O_RDONLY|O_CLOEXEC) = 3
        openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libperl.so.5.34", O_RDONLY|O_CLOEXEC) = 3
        openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
      
        # perf trace -e examples/bpf/augmented_raw_syscalls.c --max-events 10 perf stat --quiet sleep 0.001
        ... [continued]: execve())             = 0
        brk(NULL)                               = 0xaaaab1d28000
        faccessat(-100, "/etc/ld.so.preload", 4) = -1 ENOENT (No such file or directory)
        openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
        close(3</usr/lib/aarch64-linux-gnu/libcrypto.so.3>) = 0
        openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libm.so.6", O_RDONLY|O_CLOEXEC) = 3
        read(3</usr/lib/aarch64-linux-gnu/libcrypto.so.3>, 0xfffff33f70d0, 832) = 832
        munmap(0xffffb5519000, 28672)           = 0
        munmap(0xffffb55b7000, 32880)           = 0
        mprotect(0xffffb55a6000, 61440, PROT_NONE) = 0
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20221121075237.127706-6-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8daf87f5
    • Leo Yan's avatar
      perf augmented_raw_syscalls: Remove unused variable 'syscall' · 9bc427a0
      Leo Yan authored
      The local variable 'syscall' is not used anymore, remove it.
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20221121075237.127706-5-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9bc427a0
    • Leo Yan's avatar
      perf trace: Handle failure when trace point folder is missed · 03e9a5d8
      Leo Yan authored
      On Arm64 a case is perf tools fails to find the corresponding trace
      point folder for system calls listed in the table 'syscalltbl_arm64',
      e.g. the generated system call table contains "lookup_dcookie" but we
      cannot find out the matched trace point folder for it.
      
      We need to figure out if there have any issue for the generated system
      call table, on the other hand, we need to handle the case when trace
      point folder is missed under sysfs, this patch sets the flag
      syscall::nonexistent as true and returns the error from
      trace__read_syscall_info().
      
      Another problem is for trace__syscall_info(), it returns two different
      values if a system call doesn't exist: at the first time calling
      trace__syscall_info() it returns NULL when the system call doesn't exist,
      later if call trace__syscall_info() again for the same missed system
      call, it returns pointer of syscall.  trace__syscall_info() checks the
      condition 'syscalls.table[id].name == NULL', but the name will be
      assigned in the first invoking even the system call is not found.
      
      So checking system call's name in trace__syscall_info() is not the right
      thing to do, this patch simply checks flag syscall::nonexistent to make
      decision if a system call exists or not, finally trace__syscall_info()
      returns the consistent result (NULL) if a system call doesn't existed.
      
      Fixes: b8b1033f ("perf trace: Mark syscall ids that are not allocated to avoid unnecessary error messages")
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: bpf@vger.kernel.org
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20221121075237.127706-4-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      03e9a5d8
    • Leo Yan's avatar
      perf trace: Return error if a system call doesn't exist · d4223e17
      Leo Yan authored
      When a system call is not detected, the reason is either because the
      system call ID is out of scope or failure to find the corresponding path
      in the sysfs, trace__read_syscall_info() returns zero.  Finally, without
      returning an error value it introduces confusion for the caller.
      
      This patch lets the function trace__read_syscall_info() to return
      -EEXIST when a system call doesn't exist.
      
      Fixes: b8b1033f ("perf trace: Mark syscall ids that are not allocated to avoid unnecessary error messages")
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: bpf@vger.kernel.org
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20221121075237.127706-3-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d4223e17
    • Leo Yan's avatar
      perf trace: Use macro RAW_SYSCALL_ARGS_NUM to replace number · eadcab4c
      Leo Yan authored
      This patch defines a macro RAW_SYSCALL_ARGS_NUM to replace the open
      coded number '6'.
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20221121075237.127706-2-leo.yan@linaro.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      eadcab4c
    • Ian Rogers's avatar
      perf list: Add JSON output option · 6ed24944
      Ian Rogers authored
      Output events and metrics in a JSON format by overriding the print
      callbacks. Currently other command line options aren't supported and
      metrics are repeated once per metric group.
      
      Committer testing:
      
        $ perf list cache
      
        List of pre-defined events (to be used in -e or -M):
      
          L1-dcache-load-misses                              [Hardware cache event]
          L1-dcache-loads                                    [Hardware cache event]
          L1-dcache-prefetches                               [Hardware cache event]
          L1-icache-load-misses                              [Hardware cache event]
          L1-icache-loads                                    [Hardware cache event]
          branch-load-misses                                 [Hardware cache event]
          branch-loads                                       [Hardware cache event]
          dTLB-load-misses                                   [Hardware cache event]
          dTLB-loads                                         [Hardware cache event]
          iTLB-load-misses                                   [Hardware cache event]
          iTLB-loads                                         [Hardware cache event]
        $ perf list --json cache
        [
        {
                "Unit": "cache",
                "EventName": "L1-dcache-load-misses",
                "EventType": "Hardware cache event"
        },
        {
                "Unit": "cache",
                "EventName": "L1-dcache-loads",
                "EventType": "Hardware cache event"
        },
        {
                "Unit": "cache",
                "EventName": "L1-dcache-prefetches",
                "EventType": "Hardware cache event"
        },
        {
                "Unit": "cache",
                "EventName": "L1-icache-load-misses",
                "EventType": "Hardware cache event"
        },
        {
                "Unit": "cache",
                "EventName": "L1-icache-loads",
                "EventType": "Hardware cache event"
        },
        {
                "Unit": "cache",
                "EventName": "branch-load-misses",
                "EventType": "Hardware cache event"
        },
        {
                "Unit": "cache",
                "EventName": "branch-loads",
                "EventType": "Hardware cache event"
        },
        {
                "Unit": "cache",
                "EventName": "dTLB-load-misses",
                "EventType": "Hardware cache event"
        },
        {
                "Unit": "cache",
                "EventName": "dTLB-loads",
                "EventType": "Hardware cache event"
        },
        {
                "Unit": "cache",
                "EventName": "iTLB-load-misses",
                "EventType": "Hardware cache event"
        },
        {
                "Unit": "cache",
                "EventName": "iTLB-loads",
                "EventType": "Hardware cache event"
        }
        ]
        $
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Caleb Biggers <caleb.biggers@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Perry Taylor <perry.taylor@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Weilin Wang <weilin.wang@intel.com>
      Cc: Xin Gao <gaoxin@cdjrlc.com>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: http://lore.kernel.org/lkml/20221114210723.2749751-11-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6ed24944
    • Ian Rogers's avatar
      perf list: Reorganize to use callbacks to allow honouring command line options · e5c6109f
      Ian Rogers authored
      Rather than controlling the list output with passed flags, add
      callbacks that are called when an event or metric are
      encountered. State is passed to the callback so that command line
      options can be respected, alternatively the callbacks can be changed.
      
      Fix a few bugs:
       - wordwrap to columns metric descriptions and expressions;
       - remove unnecessary whitespace after PMU event names;
       - the metric filter is a glob but matched using strstr which will
         always fail, switch to using a proper globmatch,
       - the detail flag gives details for extra kernel PMU events like
         branch-instructions.
      
      In metricgroup.c switch from struct mep being a rbtree of metricgroups
      containing a list of metrics, to the tree directly containing all the
      metrics. In general the alias for a name is passed to the print
      routine rather than being contained in the name with OR.
      
      Committer notes:
      
      Check the asprint() return to address this on fedora 36:
      
        util/print-events.c: In function ‘print_sdt_events’:
        util/print-events.c:183:33: error: ignoring return value of ‘asprintf’ declared with attribute ‘warn_unused_result’ [-Werror=unused-result]
          183 |                                 asprintf(&evt_name, "%s@%s(%.12s)", sdt_name->s, path, bid);
              |                                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        cc1: all warnings being treated as errors
      
        $ gcc --version | head -1
        gcc (GCC) 12.2.1 20220819 (Red Hat 12.2.1-2)
        $
      
      Fix ps.pmu_glob setting when dealing with *:* events, it was being left
      with a freed pointer that then at the end of cmd_list() would be double
      freed.
      
      Check if pmu_name is NULL in default_print_event() before calling
      strglobmatch(pmu_name, ...) to avoid a segfault.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Caleb Biggers <caleb.biggers@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Perry Taylor <perry.taylor@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Weilin Wang <weilin.wang@intel.com>
      Cc: Xin Gao <gaoxin@cdjrlc.com>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: http://lore.kernel.org/lkml/20221114210723.2749751-10-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e5c6109f
    • Ian Rogers's avatar
      perf build: Fix LIBTRACEEVENT_DYNAMIC · a3720e96
      Ian Rogers authored
      The tools/lib includes fixes break LIBTRACEVENT_DYNAMIC as the makefile
      erroneously had dependencies on building libtraceevent even when not
      linking with it. This change fixes the issues with LIBTRACEEVENT_DYNAMIC
      by making the built files optional.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masahiro Yamada <masahiroy@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Nicolas Schier <nicolas@fjasle.eu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: bpf@vger.kernel.org
      Link: http://lore.kernel.org/lkml/20221116224631.207631-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a3720e96
    • Namhyung Kim's avatar
      perf test: Replace data symbol test workload with datasym · 0b77fe47
      Namhyung Kim authored
      So that it can get rid of requirement of a compiler.
      
        $ sudo ./perf test -v 109
        109: Test data symbol                                                :
        --- start ---
        test child forked, pid 844526
        Recording workload...
        [ perf record: Woken up 2 times to write data ]
        [ perf record: Captured and wrote 0.354 MB /tmp/__perf_test.perf.data.GFeZO (4847 samples) ]
        Cleaning up files...
        test child finished with 0
        ---- end ----
        Test data symbol: Ok
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20221116233854.1596378-13-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0b77fe47
    • Namhyung Kim's avatar
      perf test: Add 'datasym' test workload · 3dfc01fe
      Namhyung Kim authored
      The datasym workload is to check if perf mem command gets the data
      addresses precisely.  This is needed for data symbol test.
      
        $ perf test -w datasym
      
      I had to keep the buf1 in the data section, otherwise it could end
      up in the BSS and was mmaped as a separate //anon region, then it
      was not symbolized at all.  It needs to be fixed separately.
      
      Committer notes:
      
      Add a -U _FORTIFY_SOURCE to the datasym CFLAGS, as the main perf flags
      set it and it requires building with optimization, and this new test has
      a -O0.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20221116233854.1596378-12-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3dfc01fe
    • Namhyung Kim's avatar
      perf test: Replace brstack test workload · 7bc1dd96
      Namhyung Kim authored
      So that it can get rid of requirement of a compiler.  Also rename the
      symbols to match with the perf test workload.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarJames Clark <james.clark@arm.com>
      Acked-by: default avatarGerman Gomez <german.gomez@arm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20221116233854.1596378-11-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7bc1dd96
    • Namhyung Kim's avatar
      perf test: Add 'brstack' test workload · a104f0ea
      Namhyung Kim authored
      The brstack is to run different kinds of branches repeatedly.  This is
      necessary for brstack test case to verify if it has correct branch info.
      
        $ perf test -w brstack
      
      I renamed the internal functions to have brstack_ prefix as it's too
      generic name.
      
      Add a -U_FORTIFY_SOURCE to the brstack CFLAGS, as the main perf flags
      set it and it requires building with optimization, and this new test has
      a -O0.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20221116233854.1596378-10-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a104f0ea
    • Namhyung Kim's avatar
      perf test: Replace arm spe fork test workload with sqrtloop · e011979e
      Namhyung Kim authored
      So that it can get rid of requirement of a compiler.  I've also removed
      killall as it'll kill perf process now and run the test workload for 10
      sec instead.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarJames Clark <james.clark@arm.com>
      Tested-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20221116233854.1596378-9-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e011979e
    • Namhyung Kim's avatar
      perf test: Add 'sqrtloop' test workload · 39281709
      Namhyung Kim authored
      The sqrtloop creates a child process to run an infinite loop calling
      sqrt() with rand().  This is needed for ARM SPE fork test.
      
        $ perf test -w sqrtloop
      
      It can take an optional argument to specify how long it will run in
      seconds (default: 1).
      
      Committer notes:
      
      Explicitely ignored the sqrt() return to fix the build on systems where
      the compiler complains it isn't being used.
      
      And added a sqrtloop specific CFLAGS to disable optimizations to make
      this a bit more robust wrt dead code elimination.
      
      Doing that a -U_FORTIFY_SOURCE needs to be added, as -O0 is incompatible
      with it.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20221116233854.1596378-8-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      39281709
    • Namhyung Kim's avatar
      perf test: Replace arm callgraph fp test workload with leafloop · 7cf0b4a7
      Namhyung Kim authored
      So that it can get rid of requirement of a compiler.
      Reviewed-by: default avatarLeo Yan <leo.yan@linaro.org>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20221116233854.1596378-7-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7cf0b4a7
    • Namhyung Kim's avatar
      perf test: Add 'leafloop' test workload · 41522f74
      Namhyung Kim authored
      The leafloop workload is to run an infinite loop in the test_leaf
      function.  This is needed for the ARM fp callgraph test to verify if it
      gets the correct callchains.
      
        $ perf test -w leafloop
      
      Committer notes:
      
      Add a:
      
        -U_FORTIFY_SOURCE
      
      to the leafloop CFLAGS as the main perf flags set it and it requires
      building with optimization, and this new test has a -O0.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20221116233854.1596378-6-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      41522f74
  3. 20 Nov, 2022 3 commits
    • Namhyung Kim's avatar
      perf test: Replace record test workload with thloop · 0b8ff0ba
      Namhyung Kim authored
      So that it can get rid of requirements for a compiler.
      
        $ sudo ./perf test -v 92
         92: perf record tests                                               :
        --- start ---
        test child forked, pid 740204
        Basic --per-thread mode test
        Basic --per-thread mode test [Success]
        Register capture test
        Register capture test [Success]
        Basic --system-wide mode test
        Basic --system-wide mode test [Success]
        Basic target workload test
        Basic target workload test [Success]
        test child finished with 0
        ---- end ----
        perf record tests: Ok
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20221116233854.1596378-5-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0b8ff0ba
    • Namhyung Kim's avatar
      perf test: Add 'thloop' test workload · 69b35292
      Namhyung Kim authored
      The thloop is similar to noploop but runs in two threads.  This is
      needed to verify perf record --per-thread to handle multi-threaded
      programs properly.
      
        $ perf test -w thloop
      
      It also takes an optional argument to specify runtime in seconds
      (default: 1).
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20221116233854.1596378-4-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      69b35292
    • Namhyung Kim's avatar
      perf test: Replace pipe test workload with noploop · 24e733b2
      Namhyung Kim authored
      So that it can get rid of requirement of a compiler.
      Also define and use more local symbols to ease future changes.
      
        $ sudo ./perf test -v pipe
         87: perf pipe recording and injection test                          :
        --- start ---
        test child forked, pid 748003
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.000 MB - ]
            748014   748014       -1 |perf
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.000 MB - ]
            99.83%  perf     perf                  [.] noploop
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.000 MB - ]
            99.85%  perf     perf                  [.] noploop
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.160 MB /tmp/perf.data.2XYPdw (4007 samples) ]
            99.83%  perf     perf                  [.] noploop
        test child finished with 0
        ---- end ----
        perf pipe recording and injection test: Ok
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20221116233854.1596378-3-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      24e733b2