1. 11 Jul, 2023 6 commits
    • Sandipan Das's avatar
      perf vendor events amd: Fix large metrics · 8d40f74e
      Sandipan Das authored
      There are cases where a metric requires more events than the number of
      available counters. E.g. AMD Zen, Zen 2 and Zen 3 processors have four
      data fabric counters but the "nps1_die_to_dram" metric has eight events.
      
      By default, the constituent events are placed in a group and since the
      events cannot be scheduled at the same time, the metric is not computed.
      The "all metrics" test also fails because of this.
      
      Use the NO_GROUP_EVENTS constraint for such metrics which anyway expect
      the user to run perf with "--metric-no-group".
      
      E.g.
      
        $ sudo perf test -v 101
      
      Before:
      
        101: perf all metrics test                                           :
        --- start ---
        test child forked, pid 37131
        Testing branch_misprediction_ratio
        Testing all_remote_links_outbound
        Testing nps1_die_to_dram
        Metric 'nps1_die_to_dram' not printed in:
        Error:
        Invalid event (dram_channel_data_controller_4) in per-thread mode, enable system wide with '-a'.
        Testing macro_ops_dispatched
        Testing all_l2_cache_accesses
        Testing all_l2_cache_hits
        Testing all_l2_cache_misses
        Testing ic_fetch_miss_ratio
        Testing l2_cache_accesses_from_l2_hwpf
        Testing l2_cache_misses_from_l2_hwpf
        Testing op_cache_fetch_miss_ratio
        Testing l3_read_miss_latency
        Testing l1_itlb_misses
        test child finished with -1
        ---- end ----
        perf all metrics test: FAILED!
      
      After:
      
        101: perf all metrics test                                           :
        --- start ---
        test child forked, pid 43766
        Testing branch_misprediction_ratio
        Testing all_remote_links_outbound
        Testing nps1_die_to_dram
        Testing macro_ops_dispatched
        Testing all_l2_cache_accesses
        Testing all_l2_cache_hits
        Testing all_l2_cache_misses
        Testing ic_fetch_miss_ratio
        Testing l2_cache_accesses_from_l2_hwpf
        Testing l2_cache_misses_from_l2_hwpf
        Testing op_cache_fetch_miss_ratio
        Testing l3_read_miss_latency
        Testing l1_itlb_misses
        test child finished with 0
        ---- end ----
        perf all metrics test: Ok
      Reported-by: default avatarAyush Jain <ayush.jain3@amd.com>
      Suggested-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarSandipan Das <sandipan.das@amd.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ananth Narayan <ananth.narayan@amd.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Santosh Shukla <santosh.shukla@amd.com>
      Link: https://lore.kernel.org/r/20230706063440.54189-1-sandipan.das@amd.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8d40f74e
    • James Clark's avatar
      perf build: Fix library not found error when using CSLIBS · 1feece27
      James Clark authored
      -L only specifies the search path for libraries directly provided in the
      link line with -l. Because -lopencsd isn't specified, it's only linked
      because it's a dependency of -lopencsd_c_api. Dependencies like this are
      resolved using the default system search paths or -rpath-link=... rather
      than -L. This means that compilation only works if OpenCSD is installed
      to the system rather than provided with the CSLIBS (-L) option.
      
      This could be fixed by adding -Wl,-rpath-link=$(CSLIBS) but that is less
      conventional than just adding -lopencsd to the link line so that it uses
      -L. -lopencsd seems to have been removed in commit ed17b191
      ("perf tools: Drop requirement for libstdc++.so for libopencsd check")
      because it was thought that there was a chance compilation would work
      even if it didn't exist, but I think that only applies to libstdc++ so
      there is no harm to add it back. libopencsd.so and libopencsd_c_api.so
      would always exist together.
      
      Testing
      =======
      
      The following scenarios now all work:
      
       * Cross build with OpenCSD installed
       * Cross build using CSLIBS=...
       * Native build with OpenCSD installed
       * Native build using CSLIBS=...
       * Static cross build with OpenCSD installed
       * Static cross build with CSLIBS=...
      
      Committer testing:
      
        ⬢[acme@toolbox perf-tools]$ alias m
        alias m='make -k BUILD_BPF_SKEL=1 CORESIGHT=1 O=/tmp/build/perf-tools -C tools/perf install-bin && git status && perf test python ;  perf record -o /dev/null sleep 0.01 ; perf stat --null sleep 0.01'
        ⬢[acme@toolbox perf-tools]$ ldd ~/bin/perf | grep csd
        	libopencsd_c_api.so.1 => /lib64/libopencsd_c_api.so.1 (0x00007fd49c44e000)
        	libopencsd.so.1 => /lib64/libopencsd.so.1 (0x00007fd49bd56000)
        ⬢[acme@toolbox perf-tools]$ cat /etc/redhat-release
        Fedora release 36 (Thirty Six)
        ⬢[acme@toolbox perf-tools]$
      
      Fixes: ed17b191 ("perf tools: Drop requirement for libstdc++.so for libopencsd check")
      Reported-by: default avatarRadhey Shyam Pandey <radhey.shyam.pandey@amd.com>
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: default avatarRadhey Shyam Pandey <radhey.shyam.pandey@amd.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Uwe Kleine-König <uwe@kleine-koenig.org>
      Cc: coresight@lists.linaro.org
      Closes: https://lore.kernel.org/linux-arm-kernel/56905d7a-a91e-883a-b707-9d5f686ba5f1@arm.com/
      Link: https://lore.kernel.org/all/36cc4dc6-bf4b-1093-1c0a-876e368af183@kleine-koenig.org/
      Link: https://lore.kernel.org/r/20230707154546.456720-1-james.clark@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1feece27
    • Arnaldo Carvalho de Melo's avatar
      tools headers UAPI: Sync files changed by new cachestat syscall with the kernel sources · 9350a917
      Arnaldo Carvalho de Melo authored
      To pick the changes in these csets:
      
        cf264e13 ("cachestat: implement cachestat syscall")
      
      That add support for this new syscall in tools such as 'perf trace'.
      
      For instance, this is now possible:
      
        # perf trace -e cachestat
        ^C[root@five ~]#
        # perf trace -v -e cachestat
        Using CPUID AuthenticAMD-25-21-0
        event qualifier tracepoint filter: (common_pid != 3163687 && common_pid != 3147) && (id == 451)
        mmap size 528384B
        ^C[root@five ~]
      
        # perf trace -v -e *stat* --max-events=10
        Using CPUID AuthenticAMD-25-21-0
        event qualifier tracepoint filter: (common_pid != 3163713 && common_pid != 3147) && (id == 4 || id == 5 || id == 6 || id == 136 || id == 137 || id == 138 || id == 262 || id == 332 || id == 451)
        mmap size 528384B
             0.000 ( 0.009 ms): Cache2 I/O/4544 statfs(pathname: 0x45635288, buf: 0x7f8745725b60)                     = 0
             0.012 ( 0.003 ms): Cache2 I/O/4544 newfstatat(dfd: CWD, filename: 0x45635288, statbuf: 0x7f874569d250)   = 0
             0.036 ( 0.002 ms): Cache2 I/O/4544 newfstatat(dfd: 138, filename: 0x541b7093, statbuf: 0x7f87457256f0, flag: 4096) = 0
             0.372 ( 0.006 ms): Cache2 I/O/4544 statfs(pathname: 0x45635288, buf: 0x7f8745725b10)                     = 0
             0.379 ( 0.003 ms): Cache2 I/O/4544 newfstatat(dfd: CWD, filename: 0x45635288, statbuf: 0x7f874569d250)   = 0
             0.390 ( 0.002 ms): Cache2 I/O/4544 newfstatat(dfd: 138, filename: 0x541b7093, statbuf: 0x7f87457256a0, flag: 4096) = 0
             0.609 ( 0.005 ms): Cache2 I/O/4544 statfs(pathname: 0x45635288, buf: 0x7f8745725b60)                     = 0
             0.615 ( 0.003 ms): Cache2 I/O/4544 newfstatat(dfd: CWD, filename: 0x45635288, statbuf: 0x7f874569d250)   = 0
             0.625 ( 0.002 ms): Cache2 I/O/4544 newfstatat(dfd: 138, filename: 0x541b7093, statbuf: 0x7f87457256f0, flag: 4096) = 0
             0.826 ( 0.005 ms): Cache2 I/O/4544 statfs(pathname: 0x45635288, buf: 0x7f8745725b10)                     = 0
        #
      
      That is the filter expression attached to the raw_syscalls:sys_{enter,exit}
      tracepoints.
      
        $ find tools/perf/arch/ -name "syscall*tbl" | xargs grep -w sys_cachestat
        tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl:451	n64	cachestat			sys_cachestat
        tools/perf/arch/powerpc/entry/syscalls/syscall.tbl:451	common	cachestat			sys_cachestat
        tools/perf/arch/s390/entry/syscalls/syscall.tbl:451  common	cachestat		sys_cachestat			sys_cachestat
        tools/perf/arch/x86/entry/syscalls/syscall_64.tbl:451	common	cachestat		sys_cachestat
        $
      
        $ grep -w cachestat /tmp/build/perf-tools/arch/x86/include/generated/asm/syscalls_64.c
        	[451] = "cachestat",
        $
      
      This addresses these perf build warnings:
      
      Warning: Kernel ABI header differences:
        diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
        diff -u tools/include/uapi/linux/mman.h include/uapi/linux/mman.h
        diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
        diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
        diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl
        diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nhat Pham <nphamcs@gmail.com>
      Link: https://lore.kernel.org/lkml/ZK1pVBJpbjujJNJW@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9350a917
    • Arnaldo Carvalho de Melo's avatar
      tools headers UAPI: Sync drm/i915_drm.h with the kernel sources · 142256d2
      Arnaldo Carvalho de Melo authored
        81b1b599 ("drm/i915: Allow user to set cache at BO creation")
        98d2722a ("drm/i915/huc: differentiate the 2 steps of the MTL HuC auth flow")
        bc4be0a3 ("drm/i915/pmu: Prepare for multi-tile non-engine counters")
        d1da138f ("drm/i915/uapi/pxp: Add a GET_PARAM for PXP")
      
      That adds some ioctls but use the __I915_PMU_OTHER() macro, not
      supported yet in the tools/perf/trace/beauty/drm_ioctl.sh conversion
      script.
      
      This silences this perf build warning:
      
        Warning: Kernel ABI header differences:
          diff -u tools/include/uapi/drm/i915_drm.h include/uapi/drm/i915_drm.h
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
      Cc: Andi Shyti <andi.shyti@linux.intel.com>
      Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
      Cc: Fei Yang <fei.yang@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
      Link: https://lore.kernel.org/lkml/ZK1R%2FIyWcUKYQbQV@kernel.org/Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      142256d2
    • Georg Müller's avatar
      perf probe: Read DWARF files from the correct CU · c66e1c68
      Georg Müller authored
      After switching from dwarf_decl_file() to die_get_decl_file(), it is not
      possible to add probes for certain functions:
      
        $ perf probe -x /usr/lib/systemd/systemd-logind match_unit_removed
        A function DIE doesn't have decl_line. Maybe broken DWARF?
        A function DIE doesn't have decl_line. Maybe broken DWARF?
        Probe point 'match_unit_removed' not found.
           Error: Failed to add events.
      
      The problem is that die_get_decl_file() uses the wrong CU to search for
      the file. elfutils commit e1db5cdc9f has some good explanation for this:
      
          dwarf_decl_file uses dwarf_attr_integrate to get the DW_AT_decl_file
          attribute. This means the attribute might come from a different DIE
          in a different CU. If so, we need to use the CU associated with the
          attribute, not the original DIE, to resolve the file name.
      
      This patch uses the same source of information as elfutils: use attribute
      DW_AT_decl_file and use this CU to search for the file.
      
      Fixes: dc9a5d2c ("perf probe: Fix to get declared file name from clang DWARF5")
      Signed-off-by: default avatarGeorg Müller <georgmueller@gmx.net>
      Acked-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: regressions@lists.linux.dev
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20230628084551.1860532-6-georgmueller@gmx.netSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c66e1c68
    • Georg Müller's avatar
      perf probe: Add test for regression introduced by switch to die_get_decl_file() · 56cbeacf
      Georg Müller authored
      This patch adds a test to validate that 'perf probe' works for binaries
      where DWARF info is split into multiple CUs
      Signed-off-by: default avatarGeorg Müller <georgmueller@gmx.net>
      Acked-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: regressions@lists.linux.dev
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20230628084551.1860532-5-georgmueller@gmx.netSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      56cbeacf
  2. 10 Jul, 2023 2 commits
  3. 09 Jul, 2023 10 commits
  4. 08 Jul, 2023 22 commits