1. 18 Sep, 2023 3 commits
    • Ian Rogers's avatar
      perf build: Default BUILD_BPF_SKEL, warn/disable for missing deps · 9925495d
      Ian Rogers authored
      LIBBPF is dependent on zlib so move the NO_ZLIB and feature check
      early to avoid statically building when zlib is disabled. This avoids
      a linkage failure with perf and static libbpf when zlib isn't
      specified.
      
      Move BUILD_BPF_SKEL logic to one place and if not defined set
      BUILD_BPF_SKEL to 1. Detect dependencies of building with BPF
      skeletons and warn/disable if the dependencies aren't present.
      
      Change Makefile.perf to contain BPF skeleton logic dependent on the
      Makefile.config result and refresh the comment about BUILD_BPF_SKEL.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: James Clark <james.clark@arm.com>
      Cc: Nick Terrell <terrelln@fb.com>
      Cc: Patrice Duroux <patrice.duroux@gmail.com>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Andrii Nakryiko <andrii@kernel.org>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
      Cc: Tom Rix <trix@redhat.com>
      Cc: llvm@lists.linux.dev
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230914211948.814999-3-irogers@google.comSigned-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      9925495d
    • Ian Rogers's avatar
      perf version: Add status of bpf skeletons · 727e4314
      Ian Rogers authored
      Add status for BPF skeletons, to see if a build has them enabled:
      ```
      $ perf version --build-options
      perf version 6.6.rc1.g0381ae36d1a6
                       dwarf: [ OFF ]  # HAVE_DWARF_SUPPORT
          dwarf_getlocations: [ OFF ]  # HAVE_DWARF_GETLOCATIONS_SUPPORT
               syscall_table: [ on  ]  # HAVE_SYSCALL_TABLE_SUPPORT
                      libbfd: [ OFF ]  # HAVE_LIBBFD_SUPPORT
                  debuginfod: [ OFF ]  # HAVE_DEBUGINFOD_SUPPORT
                      libelf: [ OFF ]  # HAVE_LIBELF_SUPPORT
                     libnuma: [ OFF ]  # HAVE_LIBNUMA_SUPPORT
      numa_num_possible_cpus: [ OFF ]  # HAVE_LIBNUMA_SUPPORT
                     libperl: [ on  ]  # HAVE_LIBPERL_SUPPORT
                   libpython: [ on  ]  # HAVE_LIBPYTHON_SUPPORT
                    libslang: [ on  ]  # HAVE_SLANG_SUPPORT
                   libcrypto: [ on  ]  # HAVE_LIBCRYPTO_SUPPORT
                   libunwind: [ OFF ]  # HAVE_LIBUNWIND_SUPPORT
          libdw-dwarf-unwind: [ OFF ]  # HAVE_DWARF_SUPPORT
                        zlib: [ on  ]  # HAVE_ZLIB_SUPPORT
                        lzma: [ on  ]  # HAVE_LZMA_SUPPORT
                   get_cpuid: [ on  ]  # HAVE_AUXTRACE_SUPPORT
                         bpf: [ OFF ]  # HAVE_LIBBPF_SUPPORT
                         aio: [ on  ]  # HAVE_AIO_SUPPORT
                        zstd: [ on  ]  # HAVE_ZSTD_SUPPORT
                     libpfm4: [ on  ]  # HAVE_LIBPFM
               libtraceevent: [ on  ]  # HAVE_LIBTRACEEVENT
               bpf_skeletons: [ OFF ]  # HAVE_BPF_SKEL
      ```
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Tested-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: James Clark <james.clark@arm.com>
      Cc: Nick Terrell <terrelln@fb.com>
      Cc: Patrice Duroux <patrice.duroux@gmail.com>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Andrii Nakryiko <andrii@kernel.org>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
      Cc: Tom Rix <trix@redhat.com>
      Cc: llvm@lists.linux.dev
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230914211948.814999-2-irogers@google.comSigned-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      727e4314
    • Yang Li's avatar
      perf kwork top: Simplify bool conversion · 3ecf87b2
      Yang Li authored
      ./tools/perf/util/bpf_kwork_top.c:120:53-58: WARNING: conversion to bool not needed here
      Signed-off-by: default avatarYang Li <yang.lee@linux.alibaba.com>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230915063832.120274-1-yang.lee@linux.alibaba.comSigned-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      3ecf87b2
  2. 17 Sep, 2023 1 commit
    • Yang Jihong's avatar
      perf test: Fix test-record-dummy-C0 failure for supported PERF_FORMAT_LOST feature kernel · a132b784
      Yang Jihong authored
      For kernel that supports PERF_FORMAT_LOST, attr->read_format has
      PERF_FORMAT_LOST bit. Update expected value of
      attr->read_format of test-record-dummy-C0 for this scenario.
      
      Before:
      
        # ./perf test 17 -vv
         17: Setup struct perf_event_attr                                    :
        --- start ---
        test child forked, pid 1609441
        <SNIP>
        running './tests/attr/test-record-dummy-C0'
          'PERF_TEST_ATTR=/tmp/tmpm3s60aji ./perf record -o /tmp/tmpm3s60aji/perf.data --no-bpf-event -e dummy -C 0 kill >/dev/null 2>&1' ret '1', expected '1'
        expected read_format=4, got 20
        FAILED './tests/attr/test-record-dummy-C0' - match failure
        test child finished with -1
        ---- end ----
        Setup struct perf_event_attr: FAILED!
      
      After:
      
        # ./perf test 17 -vv
         17: Setup struct perf_event_attr                                    :
        --- start ---
        test child forked, pid 1609441
        <SNIP>
        running './tests/attr/test-record-dummy-C0'
          'PERF_TEST_ATTR=/tmp/tmppa9vxcb7 ./perf record -o /tmp/tmppa9vxcb7/perf.data --no-bpf-event -e dummy -C 0 kill >/dev/null 2>&1' ret '1', expected '1'
        <SNIP>
        test child finished with 0
        ---- end ----
        Setup struct perf_event_attr: Ok
      Reported-and-Tested-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Link: https://lore.kernel.org/r/20230916091641.776031-1-yangjihong1@huawei.comSigned-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      a132b784
  3. 15 Sep, 2023 6 commits
  4. 12 Sep, 2023 30 commits
    • Ian Rogers's avatar
      perf bpf-filter: Add YYDEBUG · 999b81b9
      Ian Rogers authored
      YYDEBUG enables line numbers and other error helpers in the generated
      bpf-filter-bison.c. Conditionally enabled only for debug builds.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230911170559.4037734-5-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      999b81b9
    • Ian Rogers's avatar
      perf pmu: Add YYDEBUG · f0f4cd10
      Ian Rogers authored
      YYDEBUG enables line numbers and other error helpers in the generated
      pmu-bison.c. Conditionally enabled only for debug builds.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230911170559.4037734-4-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f0f4cd10
    • Ian Rogers's avatar
      perf expr: Make YYDEBUG dependent on doing a debug build · 1344a707
      Ian Rogers authored
      YYDEBUG enables line numbers and other error helpers in the generated
      expr-bison.c. These shouldn't be generated when debugging
      isn't enabled.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230911170559.4037734-3-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1344a707
    • Ian Rogers's avatar
      perf parse-events: Make YYDEBUG dependent on doing a debug build · d4ce6019
      Ian Rogers authored
      YYDEBUG enables line numbers and other error helpers in the generated
      parse-events-bison.c. These shouldn't be generated when debugging
      isn't enabled.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230911170559.4037734-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d4ce6019
    • Ian Rogers's avatar
      perf parse-events: Remove unused header files · dc2cfef9
      Ian Rogers authored
      The fnmatch header is now used in the PMU matching logic in pmu.c.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Gaosheng Cui <cuigaosheng1@huawei.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230911170559.4037734-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      dc2cfef9
    • Athira Rajeev's avatar
      perf tools: Add includes for detected configs in Makefile.perf · f5d98b8b
      Athira Rajeev authored
      Makefile.perf uses "CONFIG_*" checks in the code. Example the config for
      libtraceevent is used to set PYTHON_EXT_SRCS
      
      	ifeq ($(CONFIG_LIBTRACEEVENT),y)
      	  PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources)
      	else
      	  PYTHON_EXT_SRCS := $(shell grep -v '^\#\|util/trace-event.c' util/python-ext-sources)
      	endif
      
      But this is not picking the value for CONFIG_LIBTRACEEVENT that is set
      using the settings in Makefile.config. Include the file
      ".config-detected" so that make will use the system detected
      configuration in the CONFIG checks.
      
      This will fix isues that could arise when other "CONFIG_*" checks are
      added to Makefile.perf in future as well.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarAthira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: https://lore.kernel.org/r/20230912063807.74250-1-atrajeev@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f5d98b8b
    • Ruidong Tian's avatar
      perf test: Update cs_etm testcase for Arm ETE · bb350847
      Ruidong Tian authored
      Add ETE as one of the supported device types in perf cs_etm testcase.
      Reviewed-by: default avatarJames Clark <james.clark@arm.com>
      Signed-off-by: default avatarRuidong Tian <tianruidong@linux.alibaba.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230911065541.91293-1-tianruidong@linux.alibaba.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      bb350847
    • James Clark's avatar
      perf vendor events arm64: Add V1 metrics using Arm telemetry repo · 5cdb51ba
      James Clark authored
      Metrics for V1 weren't previously included in the Perf Jsons, so add
      them using the telemetry source [1].
      
      After generation any parts identical to the default metrics in sbsa.json
      were manually removed.
      
      [1]: https://gitlab.arm.com/telemetry-solution/telemetry-solution/-/blob/main/data/pmu/cpu/neoverse/neoverse-v1.jsonSigned-off-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nick Forrington <nick.forrington@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230831161618.134738-3-james.clark@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5cdb51ba
    • James Clark's avatar
      perf vendor events arm64: Update V1 events using Arm telemetry repo · a484e645
      James Clark authored
      The new data [1] includes descriptions that may have product specific
      details and new groupings that will be consistent with other products.
      
      The following command was used to generate the jsons:
      
       $ telemetry-solution/tools/perf_json_generator/generate.py \
         linux/tools/perf/ --telemetry-files \
         telemetry-solution/data/pmu/cpu/neoverse/neoverse-v1.json
      
      [1]: https://gitlab.arm.com/telemetry-solution/telemetry-solution/-/blob/main/data/pmu/cpu/neoverse/neoverse-v1.jsonSigned-off-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nick Forrington <nick.forrington@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will@kernel.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230831161618.134738-2-james.clark@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a484e645
    • James Clark's avatar
      perf test: Add a test for strcmp_cpuid_str() expression · a1ebf771
      James Clark authored
      Test that the new expression builtin returns a match when the current
      escaped CPU ID is given, and that it doesn't match when "0x0" is given.
      
      The CPU ID in test__expr() has to be changed to perf_pmu__getcpuid()
      which returns the CPU ID string, rather than the raw CPU ID that
      get_cpuid() returns because that can't be used with strcmp_cpuid_str().
      It doesn't affect the is_intel test because both versions contain
      "Intel".
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Chen Zhongjin <chenzhongjin@huawei.com>
      Cc: Eduard Zingerman <eddyz87@gmail.com>
      Cc: Haixin Yu <yuhaixin.yhx@linux.alibaba.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Liam Howlett <liam.howlett@oracle.com>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230904095104.1162928-5-james.clark@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a1ebf771
    • James Clark's avatar
      perf util: Add a function for replacing characters in a string · 8a55c1e2
      James Clark authored
      It finds all occurrences of a single character and replaces them with
      a multi character string. This will be used in a test in a following
      commit.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Chen Zhongjin <chenzhongjin@huawei.com>
      Cc: Eduard Zingerman <eddyz87@gmail.com>
      Cc: Haixin Yu <yuhaixin.yhx@linux.alibaba.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Liam Howlett <liam.howlett@oracle.com>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230904095104.1162928-4-james.clark@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8a55c1e2
    • James Clark's avatar
      perf jevents: Remove unused keyword · f561fc78
      James Clark authored
      'cpuid_not_more_than' was the working title of the new
      'strcmp_cpuid_str' keyword and was accidentally left in. It was never
      used so tidying it up has no effect.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Chen Zhongjin <chenzhongjin@huawei.com>
      Cc: Eduard Zingerman <eddyz87@gmail.com>
      Cc: Haixin Yu <yuhaixin.yhx@linux.alibaba.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Liam Howlett <liam.howlett@oracle.com>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Miguel Ojeda <ojeda@kernel.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230904095104.1162928-3-james.clark@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f561fc78
    • James Clark's avatar
      perf test: Check result of has_event(cycles) test · d19a353c
      James Clark authored
      Currently the function always returns 0, so even when the has_event()
      test fails, the test still passes. Fix it by returning ret instead.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Chen Zhongjin <chenzhongjin@huawei.com>
      Cc: Eduard Zingerman <eddyz87@gmail.com>
      Cc: Haixin Yu <yuhaixin.yhx@linux.alibaba.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Garry <john.g.garry@oracle.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Liam Howlett <liam.howlett@oracle.com>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Miguel Ojeda <ojeda@kernel.org>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: https://lore.kernel.org/r/20230904095104.1162928-2-james.clark@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d19a353c
    • Ian Rogers's avatar
      perf list pfm: Retry supported test with exclude_kernel · 6bd8c2ea
      Ian Rogers authored
      With paranoia set at 2 evsel__open will fail with EACCES for non-root
      users. To avoid this stopping libpfm4 events from being printed, retry
      with exclude_kernel enabled - copying the regular is_event_supported
      test.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kang Minchul <tegongkang@gmail.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20230906234416.3472339-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6bd8c2ea
    • Ian Rogers's avatar
      perf list: Avoid a hardcoded cpu PMU name · 4f19fc18
      Ian Rogers authored
      Use the first core PMU instead.
      
      On a Raspberry Pi, before:
      
        $ perf list
        ...
          cpu/t1=v1[,t2=v2,t3 ...]/modifier                  [Raw hardware event descriptor]
               [(see 'man perf-list' on how to encode it)]
        ...
      
      After:
      
        $ perf list
        ...
          armv8_cortex_a72/t1=v1[,t2=v2,t3 ...]/modifier     [Raw hardware event descriptor]
               [(see 'man perf-list' on how to encode it)]
        ...
        ```
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kang Minchul <tegongkang@gmail.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20230906234416.3472339-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4f19fc18
    • Namhyung Kim's avatar
      perf test shell lock_contention: Add cgroup aggregation and filter tests · e44b47b9
      Namhyung Kim authored
      Add cgroup aggregation and filter tests.
      
        $ sudo ./perf test -v contention
         84: kernel lock contention analysis test                            :
        --- start ---
        test child forked, pid 222423
        Testing perf lock record and perf lock contention
        Testing perf lock contention --use-bpf
        Testing perf lock record and perf lock contention at the same time
        Testing perf lock contention --threads
        Testing perf lock contention --lock-addr
        Testing perf lock contention --lock-cgroup
        Testing perf lock contention --type-filter (w/ spinlock)
        Testing perf lock contention --lock-filter (w/ tasklist_lock)
        Testing perf lock contention --callstack-filter (w/ unix_stream)
        Testing perf lock contention --callstack-filter with task aggregation
        Testing perf lock contention --cgroup-filter
        Testing perf lock contention CSV output
        test child finished with 0
        ---- end ----
        kernel lock contention analysis test: Ok
      
      Committer testing:
      
        [root@quaco ~]# uname -a
        Linux quaco 6.4.10-200.fc38.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Aug 11 12:20:29 UTC 2023 x86_64 GNU/Linux
        [root@quaco ~]# perf test -v contention
         84: kernel lock contention analysis test                            :
        --- start ---
        test child forked, pid 452625
        Testing perf lock record and perf lock contention
        Testing perf lock contention --use-bpf
        Testing perf lock record and perf lock contention at the same time
        Testing perf lock contention --threads
        Testing perf lock contention --lock-addr
        Testing perf lock contention --lock-cgroup
        Testing perf lock contention --type-filter (w/ spinlock)
        Testing perf lock contention --lock-filter (w/ tasklist_lock)
        Testing perf lock contention --callstack-filter (w/ unix_stream)
        Testing perf lock contention --callstack-filter with task aggregation
        Testing perf lock contention --cgroup-filter
        Testing perf lock contention CSV output
        test child finished with 0
        ---- end ----
        kernel lock contention analysis test: Ok
        [root@quaco ~]#
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <song@kernel.org>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230906174903.346486-6-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e44b47b9
    • Namhyung Kim's avatar
      perf lock contention: Add -G/--cgroup-filter option · 4fd06bd2
      Namhyung Kim authored
      The -G/--cgroup-filter is to limit lock contention collection on the
      tasks in the specific cgroups only.
      
        $ sudo ./perf lock con -abt -G /user.slice/.../vte-spawn-52221fb8-b33f-4a52-b5c3-e35d1e6fc0e0.scope \
          ./perf bench sched messaging
        # Running 'sched/messaging' benchmark:
        # 20 sender and receiver processes per group
        # 10 groups == 400 processes run
      
             Total time: 0.174 [sec]
         contended   total wait     max wait     avg wait          pid   comm
      
                 4    114.45 us     60.06 us     28.61 us       214847   sched-messaging
                 2    111.40 us     60.84 us     55.70 us       214848   sched-messaging
                 2    106.09 us     59.42 us     53.04 us       214837   sched-messaging
                 1     81.70 us     81.70 us     81.70 us       214709   sched-messaging
                68     78.44 us      6.83 us      1.15 us       214633   sched-messaging
                69     73.71 us      2.69 us      1.07 us       214632   sched-messaging
                 4     72.62 us     60.83 us     18.15 us       214850   sched-messaging
                 2     71.75 us     67.60 us     35.88 us       214840   sched-messaging
                 2     69.29 us     67.53 us     34.65 us       214804   sched-messaging
                 2     69.00 us     68.23 us     34.50 us       214826   sched-messaging
        ...
      
      Export cgroup__new() function as it's needed from outside.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <song@kernel.org>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230906174903.346486-5-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4fd06bd2
    • Namhyung Kim's avatar
      perf lock contention: Add --lock-cgroup option · 4d1792d0
      Namhyung Kim authored
      The --lock-cgroup option shows lock contention stats break down by
      cgroups.
      
      Add LOCK_AGGR_CGROUP mode and use it instead of use_cgroup field.
      
        $ sudo ./perf lock con -ab --lock-cgroup sleep 1
         contended   total wait     max wait     avg wait   cgroup
      
                 8     15.70 us      6.34 us      1.96 us   /
                 2      1.48 us       747 ns       738 ns   /user.slice/.../app.slice/app-gnome-google\x2dchrome-6442.scope
                 1       848 ns       848 ns       848 ns   /user.slice/.../session.slice/org.gnome.Shell@x11.service
                 1       220 ns       220 ns       220 ns   /user.slice/.../session.slice/pipewire-pulse.service
      
      For now, the cgroup mode only works with BPF (-b).
      
      Committer notes:
      
      Remove -g as it is used in the other tools with a clear meaning of
      collect/show callchains. As agreed with Namhyung off list.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <song@kernel.org>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230906174903.346486-4-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4d1792d0
    • Namhyung Kim's avatar
      perf lock contention: Prepare to handle cgroups · d0c502e4
      Namhyung Kim authored
      Save cgroup info and display cgroup names if requested.  This is a
      preparation for the next patch.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <song@kernel.org>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230906174903.346486-3-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d0c502e4
    • Namhyung Kim's avatar
      perf tools: Add read_all_cgroups() and __cgroup_find() · 2bc12abc
      Namhyung Kim authored
      The read_all_cgroups() is to build a tree of cgroups in the system and
      users can look up a cgroup using __cgroup_find().
      
      Committer notes:
      
      Had to do this to cover that #else block:
      
        -static inline u64 __read_cgroup_id(const char *path) { return -1ULL; }
        +static inline u64 __read_cgroup_id(const char *path __maybe_unused) { return -1ULL; }
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Hao Luo <haoluo@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <song@kernel.org>
      Cc: bpf@vger.kernel.org
      Link: https://lore.kernel.org/r/20230906174903.346486-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2bc12abc
    • Yang Jihong's avatar
      perf kwork top: Add BPF-based statistics on softirq event support · 36019dff
      Yang Jihong authored
      Use BPF to collect statistics on softirq events based on perf BPF skeletons.
      
      Example usage:
      
        # perf kwork top -b
        Starting trace, Hit <Ctrl+C> to stop and report
        ^C
        Total  : 135445.704 ms, 8 cpus
        %Cpu(s):  28.35% id,   0.00% hi,   0.25% si
        %Cpu0   [||||||||||||||||||||            69.85%]
        %Cpu1   [||||||||||||||||||||||          74.10%]
        %Cpu2   [|||||||||||||||||||||           71.18%]
        %Cpu3   [||||||||||||||||||||            69.61%]
        %Cpu4   [||||||||||||||||||||||          74.05%]
        %Cpu5   [||||||||||||||||||||            69.33%]
        %Cpu6   [||||||||||||||||||||            69.71%]
        %Cpu7   [||||||||||||||||||||||          73.77%]
      
              PID     SPID    %CPU           RUNTIME  COMMMAND
          -------------------------------------------------------------
                0        0   30.43       5271.005 ms  [swapper/5]
                0        0   30.17       5226.644 ms  [swapper/3]
                0        0   30.08       5210.257 ms  [swapper/6]
                0        0   29.89       5177.177 ms  [swapper/0]
                0        0   28.51       4938.672 ms  [swapper/2]
                0        0   25.93       4223.464 ms  [swapper/7]
                0        0   25.69       4181.411 ms  [swapper/4]
                0        0   25.63       4173.804 ms  [swapper/1]
            16665    16265    2.16        360.600 ms  sched-messaging
            16537    16265    2.05        356.275 ms  sched-messaging
            16503    16265    2.01        343.063 ms  sched-messaging
            16424    16265    1.97        336.876 ms  sched-messaging
            16580    16265    1.94        323.658 ms  sched-messaging
            16515    16265    1.92        321.616 ms  sched-messaging
            16659    16265    1.91        325.538 ms  sched-messaging
            16634    16265    1.88        327.766 ms  sched-messaging
            16454    16265    1.87        326.843 ms  sched-messaging
            16382    16265    1.87        322.591 ms  sched-messaging
            16642    16265    1.86        320.506 ms  sched-messaging
            16582    16265    1.86        320.164 ms  sched-messaging
            16315    16265    1.86        326.872 ms  sched-messaging
            16637    16265    1.85        323.766 ms  sched-messaging
            16506    16265    1.82        311.688 ms  sched-messaging
            16512    16265    1.81        304.643 ms  sched-messaging
            16560    16265    1.80        314.751 ms  sched-messaging
            16320    16265    1.80        313.405 ms  sched-messaging
            16442    16265    1.80        314.403 ms  sched-messaging
            16626    16265    1.78        295.380 ms  sched-messaging
            16600    16265    1.77        309.444 ms  sched-messaging
            16550    16265    1.76        301.161 ms  sched-messaging
            16525    16265    1.75        296.560 ms  sched-messaging
            16314    16265    1.75        298.338 ms  sched-messaging
            16595    16265    1.74        304.390 ms  sched-messaging
            16555    16265    1.74        287.564 ms  sched-messaging
            16520    16265    1.74        295.734 ms  sched-messaging
            16507    16265    1.73        293.956 ms  sched-messaging
            16593    16265    1.72        296.443 ms  sched-messaging
            16531    16265    1.72        299.950 ms  sched-messaging
            16281    16265    1.72        301.339 ms  sched-messaging
        <SNIP>
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Link: https://lore.kernel.org/r/20230812084917.169338-17-yangjihong1@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      36019dff
    • Yang Jihong's avatar
      perf kwork top: Add BPF-based statistics on hardirq event support · d2956b3a
      Yang Jihong authored
      Use BPF to collect statistics on hardirq events based on perf BPF skeletons.
      
      Example usage:
      
        # perf kwork top -k sched,irq -b
        Starting trace, Hit <Ctrl+C> to stop and report
        ^C
        Total  : 136717.945 ms, 8 cpus
        %Cpu(s):  17.10% id,   0.01% hi,   0.00% si
        %Cpu0   [|||||||||||||||||||||||||       84.26%]
        %Cpu1   [|||||||||||||||||||||||||       84.77%]
        %Cpu2   [||||||||||||||||||||||||        83.22%]
        %Cpu3   [||||||||||||||||||||||||        80.37%]
        %Cpu4   [||||||||||||||||||||||||        81.49%]
        %Cpu5   [|||||||||||||||||||||||||       84.68%]
        %Cpu6   [|||||||||||||||||||||||||       84.48%]
        %Cpu7   [||||||||||||||||||||||||        80.21%]
      
              PID     SPID    %CPU           RUNTIME  COMMMAND
          -------------------------------------------------------------
                0        0   19.78       3482.833 ms  [swapper/7]
                0        0   19.62       3454.219 ms  [swapper/3]
                0        0   18.50       3258.339 ms  [swapper/4]
                0        0   16.76       2842.749 ms  [swapper/2]
                0        0   15.71       2627.905 ms  [swapper/0]
                0        0   15.51       2598.206 ms  [swapper/6]
                0        0   15.31       2561.820 ms  [swapper/5]
                0        0   15.22       2548.708 ms  [swapper/1]
            13253    13018    2.95        513.108 ms  sched-messaging
            13092    13018    2.67        454.167 ms  sched-messaging
            13401    13018    2.66        454.790 ms  sched-messaging
            13240    13018    2.64        454.587 ms  sched-messaging
            13251    13018    2.61        442.273 ms  sched-messaging
            13075    13018    2.61        438.932 ms  sched-messaging
            13220    13018    2.60        443.245 ms  sched-messaging
            13235    13018    2.59        443.268 ms  sched-messaging
            13222    13018    2.50        426.344 ms  sched-messaging
            13410    13018    2.49        426.191 ms  sched-messaging
            13228    13018    2.46        425.121 ms  sched-messaging
            13379    13018    2.38        409.950 ms  sched-messaging
            13236    13018    2.37        413.159 ms  sched-messaging
            13095    13018    2.36        396.572 ms  sched-messaging
            13325    13018    2.35        408.089 ms  sched-messaging
            13242    13018    2.32        394.750 ms  sched-messaging
            13386    13018    2.31        396.997 ms  sched-messaging
            13046    13018    2.29        383.833 ms  sched-messaging
            13109    13018    2.28        388.482 ms  sched-messaging
            13388    13018    2.28        393.576 ms  sched-messaging
            13238    13018    2.26        388.487 ms  sched-messaging
        <SNIP>
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Link: https://lore.kernel.org/r/20230812084917.169338-16-yangjihong1@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d2956b3a
    • Yang Jihong's avatar
      perf kwork top: Implements BPF-based cpu usage statistics · 8c984209
      Yang Jihong authored
      Use BPF to collect statistics on the CPU usage based on perf BPF skeletons.
      
      Example usage:
      
        # perf kwork top -h
      
         Usage: perf kwork top [<options>]
      
            -b, --use-bpf         Use BPF to measure task cpu usage
            -C, --cpu <cpu>       list of cpus to profile
            -i, --input <file>    input file name
            -n, --name <name>     event name to profile
            -s, --sort <key[,key2...]>
                                  sort by key(s): rate, runtime, tid
                --time <str>      Time span for analysis (start,stop)
      
        #
        # perf kwork -k sched top -b
        Starting trace, Hit <Ctrl+C> to stop and report
        ^C
        Total  : 160702.425 ms, 8 cpus
        %Cpu(s):  36.00% id,   0.00% hi,   0.00% si
        %Cpu0   [||||||||||||||||||              61.66%]
        %Cpu1   [||||||||||||||||||              61.27%]
        %Cpu2   [|||||||||||||||||||             66.40%]
        %Cpu3   [||||||||||||||||||              61.28%]
        %Cpu4   [||||||||||||||||||              61.82%]
        %Cpu5   [|||||||||||||||||||||||         77.41%]
        %Cpu6   [||||||||||||||||||              61.73%]
        %Cpu7   [||||||||||||||||||              63.25%]
      
              PID     SPID    %CPU           RUNTIME  COMMMAND
          -------------------------------------------------------------
                0        0   38.72       8089.463 ms  [swapper/1]
                0        0   38.71       8084.547 ms  [swapper/3]
                0        0   38.33       8007.532 ms  [swapper/0]
                0        0   38.26       7992.985 ms  [swapper/6]
                0        0   38.17       7971.865 ms  [swapper/4]
                0        0   36.74       7447.765 ms  [swapper/7]
                0        0   33.59       6486.942 ms  [swapper/2]
                0        0   22.58       3771.268 ms  [swapper/5]
             9545     9351    2.48        447.136 ms  sched-messaging
             9574     9351    2.09        418.583 ms  sched-messaging
             9724     9351    2.05        372.407 ms  sched-messaging
             9531     9351    2.01        368.804 ms  sched-messaging
             9512     9351    2.00        362.250 ms  sched-messaging
             9514     9351    1.95        357.767 ms  sched-messaging
             9538     9351    1.86        384.476 ms  sched-messaging
             9712     9351    1.84        386.490 ms  sched-messaging
             9723     9351    1.83        380.021 ms  sched-messaging
             9722     9351    1.82        382.738 ms  sched-messaging
             9517     9351    1.81        354.794 ms  sched-messaging
             9559     9351    1.79        344.305 ms  sched-messaging
             9725     9351    1.77        365.315 ms  sched-messaging
        <SNIP>
      
        # perf kwork -k sched top -b -n perf
        Starting trace, Hit <Ctrl+C> to stop and report
        ^C
        Total  : 151563.332 ms, 8 cpus
        %Cpu(s):  26.49% id,   0.00% hi,   0.00% si
        %Cpu0   [                                 0.01%]
        %Cpu1   [                                 0.00%]
        %Cpu2   [                                 0.00%]
        %Cpu3   [                                 0.00%]
        %Cpu4   [                                 0.00%]
        %Cpu5   [                                 0.00%]
        %Cpu6   [                                 0.00%]
        %Cpu7   [                                 0.00%]
      
              PID     SPID    %CPU           RUNTIME  COMMMAND
          -------------------------------------------------------------
             9754     9754    0.01          2.303 ms  perf
      
        #
        # perf kwork -k sched top -b -C 2,3,4
        Starting trace, Hit <Ctrl+C> to stop and report
        ^C
        Total  :  48016.721 ms, 3 cpus
        %Cpu(s):  27.82% id,   0.00% hi,   0.00% si
        %Cpu2   [||||||||||||||||||||||          74.68%]
        %Cpu3   [|||||||||||||||||||||           71.06%]
        %Cpu4   [|||||||||||||||||||||           70.91%]
      
              PID     SPID    %CPU           RUNTIME  COMMMAND
          -------------------------------------------------------------
                0        0   29.08       4734.998 ms  [swapper/4]
                0        0   28.93       4710.029 ms  [swapper/3]
                0        0   25.31       3912.363 ms  [swapper/2]
            10248    10158    1.62        264.931 ms  sched-messaging
            10253    10158    1.62        265.136 ms  sched-messaging
            10158    10158    1.60        263.013 ms  bash
            10360    10158    1.49        243.639 ms  sched-messaging
            10413    10158    1.48        238.604 ms  sched-messaging
            10531    10158    1.47        234.067 ms  sched-messaging
            10400    10158    1.47        240.631 ms  sched-messaging
            10355    10158    1.47        230.586 ms  sched-messaging
            10377    10158    1.43        234.835 ms  sched-messaging
            10526    10158    1.42        232.045 ms  sched-messaging
            10298    10158    1.41        222.396 ms  sched-messaging
            10410    10158    1.38        221.853 ms  sched-messaging
            10364    10158    1.38        226.042 ms  sched-messaging
            10480    10158    1.36        213.633 ms  sched-messaging
            10370    10158    1.36        223.620 ms  sched-messaging
            10553    10158    1.34        217.169 ms  sched-messaging
            10291    10158    1.34        211.516 ms  sched-messaging
            10251    10158    1.34        218.813 ms  sched-messaging
            10522    10158    1.33        218.498 ms  sched-messaging
            10288    10158    1.33        216.787 ms  sched-messaging
        <SNIP>
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Link: https://lore.kernel.org/r/20230812084917.169338-15-yangjihong1@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8c984209
    • Yang Jihong's avatar
      perf kwork top: Add -C/--cpu -i/--input -n/--name -s/--sort --time options · aa172a5a
      Yang Jihong authored
      Provide the following options for perf kwork top:
      
      1. -C, --cpu <cpu>		list of cpus to profile
      2. -i, --input <file>		input file name
      3. -n, --name <name>		event name to profile
      4. -s, --sort <key[,key2...]>	sort by key(s): rate, runtime, tid
      5. --time <str>		Time span for analysis (start,stop)
      
      Example usage:
      
        # perf kwork top -h
      
         Usage: perf kwork top [<options>]
      
            -C, --cpu <cpu>       list of cpus to profile
            -i, --input <file>    input file name
            -n, --name <name>     event name to profile
            -s, --sort <key[,key2...]>
                                  sort by key(s): rate, runtime, tid
                --time <str>      Time span for analysis (start,stop)
      
        # perf kwork top -C 2,4,5
      
        Total  :  51226.940 ms, 3 cpus
        %Cpu(s):  92.59% id,   0.00% hi,   0.09% si
        %Cpu2   [|                                4.61%]
        %Cpu4   [                                 0.01%]
        %Cpu5   [|||||                           17.31%]
      
              PID    %CPU           RUNTIME  COMMMAND
          ----------------------------------------------------
                0   99.98      17073.515 ms  swapper/4
                0   95.17      16250.874 ms  swapper/2
                0   82.62      14108.577 ms  swapper/5
             4342   21.70       3708.358 ms  perf
               16    0.13         22.296 ms  rcu_preempt
               75    0.02          4.261 ms  kworker/2:1
               98    0.01          2.540 ms  jbd2/sda-8
               61    0.01          3.404 ms  kcompactd0
               87    0.00          0.145 ms  kworker/5:1H
               73    0.00          0.596 ms  kworker/5:1
               41    0.00          0.041 ms  ksoftirqd/5
               40    0.00          0.718 ms  migration/5
               64    0.00          0.115 ms  kworker/4:1
               35    0.00          0.556 ms  migration/4
              353    0.00          1.143 ms  sshd
               26    0.00          1.665 ms  ksoftirqd/2
               25    0.00          0.662 ms  migration/2
      
        # perf kwork top -i perf.data
      
        Total  : 136601.588 ms, 8 cpus
        %Cpu(s):  95.66% id,   0.04% hi,   0.05% si
        %Cpu0   [                                 0.02%]
        %Cpu1   [                                 0.01%]
        %Cpu2   [|                                4.61%]
        %Cpu3   [                                 0.04%]
        %Cpu4   [                                 0.01%]
        %Cpu5   [|||||                           17.31%]
        %Cpu6   [                                 0.51%]
        %Cpu7   [|||                             11.42%]
      
              PID    %CPU           RUNTIME  COMMMAND
          ----------------------------------------------------
                0   99.98      17073.515 ms  swapper/4
                0   99.98      17072.173 ms  swapper/1
                0   99.93      17064.229 ms  swapper/3
                0   99.62      17011.013 ms  swapper/0
                0   99.47      16985.180 ms  swapper/6
                0   95.17      16250.874 ms  swapper/2
                0   88.51      15111.684 ms  swapper/7
                0   82.62      14108.577 ms  swapper/5
             4342   33.00       5644.045 ms  perf
             4344    0.43         74.351 ms  perf
               16    0.13         22.296 ms  rcu_preempt
             4345    0.05         10.093 ms  perf
             4343    0.05          8.769 ms  perf
             4341    0.02          4.882 ms  perf
             4095    0.02          4.605 ms  kworker/7:1
               75    0.02          4.261 ms  kworker/2:1
              120    0.01          1.909 ms  systemd-journal
               98    0.01          2.540 ms  jbd2/sda-8
               61    0.01          3.404 ms  kcompactd0
              667    0.01          2.542 ms  kworker/u16:2
             4340    0.00          1.052 ms  kworker/7:2
               97    0.00          0.489 ms  kworker/7:1H
               51    0.00          0.209 ms  ksoftirqd/7
               50    0.00          0.646 ms  migration/7
               76    0.00          0.753 ms  kworker/6:1
               45    0.00          0.572 ms  migration/6
               87    0.00          0.145 ms  kworker/5:1H
               73    0.00          0.596 ms  kworker/5:1
               41    0.00          0.041 ms  ksoftirqd/5
               40    0.00          0.718 ms  migration/5
               64    0.00          0.115 ms  kworker/4:1
               35    0.00          0.556 ms  migration/4
              353    0.00          2.600 ms  sshd
               74    0.00          0.205 ms  kworker/3:1
               33    0.00          1.576 ms  kworker/3:0H
               30    0.00          0.996 ms  migration/3
               26    0.00          1.665 ms  ksoftirqd/2
               25    0.00          0.662 ms  migration/2
              397    0.00          0.057 ms  kworker/1:1
               20    0.00          1.005 ms  migration/1
             2909    0.00          1.053 ms  kworker/0:2
               17    0.00          0.720 ms  migration/0
               15    0.00          0.039 ms  ksoftirqd/0
      
        # perf kwork top -n perf
      
        Total  : 136601.588 ms, 8 cpus
        %Cpu(s):  95.66% id,   0.04% hi,   0.05% si
        %Cpu0   [                                 0.01%]
        %Cpu1   [                                 0.00%]
        %Cpu2   [|                                4.44%]
        %Cpu3   [                                 0.00%]
        %Cpu4   [                                 0.00%]
        %Cpu5   [                                 0.00%]
        %Cpu6   [                                 0.49%]
        %Cpu7   [|||                             11.38%]
      
              PID    %CPU           RUNTIME  COMMMAND
          ----------------------------------------------------
             4342   15.74       2695.516 ms  perf
             4344    0.43         74.351 ms  perf
             4345    0.05         10.093 ms  perf
             4343    0.05          8.769 ms  perf
             4341    0.02          4.882 ms  perf
      
        # perf kwork top -s tid
      
        Total  : 136601.588 ms, 8 cpus
        %Cpu(s):  95.66% id,   0.04% hi,   0.05% si
        %Cpu0   [                                 0.02%]
        %Cpu1   [                                 0.01%]
        %Cpu2   [|                                4.61%]
        %Cpu3   [                                 0.04%]
        %Cpu4   [                                 0.01%]
        %Cpu5   [|||||                           17.31%]
        %Cpu6   [                                 0.51%]
        %Cpu7   [|||                             11.42%]
      
              PID    %CPU           RUNTIME  COMMMAND
          ----------------------------------------------------
                0   99.62      17011.013 ms  swapper/0
                0   99.98      17072.173 ms  swapper/1
                0   95.17      16250.874 ms  swapper/2
                0   99.93      17064.229 ms  swapper/3
                0   99.98      17073.515 ms  swapper/4
                0   82.62      14108.577 ms  swapper/5
                0   99.47      16985.180 ms  swapper/6
                0   88.51      15111.684 ms  swapper/7
               15    0.00          0.039 ms  ksoftirqd/0
               16    0.13         22.296 ms  rcu_preempt
               17    0.00          0.720 ms  migration/0
               20    0.00          1.005 ms  migration/1
               25    0.00          0.662 ms  migration/2
               26    0.00          1.665 ms  ksoftirqd/2
               30    0.00          0.996 ms  migration/3
               33    0.00          1.576 ms  kworker/3:0H
               35    0.00          0.556 ms  migration/4
               40    0.00          0.718 ms  migration/5
               41    0.00          0.041 ms  ksoftirqd/5
               45    0.00          0.572 ms  migration/6
               50    0.00          0.646 ms  migration/7
               51    0.00          0.209 ms  ksoftirqd/7
               61    0.01          3.404 ms  kcompactd0
               64    0.00          0.115 ms  kworker/4:1
               73    0.00          0.596 ms  kworker/5:1
               74    0.00          0.205 ms  kworker/3:1
               75    0.02          4.261 ms  kworker/2:1
               76    0.00          0.753 ms  kworker/6:1
               87    0.00          0.145 ms  kworker/5:1H
               97    0.00          0.489 ms  kworker/7:1H
               98    0.01          2.540 ms  jbd2/sda-8
              120    0.01          1.909 ms  systemd-journal
              353    0.00          2.600 ms  sshd
              397    0.00          0.057 ms  kworker/1:1
              667    0.01          2.542 ms  kworker/u16:2
             2909    0.00          1.053 ms  kworker/0:2
             4095    0.02          4.605 ms  kworker/7:1
             4340    0.00          1.052 ms  kworker/7:2
             4341    0.02          4.882 ms  perf
             4342   33.00       5644.045 ms  perf
             4343    0.05          8.769 ms  perf
             4344    0.43         74.351 ms  perf
             4345    0.05         10.093 ms  perf
      
        # perf kwork top --time 128800,
      
        Total  :  53495.122 ms, 8 cpus
        %Cpu(s):  94.71% id,   0.09% hi,   0.09% si
        %Cpu0   [                                 0.07%]
        %Cpu1   [                                 0.04%]
        %Cpu2   [||                               8.49%]
        %Cpu3   [                                 0.09%]
        %Cpu4   [                                 0.02%]
        %Cpu5   [                                 0.06%]
        %Cpu6   [                                 0.12%]
        %Cpu7   [||||||                          21.24%]
      
              PID    %CPU           RUNTIME  COMMMAND
          ----------------------------------------------------
                0   99.96       3981.363 ms  swapper/4
                0   99.94       3978.955 ms  swapper/1
                0   99.91       9329.375 ms  swapper/5
                0   99.87       4906.829 ms  swapper/3
                0   99.86       9028.064 ms  swapper/6
                0   98.67       3928.161 ms  swapper/0
                0   91.17       8388.432 ms  swapper/2
                0   78.65       7125.602 ms  swapper/7
             4342   29.42       2675.198 ms  perf
               16    0.18         16.817 ms  rcu_preempt
             4345    0.09          8.183 ms  perf
             4344    0.04          4.290 ms  perf
             4343    0.03          2.844 ms  perf
              353    0.03          2.600 ms  sshd
             4095    0.02          2.702 ms  kworker/7:1
              120    0.02          1.909 ms  systemd-journal
               98    0.02          2.540 ms  jbd2/sda-8
               61    0.02          1.886 ms  kcompactd0
              667    0.02          1.011 ms  kworker/u16:2
               75    0.02          2.693 ms  kworker/2:1
             4341    0.01          1.838 ms  perf
               30    0.01          0.788 ms  migration/3
               26    0.01          1.665 ms  ksoftirqd/2
               20    0.01          0.752 ms  migration/1
             2909    0.01          0.604 ms  kworker/0:2
             4340    0.00          0.635 ms  kworker/7:2
               97    0.00          0.214 ms  kworker/7:1H
               51    0.00          0.209 ms  ksoftirqd/7
               50    0.00          0.646 ms  migration/7
               76    0.00          0.602 ms  kworker/6:1
               45    0.00          0.366 ms  migration/6
               87    0.00          0.145 ms  kworker/5:1H
               40    0.00          0.446 ms  migration/5
               35    0.00          0.318 ms  migration/4
               74    0.00          0.205 ms  kworker/3:1
               33    0.00          0.080 ms  kworker/3:0H
               25    0.00          0.448 ms  migration/2
              397    0.00          0.057 ms  kworker/1:1
               17    0.00          0.365 ms  migration/0
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Link: https://lore.kernel.org/r/20230812084917.169338-14-yangjihong1@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      aa172a5a
    • Yang Jihong's avatar
      perf kwork top: Add statistics on softirq event support · e29090d2
      Yang Jihong authored
      Calculate the runtime of the softirq events and subtract it from
      the corresponding task runtime to improve the precision.
      
      Example usage:
      
        # perf kwork -k sched,irq,softirq record -- perf record -e cpu-clock -o perf_record.data -a sleep 10
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.467 MB perf_record.data (7154 samples) ]
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 2.152 MB perf.data (22846 samples) ]
        # perf kwork top
      
        Total  : 136601.588 ms, 8 cpus
        %Cpu(s):  95.66% id,   0.04% hi,   0.05% si
        %Cpu0   [                                 0.02%]
        %Cpu1   [                                 0.01%]
        %Cpu2   [|                                4.61%]
        %Cpu3   [                                 0.04%]
        %Cpu4   [                                 0.01%]
        %Cpu5   [|||||                           17.31%]
        %Cpu6   [                                 0.51%]
        %Cpu7   [|||                             11.42%]
      
              PID    %CPU           RUNTIME  COMMMAND
          ----------------------------------------------------
                0   99.98      17073.515 ms  swapper/4
                0   99.98      17072.173 ms  swapper/1
                0   99.93      17064.229 ms  swapper/3
                0   99.62      17011.013 ms  swapper/0
                0   99.47      16985.180 ms  swapper/6
                0   95.17      16250.874 ms  swapper/2
                0   88.51      15111.684 ms  swapper/7
                0   82.62      14108.577 ms  swapper/5
             4342   33.00       5644.045 ms  perf
             4344    0.43         74.351 ms  perf
               16    0.13         22.296 ms  rcu_preempt
             4345    0.05         10.093 ms  perf
             4343    0.05          8.769 ms  perf
             4341    0.02          4.882 ms  perf
             4095    0.02          4.605 ms  kworker/7:1
               75    0.02          4.261 ms  kworker/2:1
              120    0.01          1.909 ms  systemd-journal
               98    0.01          2.540 ms  jbd2/sda-8
               61    0.01          3.404 ms  kcompactd0
              667    0.01          2.542 ms  kworker/u16:2
             4340    0.00          1.052 ms  kworker/7:2
               97    0.00          0.489 ms  kworker/7:1H
               51    0.00          0.209 ms  ksoftirqd/7
               50    0.00          0.646 ms  migration/7
               76    0.00          0.753 ms  kworker/6:1
               45    0.00          0.572 ms  migration/6
               87    0.00          0.145 ms  kworker/5:1H
               73    0.00          0.596 ms  kworker/5:1
               41    0.00          0.041 ms  ksoftirqd/5
               40    0.00          0.718 ms  migration/5
               64    0.00          0.115 ms  kworker/4:1
               35    0.00          0.556 ms  migration/4
              353    0.00          2.600 ms  sshd
               74    0.00          0.205 ms  kworker/3:1
               33    0.00          1.576 ms  kworker/3:0H
               30    0.00          0.996 ms  migration/3
               26    0.00          1.665 ms  ksoftirqd/2
               25    0.00          0.662 ms  migration/2
              397    0.00          0.057 ms  kworker/1:1
               20    0.00          1.005 ms  migration/1
             2909    0.00          1.053 ms  kworker/0:2
               17    0.00          0.720 ms  migration/0
               15    0.00          0.039 ms  ksoftirqd/0
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Link: https://lore.kernel.org/r/20230812084917.169338-13-yangjihong1@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e29090d2
    • Yang Jihong's avatar
      perf kwork top: Add statistics on hardirq event support · 2f21f5e4
      Yang Jihong authored
      Calculate the runtime of the hardirq events and subtract it from
      the corresponding task runtime to improve the precision.
      
      Example usage:
      
        # perf kwork -k sched,irq record -- perf record -o perf_record.data -a sleep 10
        [ perf record: Woken up 2 times to write data ]
        [ perf record: Captured and wrote 1.054 MB perf_record.data (18019 samples) ]
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 1.798 MB perf.data (16334 samples) ]
        #
        # perf kwork top
      
        Total  : 139240.869 ms, 8 cpus
        %Cpu(s):  94.91% id,   0.05% hi
        %Cpu0   [                                 0.05%]
        %Cpu1   [|                                5.00%]
        %Cpu2   [                                 0.43%]
        %Cpu3   [                                 0.57%]
        %Cpu4   [                                 1.19%]
        %Cpu5   [||||||                          20.46%]
        %Cpu6   [                                 0.48%]
        %Cpu7   [|||                             12.10%]
      
              PID    %CPU           RUNTIME  COMMMAND
          ----------------------------------------------------
                0   99.54      17325.622 ms  swapper/2
                0   99.54      17327.527 ms  swapper/0
                0   99.51      17319.909 ms  swapper/6
                0   99.42      17304.934 ms  swapper/3
                0   98.80      17197.385 ms  swapper/4
                0   94.99      16534.991 ms  swapper/1
                0   87.89      15295.264 ms  swapper/7
                0   79.53      13843.182 ms  swapper/5
             4252   36.50       6361.768 ms  perf
             4256    1.17        205.215 ms  bash
              151    0.53         93.298 ms  systemd-resolve
             4254    0.39         69.468 ms  perf
              423    0.34         59.368 ms  bash
              412    0.29         51.204 ms  sshd
              249    0.20         35.288 ms  sd-resolve
               16    0.17         30.287 ms  rcu_preempt
              153    0.09         17.266 ms  systemd-timesyn
                1    0.09         17.078 ms  systemd
             4253    0.07         12.457 ms  perf
             4255    0.06         11.559 ms  perf
             4234    0.03          6.105 ms  kworker/u16:1
               69    0.03          6.259 ms  kworker/1:1H
             4251    0.02          4.615 ms  perf
             4095    0.02          4.890 ms  kworker/7:1
               61    0.02          4.005 ms  kcompactd0
               75    0.02          3.546 ms  kworker/2:1
               97    0.01          3.106 ms  kworker/7:1H
               98    0.01          1.995 ms  jbd2/sda-8
             4088    0.01          1.779 ms  kworker/u16:3
             2909    0.01          1.795 ms  kworker/0:2
             4246    0.00          1.117 ms  kworker/7:2
               51    0.00          0.327 ms  ksoftirqd/7
               50    0.00          0.369 ms  migration/7
              102    0.00          0.160 ms  kworker/6:1H
               76    0.00          0.609 ms  kworker/6:1
               45    0.00          0.779 ms  migration/6
               87    0.00          0.504 ms  kworker/5:1H
               73    0.00          1.130 ms  kworker/5:1
               41    0.00          0.152 ms  ksoftirqd/5
               40    0.00          0.702 ms  migration/5
               64    0.00          0.316 ms  kworker/4:1
               35    0.00          0.791 ms  migration/4
              353    0.00          2.211 ms  sshd
               74    0.00          0.272 ms  kworker/3:1
               30    0.00          0.819 ms  migration/3
               25    0.00          0.784 ms  migration/2
              397    0.00          0.539 ms  kworker/1:1
               21    0.00          1.600 ms  ksoftirqd/1
               20    0.00          0.773 ms  migration/1
               17    0.00          1.682 ms  migration/0
               15    0.00          0.076 ms  ksoftirqd/0
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Link: https://lore.kernel.org/r/20230812084917.169338-12-yangjihong1@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2f21f5e4
    • Yang Jihong's avatar
      perf evsel: Add evsel__intval_common() helper · a8792242
      Yang Jihong authored
      Add evsel__intval_common() helper to search for common_field in
      tracepoint format.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Link: https://lore.kernel.org/r/20230812084917.169338-11-yangjihong1@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a8792242
    • Yang Jihong's avatar
      perf kwork top: Introduce new top utility · 55c40e50
      Yang Jihong authored
      Some common tools for collecting statistics on CPU usage, such as top,
      obtain statistics from timer interrupt sampling, and then periodically
      read statistics from /proc/stat.
      
      This method has some deviations:
      
      1. In the tick interrupt, the time between the last tick and the current
         tick is counted in the current task. However, the task may be running
         only part of the time.
      2. For each task, the top tool periodically reads the /proc/{PID}/status
         information. For tasks with a short life cycle, it may be missed.
      
      In conclusion, the top tool cannot accurately collect statistics on the
      CPU usage and running time of tasks.
      
      The statistical method based on sched_switch tracepoint can accurately
      calculate the CPU usage of all tasks. This method is applicable to
      scenarios where performance comparison data is of high precision.
      
      Example usage:
      
        # perf kwork
      
         Usage: perf kwork [<options>] {record|report|latency|timehist|top}
      
            -D, --dump-raw-trace  dump raw trace in ASCII
            -f, --force           don't complain, do it
            -k, --kwork <kwork>   list of kwork to profile (irq, softirq, workqueue, sched, etc)
            -v, --verbose         be more verbose (show symbol address, etc)
      
        # perf kwork -k sched record -- perf bench sched messaging -g 1 -l 10000
        # Running 'sched/messaging' benchmark:
        # 20 sender and receiver processes per group
        # 1 groups == 40 processes run
      
             Total time: 14.074 [sec]
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 15.886 MB perf.data (129472 samples) ]
        # perf kwork top
      
        Total  : 115708.178 ms, 8 cpus
        %Cpu(s):   9.78% id
        %Cpu0   [|||||||||||||||||||||||||||     90.55%]
        %Cpu1   [|||||||||||||||||||||||||||     90.51%]
        %Cpu2   [||||||||||||||||||||||||||      88.57%]
        %Cpu3   [|||||||||||||||||||||||||||     91.18%]
        %Cpu4   [|||||||||||||||||||||||||||     91.09%]
        %Cpu5   [|||||||||||||||||||||||||||     90.88%]
        %Cpu6   [||||||||||||||||||||||||||      88.64%]
        %Cpu7   [|||||||||||||||||||||||||||     90.28%]
      
              PID    %CPU           RUNTIME  COMMMAND
          ----------------------------------------------------
             4113   22.23       3221.547 ms  sched-messaging
             4105   21.61       3131.495 ms  sched-messaging
             4119   21.53       3120.937 ms  sched-messaging
             4103   21.39       3101.614 ms  sched-messaging
             4106   21.37       3095.209 ms  sched-messaging
             4104   21.25       3077.269 ms  sched-messaging
             4115   21.21       3073.188 ms  sched-messaging
             4109   21.18       3069.022 ms  sched-messaging
             4111   20.78       3010.033 ms  sched-messaging
             4114   20.74       3007.073 ms  sched-messaging
             4108   20.73       3002.137 ms  sched-messaging
             4107   20.47       2967.292 ms  sched-messaging
             4117   20.39       2955.335 ms  sched-messaging
             4112   20.34       2947.080 ms  sched-messaging
             4118   20.32       2942.519 ms  sched-messaging
             4121   20.23       2929.865 ms  sched-messaging
             4110   20.22       2930.078 ms  sched-messaging
             4122   20.15       2919.542 ms  sched-messaging
             4120   19.77       2866.032 ms  sched-messaging
             4116   19.72       2857.660 ms  sched-messaging
             4127   16.19       2346.334 ms  sched-messaging
             4142   15.86       2297.600 ms  sched-messaging
             4141   15.62       2262.646 ms  sched-messaging
             4136   15.41       2231.408 ms  sched-messaging
             4130   15.38       2227.008 ms  sched-messaging
             4129   15.31       2217.692 ms  sched-messaging
             4126   15.21       2201.711 ms  sched-messaging
             4139   15.19       2200.722 ms  sched-messaging
             4137   15.10       2188.633 ms  sched-messaging
             4134   15.06       2182.082 ms  sched-messaging
             4132   15.02       2177.530 ms  sched-messaging
             4131   14.73       2131.973 ms  sched-messaging
             4125   14.68       2125.439 ms  sched-messaging
             4128   14.66       2122.255 ms  sched-messaging
             4123   14.65       2122.113 ms  sched-messaging
             4135   14.56       2107.144 ms  sched-messaging
             4133   14.51       2103.549 ms  sched-messaging
             4124   14.27       2066.671 ms  sched-messaging
             4140   14.17       2052.251 ms  sched-messaging
             4138   13.81       2000.361 ms  sched-messaging
                0   11.42       1652.009 ms  swapper/2
                0   11.35       1641.694 ms  swapper/6
                0    9.71       1405.108 ms  swapper/7
                0    9.48       1372.338 ms  swapper/1
                0    9.44       1366.013 ms  swapper/0
                0    9.11       1318.382 ms  swapper/5
                0    8.90       1287.582 ms  swapper/4
                0    8.81       1274.356 ms  swapper/3
             4100    2.61        379.328 ms  perf
             4101    1.16        169.487 ms  perf-exec
              151    0.65         94.741 ms  systemd-resolve
              249    0.36         53.030 ms  sd-resolve
              153    0.14         21.405 ms  systemd-timesyn
                1    0.10         16.200 ms  systemd
               16    0.09         15.785 ms  rcu_preempt
             4102    0.06          9.727 ms  perf
             4095    0.03          5.464 ms  kworker/7:1
               98    0.02          3.231 ms  jbd2/sda-8
              353    0.02          4.115 ms  sshd
               75    0.02          3.889 ms  kworker/2:1
               73    0.01          1.552 ms  kworker/5:1
               64    0.01          1.591 ms  kworker/4:1
               74    0.01          1.952 ms  kworker/3:1
               61    0.01          2.608 ms  kcompactd0
              397    0.01          1.602 ms  kworker/1:1
               69    0.01          1.817 ms  kworker/1:1H
               10    0.01          2.553 ms  kworker/u16:0
             2909    0.01          2.684 ms  kworker/0:2
             1211    0.00          0.426 ms  kworker/7:0
               97    0.00          0.153 ms  kworker/7:1H
               51    0.00          0.100 ms  ksoftirqd/7
              120    0.00          0.856 ms  systemd-journal
               76    0.00          1.414 ms  kworker/6:1
               46    0.00          0.246 ms  ksoftirqd/6
               45    0.00          0.164 ms  migration/6
               41    0.00          0.098 ms  ksoftirqd/5
               40    0.00          0.207 ms  migration/5
               86    0.00          1.339 ms  kworker/4:1H
               36    0.00          0.252 ms  ksoftirqd/4
               35    0.00          0.090 ms  migration/4
               31    0.00          0.156 ms  ksoftirqd/3
               30    0.00          0.073 ms  migration/3
               26    0.00          0.180 ms  ksoftirqd/2
               25    0.00          0.085 ms  migration/2
               21    0.00          0.106 ms  ksoftirqd/1
               20    0.00          0.118 ms  migration/1
              302    0.00          1.440 ms  systemd-logind
               17    0.00          0.132 ms  migration/0
               15    0.00          0.255 ms  ksoftirqd/0
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Link: https://lore.kernel.org/r/20230812084917.169338-10-yangjihong1@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      55c40e50
    • Yang Jihong's avatar
      perf kwork: Add `root` parameter to work_sort() · b83b5071
      Yang Jihong authored
      Add a `struct rb_root_cached *root` parameter to work_sort() to sort the
      specified rb tree elements.
      
      No functional change.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Link: https://lore.kernel.org/r/20230812084917.169338-9-yangjihong1@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b83b5071
    • Yang Jihong's avatar
      perf kwork: Add sched record support · 38d8d013
      Yang Jihong authored
      The kwork_class type of sched is added to support recording and parsing of
      sched_switch events.
      
      As follows:
      
        # perf kwork -h
      
         Usage: perf kwork [<options>] {record|report|latency|timehist}
      
            -D, --dump-raw-trace  dump raw trace in ASCII
            -f, --force           don't complain, do it
            -k, --kwork <kwork>   list of kwork to profile (irq, softirq, workqueue, sched, etc)
            -v, --verbose         be more verbose (show symbol address, etc)
      
        # perf kwork -k sched record true
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.083 MB perf.data (47 samples) ]
        # perf evlist
        sched:sched_switch
        dummy:HG
        # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Link: https://lore.kernel.org/r/20230812084917.169338-8-yangjihong1@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      38d8d013