1. 10 Nov, 2023 3 commits
    • Namhyung Kim's avatar
      perf annotate: Move raw_comment and raw_func_start fields out of 'struct ins_operands' · fb7fd2a1
      Namhyung Kim authored
      Thoese two fields are used only for the jump_ops, so move them into the
      union to save some bytes.  Also add jump__delete() callback not to free
      the fields as they didn't allocate new strings.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Huacai Chen <chenhuacai@kernel.org>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: WANG Rui <wangrui@loongson.cn>
      Cc: linux-toolchains@vger.kernel.org
      Cc: linux-trace-devel@vger.kernel.org
      Link: https://lore.kernel.org/r/20231110000012.3538610-3-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      fb7fd2a1
    • Namhyung Kim's avatar
      perf annotate: Pass "-l" option to objdump conditionally · ded8c484
      Namhyung Kim authored
      The "-l" option is to print line numbers in the objdump output.  perf
      annotate TUI only can show the line numbers later but it causes big slow
      downs for the kernel binary.
      
      Similarly, showing source code also takes a long time and it already has
      an option to control it.
      
        $ time objdump ... -d -S -C vmlinux > /dev/null
        real	0m3.474s
        user	0m3.047s
        sys	0m0.428s
      
        $ time objdump ... -d -l -C vmlinux > /dev/null
        real	0m1.796s
        user	0m1.459s
        sys	0m0.338s
      
        $ time objdump ... -d -C vmlinux > /dev/null
        real	0m0.051s
        user	0m0.036s
        sys	0m0.016s
      
      As it's not needed for data type profiling, let's make it conditional so
      that it can skip the unnecessary work.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: linux-toolchains@vger.kernel.org
      Cc: linux-trace-devel@vger.kernel.org
      Link: https://lore.kernel.org/r/20231110000012.3538610-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ded8c484
    • Arnaldo Carvalho de Melo's avatar
      perf header: Additional note on AMD IBS for max_precise pmu cap · dd678532
      Arnaldo Carvalho de Melo authored
      x86 core PMU exposes supported maximum precision level via max_precise
      PMU capability. Although, AMD core PMU does not support precise mode,
      certain core PMU events with precise_ip > 0 are allowed and forwarded to
      IBS OP PMU.
      
      Display a note about this in the 'perf report' header output and
      document the details in the perf-list man page.
      Signed-off-by: default avatarRavi Bangoria <ravi.bangoria@amd.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ananth Narayan <ananth.narayan@amd.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Huacai Chen <chenhuacai@kernel.org>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Kim Phillips <kim.phillips@amd.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Ming Wang <wangming01@loongson.cn>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ross Zwisler <zwisler@chromium.org>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Santosh Shukla <santosh.shukla@amd.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Link: https://lore.kernel.org/r/20231107083331.901-2-ravi.bangoria@amd.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      dd678532
  2. 09 Nov, 2023 17 commits
    • Arnaldo Carvalho de Melo's avatar
      tools: Disable __packed attribute compiler warning due to -Werror=attributes · a399ee67
      Arnaldo Carvalho de Melo authored
      Noticed on several perf tools cross build test containers:
      
        [perfbuilder@five ~]$ grep FAIL ~/dm.log/summary
          19    10.18 debian:experimental-x-mips    : FAIL gcc version 12.3.0 (Debian 12.3.0-6)
          20    11.21 debian:experimental-x-mips64  : FAIL gcc version 12.3.0 (Debian 12.3.0-6)
          21    11.30 debian:experimental-x-mipsel  : FAIL gcc version 12.3.0 (Debian 12.3.0-6)
          37    12.07 ubuntu:18.04-x-arm            : FAIL gcc version 7.5.0 (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04)
          42    11.91 ubuntu:18.04-x-riscv64        : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
          44    13.17 ubuntu:18.04-x-sh4            : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
          45    12.09 ubuntu:18.04-x-sparc64        : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
        [perfbuilder@five ~]$
      
        In file included from util/intel-pt-decoder/intel-pt-pkt-decoder.c:10:
        /tmp/perf-6.6.0-rc1/tools/include/asm-generic/unaligned.h: In function 'get_unaligned_le16':
        /tmp/perf-6.6.0-rc1/tools/include/asm-generic/unaligned.h:13:29: error: packed attribute causes inefficient alignment for 'x' [-Werror=attributes]
           13 |         const struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr);      \
              |                             ^
        /tmp/perf-6.6.0-rc1/tools/include/asm-generic/unaligned.h:27:28: note: in expansion of macro '__get_unaligned_t'
           27 |         return le16_to_cpu(__get_unaligned_t(__le16, p));
              |                            ^~~~~~~~~~~~~~~~~
      
      This comes from the kernel, where the -Wattributes and -Wpacked isn't
      used, -Wpacked is already disabled, do it for the attributes as well.
      
      Fixes: a91c9872 ("perf tools: Add get_unaligned_leNN()")
      Suggested-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/lkml/7c5b626c-1de9-4c12-a781-e44985b4a797@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a399ee67
    • Ian Rogers's avatar
      perf bpf: Don't synthesize BPF events when disabled · 6512b6aa
      Ian Rogers authored
      If BPF sideband events are disabled on the command line, don't
      synthesize BPF events too.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarSong Liu <song@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Colin Ian King <colin.i.king@gmail.com>
      Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Huacai Chen <chenhuacai@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Li Dong <lidong@vivo.com>
      Cc: Liam Howlett <liam.howlett@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Miguel Ojeda <ojeda@kernel.org>
      Cc: Ming Wang <wangming01@loongson.cn>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nick Terrell <terrelln@fb.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
      Cc: Wenyu Liu <liuwenyu7@huawei.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Link: https://lore.kernel.org/r/20231102175735.2272696-13-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6512b6aa
    • James Clark's avatar
      perf test: Add support for setting objdump binary via perf config · 6aad765d
      James Clark authored
      Add a 'perf config' variable that does the same thing as "perf test
      --objdump <x>".
      
      Also update the man page.
      
      Committer testing:
      
        # perf config test.objdump
        # perf test "object code reading"
         26: Object code reading                                             : Ok
        # perf config test.objdump=blah
        # perf config test.objdump
        test.objdump=blah
        # perf test "object code reading"
         26: Object code reading                                             : FAILED!
        # perf test -v "object code reading"
         26: Object code reading                                             :
        --- start ---
        test child forked, pid 600599
        Looking at the vmlinux_path (8 entries long)
        Using /proc/kcore for kernel data
        Using /proc/kallsyms for symbols
        Parsing event 'cycles'
        Using CPUID AuthenticAMD-25-21-0
        mmap size 528384B
        Reading object code for memory address: 0x4d9a02
        File is: /home/acme/bin/perf
        On file address is: 0xd9a02
        Objdump command is: blah -z -d --start-address=0x4d9a02 --stop-address=0x4d9a82 /home/acme/bin/perf
        objdump read too few bytes: 128
        Bytes read differ from those read by objdump
        buf1 (dso):
        0x48 0x85 0xff 0x74 0x29 0xe8 0x94 0xdf 0x07 0x00 0x8b 0x73 0x1c 0x48 0x8b 0x43
        0x08 0xeb 0xa5 0x0f 0x1f 0x00 0x48 0x8b 0x45 0xe8 0x64 0x48 0x2b 0x04 0x25 0x28
        0x00 0x00 0x00 0x75 0x0f 0x48 0x8b 0x5d 0xf8 0xc9 0xc3 0x0f 0x1f 0x00 0x48 0x8b
        0x43 0x08 0xeb 0x84 0xe8 0xc5 0x3e 0xf3 0xff 0x0f 0x1f 0x44 0x00 0x00 0x55 0x48
        0x89 0xe5 0x41 0x56 0x41 0x55 0x49 0x89 0xd5 0x41 0x54 0x49 0x89 0xfc 0x53 0x48
        0x89 0xf3 0x48 0x83 0xec 0x30 0x48 0x8b 0x7e 0x20 0x64 0x48 0x8b 0x04 0x25 0x28
        0x00 0x00 0x00 0x48 0x89 0x45 0xd8 0x31 0xc0 0x48 0x89 0x75 0xb0 0x48 0xc7 0x45
        0xb8 0x00 0x00 0x00 0x00 0x48 0xc7 0x45 0xc0 0x00 0x00 0x00 0x00 0xe8 0xad 0xfa
      
        buf2 (objdump):
        0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
        0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
        0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
        0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
        0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
        0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
        0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
        0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
      
        test child finished with -1
        ---- end ----
        Object code reading: FAILED!
        # perf config test.objdump=/usr/bin/objdump
        # perf config test.objdump
        test.objdump=/usr/bin/objdump
        # perf test "object code reading"
         26: Object code reading                                             : Ok
        #
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Fangrui Song <maskray@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Tom Rix <trix@redhat.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Cc: Yonghong Song <yhs@fb.com>
      Cc: llvm@lists.linux.dev
      Link: https://lore.kernel.org/r/20231106151051.129440-3-james.clark@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6aad765d
    • James Clark's avatar
      perf test: Add option to change objdump binary · 33ce9fc4
      James Clark authored
      All of the other Perf subcommands that use objdump have an option to
      specify the binary, so add the same option to 'perf test'.
      
      This is useful if you have built the kernel with a different toolchain
      to the system one, where the system objdump may fail to disassemble
      vmlinux.
      
      Now this can be fixed with something like this:
      
        $ perf test --objdump llvm-objdump "object code reading"
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Fangrui Song <maskray@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Tom Rix <trix@redhat.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Cc: llvm@lists.linux.dev
      Link: https://lore.kernel.org/r/20231106151051.129440-2-james.clark@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      33ce9fc4
    • Thomas Richter's avatar
      perf tests offcpu: Adjust test case perf record offcpu profiling tests for s390 · b861fd7e
      Thomas Richter authored
      On s390 using linux-next the test case:
      
          87: perf record offcpu profiling tests
      
      fails. The root cause is this command
      
        # ./perf  record --off-cpu -e dummy -- ./perf bench sched messaging -l 10
        # Running 'sched/messaging' benchmark:
        # 20 sender and receiver processes per group
        # 10 groups == 400 processes run
      
           Total time: 0.231 [sec]
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.077 MB perf.data (401 samples) ]
        #
      
      It does not generate 800+ sample entries, on s390 usually around
      40[1-9], sometimes a few more, but never more than 450. The higher the
      number of CPUs the lower the number of samples.
      
      Looking at function chain:
      
        bench_sched_messaging()
        +--> group()
      
      the senders and receiver threads are created. The senders and receivers
      call function ready() which writes one bytes and wait for a reply using
      poll system() call.
      
      As context switches are counted, the function ready() will trigger a
      context switch when no input data is available after the write system
      call. The write system call does not trigger context switches when the
      data size is small. And writing 1000 bytes (10 iterations with
      100 bytes) is not much and certainly won't block.
      
      The 400+ context switch on s390 occur when the some receiver/sender
      threads call ready() and wait for the response from function
      bench_sched_messaging() being kicked off.
      
      Lower the number of expected context switches to 400 to succeed on s390.
      Suggested-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarIlya Leoshkevich <iii@linux.ibm.com>
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
      Cc: Sven Schnelle <svens@linux.ibm.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Co-developed-by: default avatarIlya Leoshkevich <iii@linux.ibm.com>
      Link: https://lore.kernel.org/r/20231106091627.2022530-1-tmricht@linux.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b861fd7e
    • Yang Jihong's avatar
      perf tools: Add the python_ext_build directory to .gitignore · 36c70e44
      Yang Jihong authored
      `python_ext_build` is the build directory for python.so, ignore it for
      cleaner git status.
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20231030111438.1357962-2-yangjihong1@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      36c70e44
    • zhaimingbing's avatar
      perf tests attr: Fix spelling mistake "whic" to "which" · 4a5aaaf3
      zhaimingbing authored
      There is a spelling mistake, Please fix it.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarzhaimingbing <zhaimingbing@cmss.chinamobile.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: kernel-janitors@vger.kernel.org
      Link: https://lore.kernel.org/r/20231030075825.3701-1-zhaimingbing@cmss.chinamobile.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4a5aaaf3
    • Namhyung Kim's avatar
      perf annotate: Move offsets array from 'struct annotation' to 'struct annotated_source' · b753d48f
      Namhyung Kim authored
      The offsets array keeps pointers to 'struct annotation_line' entries
      which are available in the 'struct annotated_source'.  Let's move it to
      there.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20231103191907.54531-6-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b753d48f
    • Namhyung Kim's avatar
      perf annotate: Move some source code related fields from 'struct annotation'... · 0aae4c99
      Namhyung Kim authored
      perf annotate: Move some source code related fields from 'struct annotation' to 'struct annotated_source'
      
      Some fields in the 'struct annotation' are only used with 'struct
      annotated_source' so better to be moved there in order to reduce memory
      consumption for other symbols.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20231103191907.54531-5-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0aae4c99
    • Namhyung Kim's avatar
      perf annotate: Move max_coverage from 'struct annotation' to 'struct annotated_branch' · 2b215ec7
      Namhyung Kim authored
      The max_coverage field is only used when branch stack info is available
      so it'd be natural to move to 'struct annotated_branch'.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20231103191907.54531-4-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2b215ec7
    • Namhyung Kim's avatar
      perf annotate: Split branch stack cycles info from 'struct annotation' · b7f87e32
      Namhyung Kim authored
      The cycles info is only meaningful when sample has branch stacks.  To
      save the memory for normal cases, move those fields to a new 'struct
      annotated_branch' and dynamically allocate it when needed.  Also move
      cycles_hist from annotated_source as it's related here.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20231103191907.54531-3-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b7f87e32
    • Namhyung Kim's avatar
      perf annotate: Split branch stack cycles information out of 'struct annotation_line' · de2c7eb5
      Namhyung Kim authored
      The cycles info is used only when branch stack is provided.  Separate
      them from 'struct annotation_line' into a separate struct and lazy
      allocate them to save some memory.
      
      Committer notes:
      
      Make annotation__compute_ipc() check if the lazy allocation works,
      bailing out if so, its callers already do error checking and
      propagation.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lore.kernel.org/r/20231103191907.54531-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      de2c7eb5
    • Namhyung Kim's avatar
      perf test: Simplify "object code reading" test · 89d5c48c
      Namhyung Kim authored
      It tries cycles (or cpu-clock on s390) event with exclude_kernel bit to
      open.  But other arch on a VM can fail with the hardware event and need
      to fallback to the software event in the same way.
      
      So let's get rid of the cpuid check and use generic fallback mechanism
      using an array of event candidates.  Now event in the odd index excludes
      the kernel so use that for the return value.
      Reviewed-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarJames Clark <james.clark@arm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Link: https://lore.kernel.org/r/20231103195541.67788-1-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      89d5c48c
    • Ian Rogers's avatar
      perf machine thread: Remove exited threads by default · 9ffa6c75
      Ian Rogers authored
      'struct thread' values hold onto references to mmaps, DSOs, etc. When a
      thread exits it is necessary to clean all of this memory up by removing
      the thread from the machine's threads. Some tools require this doesn't
      happen, such as auxtrace events, 'perf report' if offcpu events exist or
      if a task list is being generated, so add a 'struct symbol_conf' member
      to make the behavior optional. When an exited thread is left in the
      machine's threads, mark it as exited.
      
      This change relates to commit 40826c45 ("perf thread: Remove
      notion of dead threads") . Dead threads were removed as they had a
      reference count of 0 and were difficult to reason about with the
      reference count checker. Here a thread is removed from threads when it
      exits, unless via symbol_conf the exited thread isn't remove and is
      marked as exited. Reference counting behaves as it normally does.
      Reviewed-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Colin Ian King <colin.i.king@gmail.com>
      Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Huacai Chen <chenhuacai@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Li Dong <lidong@vivo.com>
      Cc: Liam Howlett <liam.howlett@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Miguel Ojeda <ojeda@kernel.org>
      Cc: Ming Wang <wangming01@loongson.cn>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Nick Terrell <terrelln@fb.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
      Cc: Wenyu Liu <liuwenyu7@huawei.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Link: https://lore.kernel.org/r/20231102175735.2272696-6-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9ffa6c75
    • Ian Rogers's avatar
      perf record: Lazy load kernel symbols · 1a27fc01
      Ian Rogers authored
      Commit 5b7ba82a ("perf symbols: Load kernel maps before using")
      changed it so that loading a kernel DSO would cause the symbols for the
      DSO to be eagerly loaded.
      
      For 'perf record' this is overhead as the symbols won't be used. Add a
      field to 'struct symbol_conf' to control the behavior and disable it for
      'perf record' and 'perf inject'.
      Reviewed-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Changbin Du <changbin.du@huawei.com>
      Cc: Colin Ian King <colin.i.king@gmail.com>
      Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Huacai Chen <chenhuacai@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: K Prateek Nayak <kprateek.nayak@amd.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Li Dong <lidong@vivo.com>
      Cc: Liam Howlett <liam.howlett@oracle.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Miguel Ojeda <ojeda@kernel.org>
      Cc: Ming Wang <wangming01@loongson.cn>
      Cc: Nick Terrell <terrelln@fb.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@amd.com>
      Cc: Sandipan Das <sandipan.das@amd.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Steinar H. Gunderson <sesse@google.com>
      Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
      Cc: Wenyu Liu <liuwenyu7@huawei.com>
      Cc: Yang Jihong <yangjihong1@huawei.com>
      Link: https://lore.kernel.org/r/20231102175735.2272696-3-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1a27fc01
    • Colin Ian King's avatar
      perf tools: Fix spelling mistake "parametrized" -> "parameterized" · 7ff7b7af
      Colin Ian King authored
      There are spelling mistakes in comments and a pr_debug message. Fix them.
      Reviewed-by: default avatarAthira Jajeev <atrajeev@linux.vnet.ibm.com>
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarColin Ian King <colin.i.king@gmail.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: kernel-janitors@vger.kernel.org
      Link: https://lore.kernel.org/r/20231003074911.220216-1-colin.i.king@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7ff7b7af
    • Kan Liang's avatar
      perf tools: Add branch counter knob · 9fbb4b02
      Kan Liang authored
      Add a new branch filter, "counter", for the branch counter option. It is
      used to mark the events which should be logged in the branch. If it is
      applied with the -j option, the counters of all the events should be
      logged in the branch. If the legacy kernel doesn't support the new
      branch sample type, switching off the branch counter filter.
      
      The stored counter values in each branch are displayed right after the
      regular branch stack information via perf report -D.
      
      Usage examples:
      
        # perf record -e "{branch-instructions,branch-misses}:S" -j any,counter
      
      Only the first event, branch-instructions, collect the LBR. Both
      branch-instructions and branch-misses are marked as logged events.  The
      occurrences information of them can be found in the branch stack
      extension space of each branch.
      
        # perf record -e "{cpu/branch-instructions,branch_type=any/,cpu/branch-misses,branch_type=counter/}"
      
      Only the first event, branch-instructions, collect the LBR. Only the
      branch-misses event is marked as a logged event.
      
      Committer notes:
      
      I noticed 'perf test "Sample parsing"' failing, reported to the list and
      Kan provided a patch that checks if the evsel has a leader and that
      evsel->evlist is set, the comment in the source code further explains
      it.
      Reviewed-by: default avatarIan Rogers <irogers@google.com>
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Tinghao Zhang <tinghao.zhang@intel.com>
      Link: https://lore.kernel.org/r/20231025201626.3000228-8-kan.liang@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9fbb4b02
  3. 06 Nov, 2023 2 commits
  4. 03 Nov, 2023 2 commits
  5. 31 Oct, 2023 1 commit
  6. 30 Oct, 2023 1 commit
  7. 28 Oct, 2023 13 commits
  8. 26 Oct, 2023 1 commit