1. 11 Mar, 2019 30 commits
  2. 09 Mar, 2019 4 commits
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo-5.1-20190307' of... · b339da48
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo-5.1-20190307' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
      
      Pull perf/core changes from Arnaldo Carvalho de Melo:
      
      perf bpf:
      
        Arnaldo Carvalho de Melo:
      
        - Automatically add BTF ELF markers to 'perf trace' BPF programs, so that
          tools such as 'bpftool map dump' can pretty print map keys and values.
      
      perf c2c:
      
        Jiri Olsa:
      
        - Fix report for empty NUMA node.
      
      perf diff:
      
        Jin Yao:
      
        - Support --time, --cpu, --pid and --tid filter options.
      
      perf probe:
      
        Arnaldo Carvalho de Melo:
      
        - Clarify error message about not finding kernel modules debuginfo.
      
      perf record:
      
        Jiri Olsa:
      
        - Fixup probing for max attr.precise_ip.
      
      perf trace:
      
        Arnaldo Carvalho de Melo:
      
        - Add missing %s lost in the 'msg_flags' recvmmsg arg when adding prefix suppression logic.
      
      perf annotate:
      
        Arnaldo Carvalho de Melo:
      
        - Calculate the max instruction name, align column to that, removing the
          hardcoded max 6 chars and cope with instructions with names longer than that,
          such as vpmovmskb, vpcmpeqb, etc.
      
      kernel:
      
        Song Liu:
      
        - Consider events with attr.bpf_event set as side-band.
      
        Gustavo A. R. Silva:
      
        - Mark expected switch fall-through in perf_event_parse_addr_filter().
      
      Libraries:
      
        Jiri Olsa:
      
        - Fix leaks and double frees on error paths.
      
      libtraceevent:
      
        Tony Jones:
      
        - Fix buffer overflow in arg_eval().
      
      python scripting:
      
        Tony Jones:
      
        - More python3 fixes.
      
      Trivial:
      
        Yang Wei:
      
        - Remove needless extra semicolon in clang C++ glue code.
      
      Intel PT/BTS:
      
        Adrian Hunter:
      
        - Improve auxtrace address filter error message when there is no DSO.
      
        - Fix divide by zero when TSC is not available.
      
        - Further improvements to the export to sqlite/posgresql python scripts
          and to the GUI sqlviewer, exporting 'parent_id' so that we have enable
          the creation of call trees.
      
        Andi Kleen:
      
        - Generalize function to copy from thread addr space from intel-bts code.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      b339da48
    • Gustavo A. R. Silva's avatar
      perf/core: Mark expected switch fall-through · 43aa378b
      Gustavo A. R. Silva authored
      In preparation to enabling -Wimplicit-fallthrough, mark switch cases
      where we are expecting to fall through.
      
      This patch fixes the following warning:
      
        kernel/events/core.c: In function ‘perf_event_parse_addr_filter’:
        kernel/events/core.c:9154:11: warning: this statement may fall through [-Wimplicit-fallthrough=]
            kernel = 1;
            ~~~~~~~^~~
        kernel/events/core.c:9156:3: note: here
           case IF_SRC_FILEADDR:
           ^~~~
      
      Warning level 3 was used: -Wimplicit-fallthrough=3
      
      This patch is part of the ongoing efforts to enable -Wimplicit-fallthrough.
      Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: https://lkml.kernel.org/r/20190212205430.GA8446@embeddedorSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      43aa378b
    • Kan Liang's avatar
      perf/x86/intel/uncore: Fix client IMC events return huge result · 8041ffd3
      Kan Liang authored
      The client IMC bandwidth events currently return very large values:
      
        $ perf stat -e uncore_imc/data_reads/ -e uncore_imc/data_writes/ -I 10000 -a
      
        10.000117222 34,788.76 MiB uncore_imc/data_reads/
        10.000117222 8.26 MiB uncore_imc/data_writes/
        20.000374584 34,842.89 MiB uncore_imc/data_reads/
        20.000374584 10.45 MiB uncore_imc/data_writes/
        30.000633299 37,965.29 MiB uncore_imc/data_reads/
        30.000633299 323.62 MiB uncore_imc/data_writes/
        40.000891548 41,012.88 MiB uncore_imc/data_reads/
        40.000891548 6.98 MiB uncore_imc/data_writes/
        50.001142480 1,125,899,906,621,494.75 MiB uncore_imc/data_reads/
        50.001142480 6.97 MiB uncore_imc/data_writes/
      
      The client IMC events are freerunning counters. They still use the
      old event encoding format (0x1 for data_read and 0x2 for data write).
      The counter bit width is calculated by common code, which assume that
      the standard encoding format is used for the freerunning counters.
      Error bit width information is calculated.
      
      The patch intends to convert the old client IMC event encoding to the
      standard encoding format.
      
      Current common code uses event->attr.config which directly copy from
      user space. We should not implicitly modify it for a converted event.
      The event->hw.config is used to replace the event->attr.config in
      common code.
      
      For client IMC events, the event->attr.config is used to calculate a
      converted event with standard encoding format in the custom
      event_init(). The converted event is stored in event->hw.config.
      For other events of freerunning counters, they already use the standard
      encoding format. The same value as event->attr.config is assigned to
      event->hw.config in common event_init().
      Reported-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Tested-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: stable@kernel.org # v4.18+
      Fixes: 9aae1780 ("perf/x86/intel/uncore: Clean up client IMC uncore")
      Link: https://lkml.kernel.org/r/20190227165729.1861-1-kan.liang@linux.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      8041ffd3
    • Alexander Shishkin's avatar
      perf/ring_buffer: Use high order allocations for AUX buffers optimistically · 5768402f
      Alexander Shishkin authored
      Currently, the AUX buffer allocator will use high-order allocations
      for PMUs that don't support hardware scatter-gather chaining to ensure
      large contiguous blocks of pages, and always use an array of single
      pages otherwise.
      
      There is, however, a tangible performance benefit in using larger chunks
      of contiguous memory even in the latter case, that comes from not having
      to fetch the next page's address at every page boundary. In particular,
      a task running under Intel PT on an Atom CPU shows 1.5%-2% less runtime
      penalty with a single multi-page output region in snapshot mode (no PMI)
      than with multiple single-page output regions, from ~6% down to ~4%. For
      the snapshot mode it does make a difference as it is intended to run over
      long periods of time.
      
      For this reason, change the allocation policy to always optimistically
      start with the highest possible order when allocating pages for the AUX
      buffer, desceding until the allocation succeeds or order zero allocation
      fails.
      Signed-off-by: default avatarAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Link: https://lkml.kernel.org/r/20190215114727.62648-2-alexander.shishkin@linux.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      5768402f
  3. 06 Mar, 2019 6 commits