1. 31 Jul, 2020 1 commit
    • Thomas Hebb's avatar
      tools build feature: Use CC and CXX from parent · e3232c2f
      Thomas Hebb authored
      commit c8c18867 ("tools build: Use the same CC for feature detection
      and actual build") changed these assignments from unconditional (:=) to
      conditional (?=) so that they wouldn't clobber values from the
      environment. However, conditional assignment does not work properly for
      variables that Make implicitly sets, among which are CC and CXX. To
      quote tools/scripts/Makefile.include, which handles this properly:
      
        # Makefiles suck: This macro sets a default value of $(2) for the
        # variable named by $(1), unless the variable has been set by
        # environment or command line. This is necessary for CC and AR
        # because make sets default values, so the simpler ?= approach
        # won't work as expected.
      
      In other words, the conditional assignments will not run even if the
      variables are not overridden in the environment; Make will set CC to
      "cc" and CXX to "g++" when it starts[1], meaning the variables are not
      empty by the time the conditional assignments are evaluated. This breaks
      cross-compilation when CROSS_COMPILE is set but CC isn't, since "cc"
      gets used for feature detection instead of the cross compiler (and
      likewise for CXX).
      
      To fix the issue, just pass down the values of CC and CXX computed by
      the parent Makefile, which gets included by the Makefile that actually
      builds whatever we're detecting features for and so is guaranteed to
      have good values. This is a better solution anyway, since it means we
      aren't trying to replicate the logic of the parent build system and so
      don't risk it getting out of sync.
      
      Leave PKG_CONFIG alone, since 1) there's no common logic to compute it
      in Makefile.include, and 2) it's not an implicit variable, so
      conditional assignment works properly.
      
      [1] https://www.gnu.org/software/make/manual/html_node/Implicit-Variables.html
      
      Fixes: c8c18867 ("tools build: Use the same CC for feature detection and actual build")
      Signed-off-by: default avatarThomas Hebb <tommyhebb@gmail.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: David Carrillo-Cisneros <davidcc@google.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Igor Lubashev <ilubashe@akamai.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Quentin Monnet <quentin@isovalent.com>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: thomas hebb <tommyhebb@gmail.com>
      Link: http://lore.kernel.org/lkml/0a6e69d1736b0fa231a648f50b0cce5d8a6734ef.1595822871.git.tommyhebb@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e3232c2f
  2. 30 Jul, 2020 22 commits
  3. 29 Jul, 2020 2 commits
    • Wei Li's avatar
      perf tools: No need to cache the PMUs in ARM SPE auxtrace init routine · 3e43d79d
      Wei Li authored
      - auxtrace_record__init() is called only once, so there is no point in
        using a static variable to cache the results of
        find_all_arm_spe_pmus(), make it local and free the results after use.
      
      - Another reason is, even though SPE is micro-architecture dependent,
        but so far it only supports "statistical-profiling-extension-v1" and
        we have no chance to use multiple SPE's PMU events in Perf command.
      
      So remove the useless check code to make it clear.
      Signed-off-by: default avatarWei Li <liwei391@huawei.com>
      Reviewed-by: default avatarMathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Hanjun Guo <guohanjun@huawei.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Cc: Leo Yan <leo.yan@linaro.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lore.kernel.org/lkml/20200724071111.35593-3-liwei391@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3e43d79d
    • Wei Li's avatar
      perf tools: Fix record failure when mixed with ARM SPE event · 31e81e0b
      Wei Li authored
      When recording with cache-misses and arm_spe_x event, I found that it
      will just fail without showing any error info if i put cache-misses
      after 'arm_spe_x' event.
      
        [root@localhost 0620]# perf record -e cache-misses \
      				-e arm_spe_0/ts_enable=1,pct_enable=1,pa_enable=1,load_filter=1,jitter=1,store_filter=1,min_latency=0/ sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.067 MB perf.data ]
        [root@localhost 0620]#
        [root@localhost 0620]# perf record -e arm_spe_0/ts_enable=1,pct_enable=1,pa_enable=1,load_filter=1,jitter=1,store_filter=1,min_latency=0/ \
      				     -e  cache-misses sleep 1
        [root@localhost 0620]#
      
      The current code can only work if the only event to be traced is an
      'arm_spe_x', or if it is the last event to be specified. Otherwise the
      last event type will be checked against all the arm_spe_pmus[i]->types,
      none will match and an out of bound 'i' index will be used in
      arm_spe_recording_init().
      
      We don't support concurrent multiple arm_spe_x events currently, that
      is checked in arm_spe_recording_options(), and it will show the relevant
      info. So add the check and record of the first found 'arm_spe_pmu' to
      fix this issue here.
      
      Fixes: ffd3d18c ("perf tools: Add ARM Statistical Profiling Extensions (SPE) support")
      Signed-off-by: default avatarWei Li <liwei391@huawei.com>
      Reviewed-by: default avatarMathieu Poirier <mathieu.poirier@linaro.org>
      Tested-by-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Hanjun Guo <guohanjun@huawei.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kim Phillips <kim.phillips@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lore.kernel.org/lkml/20200724071111.35593-2-liwei391@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      31e81e0b
  4. 28 Jul, 2020 1 commit
    • Davidlohr Bueso's avatar
      perf bench: Add basic syscall benchmark · c2a08203
      Davidlohr Bueso authored
      The usefulness of having a standard way of testing syscall performance
      has come up from time to time[0]. Furthermore, some of our testing
      machinery (such as 'mmtests') already makes use of a simplified version
      of the microbenchmark. This patch mainly takes the same idea to measure
      syscall throughput compatible with 'perf-bench' via getppid(2), yet
      without any of the additional template stuff from Ingo's version (based
      on numa.c). The code is identical to what mmtests uses.
      
      [0] https://lore.kernel.org/lkml/20160201074156.GA27156@gmail.com/
      
      Committer notes:
      
      Add mising stdlib.h and unistd.h to get the prototypes for exit() and
      getppid().
      
      Committer testing:
      
        $ perf bench
        Usage:
        	perf bench [<common options>] <collection> <benchmark> [<options>]
      
                # List of all available benchmark collections:
      
                 sched: Scheduler and IPC benchmarks
               syscall: System call benchmarks
                   mem: Memory access benchmarks
                  numa: NUMA scheduling and MM benchmarks
                 futex: Futex stressing benchmarks
                 epoll: Epoll stressing benchmarks
             internals: Perf-internals benchmarks
                   all: All benchmarks
      
        $
        $ perf bench syscall
      
                # List of available benchmarks for collection 'syscall':
      
                 basic: Benchmark for basic getppid(2) calls
                   all: Run all syscall benchmarks
      
        $ perf bench syscall basic
        # Running 'syscall/basic' benchmark:
        # Executed 10000000 getppid() calls
             Total time: 3.679 [sec]
      
               0.367957 usecs/op
                2717708 ops/sec
        $ perf bench syscall all
        # Running syscall/basic benchmark...
        # Executed 10000000 getppid() calls
             Total time: 3.644 [sec]
      
               0.364456 usecs/op
                2743815 ops/sec
      
        $
      Signed-off-by: default avatarDavidlohr Bueso <dbueso@suse.de>
      Acked-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lore.kernel.org/lkml/20190308181747.l36zqz2avtivrr3c@linux-r8p5Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c2a08203
  5. 22 Jul, 2020 8 commits
  6. 21 Jul, 2020 3 commits
  7. 17 Jul, 2020 3 commits
    • Jiri Olsa's avatar
      perf metric: Add 'struct expr_id_data' to keep expr value · 070b3b5a
      Jiri Olsa authored
      Add 'struct expr_id_data' to keep an expr value instead of just a simple
      double pointer, so we can store more data for ID in the following
      changes.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20200712132634.138901-3-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      070b3b5a
    • Jiri Olsa's avatar
      perf metric: Rename expr__add_id() to expr__add_val() · 2c46f542
      Jiri Olsa authored
      Rename expr__add_id() to expr__add_val() so we can use expr__add_id() to
      actually add just the id without any value in following changes.
      
      There's no functional change.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: John Garry <john.garry@huawei.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lore.kernel.org/lkml/20200712132634.138901-2-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2c46f542
    • Masami Hiramatsu's avatar
      perf probe: Warn if the target function is a GNU indirect function · 3de2bf9d
      Masami Hiramatsu authored
      Warn if the probe target function is a GNU indirect function (GNU_IFUNC)
      because it may not be what the user wants to probe.
      
      The GNU indirect function ( https://sourceware.org/glibc/wiki/GNU_IFUNC )
      is the dynamic symbol solved at runtime. An IFUNC function is a selector
      which is invoked from the ELF loader, but the symbol address of the
      function which will be modified by the IFUNC is the same as the IFUNC in
      the symbol table. This can confuse users trying to probe such functions.
      
      For example, memcpy is an IFUNC.
      
        probe_libc:memcpy    (on __new_memcpy_ifunc@x86_64/multiarch/memcpy.c in /usr/lib64/libc-2.30.so)
      
      the probe is put on an IFUNC.
      
        perf  1742 [000] 26201.715632: probe_libc:memcpy: (7fdaa53824c0)
                    7fdaa53824c0 __new_memcpy_ifunc+0x0 (inlined)
                    7fdaa5d4a980 elf_machine_rela+0x6c0 (inlined)
                    7fdaa5d4a980 elf_dynamic_do_Rela+0x6c0 (inlined)
                    7fdaa5d4a980 _dl_relocate_object+0x6c0 (/usr/lib64/ld-2.30.so)
                    7fdaa5d42155 dl_main+0x1cc5 (/usr/lib64/ld-2.30.so)
                    7fdaa5d5831a _dl_sysdep_start+0x54a (/usr/lib64/ld-2.30.so)
                    7fdaa5d3ffeb _dl_start_final+0x25b (inlined)
                    7fdaa5d3ffeb _dl_start+0x25b (/usr/lib64/ld-2.30.so)
                    7fdaa5d3f117 .annobin_rtld.c+0x7 (inlined)
      
      And the event is invoked from the ELF loader instead of the target
      program's main code.
      
      Moreover, at this moment, we can not probe on the function which will
      be selected by the IFUNC, because it is determined at runtime. But
      uprobe will be prepared before running the target binary.
      
      Thus, I decided to warn user when 'perf probe' detects that the probe
      point is on an GNU IFUNC symbol. Someone who wants to probe an IFUNC
      symbol to debug the IFUNC function can ignore this warning.
      
      Committer notes:
      
      I.e., this warning will be emitted if the probe point is an IFUNC:
      
        "Warning: The probe function (%s) is a GNU indirect function.\n"
        "Consider identifying the final function used at run time and set the probe directly on that.\n"
      
      Complete set of steps:
      
        # readelf -sW /lib64/libc-2.29.so  | grep IFUNC | tail
         22196: 0000000000109a80   183 IFUNC   GLOBAL DEFAULT   14 __memcpy_chk
         22214: 00000000000b7d90   191 IFUNC   GLOBAL DEFAULT   14 __gettimeofday
         22336: 000000000008b690    60 IFUNC   GLOBAL DEFAULT   14 memchr
         22350: 000000000008b9b0    89 IFUNC   GLOBAL DEFAULT   14 __stpcpy
         22420: 000000000008bb10    76 IFUNC   GLOBAL DEFAULT   14 __strcasecmp_l
         22582: 000000000008a970    60 IFUNC   GLOBAL DEFAULT   14 strlen
         22585: 00000000000a54d0    92 IFUNC   WEAK   DEFAULT   14 wmemset
         22600: 000000000010b030    92 IFUNC   GLOBAL DEFAULT   14 __wmemset_chk
         22618: 000000000008b8a0   183 IFUNC   GLOBAL DEFAULT   14 __mempcpy
         22675: 000000000008ba70    76 IFUNC   WEAK   DEFAULT   14 strcasecmp
        #
        # perf probe -x /lib64/libc-2.29.so strlen
        Warning: The probe function (strlen) is a GNU indirect function.
        Consider identifying the final function used at run time and set the probe directly on that.
        Added new event:
          probe_libc:strlen    (on strlen in /usr/lib64/libc-2.29.so)
      
        You can now use it in all perf tools, such as:
      
        	perf record -e probe_libc:strlen -aR sleep 1
      
        #
      Reported-by: default avatarAndi Kleen <andi@firstfloor.org>
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Link: http://lore.kernel.org/lkml/159438669349.62703.5978345670436126948.stgit@devnote2Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3de2bf9d