1. 27 Dec, 2017 33 commits
    • Jin Yao's avatar
      perf tool: Improve bash command line auto-complete for multiple events with comma · 74cd5815
      Jin Yao authored
      perf has perf-completion.sh to define command line auto-completion in
      bash/zsh.
      
      For record/stat -e it works for single events, but isn't working when
      specifying multiple events with comma.
      
      It would be very useful if it could be fixed to make it easier by
      supporting multiple events, comma separated.
      
      With this patch, the result can be like this:
      
      1. Support the events returned from 'perf list --raw-dump'
      
      root@skl:/tmp# perf stat -e cpu/cache<TAB>
      cpu/cache-misses/      cpu/cache-references/
      
      root@skl:/tmp# perf stat -e cpu/cache-misses/,cpu/branch-<TAB>
      cpu/branch-instructions/  cpu/branch-misses/
      
      root@skl:/tmp# perf stat -e cpu/cache-misses/,cpu/branch-i<TAB>
      root@skl:/tmp# perf stat -e cpu/cache-misses/,cpu/branch-instructions/
      
      2. Support the events listed in /sys/bus/event_source/devices/cpu/events
      
      root@skl:/tmp# perf stat -e cycle<TAB>
      cycle_activity.cycles_l1d_miss  cycle_activity.stalls_l3_miss
      cycle_activity.cycles_l2_miss   cycle_activity.stalls_mem_any
      cycle_activity.cycles_l3_miss   cycle_activity.stalls_total
      cycle_activity.cycles_mem_any   cycles-ct
      cycle_activity.stalls_l1d_miss  cycles-t
      cycle_activity.stalls_l2_miss
      
      root@skl:/tmp# perf stat -e cycles-<TAB>
      cycles-ct  cycles-t
      
      root@skl:/tmp# perf stat -e cycles-t,cpu/c<TAB>
      cpu/cache-misses/      cpu/cpu-cycles/        cpu/cycles-t/
      cpu/cache-references/  cpu/cycles-ct/
      
      root@skl:/tmp# perf stat -e cycles-t,cpu/cache-<TAB>
      cpu/cache-misses/      cpu/cache-references/
      
      root@skl:/tmp# perf stat -e cycles-t,cpu/cache-misses/
      
      3. Support the uppercase event which is with prefix "cpu/"
      
      root@skl:/tmp# perf stat -e cpu/c<TAB>
      cpu/cache-misses/      cpu/cpu-cycles/        cpu/cycles-t/
      cpu/cache-references/  cpu/cycles-ct/
      
      root@skl:/tmp# perf stat -e cpu/cache-misses/,cpu/C<TAB>
      cpu/CACHE-MISSES/      cpu/CPU-CYCLES/        cpu/CYCLES-T/
      cpu/CACHE-REFERENCES/  cpu/CYCLES-CT/
      
      root@skl:/tmp# perf stat -e cpu/cache-misses/,cpu/CACHE-REFERENCES/
      
      Note that:
      
      a) This patch only supports bash.
      
      b) It doesn't support the cases like {},{} or {...,...}.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1513848370-8098-1-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      74cd5815
    • Kim Phillips's avatar
      perf probe arm64: Fix symbol fixup issues due to ELF type · f1031c8d
      Kim Phillips authored
      On an arm64 machine running a CONFIG_RANDOMIZE_BASE=y kernel, perf
      kernel symbol resolution fails.  Debugging saw symsrc_init calling the
      default elf__needs_adjust_symbols() where checks for an ET_DYN (3)
      ehdr.e_type failed when they should have succeeded.
      
      Fix by adopting powerpc version of the weak elf__needs_adjust_symbols()
      function, as done in commit d2332098 ("perf probe ppc: Fix symbol
      fixup issues due to ELF type").
      
      Prior to this patch, perf test 1 would fail:
      
        $ sudo oldperf test -v 1 |& head
         1: vmlinux symtab matches kallsyms                       :
        test child forked, pid 33374
        Looking at the vmlinux_path (8 entries long)
        Using /usr/lib/debug/boot/vmlinux for symbols
        ERR : 0xfffe0000100f1000: do_undefinstr not on kallsyms
        ERR : 0xfffe0000100f1320: do_sysinstr not on kallsyms
        ERR : 0xfffe0000100f13b0: do_debug_exception not on kallsyms
        ERR : 0xfffe0000100f1498: do_mem_abort not on kallsyms
        ERR : 0xfffe0000100f1580: do_sp_pc_abort not on kallsyms
        ...
      
      After applying this patch, perf test 1 now succeeds:
      
        $ sudo ./newperf test -v 1 |& head
         1: vmlinux symtab matches kallsyms                       :
        test child forked, pid 33378
        Looking at the vmlinux_path (8 entries long)
        Using /usr/lib/debug/boot/vmlinux for symbols
        WARN: 0xffff000008081000: diff name v: do_undefinstr k: __exception_text_start
        WARN: 0xffff0000080819e8: diff name v: __irqentry_text_end k: __softirqentry_text_start
        WARN: 0xffff000008081d08: diff name v: __entry_text_start k: __softirqentry_text_end
        WARN: 0xffff00000809db5c: diff name v: flush_icache_range k: __flush_cache_user_range
        WARN: 0xffff000008101908: diff name v: sys_ni_syscall k: sys_vm86old
        ...
      Signed-off-by: default avatarKim Phillips <kim.phillips@arm.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/20171214175242.e30450f17f93ad675d968fa3@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f1031c8d
    • Mengting Zhang's avatar
      perf evsel: Enable ignore_missing_thread for pid option · ca800068
      Mengting Zhang authored
      While monitoring a multithread process with pid option, perf sometimes
      may return sys_perf_event_open failure with 3(No such process) if any of
      the process's threads die before we open the event. However, we want
      perf continue monitoring the remaining threads and do not exit with
      error.
      
      Here, the patch enables perf_evsel::ignore_missing_thread for -p option
      to ignore complete failure if any of threads die before we open the event.
      But it may still return sys_perf_event_open failure with 22(Invalid) if we
      monitors several event groups.
      
              sys_perf_event_open: pid 28960  cpu 40  group_fd 118202  flags 0x8
              sys_perf_event_open: pid 28961  cpu 40  group_fd 118203  flags 0x8
              WARNING: Ignored open failure for pid 28962
              sys_perf_event_open: pid 28962  cpu 40  group_fd [118203]  flags 0x8
              sys_perf_event_open failed, error -22
      
      That is because when we ignore a missing thread, we change the thread_idx
      without dealing with its fds, FD(evsel, cpu, thread). Then get_group_fd()
      may return a wrong group_fd for the next thread and sys_perf_event_open()
      return with 22.
      
              sys_perf_event_open(){
                 ...
                 if (group_fd != -1)
                     perf_fget_light()//to get corresponding group_leader by group_fd
                 ...
                 if (group_leader)
                    if (group_leader->ctx->task != ctx->task)//should on the same task
                         goto err_context
                 ...
              }
      
      This patch also fixes this bug by introducing perf_evsel__remove_fd() and
      update_fds to allow removing fds for the missing thread.
      
      Changes since v1:
      - Change group_fd__remove() into a more genetic way without changing code logic
      - Remove redundant condition
      
      Changes since v2:
      - Use a proper function name and add some comment.
      - Multiline comment style fixes.
      
      Committer testing:
      
      Before this patch the recently added 'perf stat --per-thread' for system
      wide counting would race while enumerating all threads using /proc:
      
        [root@jouet ~]# perf stat --per-thread
        failed to parse CPUs map: No such file or directory
      
         Usage: perf stat [<options>] [<command>]
      
            -C, --cpu <cpu>       list of cpus to monitor in system-wide
            -a, --all-cpus        system-wide collection from all CPUs
        [root@jouet ~]# perf stat --per-thread
        failed to parse CPUs map: No such file or directory
      
         Usage: perf stat [<options>] [<command>]
      
            -C, --cpu <cpu>       list of cpus to monitor in system-wide
            -a, --all-cpus        system-wide collection from all CPUs
        [root@jouet ~]#
      
      When, say, the kernel was being built, so lots of shortlived threads,
      after this patch this doesn't happen.
      Signed-off-by: default avatarMengting Zhang <zhangmengting@huawei.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Cc: Cheng Jian <cj.chengjian@huawei.com>
      Cc: Li Bin <huawei.libin@huawei.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1513148513-6974-1-git-send-email-zhangmengting@huawei.com
      [ Remove one use 'evlist' alias variable ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ca800068
    • Hendrik Brueckner's avatar
      perf s390: Always build with -fPIC · a9a3f1d1
      Hendrik Brueckner authored
      On s390, object files must be compiled with position-indepedent code in
      order to be incrementally linked or linked to shared libraries.
      
      Therefore, add -fPIC to the CFLAGS for s390 to ensure each object file
      is built properly.
      Reported-by: default avatarJonathan Hermann <jonathan.hermann@de.ibm.com>
      Signed-off-by: default avatarHendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: linux s390 list <linux-s390@vger.kernel.org>
      Link: https://lkml.kernel.org/r/20171207080951.GC4889@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a9a3f1d1
    • Arnaldo Carvalho de Melo's avatar
      Revert "perf s390: Always build with -fPIC" · 922991c2
      Arnaldo Carvalho de Melo authored
      This one made x86 always build with -fPIC, when the intention was for
      s390 to be built that way, due to a rebase mistake.
      Reported-by: default avatarHendrik Brueckner <brueckner@linux.vnet.ibm.com>
      This reverts commit 1dc4ddf1.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      922991c2
    • Michael Petlan's avatar
      perf test shell: Fix check open filename arg using 'perf trace' · 69b5c953
      Michael Petlan authored
      Commit f231af78 ("perf test shell: Fix check open filename arg using
      'perf trace' on s390x") added an exception for s390x to use openat()
      instead of open() in the test that intercepts a open syscall to look for
      the filename argument as obtained by the vfs_getname 'perf probe' it
      puts in place at the getname_flags kernel function.
      
      Its not just s390x that uses openat() instead of open(), so use 'perf
      list' to look for the syscall:sys_enter_open(at)? present in the system
      being tested instead of checking if the system is s390x.
      
      In fact Namhyung pointed out that glibc 2.26 changed this behaviour, as
      described in https://lwn.net/Articles/738694/, so systems where glibc is
      >= 2.26 will need this patch for this test to work, which already took
      place in some distros for architectures such as s390x, while Fedora 26
      x86_64 is at glibc 2.25, i.e. still uses open().
      Signed-off-by: default avatarMichael Petlan <mpetlan@redhat.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: default avatarThomas Richter <tmricht@linux.vnet.ibm.com>
      Link: https://lkml.kernel.org/r/ab23fe42-1080-a46b-503e-744e097f414f@linux.vnet.ibm.com
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      LPU-Reference: 1275675985.12835754.1513095723265.JavaMail.zimbra@redhat.com
      Link: https://lkml.kernel.org/n/tip-j2wbz9av1rw3thr3t0g4dtuk@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      69b5c953
    • Jiri Olsa's avatar
      perf evsel: Fix swap for samples with raw data · f9d8adb3
      Jiri Olsa authored
      When we detect a different endianity we swap event before processing.
      It's tricky for samples because we have no idea what's inside. We treat
      it as an array of u64s, swap them and later on we swap back parts which
      are different.
      
      We mangle this way also the tracepoint raw data, which ends up in report
      showing wrong data:
      
        1.95%  comm=Q^B pid=29285 prio=16777216 target_cpu=000
        1.67%  comm=l^B pid=0 prio=16777216 target_cpu=000
      
      Luckily the traceevent library handles the endianity by itself (thank
      you Steven!), so we can pass the RAW data directly in the other
      endianity.
      
        2.51%  comm=beah-rhts-task pid=1175 prio=120 target_cpu=002
        2.23%  comm=kworker/0:0 pid=11566 prio=120 target_cpu=000
      
      The fix is basically to swap back the raw data if different endianity is
      detected.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/20171129184346.3656-1-jolsa@kernel.org
      [ Add util/memswap.c to python-ext-sources to link missing mem_bswap_64() ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f9d8adb3
    • Masami Hiramatsu's avatar
      perf probe: Support escaped character in parser · c588d158
      Masami Hiramatsu authored
      Support the special characters escaped by '\' in parser.  This allows
      user to specify versions directly like below.
      
        =====
        # ./perf probe -x /lib64/libc-2.25.so malloc_get_state\\@GLIBC_2.2.5
        Added new event:
          probe_libc:malloc_get_state (on malloc_get_state@GLIBC_2.2.5 in /usr/lib64/libc-2.25.so)
      
        You can now use it in all perf tools, such as:
      
      	  perf record -e probe_libc:malloc_get_state -aR sleep 1
      
        =====
      
      Or, you can use separators in source filename, e.g.
      
        =====
        # ./perf probe -x /opt/test/a.out foo+bar.c:3
        Semantic error :There is non-digit character in offset.
          Error: Command Parse Error.
        =====
      
      Usually "+" in source file cause parser error, but
      
        =====
        # ./perf probe -x /opt/test/a.out foo\\+bar.c:4
        Added new event:
          probe_a:main         (on @foo+bar.c:4 in /opt/test/a.out)
      
        You can now use it in all perf tools, such as:
      
      	  perf record -e probe_a:main -aR sleep 1
        =====
      
      escaped "\+" allows you to specify that.
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Reviewed-by: default avatarThomas Richter <tmricht@linux.vnet.ibm.com>
      Acked-by: default avatarRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: bhargavb <bhargavaramudu@gmail.com>
      Cc: linux-rt-users@vger.kernel.org
      Link: http://lkml.kernel.org/r/151309111236.18107.5634753157435343410.stgit@devboxSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c588d158
    • Masami Hiramatsu's avatar
      perf string: Add {strdup,strpbrk}_esc() · 1e9f9e8a
      Masami Hiramatsu authored
      To support the special characters escaped by '\' in 'perf probe' event parser.
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Reviewed-by: default avatarThomas Richter <tmricht@linux.vnet.ibm.com>
      Acked-by: default avatarRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: bhargavb <bhargavaramudu@gmail.com>
      Cc: linux-rt-users@vger.kernel.org
      Link: http://lkml.kernel.org/r/151275052163.24652.18205979384585484358.stgit@devbox
      [ Split from a larger patch ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1e9f9e8a
    • Masami Hiramatsu's avatar
      perf probe: Find versioned symbols from map · 4b3a2716
      Masami Hiramatsu authored
      Commit d8040645 ("perf symbols: Allow user probes on versioned
      symbols") allows user to find default versioned symbols (with "@@") in
      map. However, it did not enable normal versioned symbol (with "@") for
      perf-probe.  E.g.
      
        =====
        # ./perf probe -x /lib64/libc-2.25.so malloc_get_state
        Failed to find symbol malloc_get_state in /usr/lib64/libc-2.25.so
          Error: Failed to add events.
        =====
      
      This solves above issue by improving perf-probe symbol search function,
      as below.
      
        =====
        # ./perf probe -x /lib64/libc-2.25.so malloc_get_state
        Added new event:
          probe_libc:malloc_get_state (on malloc_get_state in /usr/lib64/libc-2.25.so)
      
        You can now use it in all perf tools, such as:
      
      	  perf record -e probe_libc:malloc_get_state -aR sleep 1
      
        # ./perf probe -l
          probe_libc:malloc_get_state (on malloc_get_state@GLIBC_2.2.5 in /usr/lib64/libc-2.25.so)
        =====
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Reviewed-by: default avatarThomas Richter <tmricht@linux.vnet.ibm.com>
      Acked-by: default avatarRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: bhargavb <bhargavaramudu@gmail.com>
      Cc: linux-rt-users@vger.kernel.org
      Link: http://lkml.kernel.org/r/151275049269.24652.1639103455496216255.stgit@devboxSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4b3a2716
    • Masami Hiramatsu's avatar
      perf probe: Add __return suffix for return events · e63c625a
      Masami Hiramatsu authored
      Add __return suffix for function return events automatically. Without
      this, user have to give --force option and will see the number suffix
      for each event like "function_1", which is not easy to recognize.
      Instead, this adds __return suffix to it automatically.  E.g.
      
        =====
        # ./perf probe -x /lib64/libc-2.25.so 'malloc*%return'
        Added new events:
          probe_libc:malloc_printerr__return (on malloc*%return in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_consolidate__return (on malloc*%return in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_check__return (on malloc*%return in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_hook_ini__return (on malloc*%return in /usr/lib64/libc-2.25.so)
          probe_libc:malloc__return (on malloc*%return in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_trim__return (on malloc*%return in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_usable_size__return (on malloc*%return in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_stats__return (on malloc*%return in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_info__return (on malloc*%return in /usr/lib64/libc-2.25.so)
          probe_libc:mallochook__return (on malloc*%return in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_get_state__return (on malloc*%return in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_set_state__return (on malloc*%return in /usr/lib64/libc-2.25.so)
      
        You can now use it in all perf tools, such as:
      
      	  perf record -e probe_libc:malloc_set_state__return -aR sleep 1
      
        =====
      Reported-by: default avatarArnaldo Carvalho de Melo <acme@kernel.org>
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Acked-by: default avatarRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Reviewed-by: default avatarThomas Richter <tmricht@linux.vnet.ibm.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: bhargavb <bhargavaramudu@gmail.com>
      Cc: linux-rt-users@vger.kernel.org
      Link: http://lkml.kernel.org/r/151275046418.24652.6696011972866498489.stgit@devboxSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e63c625a
    • Masami Hiramatsu's avatar
      perf probe: Cut off the version suffix from event name · a3110cd9
      Masami Hiramatsu authored
      Cut off the version suffix (e.g. @GLIBC_2.2.5 etc.) from automatic
      generated event name. This fixes wildcard event adding like below case;
      
        =====
        # perf probe -x /lib64/libc-2.25.so malloc*
        Internal error: "malloc_get_state@GLIBC_2" is wrong event name.
          Error: Failed to add events.
        =====
      
      This failure was caused by a versioned suffix symbol.
      
      With this fix, perf probe automatically cuts the suffix after @ as
      below.
      
        =====
        # ./perf probe -x /lib64/libc-2.25.so malloc*
        Added new events:
          probe_libc:malloc_printerr (on malloc* in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_consolidate (on malloc* in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_check (on malloc* in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_hook_ini (on malloc* in /usr/lib64/libc-2.25.so)
          probe_libc:malloc    (on malloc* in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_trim (on malloc* in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_usable_size (on malloc* in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_stats (on malloc* in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_info (on malloc* in /usr/lib64/libc-2.25.so)
          probe_libc:mallochook (on malloc* in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_get_state (on malloc* in /usr/lib64/libc-2.25.so)
          probe_libc:malloc_set_state (on malloc* in /usr/lib64/libc-2.25.so)
      
        You can now use it in all perf tools, such as:
      
      	  perf record -e probe_libc:malloc_set_state -aR sleep 1
      
        =====
      Reported-by: default avatarArnaldo Carvalho de Melo <acme@kernel.org>
      Reported-by: default avatarbhargavb <bhargavaramudu@gmail.com>
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Acked-by: default avatarRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Reviewed-by: default avatarThomas Richter <tmricht@linux.vnet.ibm.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: linux-rt-users@vger.kernel.org
      Link: http://lkml.kernel.org/r/NoneSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a3110cd9
    • Masami Hiramatsu's avatar
      perf probe: Add warning message if there is unexpected event name · 9f5c6d87
      Masami Hiramatsu authored
      This improve the error message so that user can know event-name error
      before writing new events to kprobe-events interface.
      
      E.g.
         ======
         #./perf probe -x /lib64/libc-2.25.so malloc_get_state*
         Internal error: "malloc_get_state@GLIBC_2" is an invalid event name.
           Error: Failed to add events.
         ======
      Reported-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Acked-by: default avatarRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Reviewed-by: default avatarThomas Richter <tmricht@linux.vnet.ibm.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Clarke <pc@us.ibm.com>
      Cc: bhargavb <bhargavaramudu@gmail.com>
      Cc: linux-rt-users@vger.kernel.org
      Link: http://lkml.kernel.org/r/151275040665.24652.5188568529237584489.stgit@devboxSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9f5c6d87
    • Arnaldo Carvalho de Melo's avatar
      perf env: Adopt perf_env__arch() from the annotate code · 4e8fbc1c
      Arnaldo Carvalho de Melo authored
      And use it in the libunwind case, with both passing a valid perf_env to
      extract the arch to be normalized from and passing NULL with the same
      semantic as in the annotate code: to get it from uname() uts.machine.
      
      Now the code to generate per arch errno translation tables (int/string)
      can use it to decode perf.data files recorded in a different arch than
      that where 'perf trace' (or any other analysis tool) runs.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-p2epffgash69w38kvj3ntpc9@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4e8fbc1c
    • Arnaldo Carvalho de Melo's avatar
      perf annotate: Use perf_env when obtaining the arch name · 3285deba
      Arnaldo Carvalho de Melo authored
      Paving the way to reuse these routines in other areas, like when
      generating errno tables.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-rh1qv051vb8gfdcswskrn53h@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3285deba
    • Arnaldo Carvalho de Melo's avatar
      perf annotate: Get the cpuid from evsel->evlist->env in symbol__annotate() · 5449f13c
      Arnaldo Carvalho de Melo authored
      To reduce its function signature, since we get this from 'evsel' which
      is already one of its arguments.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lkml.kernel.org/n/tip-070eap7t6uicg9c3w086xy2z@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5449f13c
    • Hendrik Brueckner's avatar
      perf trace: Use generated syscall table on s390 too · 901bb028
      Hendrik Brueckner authored
      This should speed up accessing new system calls introduced with the
      kernel rather than waiting for libaudit updates to include them.
      
      It also enables users to specify wildcards, for example, perf trace -e
      'open*', just like was already possible on x86.
      Signed-off-by: default avatarHendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Reviewed-by: default avatarThomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: linux-s390@vger.kernel.org
      LPU-Reference: 1512635281-20733-2-git-send-email-brueckner@linux.vnet.ibm.com
      Link: https://lkml.kernel.org/n/tip-htplh3nbrivi7g3cffbh4fsu@git.kernel.org
      [ split from a larger patch ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      901bb028
    • Hendrik Brueckner's avatar
      perf s390: Generate system call table from asm/unistd.h · 164a747f
      Hendrik Brueckner authored
      This should speed up accessing new system calls introduced with
      the kernel rather than waiting for libaudit updates to include
      them.
      
      Committer testing:
      
        $ rm -rf /tmp/build/perf
        $ mkdir /tmp/build/perf
        $ make srctree=/home/acme/git/perf -C tools/perf/arch/s390 OUTPUT=/tmp/build/perf/ archheaders
        make: Entering directory '/home/acme/git/perf/tools/perf/arch/s390'
        /bin/sh '/home/acme/git/perf/tools/perf/arch/s390/entry/syscalls//mksyscalltbl' 'cc' /home/acme/git/perf/tools/arch/s390/include/uapi/asm/unistd.h > /tmp/build/perf/arch/s390/include/generated/asm/syscalls_64.c
        make: Leaving directory '/home/acme/git/perf/tools/perf/arch/s390'
        $ head -5 /tmp/build/perf/arch/s390/include/generated/asm/syscalls_64.c
        static const char *syscalltbl_s390_64[] = {
      	[1] = "exit",
      	[2] = "fork",
      	[3] = "read",
      	[4] = "write",
        $ tail -5 /tmp/build/perf/arch/s390/include/generated/asm/syscalls_64.c
      	[378] = "s390_guarded_storage",
      	[379] = "statx",
      	[380] = "s390_sthyi",
        };
        #define SYSCALLTBL_S390_64_MAX_ID 380
        $
      
      Now to plug this into 'perf trace' proper.
      Signed-off-by: default avatarHendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Reviewed-by: default avatarThomas Richter <tmricht@linux.vnet.ibm.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: linux-s390@vger.kernel.org
      LPU-Reference: 1512635281-20733-2-git-send-email-brueckner@linux.vnet.ibm.com
      Link: https://lkml.kernel.org/n/tip-h5km60rdg3rqxvsys85q50l3@git.kernel.org
      [ split from a larger patch ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      164a747f
    • Hendrik Brueckner's avatar
      tools include s390: Grab a copy of arch/s390/include/uapi/asm/unistd.h · 7af7919f
      Hendrik Brueckner authored
      Will be used for generating the syscall id/string translation table.
      Signed-off-by: default avatarHendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Reviewed-by: default avatarThomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: linux-s390@vger.kernel.org
      LPU-Reference: 1512635281-20733-2-git-send-email-brueckner@linux.vnet.ibm.com
      Link: https://lkml.kernel.org/n/tip-vjfbfvgjrnqnbdluqd7leo98@git.kernel.org
      [ split from a larger patch ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7af7919f
    • Pravin Shedge's avatar
      perf perf: Remove duplicate includes · 3315d14f
      Pravin Shedge authored
      These duplicate includes have been found with scripts/checkincludes.pl
      but they have been removed manually to avoid removing false positives.
      Signed-off-by: default avatarPravin Shedge <pravin.shedge4linux@gmail.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1512582204-6493-1-git-send-email-pravin.shedge4linux@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3315d14f
    • Jiri Olsa's avatar
      perf test: Handle properly readdir DT_UNKNOWN · 378811ac
      Jiri Olsa authored
      Some system can return DT_UNKNOWN in readdir's struct dirent::d_type and
      we must handle it properly. In this case we can directly check if the
      entity we found is directory and skip it.
      Reported-by: default avatarMichael Petlan <mpetlan@redhat.com>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20171206174535.25380-1-jolsa@kernel.org
      [ Split from a larger patch ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      378811ac
    • Jiri Olsa's avatar
      perf utils: Move is_directory() to path.h · 06c3f2aa
      Jiri Olsa authored
      So that it can be used more widely, like in the next patch, when it will
      be used to fix a bug in 'perf test' handling of dirent.d_type ==
      DT_UNKNOWN.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20171206174535.25380-1-jolsa@kernel.org
      [ Split from a larger patch, removed needless includes in path.h ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      06c3f2aa
    • Jin Yao's avatar
      perf stat: Resort '--per-thread' result · 29734550
      Jin Yao authored
      There are many threads reported if we enable '--per-thread'
      globally.
      
      1. Most of the threads are not counted or counting value 0.
      This patch removes these threads.
      
      2. We also resort the threads in display according to the
      counting value. It's useful for user to see the hottest
      threads easily.
      
      For example, the new results would be:
      
      root@skl:/tmp# perf stat --per-thread
      ^C
       Performance counter stats for 'system wide':
      
                  perf-24165              4.302433      cpu-clock (msec)          #    0.001 CPUs utilized
                vmstat-23127              1.562215      cpu-clock (msec)          #    0.000 CPUs utilized
            irqbalance-2780               0.827851      cpu-clock (msec)          #    0.000 CPUs utilized
                  sshd-23111              0.278308      cpu-clock (msec)          #    0.000 CPUs utilized
              thermald-2841               0.230880      cpu-clock (msec)          #    0.000 CPUs utilized
                  sshd-23058              0.207306      cpu-clock (msec)          #    0.000 CPUs utilized
           kworker/0:2-19991              0.133983      cpu-clock (msec)          #    0.000 CPUs utilized
         kworker/u16:1-18249              0.125636      cpu-clock (msec)          #    0.000 CPUs utilized
             rcu_sched-8                  0.085533      cpu-clock (msec)          #    0.000 CPUs utilized
         kworker/u16:2-23146              0.077139      cpu-clock (msec)          #    0.000 CPUs utilized
                 gmain-2700               0.041789      cpu-clock (msec)          #    0.000 CPUs utilized
           kworker/4:1-15354              0.028370      cpu-clock (msec)          #    0.000 CPUs utilized
           kworker/6:0-17528              0.023895      cpu-clock (msec)          #    0.000 CPUs utilized
          kworker/4:1H-1887               0.013209      cpu-clock (msec)          #    0.000 CPUs utilized
           kworker/5:2-31362              0.011627      cpu-clock (msec)          #    0.000 CPUs utilized
            watchdog/0-11                 0.010892      cpu-clock (msec)          #    0.000 CPUs utilized
           kworker/3:2-12870              0.010220      cpu-clock (msec)          #    0.000 CPUs utilized
           ksoftirqd/0-7                  0.008869      cpu-clock (msec)          #    0.000 CPUs utilized
            watchdog/1-14                 0.008476      cpu-clock (msec)          #    0.000 CPUs utilized
            watchdog/7-50                 0.002944      cpu-clock (msec)          #    0.000 CPUs utilized
            watchdog/3-26                 0.002893      cpu-clock (msec)          #    0.000 CPUs utilized
            watchdog/4-32                 0.002759      cpu-clock (msec)          #    0.000 CPUs utilized
            watchdog/2-20                 0.002429      cpu-clock (msec)          #    0.000 CPUs utilized
            watchdog/6-44                 0.001491      cpu-clock (msec)          #    0.000 CPUs utilized
            watchdog/5-38                 0.001477      cpu-clock (msec)          #    0.000 CPUs utilized
             rcu_sched-8                        10      context-switches          #    0.117 M/sec
         kworker/u16:1-18249                     7      context-switches          #    0.056 M/sec
                  sshd-23111                     4      context-switches          #    0.014 M/sec
                vmstat-23127                     4      context-switches          #    0.003 M/sec
                  perf-24165                     4      context-switches          #    0.930 K/sec
           kworker/0:2-19991                     3      context-switches          #    0.022 M/sec
         kworker/u16:2-23146                     3      context-switches          #    0.039 M/sec
           kworker/4:1-15354                     2      context-switches          #    0.070 M/sec
           kworker/6:0-17528                     2      context-switches          #    0.084 M/sec
                  sshd-23058                     2      context-switches          #    0.010 M/sec
           ksoftirqd/0-7                         1      context-switches          #    0.113 M/sec
            watchdog/0-11                        1      context-switches          #    0.092 M/sec
            watchdog/1-14                        1      context-switches          #    0.118 M/sec
            watchdog/2-20                        1      context-switches          #    0.412 M/sec
            watchdog/3-26                        1      context-switches          #    0.346 M/sec
            watchdog/4-32                        1      context-switches          #    0.362 M/sec
            watchdog/5-38                        1      context-switches          #    0.677 M/sec
            watchdog/6-44                        1      context-switches          #    0.671 M/sec
            watchdog/7-50                        1      context-switches          #    0.340 M/sec
          kworker/4:1H-1887                      1      context-switches          #    0.076 M/sec
              thermald-2841                      1      context-switches          #    0.004 M/sec
                 gmain-2700                      1      context-switches          #    0.024 M/sec
            irqbalance-2780                      1      context-switches          #    0.001 M/sec
           kworker/3:2-12870                     1      context-switches          #    0.098 M/sec
           kworker/5:2-31362                     1      context-switches          #    0.086 M/sec
         kworker/u16:1-18249                     2      cpu-migrations            #    0.016 M/sec
         kworker/u16:2-23146                     2      cpu-migrations            #    0.026 M/sec
             rcu_sched-8                         1      cpu-migrations            #    0.012 M/sec
                  sshd-23058                     1      cpu-migrations            #    0.005 M/sec
                  perf-24165             8,833,385      cycles                    #    2.053 GHz
                vmstat-23127             1,702,699      cycles                    #    1.090 GHz
            irqbalance-2780                739,847      cycles                    #    0.894 GHz
                  sshd-23111               269,506      cycles                    #    0.968 GHz
              thermald-2841                204,556      cycles                    #    0.886 GHz
                  sshd-23058               158,780      cycles                    #    0.766 GHz
           kworker/0:2-19991               112,981      cycles                    #    0.843 GHz
         kworker/u16:1-18249               100,926      cycles                    #    0.803 GHz
             rcu_sched-8                    74,024      cycles                    #    0.865 GHz
         kworker/u16:2-23146                55,984      cycles                    #    0.726 GHz
                 gmain-2700                 34,278      cycles                    #    0.820 GHz
           kworker/4:1-15354                20,665      cycles                    #    0.728 GHz
           kworker/6:0-17528                16,445      cycles                    #    0.688 GHz
           kworker/5:2-31362                 9,492      cycles                    #    0.816 GHz
            watchdog/3-26                    8,695      cycles                    #    3.006 GHz
          kworker/4:1H-1887                  8,238      cycles                    #    0.624 GHz
            watchdog/4-32                    7,580      cycles                    #    2.747 GHz
           kworker/3:2-12870                 7,306      cycles                    #    0.715 GHz
            watchdog/2-20                    7,274      cycles                    #    2.995 GHz
            watchdog/0-11                    6,988      cycles                    #    0.642 GHz
           ksoftirqd/0-7                     6,376      cycles                    #    0.719 GHz
            watchdog/1-14                    5,340      cycles                    #    0.630 GHz
            watchdog/5-38                    4,061      cycles                    #    2.749 GHz
            watchdog/6-44                    3,976      cycles                    #    2.667 GHz
            watchdog/7-50                    3,418      cycles                    #    1.161 GHz
                vmstat-23127             2,511,699      instructions              #    1.48  insn per cycle
                  perf-24165             1,829,908      instructions              #    0.21  insn per cycle
            irqbalance-2780              1,190,204      instructions              #    1.61  insn per cycle
              thermald-2841                143,544      instructions              #    0.70  insn per cycle
                  sshd-23111               128,138      instructions              #    0.48  insn per cycle
                  sshd-23058                57,654      instructions              #    0.36  insn per cycle
             rcu_sched-8                    44,063      instructions              #    0.60  insn per cycle
         kworker/u16:1-18249                42,551      instructions              #    0.42  insn per cycle
           kworker/0:2-19991                25,873      instructions              #    0.23  insn per cycle
         kworker/u16:2-23146                21,407      instructions              #    0.38  insn per cycle
                 gmain-2700                 13,691      instructions              #    0.40  insn per cycle
           kworker/4:1-15354                12,964      instructions              #    0.63  insn per cycle
           kworker/6:0-17528                10,034      instructions              #    0.61  insn per cycle
           kworker/5:2-31362                 5,203      instructions              #    0.55  insn per cycle
           kworker/3:2-12870                 4,866      instructions              #    0.67  insn per cycle
          kworker/4:1H-1887                  3,586      instructions              #    0.44  insn per cycle
           ksoftirqd/0-7                     3,463      instructions              #    0.54  insn per cycle
            watchdog/0-11                    3,135      instructions              #    0.45  insn per cycle
            watchdog/1-14                    3,135      instructions              #    0.59  insn per cycle
            watchdog/2-20                    3,135      instructions              #    0.43  insn per cycle
            watchdog/3-26                    3,135      instructions              #    0.36  insn per cycle
            watchdog/4-32                    3,135      instructions              #    0.41  insn per cycle
            watchdog/5-38                    3,135      instructions              #    0.77  insn per cycle
            watchdog/6-44                    3,135      instructions              #    0.79  insn per cycle
            watchdog/7-50                    3,135      instructions              #    0.92  insn per cycle
                vmstat-23127               539,181      branches                  #  345.139 M/sec
                  perf-24165               375,364      branches                  #   87.245 M/sec
            irqbalance-2780                262,092      branches                  #  316.593 M/sec
              thermald-2841                 31,611      branches                  #  136.915 M/sec
                  sshd-23111                21,874      branches                  #   78.596 M/sec
                  sshd-23058                10,682      branches                  #   51.528 M/sec
             rcu_sched-8                     8,693      branches                  #  101.633 M/sec
         kworker/u16:1-18249                 7,891      branches                  #   62.808 M/sec
           kworker/0:2-19991                 5,761      branches                  #   42.998 M/sec
         kworker/u16:2-23146                 4,099      branches                  #   53.138 M/sec
           kworker/4:1-15354                 2,755      branches                  #   97.110 M/sec
                 gmain-2700                  2,638      branches                  #   63.127 M/sec
           kworker/6:0-17528                 2,216      branches                  #   92.739 M/sec
           kworker/5:2-31362                 1,132      branches                  #   97.360 M/sec
           kworker/3:2-12870                 1,081      branches                  #  105.773 M/sec
          kworker/4:1H-1887                    725      branches                  #   54.887 M/sec
           ksoftirqd/0-7                       707      branches                  #   79.716 M/sec
            watchdog/0-11                      652      branches                  #   59.860 M/sec
            watchdog/1-14                      652      branches                  #   76.923 M/sec
            watchdog/2-20                      652      branches                  #  268.423 M/sec
            watchdog/3-26                      652      branches                  #  225.372 M/sec
            watchdog/4-32                      652      branches                  #  236.318 M/sec
            watchdog/5-38                      652      branches                  #  441.435 M/sec
            watchdog/6-44                      652      branches                  #  437.290 M/sec
            watchdog/7-50                      652      branches                  #  221.467 M/sec
                vmstat-23127                 8,960      branch-misses             #    1.66% of all branches
            irqbalance-2780                  3,047      branch-misses             #    1.16% of all branches
                  perf-24165                 2,876      branch-misses             #    0.77% of all branches
                  sshd-23111                 1,843      branch-misses             #    8.43% of all branches
              thermald-2841                  1,444      branch-misses             #    4.57% of all branches
                  sshd-23058                 1,379      branch-misses             #   12.91% of all branches
         kworker/u16:1-18249                   982      branch-misses             #   12.44% of all branches
             rcu_sched-8                       893      branch-misses             #   10.27% of all branches
         kworker/u16:2-23146                   578      branch-misses             #   14.10% of all branches
           kworker/0:2-19991                   376      branch-misses             #    6.53% of all branches
                 gmain-2700                    280      branch-misses             #   10.61% of all branches
           kworker/6:0-17528                   196      branch-misses             #    8.84% of all branches
           kworker/4:1-15354                   187      branch-misses             #    6.79% of all branches
           kworker/5:2-31362                   123      branch-misses             #   10.87% of all branches
            watchdog/0-11                       95      branch-misses             #   14.57% of all branches
            watchdog/4-32                       89      branch-misses             #   13.65% of all branches
           kworker/3:2-12870                    80      branch-misses             #    7.40% of all branches
            watchdog/3-26                       61      branch-misses             #    9.36% of all branches
          kworker/4:1H-1887                     60      branch-misses             #    8.28% of all branches
            watchdog/2-20                       52      branch-misses             #    7.98% of all branches
           ksoftirqd/0-7                        47      branch-misses             #    6.65% of all branches
            watchdog/1-14                       46      branch-misses             #    7.06% of all branches
            watchdog/7-50                       13      branch-misses             #    1.99% of all branches
            watchdog/5-38                        8      branch-misses             #    1.23% of all branches
            watchdog/6-44                        7      branch-misses             #    1.07% of all branches
      
             3.695150786 seconds time elapsed
      
      root@skl:/tmp# perf stat --per-thread -M IPC,CPI
      ^C
      
       Performance counter stats for 'system wide':
      
                vmstat-23127             2,000,783      inst_retired.any          #      1.5 IPC
              thermald-2841              1,472,670      inst_retired.any          #      1.3 IPC
                  sshd-23111               977,374      inst_retired.any          #      1.2 IPC
                  perf-24163               483,779      inst_retired.any          #      0.2 IPC
                 gmain-2700                341,213      inst_retired.any          #      0.9 IPC
                  sshd-23058               148,891      inst_retired.any          #      0.8 IPC
          rtkit-daemon-3288                 71,210      inst_retired.any          #      0.7 IPC
         kworker/u16:1-18249                39,562      inst_retired.any          #      0.3 IPC
             rcu_sched-8                    14,474      inst_retired.any          #      0.8 IPC
           kworker/0:2-19991                 7,659      inst_retired.any          #      0.2 IPC
           kworker/4:1-15354                 6,714      inst_retired.any          #      0.8 IPC
          rtkit-daemon-3289                  4,839      inst_retired.any          #      0.3 IPC
           kworker/6:0-17528                 3,321      inst_retired.any          #      0.6 IPC
           kworker/5:2-31362                 3,215      inst_retired.any          #      0.5 IPC
           kworker/7:2-23145                 3,173      inst_retired.any          #      0.7 IPC
          kworker/4:1H-1887                  1,719      inst_retired.any          #      0.3 IPC
            watchdog/0-11                    1,479      inst_retired.any          #      0.3 IPC
            watchdog/1-14                    1,479      inst_retired.any          #      0.3 IPC
            watchdog/2-20                    1,479      inst_retired.any          #      0.4 IPC
            watchdog/3-26                    1,479      inst_retired.any          #      0.4 IPC
            watchdog/4-32                    1,479      inst_retired.any          #      0.3 IPC
            watchdog/5-38                    1,479      inst_retired.any          #      0.3 IPC
            watchdog/6-44                    1,479      inst_retired.any          #      0.7 IPC
            watchdog/7-50                    1,479      inst_retired.any          #      0.7 IPC
         kworker/u16:2-23146                 1,408      inst_retired.any          #      0.5 IPC
                  perf-24163             2,249,872      cpu_clk_unhalted.thread
                vmstat-23127             1,352,455      cpu_clk_unhalted.thread
              thermald-2841              1,161,140      cpu_clk_unhalted.thread
                  sshd-23111               807,827      cpu_clk_unhalted.thread
                 gmain-2700                375,535      cpu_clk_unhalted.thread
                  sshd-23058               194,071      cpu_clk_unhalted.thread
         kworker/u16:1-18249               114,306      cpu_clk_unhalted.thread
          rtkit-daemon-3288                103,547      cpu_clk_unhalted.thread
           kworker/0:2-19991                46,550      cpu_clk_unhalted.thread
             rcu_sched-8                    18,855      cpu_clk_unhalted.thread
          rtkit-daemon-3289                 17,549      cpu_clk_unhalted.thread
           kworker/4:1-15354                 8,812      cpu_clk_unhalted.thread
           kworker/5:2-31362                 6,812      cpu_clk_unhalted.thread
          kworker/4:1H-1887                  5,270      cpu_clk_unhalted.thread
           kworker/6:0-17528                 5,111      cpu_clk_unhalted.thread
           kworker/7:2-23145                 4,667      cpu_clk_unhalted.thread
            watchdog/0-11                    4,663      cpu_clk_unhalted.thread
            watchdog/1-14                    4,663      cpu_clk_unhalted.thread
            watchdog/4-32                    4,626      cpu_clk_unhalted.thread
            watchdog/5-38                    4,403      cpu_clk_unhalted.thread
            watchdog/3-26                    3,936      cpu_clk_unhalted.thread
            watchdog/2-20                    3,850      cpu_clk_unhalted.thread
         kworker/u16:2-23146                 2,654      cpu_clk_unhalted.thread
            watchdog/6-44                    2,017      cpu_clk_unhalted.thread
            watchdog/7-50                    2,017      cpu_clk_unhalted.thread
                vmstat-23127             2,000,783      inst_retired.any          #      0.7 CPI
              thermald-2841              1,472,670      inst_retired.any          #      0.8 CPI
                  sshd-23111               977,374      inst_retired.any          #      0.8 CPI
                  perf-24163               495,037      inst_retired.any          #      4.7 CPI
                 gmain-2700                341,213      inst_retired.any          #      1.1 CPI
                  sshd-23058               148,891      inst_retired.any          #      1.3 CPI
          rtkit-daemon-3288                 71,210      inst_retired.any          #      1.5 CPI
         kworker/u16:1-18249                39,562      inst_retired.any          #      2.9 CPI
             rcu_sched-8                    14,474      inst_retired.any          #      1.3 CPI
           kworker/0:2-19991                 7,659      inst_retired.any          #      6.1 CPI
           kworker/4:1-15354                 6,714      inst_retired.any          #      1.3 CPI
          rtkit-daemon-3289                  4,839      inst_retired.any          #      3.6 CPI
           kworker/6:0-17528                 3,321      inst_retired.any          #      1.5 CPI
           kworker/5:2-31362                 3,215      inst_retired.any          #      2.1 CPI
           kworker/7:2-23145                 3,173      inst_retired.any          #      1.5 CPI
          kworker/4:1H-1887                  1,719      inst_retired.any          #      3.1 CPI
            watchdog/0-11                    1,479      inst_retired.any          #      3.2 CPI
            watchdog/1-14                    1,479      inst_retired.any          #      3.2 CPI
            watchdog/2-20                    1,479      inst_retired.any          #      2.6 CPI
            watchdog/3-26                    1,479      inst_retired.any          #      2.7 CPI
            watchdog/4-32                    1,479      inst_retired.any          #      3.1 CPI
            watchdog/5-38                    1,479      inst_retired.any          #      3.0 CPI
            watchdog/6-44                    1,479      inst_retired.any          #      1.4 CPI
            watchdog/7-50                    1,479      inst_retired.any          #      1.4 CPI
         kworker/u16:2-23146                 1,408      inst_retired.any          #      1.9 CPI
                  perf-24163             2,302,323      cycles
                vmstat-23127             1,352,455      cycles
              thermald-2841              1,161,140      cycles
                  sshd-23111               807,827      cycles
                 gmain-2700                375,535      cycles
                  sshd-23058               194,071      cycles
         kworker/u16:1-18249               114,306      cycles
          rtkit-daemon-3288                103,547      cycles
           kworker/0:2-19991                46,550      cycles
             rcu_sched-8                    18,855      cycles
          rtkit-daemon-3289                 17,549      cycles
           kworker/4:1-15354                 8,812      cycles
           kworker/5:2-31362                 6,812      cycles
          kworker/4:1H-1887                  5,270      cycles
           kworker/6:0-17528                 5,111      cycles
           kworker/7:2-23145                 4,667      cycles
            watchdog/0-11                    4,663      cycles
            watchdog/1-14                    4,663      cycles
            watchdog/4-32                    4,626      cycles
            watchdog/5-38                    4,403      cycles
            watchdog/3-26                    3,936      cycles
            watchdog/2-20                    3,850      cycles
         kworker/u16:2-23146                 2,654      cycles
            watchdog/6-44                    2,017      cycles
            watchdog/7-50                    2,017      cycles
      
             2.175726600 seconds time elapsed
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512482591-4646-12-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      29734550
    • Jin Yao's avatar
      perf stat: Remove --per-thread pid/tid limitation · 1d9f8d1b
      Jin Yao authored
      Currently, if we execute 'perf stat --per-thread' without specifying
      pid/tid, perf will return error.
      
      root@skl:/tmp# perf stat --per-thread
      The --per-thread option is only available when monitoring via -p -t options.
          -p, --pid <pid>       stat events on existing process id
          -t, --tid <tid>       stat events on existing thread id
      
      This patch removes this limitation. If no pid/tid specified, it returns
      all threads (get threads from /proc).
      
      Note that it doesn't support cpu_list yet so if it's a cpu_list case,
      then skip.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512482591-4646-11-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1d9f8d1b
    • Jin Yao's avatar
      perf thread_map: Enumerate all threads from /proc · 73c0ca1e
      Jin Yao authored
      This patch calls thread_map__new_all_cpus() to enumerate all threads
      from /proc if per-thread flag is enabled.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512482591-4646-10-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      73c0ca1e
    • Jin Yao's avatar
      perf stat: Update or print per-thread stats · 14e72a21
      Jin Yao authored
      If the stats pointer in stat_config structure is not null, it will
      update the per-thread stats or print the per-thread stats on this
      buffer.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512482591-4646-9-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      14e72a21
    • Jin Yao's avatar
      perf stat: Allocate shadow stats buffer for threads · 56739444
      Jin Yao authored
      After perf_evlist__create_maps() being executed, we can get all threads
      from /proc. And via thread_map__nr(), we can also get the number of
      threads.
      
      With the number of threads, the patch allocates a buffer which will
      record the shadow stats for these threads.
      
      The buffer pointer is saved in stat_config.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512482591-4646-8-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      56739444
    • Jin Yao's avatar
      perf stat: Remove a set of shadow stats static variables · 6a1e2c5c
      Jin Yao authored
      In previous patches, we have reconstructed the code and let it not
      access the static variables directly.
      
      This patch removes these static variables.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512482591-4646-7-git-send-email-yao.jin@linux.intel.com
      [ Rename 'stat' variables to 'st' to build on centos:{5,6} and others where it shadows a global declaration ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6a1e2c5c
    • Jin Yao's avatar
      perf stat: Print per-thread shadow stats · e0128b30
      Jin Yao authored
      The function perf_stat__print_shadow_stats() is called to print the
      shadow stats on a set of static variables.
      
      But the static variables are the limitations to support
      per-thread shadow stats.
      
      This patch lets the perf_stat__print_shadow_stats() support
      to print the shadow stats from a input parameter 'st'.
      
      It will not directly get value from static variable. Instead,
      it now uses runtime_stat_avg() and runtime_stat_n() to get and
      compute the values.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512482591-4646-6-git-send-email-yao.jin@linux.intel.com
      [ Rename 'stat' variables to 'st' to build on centos:{5,6} and others where it shadows a global declaration ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e0128b30
    • Jin Yao's avatar
      perf stat: Update per-thread shadow stats · 1fcd0394
      Jin Yao authored
      The functions perf_stat__update_shadow_stats() is called to update the
      shadow stats on a set of static variables.
      
      But the static variables are the limitations to be extended to support
      per-thread shadow stats.
      
      This patch lets the perf_stat__update_shadow_stats() support to update
      the shadow stats on a input parameter 'st' and uses
      update_runtime_stat() to update the stats. It will not directly update
      the static variables as before.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512482591-4646-5-git-send-email-yao.jin@linux.intel.com
      [ Rename 'stat' variables to 'st' to build on centos:{5,6} and others where it shadows a global declaration ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1fcd0394
    • Jin Yao's avatar
      perf stat: Create the runtime_stat init/exit function · 8efb2df1
      Jin Yao authored
      It mainly initializes and releases the rblist which is defined in struct
      runtime_stat.
      
      For the original rblist 'runtime_saved_values', it's still kept there
      for keeping the patch bisectable.
      
      The rblist 'runtime_saved_values' will be removed in later patch at
      switching time.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512482591-4646-4-git-send-email-yao.jin@linux.intel.com
      [ Rename 'stat' variables to 'st' to build on centos:{5,6} and others where it shadows a global declaration ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8efb2df1
    • Jin Yao's avatar
      perf stat: Extend rbtree to support per-thread shadow stats · 49cd456a
      Jin Yao authored
      Previously the rbtree was used to link generic metrics.
      
      This patches adds new ctx/type/stat into rbtree keys because we will use
      this rbtree to maintain shadow metrics to replace original a couple of
      static arrays for supporting per-thread shadow stats.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512482591-4646-3-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      49cd456a
    • Jin Yao's avatar
      perf stat: Define a structure for per-thread shadow stats · e5fcc2ab
      Jin Yao authored
      Perf has a set of static variables to record the runtime shadow metrics
      stats.
      
      While if we want to record the runtime shadow stats for per-thread, it
      will be the limitation. This patch creates a structure and the next
      patches will use this structure to update the runtime shadow stats for
      per-thread.
      Signed-off-by: default avatarJin Yao <yao.jin@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1512482591-4646-2-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e5fcc2ab
  2. 18 Dec, 2017 7 commits
    • Ingo Molnar's avatar
      faaf9567
    • Ingo Molnar's avatar
      Merge tag 'perf-urgent-for-mingo-4.15-20171218' of... · 2e364635
      Ingo Molnar authored
      Merge tag 'perf-urgent-for-mingo-4.15-20171218' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
      
      Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
      
      - Fix up build in hardened environments, such as fedora 27 (Jiri Olsa)
      
      - Do not include header files from the kernel sources for the s/390 arch,
        fixing the detached tarball building (Arnaldo Carvalho de Melo)
      
      - Allow again using asm.h when building for the 'bpf' clang target,
        guarding x86 specific bits under ifndef __BPF__ (Arnaldo Carvalho de Melo)
      
      - Generate correct debug information for inlined code when generating
        ELF images for JITted java programs (Ben Gainey)
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      2e364635
    • Arnaldo Carvalho de Melo's avatar
      x86/asm: Allow again using asm.h when building for the 'bpf' clang target · ca26cffa
      Arnaldo Carvalho de Melo authored
      Up to f5caf621 ("x86/asm: Fix inline asm call constraints for Clang")
      we were able to use x86 headers to build to the 'bpf' clang target, as
      done by the BPF code in tools/perf/.
      
      With that commit, we ended up with following failure for 'perf test LLVM', this
      is because "clang ... -target bpf ..." fails since 4.0 does not have bpf inline
      asm support and 6.0 does not recognize the register 'esp', fix it by guarding
      that part with an #ifndef __BPF__, that is defined by clang when building to
      the "bpf" target.
      
        # perf test -v LLVM
        37: LLVM search and compile                               :
        37.1: Basic BPF llvm compile                              :
        --- start ---
        test child forked, pid 25526
        Kernel build dir is set to /lib/modules/4.14.0+/build
        set env: KBUILD_DIR=/lib/modules/4.14.0+/build
        unset env: KBUILD_OPTS
        include option is set to  -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h
        set env: NR_CPUS=4
        set env: LINUX_VERSION_CODE=0x40e00
        set env: CLANG_EXEC=/usr/local/bin/clang
        set env: CLANG_OPTIONS=-xc
        set env: KERNEL_INC_OPTIONS= -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h
        set env: WORKING_DIR=/lib/modules/4.14.0+/build
        set env: CLANG_SOURCE=-
        llvm compiling command template: echo '/*
         * bpf-script-example.c
         * Test basic LLVM building
         */
        #ifndef LINUX_VERSION_CODE
        # error Need LINUX_VERSION_CODE
        # error Example: for 4.2 kernel, put 'clang-opt="-DLINUX_VERSION_CODE=0x40200" into llvm section of ~/.perfconfig'
        #endif
        #define BPF_ANY 0
        #define BPF_MAP_TYPE_ARRAY 2
        #define BPF_FUNC_map_lookup_elem 1
        #define BPF_FUNC_map_update_elem 2
      
        static void *(*bpf_map_lookup_elem)(void *map, void *key) =
      	  (void *) BPF_FUNC_map_lookup_elem;
        static void *(*bpf_map_update_elem)(void *map, void *key, void *value, int flags) =
      	  (void *) BPF_FUNC_map_update_elem;
      
        struct bpf_map_def {
      	  unsigned int type;
      	  unsigned int key_size;
      	  unsigned int value_size;
      	  unsigned int max_entries;
        };
      
        #define SEC(NAME) __attribute__((section(NAME), used))
        struct bpf_map_def SEC("maps") flip_table = {
      	  .type = BPF_MAP_TYPE_ARRAY,
      	  .key_size = sizeof(int),
      	  .value_size = sizeof(int),
      	  .max_entries = 1,
        };
      
        SEC("func=SyS_epoll_wait")
        int bpf_func__SyS_epoll_wait(void *ctx)
        {
      	  int ind =0;
      	  int *flag = bpf_map_lookup_elem(&flip_table, &ind);
      	  int new_flag;
      	  if (!flag)
      		  return 0;
      	  /* flip flag and store back */
      	  new_flag = !*flag;
      	  bpf_map_update_elem(&flip_table, &ind, &new_flag, BPF_ANY);
      	  return new_flag;
        }
        char _license[] SEC("license") = "GPL";
        int _version SEC("version") = LINUX_VERSION_CODE;
        ' | $CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS -DLINUX_VERSION_CODE=$LINUX_VERSION_CODE $CLANG_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c "$CLANG_SOURCE" -target bpf -O2 -o -
        test child finished with 0
        ---- end ----
        LLVM search and compile subtest 0: Ok
        37.2: kbuild searching                                    :
        --- start ---
        test child forked, pid 25950
        Kernel build dir is set to /lib/modules/4.14.0+/build
        set env: KBUILD_DIR=/lib/modules/4.14.0+/build
        unset env: KBUILD_OPTS
        include option is set to  -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h
        set env: NR_CPUS=4
        set env: LINUX_VERSION_CODE=0x40e00
        set env: CLANG_EXEC=/usr/local/bin/clang
        set env: CLANG_OPTIONS=-xc
        set env: KERNEL_INC_OPTIONS= -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/7/include -I/home/acme/git/linux/arch/x86/include -I./arch/x86/include/generated  -I/home/acme/git/linux/include -I./include -I/home/acme/git/linux/arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I/home/acme/git/linux/include/uapi -I./include/generated/uapi -include /home/acme/git/linux/include/linux/kconfig.h
        set env: WORKING_DIR=/lib/modules/4.14.0+/build
        set env: CLANG_SOURCE=-
        llvm compiling command template: echo '/*
         * bpf-script-test-kbuild.c
         * Test include from kernel header
         */
        #ifndef LINUX_VERSION_CODE
        # error Need LINUX_VERSION_CODE
        # error Example: for 4.2 kernel, put 'clang-opt="-DLINUX_VERSION_CODE=0x40200" into llvm section of ~/.perfconfig'
        #endif
        #define SEC(NAME) __attribute__((section(NAME), used))
      
        #include <uapi/linux/fs.h>
        #include <uapi/asm/ptrace.h>
      
        SEC("func=vfs_llseek")
        int bpf_func__vfs_llseek(void *ctx)
        {
      	  return 0;
        }
      
        char _license[] SEC("license") = "GPL";
        int _version SEC("version") = LINUX_VERSION_CODE;
        ' | $CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS -DLINUX_VERSION_CODE=$LINUX_VERSION_CODE $CLANG_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c "$CLANG_SOURCE" -target bpf -O2 -o -
        In file included from <stdin>:12:
        In file included from /home/acme/git/linux/arch/x86/include/uapi/asm/ptrace.h:5:
        In file included from /home/acme/git/linux/include/linux/compiler.h:242:
        In file included from /home/acme/git/linux/arch/x86/include/asm/barrier.h:5:
        In file included from /home/acme/git/linux/arch/x86/include/asm/alternative.h:10:
        /home/acme/git/linux/arch/x86/include/asm/asm.h:145:50: error: unknown register name 'esp' in asm
        register unsigned long current_stack_pointer asm(_ASM_SP);
                                                         ^
        /home/acme/git/linux/arch/x86/include/asm/asm.h:44:18: note: expanded from macro '_ASM_SP'
        #define _ASM_SP         __ASM_REG(sp)
                                ^
        /home/acme/git/linux/arch/x86/include/asm/asm.h:27:32: note: expanded from macro '__ASM_REG'
        #define __ASM_REG(reg)         __ASM_SEL_RAW(e##reg, r##reg)
                                       ^
        /home/acme/git/linux/arch/x86/include/asm/asm.h:18:29: note: expanded from macro '__ASM_SEL_RAW'
        # define __ASM_SEL_RAW(a,b) __ASM_FORM_RAW(a)
                                    ^
        /home/acme/git/linux/arch/x86/include/asm/asm.h:11:32: note: expanded from macro '__ASM_FORM_RAW'
        # define __ASM_FORM_RAW(x)     #x
                                       ^
        <scratch space>:4:1: note: expanded from here
        "esp"
        ^
        1 error generated.
        ERROR:	unable to compile -
        Hint:	Check error message shown above.
        Hint:	You can also pre-compile it into .o using:
           		  clang -target bpf -O2 -c -
           	  with proper -I and -D options.
        Failed to compile test case: 'kbuild searching'
        test child finished with -1
        ---- end ----
        LLVM search and compile subtest 1: FAILED!
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Dmitriy Vyukov <dvyukov@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matthias Kaehlcke <mka@chromium.org>
      Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: Yonghong Song <yhs@fb.com>
      Link: https://lkml.kernel.org/r/20171128175948.GL3298@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ca26cffa
    • Arnaldo Carvalho de Melo's avatar
      tools arch s390: Do not include header files from the kernel sources · 10b9baa7
      Arnaldo Carvalho de Melo authored
      Long ago we decided to be verbotten including files in the kernel git
      sources from tools/ living source code, to avoid disturbing kernel
      development (and perf's and other tools/) when, say, a kernel hacker
      adds something, tests everything but tools/ and have tools/ build
      broken.
      
      This got broken recently by s/390, fix it by copying
      arch/s390/include/uapi/asm/perf_regs.h to tools/arch/s390/include/uapi/asm/,
      making this one be used by means of <asm/perf_regs.h> and updating
      tools/perf/check_headers.sh to make sure we are notified when the
      original changes, so that we can check if anything is needed on the
      tooling side.
      
      This would have been caught by the 'tarkpg' test entry in:
      
      $ make -C tools/perf build-test
      
      When run on a s/390 build system or container.
      Acked-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Fixes: f704ef44 ("s390/perf: add support for perf_regs and libdw")
      Link: https://lkml.kernel.org/n/tip-n57139ic0v9uffx8wdqi3d8a@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      10b9baa7
    • Ben Gainey's avatar
      perf jvmti: Generate correct debug information for inlined code · ca58d7e6
      Ben Gainey authored
      tools/perf/jvmti is broken in so far as it generates incorrect debug
      information. Specifically it attributes all debug lines to the original
      method being output even in the case that some code is being inlined
      from elsewhere.  This patch fixes the issue.
      
      To test (from within linux/tools/perf):
      
      export JDIR=/usr/lib/jvm/java-8-openjdk-amd64/
      make
      cat << __EOF > Test.java
      public class Test
      {
          private StringBuilder b = new StringBuilder();
      
          private void loop(int i, String... args)
          {
              for (String a : args)
                  b.append(a);
      
              long hc = b.hashCode() * System.nanoTime();
      
              b = new StringBuilder();
              b.append(hc);
      
              System.out.printf("Iteration %d = %d\n", i, hc);
          }
      
          public void run(String... args)
          {
              for (int i = 0; i < 10000; ++i)
              {
                  loop(i, args);
              }
          }
      
          public static void main(String... args)
          {
              Test t = new Test();
              t.run(args);
          }
      }
      __EOF
      $JDIR/bin/javac Test.java
      ./perf record -F 10000 -g -k mono $JDIR/bin/java -agentpath:`pwd`/libperf-jvmti.so Test
      ./perf inject --jit -i perf.data -o perf.data.jitted
      ./perf annotate -i perf.data.jitted --stdio | grep Test\.java: | sort -u
      
      Before this patch, Test.java line numbers get reported that are greater
      than the number of lines in the Test.java file.  They come from the
      source file of the inlined function, e.g. java/lang/String.java:1085.
      For further validation one can examine those lines in the JDK source
      distribution and confirm that they map to inlined functions called by
      Test.java.
      
      After this patch, the filename of the inlined function is output
      rather than the incorrect original source filename.
      Signed-off-by: default avatarBen Gainey <ben.gainey@arm.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: default avatarStephane Eranian <eranian@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ben Gainey <ben.gainey@arm.com>
      Cc: Colin King <colin.king@canonical.com>
      Cc: Darren Hart <dvhart@infradead.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 598b7c69 ("perf jit: add source line info support")
      Link: http://lkml.kernel.org/r/20171122182541.d25599a3eb1ada3480d142fa@arm.comSigned-off-by: default avatarKim Phillips <kim.phillips@arm.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ca58d7e6
    • Jiri Olsa's avatar
      perf tools: Fix up build in hardened environments · 61fb26a6
      Jiri Olsa authored
      On Fedora systems the perl and python CFLAGS/LDFLAGS include the
      hardened specs from redhat-rpm-config package. We apply them only for
      perl/python objects, which makes them not compatible with the rest of
      the objects and the build fails with:
      
        /usr/bin/ld: perf-in.o: relocation R_X86_64_32 against `.rodata.str1.1' can not be used when making a shared object; recompile with -f
      +PIC
        /usr/bin/ld: libperf.a(libperf-in.o): relocation R_X86_64_32S against `.text' can not be used when making a shared object; recompile w
      +ith -fPIC
        /usr/bin/ld: final link failed: Nonrepresentable section on output
        collect2: error: ld returned 1 exit status
        make[2]: *** [Makefile.perf:507: perf] Error 1
        make[1]: *** [Makefile.perf:210: sub-make] Error 2
        make: *** [Makefile:69: all] Error 2
      
      Mainly it's caused by perl/python objects being compiled with:
      
        -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1
      
      which prevent the final link impossible, because it will check
      for 'proper' objects with following option:
      
        -specs=/usr/lib/rpm/redhat/redhat-hardened-ld
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/20171204082437.GC30564@kravaSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      61fb26a6
    • Jiri Olsa's avatar
      perf tools: Use shell function for perl cflags retrieval · 5cfee7a3
      Jiri Olsa authored
      Using the shell function for perl CFLAGS retrieval instead of back
      quotes (``). Both execute shell with the command, but the latter is more
      explicit and seems to be the preferred way.
      
      Also we don't have any other use of the back quotes in perf Makefiles.
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20171108102739.30338-2-jolsa@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5cfee7a3