1. 19 Nov, 2015 19 commits
    • Namhyung Kim's avatar
      perf hists browser: Support flat callchains · 4b3a3212
      Namhyung Kim authored
      The flat callchain mode is to print all chains in a single, simple
      hierarchy so make it easy to see.
      
      Currently perf report --tui doesn't show flat callchains properly.  With
      flat callchains, only leaf nodes are added to the final rbtree so it
      should show entries in parent nodes.  To do that, add parent_val list to
      struct callchain_node and show them along with the (normal) val list.
      
      For example, consider following callchains with '-g graph'.
      
        $ perf report -g graph
        - 39.93%  swapper  [kernel.vmlinux]  [k] intel_idle
             intel_idle
             cpuidle_enter_state
             cpuidle_enter
             call_cpuidle
           - cpu_startup_entry
                28.63% start_secondary
              - 11.30% rest_init
                   start_kernel
                   x86_64_start_reservations
                   x86_64_start_kernel
      
      Before:
        $ perf report -g flat
        - 39.93%  swapper  [kernel.vmlinux]  [k] intel_idle
             28.63% start_secondary
           - 11.30% rest_init
                start_kernel
                x86_64_start_reservations
                x86_64_start_kernel
      
      After:
        $ perf report -g flat
        - 39.93%  swapper  [kernel.vmlinux]  [k] intel_idle
           - 28.63% intel_idle
                cpuidle_enter_state
                cpuidle_enter
                call_cpuidle
                cpu_startup_entry
                start_secondary
           - 11.30% intel_idle
                cpuidle_enter_state
                cpuidle_enter
                call_cpuidle
                cpu_startup_entry
                start_kernel
                x86_64_start_reservations
                x86_64_start_kernel
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: default avatarBrendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1447047946-1691-8-git-send-email-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4b3a3212
    • Namhyung Kim's avatar
      perf hists browser: Factor out hist_browser__show_callchain_list() · 18bb8381
      Namhyung Kim authored
      This function is to print a single callchain list entry.  As this
      function will be used by other function, factor out to a separate
      function.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1447047946-1691-7-git-send-email-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      18bb8381
    • Namhyung Kim's avatar
      perf report: Add callchain value option · f2af0086
      Namhyung Kim authored
      Now -g/--call-graph option supports how to display callchain values.
      Possible values are 'percent', 'period' and 'count'.  The percent is
      same as before and it's the default behavior.  The period displays the
      raw period value rather than the percentage.  The count displays the
      number of occurrences.
      
        $ perf report --no-children --stdio -g percent
        ...
          39.93%  swapper  [kernel.vmlinux]  [k] intel_idel
                  |
                  ---intel_idle
                     cpuidle_enter_state
                     cpuidle_enter
                     call_cpuidle
                     cpu_startup_entry
                     |
                     |--28.63%-- start_secondary
                     |
                      --11.30%-- rest_init
      
        $ perf report --no-children --show-total-period --stdio -g period
        ...
          39.93%   13018705  swapper  [kernel.vmlinux]  [k] intel_idel
                  |
                  ---intel_idle
                     cpuidle_enter_state
                     cpuidle_enter
                     call_cpuidle
                     cpu_startup_entry
                     |
                     |--9334403-- start_secondary
                     |
                      --3684302-- rest_init
      
        $ perf report --no-children --show-nr-samples --stdio -g count
        ...
          39.93%     80  swapper  [kernel.vmlinux]  [k] intel_idel
                  |
                  ---intel_idle
                     cpuidle_enter_state
                     cpuidle_enter
                     call_cpuidle
                     cpu_startup_entry
                     |
                     |--57-- start_secondary
                     |
                      --23-- rest_init
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarBrendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1447047946-1691-6-git-send-email-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f2af0086
    • Namhyung Kim's avatar
      perf callchain: Add count fields to struct callchain_node · 5e47f8ff
      Namhyung Kim authored
      It's to track the count of occurrences of the callchains.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarBrendan Gregg <brendan.d.gregg@gmail.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1447047946-1691-5-git-send-email-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5e47f8ff
    • Namhyung Kim's avatar
      perf callchain: Abstract callchain print function · 5ab250ca
      Namhyung Kim authored
      This is a preparation to support for printing other type of callchain
      value like count or period.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarBrendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1447047946-1691-4-git-send-email-namhyung@kernel.org
      [ renamed new _sprintf_ operation to _scnprintf_ ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5ab250ca
    • Namhyung Kim's avatar
      perf report: Support folded callchain mode on --stdio · 26e77924
      Namhyung Kim authored
      Add new call chain option (-g) 'folded' to print callchains in a line.
      The callchains are separated by semicolons, and preceded by (absolute)
      percent values and a space.
      
      For example, the following 20 lines can be printed in 3 lines with the
      folded output mode:
      
        $ perf report -g flat --no-children | grep -v ^# | head -20
            60.48%  swapper  [kernel.vmlinux]  [k] intel_idle
                    54.60%
                       intel_idle
                       cpuidle_enter_state
                       cpuidle_enter
                       call_cpuidle
                       cpu_startup_entry
                       start_secondary
      
                    5.88%
                       intel_idle
                       cpuidle_enter_state
                       cpuidle_enter
                       call_cpuidle
                       cpu_startup_entry
                       rest_init
                       start_kernel
                       x86_64_start_reservations
                       x86_64_start_kernel
      
        $ perf report -g folded --no-children | grep -v ^# | head -3
            60.48%  swapper  [kernel.vmlinux]  [k] intel_idle
        54.60% intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary
        5.88% intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel
      
      This mode is supported only for --stdio now and intended to be used by
      some scripts like in FlameGraphs[1].  Support for other UI might be
      added later.
      
      [1] http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.htmlRequested-and-Tested-by: default avatarBrendan Gregg <brendan.d.gregg@gmail.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1447047946-1691-2-git-send-email-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      26e77924
    • Masami Hiramatsu's avatar
      perf machine: Fix machine__findnew_module_map to put dso · 566c69c3
      Masami Hiramatsu authored
      Fix machine__findnew_module_map to drop the reference to the dso because
      it is already referenced by both machine__findnew_module_dso() and
      map__new2().
      
      Refcnt debugger shows:
      
        ==== [1] ====
        Unreclaimed dso: 0x1ffd980
        Refcount +1 => 1 at
          ./perf(dso__new+0x1ff) [0x4a62df]
          ./perf(__dsos__addnew+0x29) [0x4a6e19]
          ./perf() [0x4b8b91]
          ./perf(modules__parse+0xfc) [0x4a9d5c]
          ./perf() [0x4b8460]
          ./perf(machine__create_kernel_maps+0x150) [0x4bb550]
          ./perf(machine__new_host+0xfa) [0x4bb75a]
          ./perf(init_probe_symbol_maps+0x93) [0x506623]
          ./perf() [0x455ffa]
          ./perf(cmd_probe+0x6c) [0x4566bc]
          ./perf() [0x47abc5]
          ./perf(main+0x610) [0x421f90]
          /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1345a8eaf5]
          ./perf() [0x4220a9]
      
      This map_groups__insert(0x4b8b91) already gets a reference to the new
      dso:
      
        ----
        eu-addr2line -e ./perf -f 0x4b8b91
        map_groups__insert inlined at util/machine.c:586 in
        machine__create_module
        util/map.h:207
        ----
      
      So this dso refcnt will be released when map_groups gets released.
      
        [snip]
        Refcount +1 => 2 at
          ./perf(dso__get+0x34) [0x4a65f4]
          ./perf() [0x4b8b35]
          ./perf(modules__parse+0xfc) [0x4a9d5c]
          ./perf() [0x4b8460]
          ./perf(machine__create_kernel_maps+0x150) [0x4bb550]
          ./perf(machine__new_host+0xfa) [0x4bb75a]
          ./perf(init_probe_symbol_maps+0x93) [0x506623]
          ./perf() [0x455ffa]
          ./perf(cmd_probe+0x6c) [0x4566bc]
          ./perf() [0x47abc5]
          ./perf(main+0x610) [0x421f90]
          /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1345a8eaf5]
          ./perf() [0x4220a9]
      
      Here, machine__findnew_module_dso(0x4b8b35) gets the dso (and stores it
      in a local variable):
      
        ----
        # eu-addr2line -e ./perf -f 0x4b8b35
        machine__findnew_module_dso inlined at util/machine.c:578 in
        machine__create_module
        util/machine.c:514
        ----
      
        Refcount +1 => 3 at
          ./perf(dso__get+0x34) [0x4a65f4]
          ./perf(map__new2+0x76) [0x4be1c6]
          ./perf() [0x4b8b4f]
          ./perf(modules__parse+0xfc) [0x4a9d5c]
          ./perf() [0x4b8460]
          ./perf(machine__create_kernel_maps+0x150) [0x4bb550]
          ./perf(machine__new_host+0xfa) [0x4bb75a]
          ./perf(init_probe_symbol_maps+0x93) [0x506623]
          ./perf() [0x455ffa]
          ./perf(cmd_probe+0x6c) [0x4566bc]
          ./perf() [0x47abc5]
          ./perf(main+0x610) [0x421f90]
          /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1345a8eaf5]
          ./perf() [0x4220a9]
      
      But also map__new2() gets the dso which will be put when the map is
      released.
      
      So, we have to drop the constructor reference obtained in
      machine__findnew_module_dso().
      Signed-off-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20151118064035.30709.58824.stgit@localhost.localdomainSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      566c69c3
    • Masami Hiramatsu's avatar
      perf tools: Fix machine__create_kernel_maps to put kernel dso refcount · 1154c957
      Masami Hiramatsu authored
      Fix machine__create_kernel_maps() to put kernel dso because the dso has
      been gotten via __machine__create_kernel_maps().
      
      Refcnt debugger shows:
        ==== [0] ====
        Unreclaimed dso: 0x3036ab0
        Refcount +1 => 1 at
          ./perf(dso__new+0x1ff) [0x4a62df]
          ./perf(__dsos__addnew+0x29) [0x4a6e19]
          ./perf(dsos__findnew+0xd1) [0x4a7181]
          ./perf(machine__findnew_kernel+0x27) [0x4a5e17]
          ./perf() [0x4b8cf2]
          ./perf(machine__create_kernel_maps+0x28) [0x4bb428]
          ./perf(machine__new_host+0xfa) [0x4bb74a]
          ./perf(init_probe_symbol_maps+0x93) [0x506613]
          ./perf() [0x455ffa]
          ./perf(cmd_probe+0x6c) [0x4566bc]
          ./perf() [0x47abc5]
          ./perf(main+0x610) [0x421f90]
          /lib64/libc.so.6(__libc_start_main+0xf5) [0x7ffa6809eaf5]
          ./perf() [0x4220a9]
        [snip]
        Refcount +1 => 2 at
          ./perf(dsos__findnew+0x7e) [0x4a712e]
          ./perf(machine__findnew_kernel+0x27) [0x4a5e17]
          ./perf() [0x4b8cf2]
          ./perf(machine__create_kernel_maps+0x28) [0x4bb428]
          ./perf(machine__new_host+0xfa) [0x4bb74a]
          ./perf(init_probe_symbol_maps+0x93) [0x506613]
          ./perf() [0x455ffa]
          ./perf(cmd_probe+0x6c) [0x4566bc]
          ./perf() [0x47abc5]
          ./perf(main+0x610) [0x421f90]
          /lib64/libc.so.6(__libc_start_main+0xf5) [0x7ffa6809eaf5]
          ./perf() [0x4220a9]
        [snip]
        Refcount -1 => 1 at
          ./perf(dso__put+0x2f) [0x4a664f]
          ./perf(machine__delete+0xfe) [0x4b93ee]
          ./perf(exit_probe_symbol_maps+0x28) [0x5066b8]
          ./perf() [0x45628a]
          ./perf(cmd_probe+0x6c) [0x4566bc]
          ./perf() [0x47abc5]
          ./perf(main+0x610) [0x421f90]
          /lib64/libc.so.6(__libc_start_main+0xf5) [0x7ffa6809eaf5]
          ./perf() [0x4220a9]
      
      Actually, dsos__findnew gets the dso before returning it, so the dso
      user (in this case machine__create_kernel_maps) has to put the dso after
      used.
      Signed-off-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20151118064033.30709.98954.stgit@localhost.localdomainSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1154c957
    • Masami Hiramatsu's avatar
      perf tools: Fix __dsos__addnew to put dso after adding it to the list · 82de26ab
      Masami Hiramatsu authored
      __dsos__addnew should drop the constructor reference to dso after adding
      it to the list, because __dsos__add() will get a reference that will be
      kept while it is in the list.
      
      This fixes DSO leaks when entries are removed to the list and the refcount
      never gets to zero.
      
      Refcnt debugger shows:
        ==== [0] ====
        Unreclaimed dso: 0x2fccab0
        Refcount +1 => 1 at
          ./perf(dso__new+0x1ff) [0x4a62df]
          ./perf(__dsos__addnew+0x29) [0x4a6e19]
          ./perf(dsos__findnew+0xd1) [0x4a7281]
          ./perf(machine__findnew_kernel+0x27) [0x4a5e17]
          ./perf() [0x4b8df2]
          ./perf(machine__create_kernel_maps+0x28) [0x4bb528]
          ./perf(machine__new_host+0xfa) [0x4bb84a]
          ./perf(init_probe_symbol_maps+0x93) [0x506713]
          ./perf() [0x455ffa]
          ./perf(cmd_probe+0x6c) [0x4566bc]
          ./perf() [0x47abc5]
          ./perf(main+0x610) [0x421f90]
          /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f46df132af5]
          ./perf() [0x4220a9]
        Refcount +1 => 2 at
          ./perf(__dsos__addnew+0xfb) [0x4a6eeb]
          ./perf(dsos__findnew+0xd1) [0x4a7281]
          ./perf(machine__findnew_kernel+0x27) [0x4a5e17]
          ./perf() [0x4b8df2]
          ./perf(machine__create_kernel_maps+0x28) [0x4bb528]
          ./perf(machine__new_host+0xfa) [0x4bb84a]
          ./perf(init_probe_symbol_maps+0x93) [0x506713]
          ./perf() [0x455ffa]
          ./perf(cmd_probe+0x6c) [0x4566bc]
          ./perf() [0x47abc5]
          ./perf(main+0x610) [0x421f90]
          /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f46df132af5]
          ./perf() [0x4220a9]
        Refcount +1 => 3 at
          ./perf(dsos__findnew+0x7e) [0x4a722e]
          ./perf(machine__findnew_kernel+0x27) [0x4a5e17]
          ./perf() [0x4b8df2]
          ./perf(machine__create_kernel_maps+0x28) [0x4bb528]
          ./perf(machine__new_host+0xfa) [0x4bb84a]
          ./perf(init_probe_symbol_maps+0x93) [0x506713]
          ./perf() [0x455ffa]
          ./perf(cmd_probe+0x6c) [0x4566bc]
          ./perf() [0x47abc5]
          ./perf(main+0x610) [0x421f90]
          /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f46df132af5]
          ./perf() [0x4220a9]
        [snip]
      Signed-off-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20151118064031.30709.81460.stgit@localhost.localdomainSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      82de26ab
    • Masami Hiramatsu's avatar
      perf tools: Fix to put new map after inserting to map_groups in dso__load_sym · 8d5c340d
      Masami Hiramatsu authored
      Fix dso__load_sym to put the map object which is already
      insterted to kmaps.
      
      Refcnt debugger shows
        ==== [0] ====
        Unreclaimed map: 0x39113e0
        Refcount +1 => 1 at
          ./perf(map__new2+0xb5) [0x4be155]
          ./perf(dso__load_sym+0xee1) [0x503461]
          ./perf(dso__load_vmlinux+0xbf) [0x4aa6df]
          ./perf(dso__load_vmlinux_path+0x8c) [0x4aa83c]
          ./perf() [0x50528a]
          ./perf(convert_perf_probe_events+0xd79) [0x50ac29]
          ./perf() [0x45600f]
          ./perf(cmd_probe+0x6c) [0x4566bc]
          ./perf() [0x47abc5]
          ./perf(main+0x610) [0x421f90]
          /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f152368baf5]
          ./perf() [0x4220a9]
        Refcount +1 => 2 at
          ./perf(maps__insert+0x9a) [0x4bfffa]
          ./perf(dso__load_sym+0xf89) [0x503509]
          ./perf(dso__load_vmlinux+0xbf) [0x4aa6df]
          ./perf(dso__load_vmlinux_path+0x8c) [0x4aa83c]
          ./perf() [0x50528a]
          ./perf(convert_perf_probe_events+0xd79) [0x50ac29]
          ./perf() [0x45600f]
          ./perf(cmd_probe+0x6c) [0x4566bc]
          ./perf() [0x47abc5]
          ./perf(main+0x610) [0x421f90]
          /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f152368baf5]
          ./perf() [0x4220a9]
        Refcount -1 => 1 at
          ./perf(map_groups__exit+0x94) [0x4bed04]
          ./perf(machine__delete+0xb0) [0x4b9300]
          ./perf(exit_probe_symbol_maps+0x28) [0x506608]
          ./perf() [0x45628a]
          ./perf(cmd_probe+0x6c) [0x4566bc]
          ./perf() [0x47abc5]
          ./perf(main+0x610) [0x421f90]
          /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f152368baf5]
          ./perf() [0x4220a9]
      
      This means that the dso__load_sym calls map__new2 and maps_insert, both
      of them bump the map refcount, but map_groups__exit will drop just one
      reference.
      
      Fix it by dropping the refcount after inserting it into kmaps.
      Signed-off-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20151118064026.30709.50038.stgit@localhost.localdomainSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8d5c340d
    • Masami Hiramatsu's avatar
      perf tools: Make perf_exec_path() always return malloc'd string · c4068f51
      Masami Hiramatsu authored
      Since system_path() returns malloc'd string if given path is not an
      absolute path, perf_exec_path() sometimes returns a static string and
      sometimes returns a malloc'd string depending on the environment
      variables or command options.
      
      This may cause a memory leak because the caller can not unconditionally
      free the returned string.
      
      This fixes perf_exec_path() and system_path() to always return a
      malloc'd string, so the caller can always free it.
      Signed-off-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20151119060453.14210.65666.stgit@localhost.localdomainSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c4068f51
    • Masami Hiramatsu's avatar
      perf machine: Fix to destroy kernel maps when machine exits · ebe9729c
      Masami Hiramatsu authored
      Actually machine__exit forgot to call machine__destroy_kernel_maps.
      
      This fixes some memory leaks on map as below.
      
      Without this fix.
        ----
        ./perf probe vfs_read
        Added new event:
          probe:vfs_read       (on vfs_read)
      
        You can now use it in all perf tools, such as:
      
                perf record -e probe:vfs_read -aR sleep 1
      
        REFCNT: BUG: Unreclaimed objects found.
        REFCNT: Total 4 objects are not reclaimed.
           To see all backtraces, rerun with -v option
        ----
      With this fix.
        ----
        ./perf probe vfs_read
        Added new event:
          probe:vfs_read       (on vfs_read)
      
        You can now use it in all perf tools, such as:
      
                perf record -e probe:vfs_read -aR sleep 1
      
        REFCNT: BUG: Unreclaimed objects found.
        REFCNT: Total 2 objects are not reclaimed.
           To see all backtraces, rerun with -v option
        ----
      Signed-off-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20151118064024.30709.43577.stgit@localhost.localdomainSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ebe9729c
    • Masami Hiramatsu's avatar
      perf machine: Fix machine__destroy_kernel_maps to drop vmlinux_maps references · e96e4078
      Masami Hiramatsu authored
      Fix machine__destroy_kernel_maps() to drop vmlinux_maps references
      before filling it with NULL.
      
      Refcnt debugger shows
        ==== [1] ====
        Unreclaimed map: 0x36b1070
        Refcount +1 => 1 at
          ./perf(map__new2+0xb5) [0x4bdec5]
          ./perf(machine__create_kernel_maps+0x72) [0x4bb152]
          ./perf(machine__new_host+0xfa) [0x4bb41a]
          ./perf(init_probe_symbol_maps+0x93) [0x5062d3]
          ./perf() [0x455ffa]
          ./perf(cmd_probe+0x6c) [0x4566bc]
          ./perf() [0x47abc5]
          ./perf(main+0x610) [0x421f90]
          /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1fc9fc4af5]
          ./perf() [0x4220a9]
        Refcount +1 => 2 at
          ./perf(maps__insert+0x9a) [0x4bfd6a]
          ./perf(machine__create_kernel_maps+0xc3) [0x4bb1a3]
          ./perf(machine__new_host+0xfa) [0x4bb41a]
          ./perf(init_probe_symbol_maps+0x93) [0x5062d3]
          ./perf() [0x455ffa]
          ./perf(cmd_probe+0x6c) [0x4566bc]
          ./perf() [0x47abc5]
          ./perf(main+0x610) [0x421f90]
          /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1fc9fc4af5]
          ./perf() [0x4220a9]
        Refcount -1 => 1 at
          ./perf(map_groups__exit+0x94) [0x4bea74]
          ./perf(machine__delete+0x3d) [0x4b91fd]
          ./perf(exit_probe_symbol_maps+0x28) [0x506378]
          ./perf() [0x45628a]
          ./perf(cmd_probe+0x6c) [0x4566bc]
          ./perf() [0x47abc5]
          ./perf(main+0x610) [0x421f90]
          /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1fc9fc4af5]
          ./perf() [0x4220a9]
      
      map__new2() returns map with refcnt = 1, and also map_groups__insert
      gets it again in__machine__create_kernel_maps().
      
      machine__destroy_kernel_maps() calls map_groups__remove() to
      decrement the refcnt, but before decrement it again (corresponding
      to map__new2), it makes vmlinux_maps[type] = NULL. And this may
      cause a refcnt leak.
      Signed-off-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20151118064022.30709.3897.stgit@localhost.localdomainSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e96e4078
    • Masami Hiramatsu's avatar
      perf machine: Fix machine__findnew_module_map to put registered map · 9afcb420
      Masami Hiramatsu authored
      Fix machine object to drop the reference to the map object after it
      inserted it into machine->kmaps.
      
      refcnt debugger shows what happened:
        ----
        ==== [2] ====
        Unreclaimed map: 0x346f750
        Refcount +1 => 1 at
          ./perf(map__new2+0xb5) [0x4bdea5]
          ./perf() [0x4b8aaf]
          ./perf(modules__parse+0xfc) [0x4a9cbc]
          ./perf() [0x4b83c0]
          ./perf(machine__create_kernel_maps+0x148) [0x4bb208]
          ./perf(machine__new_host+0xfa) [0x4bb3fa]
          ./perf(init_probe_symbol_maps+0x93) [0x5062b3]
          ./perf() [0x455ffa]
          ./perf(cmd_probe+0x6c) [0x4566bc]
          ./perf() [0x47abc5]
          ./perf(main+0x610) [0x421f90]
          /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f5373899af5]
          ./perf() [0x4220a9]
        Refcount +1 => 2 at
          ./perf(maps__insert+0x9a) [0x4bfd4a]
          ./perf() [0x4b8acb]
          ./perf(modules__parse+0xfc) [0x4a9cbc]
          ./perf() [0x4b83c0]
          ./perf(machine__create_kernel_maps+0x148) [0x4bb208]
          ./perf(machine__new_host+0xfa) [0x4bb3fa]
          ./perf(init_probe_symbol_maps+0x93) [0x5062b3]
          ./perf() [0x455ffa]
          ./perf(cmd_probe+0x6c) [0x4566bc]
          ./perf() [0x47abc5]
          ./perf(main+0x610) [0x421f90]
          /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f5373899af5]
          ./perf() [0x4220a9]
        Refcount -1 => 1 at
          ./perf(map_groups__exit+0x94) [0x4bea54]
          ./perf(machine__delete+0x3d) [0x4b91ed]
          ./perf(exit_probe_symbol_maps+0x28) [0x506358]
          ./perf() [0x45628a]
          ./perf(cmd_probe+0x6c) [0x4566bc]
          ./perf() [0x47abc5]
          ./perf(main+0x610) [0x421f90]
          /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f5373899af5]
          ./perf() [0x4220a9]
        ----
      
      This pattern clearly shows that the refcnt of the map is acquired twice
      by map__new2 and maps__insert but released onlu once at
      map_groups__exit, when we purge its maps rbtree.
      
      Since maps__insert already reference counted the map, we have to drop
      the constructor (map__new2) reference count right after inserting it.
      
      These happened in machine__findnew_module_map, as below.
      
        ----
        # eu-addr2line -e ./perf -f 0x4b8aaf
        machine__findnew_module_map inlined at util/machine.c:1046
        in machine__create_module
        util/machine.c:582
        # eu-addr2line -e ./perf -f 0x4b8acb
        map_groups__insert inlined at util/machine.c:585
        in machine__create_module
        util/map.h:208
        ----
      
      (note that both are at util/machine.c:58X which is
       machine__findnew_module_map)
      Signed-off-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20151118064020.30709.40499.stgit@localhost.localdomainSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9afcb420
    • Masami Hiramatsu's avatar
      perf probe: Fix to free temporal Dwarf_Frame · 05c8d802
      Masami Hiramatsu authored
      Since dwarf_cfi_addrframe returns malloc'd Dwarf_Frame object, it has to
      be freed after it is used.
      Signed-off-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/20151118064011.30709.65674.stgit@localhost.localdomainSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      05c8d802
    • Wang Nan's avatar
      perf test: Mute test cases error messages if verbose == 0 · 5bcf2fe0
      Wang Nan authored
      Sometimes error messages in breaks the pretty output of 'perf test'.
      For example:
      
        # mv /lib/modules/4.3.0-rc4+/build/vmlinux{,.bak}
        # perf test LLVM BPF
        35: Test LLVM searching and compiling                        :
        35.1: Basic BPF llvm compiling test                          : Ok
        35.2: Test kbuild searching                                  : Ok
        35.3: Compile source for BPF prologue generation test        : Ok
        37: Test BPF filter                                          :
        37.1: Test basic BPF filtering                               : Ok
        37.2: Test BPF prologue generation                           :Failed to find the path for kernel: No such file or directory FAILED!
      
      This patch mute test cases thoroughly by redirect their stdout and
      stderr to /dev/null when verbose == 0. After applying this patch:
      
        # ./perf test LLVM BPF
        35: Test LLVM searching and compiling                        :
        35.1: Basic BPF llvm compiling test                          : Ok
        35.2: Test kbuild searching                                  : Ok
        35.3: Compile source for BPF prologue generation test        : Ok
        37: Test BPF filter                                          :
        37.1: Test basic BPF filtering                               : Ok
        37.2: Test BPF prologue generation                           : FAILED!
      
        # ./perf test -v LLVM BPF
        35: Test LLVM searching and compiling                        :
        35.1: Basic BPF llvm compiling test                          :
        --- start ---
        test child forked, pid 13183
        Kernel build dir is set to /lib/modules/4.3.0-rc4+/build
        set env: KBUILD_DIR=/lib/modules/4.3.0-rc4+/build
        ...
        bpf: config 'func=null_lseek file->f_mode offset orig' is ok
        Looking at the vmlinux_path (7 entries long)
        Failed to find the path for kernel: No such file or directory
        bpf_probe: failed to convert perf probe eventsFailed to add events selected by BPF
        test child finished with -1
        ---- end ----
        Test BPF filter subtest 1: FAILED!
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1447749170-175898-6-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5bcf2fe0
    • Wang Nan's avatar
      perf test: Print result for each BPF subtest · 77a0cf68
      Wang Nan authored
      This patch prints each sub-tests results for BPF testcases.
      
      Before:
      
        # ./perf test BPF
        37: Test BPF filter                                          : Ok
      
      After:
      
        # ./perf test BPF
        37: Test BPF filter                                          :
        37.1: Test basic BPF filtering                               : Ok
        37.2: Test BPF prologue generation                           : Ok
      
      When a failure happens:
      
        # cat ~/.perfconfig
        [llvm]
            clang-path = "/bin/false"
        # ./perf test BPF
        37: Test BPF filter                                          :
        37.1: Test basic BPF filtering                               : Skip
        37.2: Test BPF prologue generation                           : Skip
      Suggested-and-Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1447749170-175898-5-git-send-email-wangnan0@huawei.com
      [ Fixed up not to use .func in an anonymous union ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      77a0cf68
    • Wang Nan's avatar
      perf test: Print result for each LLVM subtest · e8c6d500
      Wang Nan authored
      Currently 'perf test llvm' and 'perf test BPF' have multiple sub-tests,
      but the result is provided in only one line:
      
        # perf test LLVM
        35: Test LLVM searching and compiling                        : Ok
      
      This patch introduces sub-tests support, allowing 'perf test' to report
      result for each sub-tests:
      
        # perf test LLVM
        35: Test LLVM searching and compiling                        :
        35.1: Basic BPF llvm compiling test                          : Ok
        35.2: Test kbuild searching                                  : Ok
        35.3: Compile source for BPF prologue generation test        : Ok
      
      When a failure happens:
      
        # cat ~/.perfconfig
        [llvm]
             clang-path = "/bin/false"
        # perf test LLVM
        35: Test LLVM searching and compiling                        :
        35.1: Basic BPF llvm compiling test                          : FAILED!
        35.2: Test kbuild searching                                  : Skip
        35.3: Compile source for BPF prologue generation test        : Skip
      
      And:
      
        # rm ~/.perfconfig
        # ./perf test LLVM
        35: Test LLVM searching and compiling                        :
        35.1: Basic BPF llvm compiling test                          : Skip
        35.2: Test kbuild searching                                  : Skip
        35.3: Compile source for BPF prologue generation test        : Skip
      
      Skip by user:
      
        # ./perf test -s 1,`seq -s , 3 42`
         1: vmlinux symtab matches kallsyms                          : Skip (user override)
         2: detect openat syscall event                              : Ok
        ...
        35: Test LLVM searching and compiling                        : Skip (user override)
        ...
      Suggested-and-Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1447749170-175898-4-git-send-email-wangnan0@huawei.com
      [ Changed so that func is not on an anonymous union ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e8c6d500
    • Arnaldo Carvalho de Melo's avatar
      perf tests: Pass the subtest index to each test routine · 721a1f53
      Arnaldo Carvalho de Melo authored
      Some tests have sub-tests we want to run, so allow passing this.
      
      Wang tried to avoid having to touch all tests, but then, having the
      test.func in an anonymous union makes the build fail on older compilers,
      like the one in RHEL6, where:
      
        test a = {
      	.func = foo,
        };
      
      fails.
      
      To fix it leave the func pointer in the main structure and pass the subtest
      index to all tests, end result function is the same, but we have just one
      function pointer, not two, with and without the subtest index as an argument.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-5genj0ficwdmelpoqlds0u4y@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      721a1f53
  2. 18 Nov, 2015 16 commits
    • Wang Nan's avatar
      perf bpf: Use same BPF program if arguments are identical · d35b3289
      Wang Nan authored
      This patch allows creating only one BPF program for different
      'probe_trace_event'(tev) entries generated by one
      'perf_probe_event'(pev) if their prologues are identical.
      
      This is done by comparing the argument list of different tev instances,
      and the maps type of prologue and tev using a mapping array. This patch
      utilizes qsort to sort the tevs. After sorting, tevs with identical
      argument lists will be grouped together.
      
      Test result:
      
      Sample BPF program:
      
        #define SEC(NAME) __attribute__((section(NAME), used))
        SEC("inlines=no;"
            "func=SyS_dup? oldfd")
        int func(void *ctx)
        {
            return 1;
        }
      
      It would probe at SyS_dup2 and SyS_dup3, obtaining oldfd as its
      argument.
      
      The following cmdline shows a BPF program being loaded into the kernel
      by perf:
      
       # perf record -e ./test_bpf_arg.c sleep 4 & sleep 1 && ls /proc/$!/fd/ -l | grep bpf-prog
      
      Before this patch:
      
        # perf record -e ./test_bpf_arg.c sleep 4 & sleep 1 && ls /proc/$!/fd/ -l | grep bpf-prog
        [1] 24858
        lrwx------ 1 root root 64 Nov 14 04:09 3 -> anon_inode:bpf-prog
        lrwx------ 1 root root 64 Nov 14 04:09 4 -> anon_inode:bpf-prog
        ...
      
      After this patch:
      
        # perf record -e ./test_bpf_arg.c sleep 4 & sleep 1 && ls /proc/$!/fd/ -l | grep bpf-prog
        [1] 25699
        lrwx------ 1 root root 64 Nov 14 04:10 3 -> anon_inode:bpf-prog
        ...
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1447749170-175898-3-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d35b3289
    • Wang Nan's avatar
      perf test: Fix 'perf test BPF' when it fails to find a suitable vmlinux · ad0dd7ae
      Wang Nan authored
      Two bugs in 'perf test BPF' are found when testing BPF prologue without
      vmlinux:
      
       # mv /lib/modules/4.3.0-rc4+/build/vmlinux{,.bak}
       # ./perf test BPF
       37: Test BPF filter             :Failed to find the path for kernel: No such file or directory
       Ok
      
      Test BPF should fail in this case.
      
      After this patch:
      
       # ./perf test BPF
       37: Test BPF filter             :Failed to find the path for kernel: No such file or directory
        FAILED!
       # mv /lib/modules/4.3.0-rc4+/build/vmlinux{.bak,}
       # ./perf test BPF
       37: Test BPF filter             : Ok
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1447749170-175898-2-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ad0dd7ae
    • Wang Nan's avatar
      perf test: Test the BPF prologue adding infrastructure · bbb7d492
      Wang Nan authored
      This patch introduces a new BPF script to test the BPF prologue adding
      routines. The new script probes at null_lseek, which is the function pointer
      used when we try to lseek on '/dev/null'.
      
      The null_lseek function is chosen because it is used by function pointers, so
      we don't need to consider inlining and LTO.
      
      By extracting file->f_mode, bpf-script-test-prologue.c should know whether the
      file is writable or readonly. According to llseek_loop() and
      bpf-script-test-prologue.c, one fourth of total lseeks should be collected.
      
      Committer note:
      
      Testing it:
      
        # perf test -v BPF
        <SNIP>
        Kernel build dir is set to /lib/modules/4.3.0+/build
        set env: KBUILD_DIR=/lib/modules/4.3.0+/build
        unset env: KBUILD_OPTS
        include option is set to  -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/4.9.2/include -I/home/git/linux/arch/x86/include -Iarch/x86/include/generated/uapi -Iarch/x86/include/generated  -I/home/git/linux/include -Iinclude -I/home/git/linux/arch/x86/include/uapi -Iarch/x86/include/generated/uapi -I/home/git/linux/include/uapi -Iinclude/generated/uapi -include /home/git/linux/include/linux/kconfig.h
        set env: NR_CPUS=4
        set env: LINUX_VERSION_CODE=0x40300
        set env: CLANG_EXEC=/usr/libexec/icecc/bin/clang
        set env: CLANG_OPTIONS=-xc
        set env: KERNEL_INC_OPTIONS= -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/4.9.2/include -I/home/git/linux/arch/x86/include -Iarch/x86/include/generated/uapi -Iarch/x86/include/generated  -I/home/git/linux/include -Iinclude -I/home/git/linux/arch/x86/include/uapi -Iarch/x86/include/generated/uapi -I/home/git/linux/include/uapi -Iinclude/generated/uapi -include /home/git/linux/include/linux/kconfig.h
        set env: WORKING_DIR=/lib/modules/4.3.0+/build
        set env: CLANG_SOURCE=-
        llvm compiling command template: echo '/*
         * bpf-script-test-prologue.c
         * Test BPF prologue
         */
        #ifndef LINUX_VERSION_CODE
        # error Need LINUX_VERSION_CODE
        # error Example: for 4.2 kernel, put 'clang-opt="-DLINUX_VERSION_CODE=0x40200" into llvm section of ~/.perfconfig'
        #endif
        #define SEC(NAME) __attribute__((section(NAME), used))
      
        #include <uapi/linux/fs.h>
      
        #define FMODE_READ		0x1
        #define FMODE_WRITE		0x2
      
        static void (*bpf_trace_printk)(const char *fmt, int fmt_size, ...) =
      	  (void *) 6;
      
        SEC("func=null_lseek file->f_mode offset orig")
        int bpf_func__null_lseek(void *ctx, int err, unsigned long f_mode,
      			   unsigned long offset, unsigned long orig)
        {
      	  if (err)
      		  return 0;
      	  if (f_mode & FMODE_WRITE)
      		  return 0;
      	  if (offset & 1)
      		  return 0;
      	  if (orig == SEEK_CUR)
      		  return 0;
      	  return 1;
        }
      
        char _license[] SEC("license") = "GPL";
        int _version SEC("version") = LINUX_VERSION_CODE;
        ' | $CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS -DLINUX_VERSION_CODE=$LINUX_VERSION_CODE $CLANG_OPTIONS $KERNEL_INC_OPTIONS -Wno-unused-value -Wno-pointer-sign -working-directory $WORKING_DIR -c "$CLANG_SOURCE" -target bpf -O2 -o -
        libbpf: loading object '[bpf_prologue_test]' from buffer
        libbpf: section .strtab, size 135, link 0, flags 0, type=3
        libbpf: section .text, size 0, link 0, flags 6, type=1
        libbpf: section .data, size 0, link 0, flags 3, type=1
        libbpf: section .bss, size 0, link 0, flags 3, type=8
        libbpf: section func=null_lseek file->f_mode offset orig, size 112, link 0, flags 6, type=1
        libbpf: found program func=null_lseek file->f_mode offset orig
        libbpf: section license, size 4, link 0, flags 3, type=1
        libbpf: license of [bpf_prologue_test] is GPL
        libbpf: section version, size 4, link 0, flags 3, type=1
        libbpf: kernel version of [bpf_prologue_test] is 40300
        libbpf: section .symtab, size 168, link 1, flags 0, type=2
        bpf: config program 'func=null_lseek file->f_mode offset orig'
        symbol:null_lseek file:(null) line:0 offset:0 return:0 lazy:(null)
        parsing arg: file->f_mode into file, f_mode(1)
        parsing arg: offset into offset
        parsing arg: orig into orig
        bpf: config 'func=null_lseek file->f_mode offset orig' is ok
        Looking at the vmlinux_path (7 entries long)
        Using /lib/modules/4.3.0+/build/vmlinux for symbols
        Open Debuginfo file: /lib/modules/4.3.0+/build/vmlinux
        Try to find probe point from debuginfo.
        Matched function: null_lseek
        Probe point found: null_lseek+0
        Searching 'file' variable in context.
        Converting variable file into trace event.
        converting f_mode in file
        f_mode type is unsigned int.
        Searching 'offset' variable in context.
        Converting variable offset into trace event.
        offset type is long long int.
        Searching 'orig' variable in context.
        Converting variable orig into trace event.
        orig type is int.
        Found 1 probe_trace_events.
        Opening /sys/kernel/debug/tracing//kprobe_events write=1
        Writing event: p:perf_bpf_probe/func _text+4840528 f_mode=+68(%di):u32 offset=%si:s64 orig=%dx:s32
        libbpf: don't need create maps for [bpf_prologue_test]
        prologue: pass validation
        prologue: slow path
        prologue: fetch arg 0, base reg is %di
        prologue: arg 0: offset 68
        prologue: fetch arg 1, base reg is %si
        prologue: fetch arg 2, base reg is %dx
        add bpf event perf_bpf_probe:func and attach bpf program 3
        adding perf_bpf_probe:func
        adding perf_bpf_probe:func to 0x51672c0
        mmap size 1052672B
        Opening /sys/kernel/debug/tracing//kprobe_events write=1
        Opening /sys/kernel/debug/tracing//uprobe_events write=1
        Parsing probe_events: p:perf_bpf_probe/func _text+4840528 f_mode=+68(%di):u32 offset=%si:s64 orig=%dx:s32
        Group:perf_bpf_probe Event:func probe:p
        Writing event: -:perf_bpf_probe/func
        test child finished with 0
        ---- end ----
        Test BPF filter: Ok
        #
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1447675815-166222-13-git-send-email-wangnan0@huawei.com
      [ Added tools/perf/tests/llvm-src-prologue.c to .gitignore ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      bbb7d492
    • Wang Nan's avatar
      perf bpf: Generate prologue for BPF programs · a08357d8
      Wang Nan authored
      This patch generates a prologue for each 'struct probe_trace_event' for
      fetching arguments for BPF programs.
      
      After bpf__probe(), iterate over each program to check whether prologues are
      required. If none of the 'struct perf_probe_event' programs will attach to have
      at least one argument, simply skip preprocessor hooking. For those who a
      prologue is required, call bpf__gen_prologue() and paste the original
      instruction after the prologue.
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1447675815-166222-12-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a08357d8
    • He Kuang's avatar
      perf bpf: Add prologue for BPF programs for fetching arguments · bfc077b4
      He Kuang authored
      This patch generates a prologue for a BPF program which fetches arguments for
      it.  With this patch, the program can have arguments as follow:
      
        SEC("lock_page=__lock_page page->flags")
        int lock_page(struct pt_regs *ctx, int err, unsigned long flags)
        {
       	 return 1;
        }
      
      This patch passes at most 3 arguments from r3, r4 and r5. r1 is still the ctx
      pointer. r2 is used to indicate if dereferencing was done successfully.
      
      This patch uses r6 to hold ctx (struct pt_regs) and r7 to hold stack pointer
      for result. Result of each arguments first store on stack:
      
       low address
       BPF_REG_FP - 24  ARG3
       BPF_REG_FP - 16  ARG2
       BPF_REG_FP - 8   ARG1
       BPF_REG_FP
       high address
      
      Then loaded into r3, r4 and r5.
      
      The output prologue for offn(...off2(off1(reg)))) should be:
      
           r6 <- r1			// save ctx into a callee saved register
           r7 <- fp
           r7 <- r7 - stack_offset	// pointer to result slot
           /* load r3 with the offset in pt_regs of 'reg' */
           (r7) <- r3			// make slot valid
           r3 <- r3 + off1		// prepare to read unsafe pointer
           r2 <- 8
           r1 <- r7			// result put onto stack
           call probe_read		// read unsafe pointer
           jnei r0, 0, err		// error checking
           r3 <- (r7)			// read result
           r3 <- r3 + off2		// prepare to read unsafe pointer
           r2 <- 8
           r1 <- r7
           call probe_read
           jnei r0, 0, err
           ...
           /* load r2, r3, r4 from stack */
           goto success
      err:
           r2 <- 1
           /* load r3, r4, r5 with 0 */
           goto usercode
      success:
           r2 <- 0
      usercode:
           r1 <- r6	// restore ctx
           // original user code
      
      If all of arguments reside in register (dereferencing is not
      required), gen_prologue_fastpath() will be used to create
      fast prologue:
      
           r3 <- (r1 + offset of reg1)
           r4 <- (r1 + offset of reg2)
           r5 <- (r1 + offset of reg3)
           r2 <- 0
      
      P.S.
      
      eBPF calling convention is defined as:
      
      * r0		- return value from in-kernel function, and exit value
                        for eBPF program
      * r1 - r5	- arguments from eBPF program to in-kernel function
      * r6 - r9	- callee saved registers that in-kernel function will
                        preserve
      * r10		- read-only frame pointer to access stack
      
      Committer note:
      
      At least testing if it builds and loads:
      
        # cat test_probe_arg.c
        struct pt_regs;
      
        __attribute__((section("lock_page=__lock_page page->flags"), used))
        int func(struct pt_regs *ctx, int err, unsigned long flags)
        {
        	return 1;
        }
      
        char _license[] __attribute__((section("license"), used)) = "GPL";
        int _version __attribute__((section("version"), used)) = 0x40300;
        # perf record -e ./test_probe_arg.c usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.016 MB perf.data ]
        # perf evlist
        perf_bpf_probe:lock_page
        #
      Signed-off-by: default avatarHe Kuang <hekuang@huawei.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1447675815-166222-11-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      bfc077b4
    • Wang Nan's avatar
      perf bpf: Allow BPF program config probing options · 03e01f56
      Wang Nan authored
      By extending the syntax of BPF object section names, this patch allows users to
      config probing options like what they can do in 'perf probe'.
      
      The error message in 'perf probe' is also updated.
      
      Test result:
      
      For following BPF file test_probe_glob.c:
      
        # cat test_probe_glob.c
        __attribute__((section("inlines=no;func=SyS_dup?"), used))
      
        int func(void *ctx)
        {
      	  return 1;
        }
      
        char _license[] __attribute__((section("license"), used)) = "GPL";
        int _version __attribute__((section("version"), used)) = 0x40300;
        #
        # ./perf record  -e ./test_probe_glob.c ls /
        ...
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.013 MB perf.data ]
        # ./perf evlist
        perf_bpf_probe:func_1
        perf_bpf_probe:func
      
      After changing "inlines=no" to "inlines=yes":
      
        # ./perf record  -e ./test_probe_glob.c ls /
        ...
        [ perf record: Woken up 2 times to write data ]
        [ perf record: Captured and wrote 0.013 MB perf.data ]
        # ./perf evlist
        perf_bpf_probe:func_3
        perf_bpf_probe:func_2
        perf_bpf_probe:func_1
        perf_bpf_probe:func
      
      Then test 'force':
      
      Use following program:
      
        # cat test_probe_force.c
        __attribute__((section("func=sys_write"), used))
      
        int funca(void *ctx)
        {
      	  return 1;
        }
      
        __attribute__((section("force=yes;func=sys_write"), used))
      
        int funcb(void *ctx)
        {
        	return 1;
        }
      
        char _license[] __attribute__((section("license"), used)) = "GPL";
        int _version __attribute__((section("version"), used)) = 0x40300;
        #
      
        # perf record -e ./test_probe_force.c usleep 1
        Error: event "func" already exists.
         Hint: Remove existing event by 'perf probe -d'
             or force duplicates by 'perf probe -f'
             or set 'force=yes' in BPF source.
        event syntax error: './test_probe_force.c'
                             \___ Probe point exist. Try 'perf probe -d "*"' and set 'force=yes'
      
        (add -v to see detail)
        ...
      
      Then replace 'force=no' to 'force=yes':
      
        # vim test_probe_force.c
        # perf record -e ./test_probe_force.c usleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.017 MB perf.data ]
        # perf evlist
        perf_bpf_probe:func_1
        perf_bpf_probe:func
        #
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1447675815-166222-7-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      03e01f56
    • Wang Nan's avatar
      perf bpf: Allow attaching BPF programs to modules symbols · 5dbd16c0
      Wang Nan authored
      By extending the syntax of BPF object section names, this patch allows
      users to attach BPF programs to symbols in modules. For example:
      
        SEC("module=i915;"
            "parse_cmds=i915_parse_cmds")
        int parse_cmds(void *ctx)
        {
            return 1;
        }
      
      The implementation is very simple: like what 'perf probe' does, for module,
      fill 'uprobe' field in 'struct perf_probe_event'. Other parts will be done
      automatically.
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kaixu Xia <xiakaixu@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1447675815-166222-5-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      5dbd16c0
    • Wang Nan's avatar
      perf bpf: Allow BPF program attach to uprobe events · 361f2b1d
      Wang Nan authored
      This patch adds a new syntax to the BPF object section name to support
      probing at uprobe event. Now we can use BPF program like this:
      
        SEC(
        "exec=/lib64/libc.so.6;"
        "libcwrite=__write"
        )
        int libcwrite(void *ctx)
        {
            return 1;
        }
      
      Where, in section name of a program, before the main config string, we
      can use 'key=value' style options. Now the only option key is "exec",
      for uprobes.
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1447675815-166222-4-git-send-email-wangnan0@huawei.com
      [ Changed the separator from \n to ; ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      361f2b1d
    • Wang Nan's avatar
      perf bpf: Compile dwarf-regs.c if CONFIG_BPF_PROLOGUE is on · 30433a3a
      Wang Nan authored
      regs_query_register_offset() in dwarf-regs.c is required by BPF
      prologue.  This patch compiles it if CONFIG_BPF_PROLOGUE is on to avoid
      build failure when CONFIG_BPF_PROLOGUE is on but CONFIG_DWARF is not
      set.
      Signed-off-by: default avatarHe Kuang <hekuang@huawei.com>
      Acked-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1447675815-166222-10-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      30433a3a
    • Wang Nan's avatar
      perf bpf: Add BPF_PROLOGUE config options for further patches · 1c0ed632
      Wang Nan authored
      If both LIBBPF and DWARF are detected, it is possible to create prologue
      for eBPF programs to help them access kernel data. HAVE_BPF_PROLOGUE and
      CONFIG_BPF_PROLOGUE are added as flags for this feature.
      
      PERF_HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET is introduced in commit
      63ab024a ("perf tools:
      regs_query_register_offset() infrastructure"), which indicates that an
      architecture supports converting name of a register to its offset in
      'struct pt_regs'. Without this support, BPF_PROLOGUE should be turned
      off.
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1447675815-166222-9-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1c0ed632
    • Wang Nan's avatar
      bpf tools: Load a program with different instances using preprocessor · b580563e
      Wang Nan authored
      This patch is a preparation for BPF prologue support which allows
      generating a series of BPF bytecode for fetching kernel data before
      calling program code. With the newly introduced multiple instances
      support, perf is able to create different prologues for different kprobe
      points.
      
      Before this patch, a bpf_program can be loaded into kernel only once,
      and get the only resulting fd. What this patch does is to allow creating
      and loading different variants of one bpf_program, then fetching their
      fds.
      
      Here we describe the basic idea in this patch. The detailed description
      of the newly introduced APIs can be found in comments in the patch body.
      
      The key of this patch is the new mechanism in bpf_program__load().
      Instead of loading BPF program into kernel directly, it calls a
      'pre-processor' to generate program instances which would be finally
      loaded into the kernel based on the original code. To enable the
      generation of multiple instances, libbpf passes an index to the
      pre-processor so it know which instance is being loaded.
      
      Pre-processor should be called from libbpf's user (perf) using
      bpf_program__set_prep(). The number of instances and the relationship
      between indices and the target instance should be clear when calling
      bpf_program__set_prep().
      
      To retrieve a fd for a specific instance of a program,
      bpf_program__nth_fd() is introduced. It returns the resulting fd
      according to index.
      Signed-off-by: default avatarHe Kuang <hekuang@huawei.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1447675815-166222-8-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarWang Nan <wangnan0@huawei.com>
      [ Enclosed multi-line if/else blocks with {}, (*func_ptr)() -> func_ptr() ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b580563e
    • Wang Nan's avatar
      tools: Clone the kernel's strtobool function · 7d85c434
      Wang Nan authored
      Copying it to tools/lib/string.c, the counterpart to the kernel's
      lib/string.c.
      
      This is preparation for enhancing BPF program configuration, which will
      allow config string like 'inlines=yes'.
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Jonathan Cameron <jic23@cam.ac.uk>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1447675815-166222-6-git-send-email-wangnan0@huawei.com
      [ Copied it to tools/lib/string.c instead, to make it usable by other tools/ ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      7d85c434
    • Arnaldo Carvalho de Melo's avatar
      tools: Adopt memdup() from tools/perf, moving it to tools/lib/string.c · 4ddd3274
      Arnaldo Carvalho de Melo authored
      That will contain more string functions with counterparts, sometimes
      verbatim copies, in the kernel.
      Acked-by: default avatarWang Nan <wangnan0@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/n/tip-rah6g97kn21vfgmlramorz6o@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4ddd3274
    • Kevin Hilman's avatar
      tools: Fix selftests_install Makefile rule · 9a13c658
      Kevin Hilman authored
      Fix copy/paste error in selftests_install rule which was copy-pasted
      from the clean rule but not properly changed.
      Signed-off-by: default avatarKevin Hilman <khilman@linaro.org>
      Cc: Bamvor Jian Zhang <bamvor.zhangjian@linaro.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Jonathan Cameron <jic23@kernel.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Pali Rohar <pali.rohar@gmail.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Roberta Dobrescu <roberta.dobrescu@gmail.com>
      Cc: Shuah Khan <shuahkh@osg.samsung.com>
      Cc: linaro-kernel@lists.linaro.org
      Link: http://lkml.kernel.org/r/1447797261-1775-1-git-send-email-khilman@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9a13c658
    • Arnaldo Carvalho de Melo's avatar
      perf test: Fix build of BPF and LLVM on older glibc libraries · 916d4092
      Arnaldo Carvalho de Melo authored
        $ rpm -q glibc
        glibc-2.12-1.166.el6_7.1.x86_64
      
      <SNIP>
          CC       /tmp/build/perf/tests/llvm.o
        cc1: warnings being treated as errors
        tests/llvm.c: In function ‘test_llvm__fetch_bpf_obj’:
        tests/llvm.c:53: error: declaration of ‘index’ shadows a global declaration
        /usr/include/string.h:489: error: shadowed declaration is here
      <SNIP>
          CC       /tmp/build/perf/tests/bpf.o
        cc1: warnings being treated as errors
        tests/bpf.c: In function ‘__test__bpf’:
        tests/bpf.c:149: error: declaration of ‘index’ shadows a global declaration
        /usr/include/string.h:489: error: shadowed declaration is here
      <SNIP>
      
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: pi3orama@163.com
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Fixes: b31de018 ("perf test: Enhance the LLVM test: update basic BPF test program")
      Fixes: ba1fae43 ("perf test: Add 'perf test BPF'")
      Link: http://lkml.kernel.org/n/tip-akpo4r750oya2phxoh9e3447@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      916d4092
    • Ingo Molnar's avatar
      Merge tag 'perf-urgent-for-mingo' of... · e15bf88a
      Ingo Molnar authored
      Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
      
      Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
      
        - Do not change the key of an object in a rbtree, this time it was
          the one for DSOs lookup by its long_name, and the noticed symptom was
          with 'perf buildid-list --with-hits' (Adrian Hunter)
      
        - 'perf inject' is a pipe, events it doesn't touch should be passed
          on, PERF_RECORD_LOST wasn't, fix it (Adrian Hunter)
      
        - Make 'perf buildid-list' request event ordering, as it needs to
          first get the mmap events to be able to mark wich DSOs had hits
          (Adrian Hunter)
      
        - Fix memory leaks on failure in 'perf probe' (Masami Hiramatsu, Wang Nan)
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e15bf88a
  3. 13 Nov, 2015 5 commits
    • Wang Nan's avatar
      perf probe: Clear probe_trace_event when add_probe_trace_event() fails · 092b1f0b
      Wang Nan authored
      When probing with a glob, errors in add_probe_trace_event() won't be
      passed to debuginfo__find_trace_events() because it would be modified by
      probe_point_search_cb(). It causes a segfault if perf fails to find an
      argument for a probe point matched by the glob. For example:
      
        # ./perf probe -v -n 'SyS_dup? oldfd'
        probe-definition(0): SyS_dup? oldfd
        symbol:SyS_dup? file:(null) line:0 offset:0 return:0 lazy:(null)
        parsing arg: oldfd into oldfd
        1 arguments
        Looking at the vmlinux_path (7 entries long)
        Using /lib/modules/4.3.0-rc4+/build/vmlinux for symbols
        Open Debuginfo file: /lib/modules/4.3.0-rc4+/build/vmlinux
        Try to find probe point from debuginfo.
        Matched function: SyS_dup3
        found inline addr: 0xffffffff812095c0
        Probe point found: SyS_dup3+0
        Searching 'oldfd' variable in context.
        Converting variable oldfd into trace event.
        oldfd type is long int.
        found inline addr: 0xffffffff812096d4
        Probe point found: SyS_dup2+36
        Searching 'oldfd' variable in context.
        Failed to find 'oldfd' in this function.
        Matched function: SyS_dup3
        Probe point found: SyS_dup3+0
        Searching 'oldfd' variable in context.
        Converting variable oldfd into trace event.
        oldfd type is long int.
        Matched function: SyS_dup2
        Probe point found: SyS_dup2+0
        Searching 'oldfd' variable in context.
        Converting variable oldfd into trace event.
        oldfd type is long int.
        Found 4 probe_trace_events.
        Opening /sys/kernel/debug/tracing//kprobe_events write=1
        Writing event: p:probe/SyS_dup3 _text+2135488 oldfd=%di:s64
        Segmentation fault (core dumped)
        #
      
      This patch ensures that add_probe_trace_event() doesn't touches
      tf->ntevs and tf->tevs if those functions fail.
      
      After the patch:
      
        # perf probe  'SyS_dup? oldfd'
        Failed to find 'oldfd' in this function.
        Added new events:
          probe:SyS_dup3       (on SyS_dup? with oldfd)
          probe:SyS_dup3_1     (on SyS_dup? with oldfd)
          probe:SyS_dup2       (on SyS_dup? with oldfd)
      
        You can now use it in all perf tools, such as:
      
      	perf record -e probe:SyS_dup2 -aR sleep 1
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1447417761-156094-3-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      092b1f0b
    • Masami Hiramatsu's avatar
      perf probe: Fix memory leaking on failure by clearing all probe_trace_events · 0196e787
      Masami Hiramatsu authored
      Fix memory leaking on the debuginfo__find_trace_events() failure path
      which frees an array of probe_trace_events but doesn't clears all the
      allocated sub-structures and strings.
      
      So, before doing zfree(tevs), clear all the array elements which may
      have allocated resources.
      Reported-by: default avatarWang Nan <wangnan0@huawei.com>
      Signed-off-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1447417761-156094-2-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0196e787
    • Adrian Hunter's avatar
      perf inject: Also re-pipe lost_samples event · d8145b3e
      Adrian Hunter authored
      perf inject must re-pipe all events otherwise they get dropped from the
      output file.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1447408112-1920-4-git-send-email-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d8145b3e
    • Adrian Hunter's avatar
      perf buildid-list: Requires ordered events · 1216b65c
      Adrian Hunter authored
      'perf buildid-list' processes events to determine hits (i.e. with-hits
      option).  That may not work if events are not sorted in order. i.e. MMAP
      events must be processed before the samples that depend on them so that
      sample processing can 'hit' the DSO to which the MMAP refers.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1447408112-1920-3-git-send-email-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1216b65c
    • Adrian Hunter's avatar
      perf symbols: Fix dso lookup by long name and missing buildids · e266a753
      Adrian Hunter authored
      Commit 4598a0a6 ("perf symbols: Improve DSO long names lookup speed
      with rbtree") Added a tree to lookup dsos by long name.  That tree gets
      corrupted whenever a dso long name is changed because the tree is not
      updated.
      
      One effect of that is buildid-list does not work with the 'with-hits'
      option because dso lookup fails and results in two structs for the same
      dso.  The first has the buildid but no hits, the second has hits but no
      buildid. e.g.
      
      Before:
      
        $ tools/perf/perf record ls
        arch     certs    CREDITS  Documentation  firmware  include
        ipc      Kconfig  lib      Makefile       net       REPORTING-BUGS
        scripts  sound    usr      block          COPYING   crypto
        drivers  fs       init     Kbuild         kernel    MAINTAINERS
        mm       README   samples  security       tools     virt
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.012 MB perf.data (11 samples) ]
        $ tools/perf/perf buildid-list
        574da826c66538a8d9060d393a8866289bd06005 [kernel.kallsyms]
        30c94dc66a1fe95180c3d68d2b89e576d5ae213c /lib/x86_64-linux-gnu/libc-2.19.so
        $ tools/perf/perf buildid-list -H
        574da826c66538a8d9060d393a8866289bd06005 [kernel.kallsyms]
        0000000000000000000000000000000000000000 /lib/x86_64-linux-gnu/libc-2.19.so
      
      After:
      
        $ tools/perf/perf buildid-list -H
        574da826c66538a8d9060d393a8866289bd06005 [kernel.kallsyms]
        30c94dc66a1fe95180c3d68d2b89e576d5ae213c /lib/x86_64-linux-gnu/libc-2.19.so
      
      The fix is to record the root of the tree on the dso so that
      dso__set_long_name() can update the tree when the long name changes.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: Douglas Hatch <doug.hatch@hp.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Scott J Norton <scott.norton@hp.com>
      Cc: Waiman Long <Waiman.Long@hp.com>
      Fixes: 4598a0a6 ("perf symbols: Improve DSO long names lookup speed with rbtree")
      Link: http://lkml.kernel.org/r/1447408112-1920-2-git-send-email-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e266a753