1. 21 Jun, 2017 23 commits
  2. 20 Jun, 2017 13 commits
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Remove unused _ALL_SOURCE define · fd25bf8b
      Arnaldo Carvalho de Melo authored
      Curious as to what this was for I looked at /usr/include/ and only some
      python headers define this, and it ends up being to enable "extensions"
      on some old OSes:
      
        /* Enable extensions on AIX 3, Interix */
      
      I guess we can remove this one safely.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-omnundlxo2brs552bdl6m0j1@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      fd25bf8b
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Do parameter validation earlier on fetch_kernel_version() · 44b58e06
      Arnaldo Carvalho de Melo authored
      While trying to reduce util.[ch] I noticed that fetch_kernel_version()
      and fetch_ubuntu_kernel_version() do lots of operations only to check if
      they are needed, i.e. it checks if the pointer where to return the
      kernel version is NULL only after obtaining the kernel version from
      /proc/version_signature or by parsing the results from uname().
      
      Do it earlier not to confuse people reading this code in the future :-)
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-i94qwyekk4tzbu0b9ce1r1mz@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      44b58e06
    • Arnaldo Carvalho de Melo's avatar
      perf evsel: Adopt find_process() · 2157f6ee
      Arnaldo Carvalho de Melo authored
      And make it static, nobody else uses it, if we ever need it in more
      places we can carve a new source file for process related methods,
      for now lets reduce util.{c,h} a tad more.
      
      Link: http://lkml.kernel.org/n/tip-zgb28rllvypjibw52aaz9p15@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2157f6ee
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo-4.13-20170719' of... · 007b811b
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo-4.13-20170719' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      User visible changes:
      
      - Allow adding and removing fields to the default 'perf script' columns,
        using + or - as field prefixes to do so (Andi Kleen)
      
      - Display titles in left frame in the annotate browser (Jin Yao)
      
      - Allow resolving the DSO name with 'perf script -F brstack{sym,off},dso'
        (Mark Santaniello)
      
      - Support function filtering in 'perf ftrace' (Namhyung Kim)
      
      - Allow specifying function call depth in 'perf ftrace' (Namhyumg Kim)
      
      Infrastructure changes:
      
      - Adopt __noreturn, __printf, __scanf, noinline, __packed and __aligned
        __alignment__(()) markers, to make the tools/ source code base to be
        more compact and look more like kernel code (Arnaldo Carvalho de Melo)
      
      - Remove unnecessary check in annotate_browser_write() (Jin Yao)
      
      - Return arch from symbol__disassemble() so that callers, such as
        the annotate TUI browser to use arch specific formattings, such
        as the upcoming instruction micro-op fusion on Intel Core (Jin Yao)
      
      - Remove superfluous check before use in the coresight code base (Kim
        Phillips)
      
      - Remove unused SAMPLE_SIZE defines and BTS priv array (Kim Phillips)
      
      - Error handling fix/tidy ups in 'perf config' (Taeung Song)
      
      - Avoid error in the BPF proggie built with clang in 'perf test llvm'
        when PROFILE_ALL_BRANCHES is set (Wang Nan)
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      007b811b
    • Ingo Molnar's avatar
      2eb0fc9b
    • Taeung Song's avatar
      perf config: Refactor the code using 'ret' variable in cmd_config() · dfe1c6d7
      Taeung Song authored
      To simplify the code related to 'ret' variable in cmd_config(),
      initialize 'ret' with -1 instead of 0 and use goto to perform resource
      release at the end of the function, setting ret to zero just before the
      out_err label, as usual in the kernel sources.
      Signed-off-by: default avatarTaeung Song <treeze.taeung@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1497671202-20495-1-git-send-email-treeze.taeung@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      dfe1c6d7
    • Taeung Song's avatar
      perf config: Check error cases of {show_spec, set}_config() · 4f1fd742
      Taeung Song authored
      show_spec_config() and set_config() can be called multiple times
      in the loop in cmd_config().
      
      However, The error cases of them wasn't checked, so fix it.
      Reported-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarTaeung Song <treeze.taeung@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1497671197-20450-1-git-send-email-treeze.taeung@gmail.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4f1fd742
    • Namhyung Kim's avatar
      perf ftrace: Add -D option for depth filter · 1096c35a
      Namhyung Kim authored
      The -D/--graph-depth option is to set max graph depth.  The following
      example traces max 2-depth of page fault handler.
      
        $ sudo perf ftrace -G __do_page_fault -D 2 -- hello
         ...
         0)               |  __do_page_fault() {
         0)   0.063 us    |    down_read_trylock();
         0)   0.251 us    |    find_vma();
         0)   5.374 us    |    handle_mm_fault();
         0)   0.054 us    |    up_read();
         0)   7.463 us    |  }
         ...
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: kernel-team@lge.com
      Link: http://lkml.kernel.org/r/20170618142302.25390-4-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1096c35a
    • Namhyung Kim's avatar
      perf ftrace: Add option for function filtering · 78b83e8b
      Namhyung Kim authored
      The -T/--trace-funcs and -N/--notrace-funcs options are to specify
      functions to enable/disable tracing dynamically.
      
      The -G/--graph-funcs and -g/--nograph-funcs options are to set filters
      for function graph tracer.
      
      For example, to trace fault handling functions only:
      
        $ sudo perf ftrace -T *fault hello
         0)               |  __do_page_fault() {
         0)               |    handle_mm_fault() {
         0)   2.117 us    |      __handle_mm_fault();
         0)   3.627 us    |    }
         0)   7.811 us    |  }
         0)               |  __do_page_fault() {
         0)               |    handle_mm_fault() {
         0)   2.014 us    |      __handle_mm_fault();
         0)   2.424 us    |    }
         0)   2.951 us    |  }
         ...
      
      To trace all functions executed in __do_page_fault:
      
        $ sudo perf ftrace -G __do_page_fault hello
         2)               |  __do_page_fault() {
         3)   0.060 us    |    down_read_trylock();
         3)               |    find_vma() {
         3)   0.075 us    |      vmacache_find();
         3)   0.053 us    |      vmacache_update();
         3)   1.246 us    |    }
         3)               |    handle_mm_fault() {
         3)   0.063 us    |      __rcu_read_lock();
         3)   0.056 us    |      mem_cgroup_from_task();
         3)   0.057 us    |      __rcu_read_unlock();
         3)               |      __handle_mm_fault() {
         3)               |        filemap_map_pages() {
         3)   0.058 us    |          __rcu_read_lock();
         3)               |          alloc_set_pte() {
         ...
      
      But don't want to show details in handle_mm_fault:
      
        $ sudo perf ftrace -G __do_page_fault -g handle_mm_fault hello
         3)               |  __do_page_fault() {
         3)   0.049 us    |    down_read_trylock();
         3)               |    find_vma() {
         3)   0.048 us    |      vmacache_find();
         3)   0.041 us    |      vmacache_update();
         3)   0.680 us    |    }
         3)   0.036 us    |    up_read();
         3)   4.547 us    |  } /* __do_page_fault */
         ...
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: kernel-team@lge.com
      Link: http://lkml.kernel.org/r/20170618142302.25390-3-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      78b83e8b
    • Namhyung Kim's avatar
      perf ftrace: Move setup_pager before opening trace_pipe · 29681bc5
      Namhyung Kim authored
      The 'perf ftrace' command fails to reset tracer after finishing
      recording like below:
      
        $ sudo perf ftrace -v hello
        write 'nop' to tracing/current_tracer failed: Device or resource busy
        ...
      
      This is because the trace_pipe file is open in pager process.  Move the
      pager setup to before opening the file.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: kernel-team@lge.com
      Fixes: 58335964 ("perf ftrace: Use pager for displaying result")
      Link: http://lkml.kernel.org/r/20170618142302.25390-2-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      29681bc5
    • Namhyung Kim's avatar
      perf ftrace: Show error message when fails to set ftrace files · e7bd9ba2
      Namhyung Kim authored
      It'd be better for debugging to show an error message when it fails to
      setup ftrace for some reason.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: kernel-team@lge.com
      Link: http://lkml.kernel.org/r/20170618142302.25390-1-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e7bd9ba2
    • Mark Santaniello's avatar
      perf script: Support -F brstackoff,dso · 106dacd8
      Mark Santaniello authored
      The idea here is to make AutoFDO easier in cloud environment with ASLR.
      It's easiest to show how this is useful by example. I built a small test
      akin to "while(1) { do_nothing(); }" where the do_nothing function is
      loaded from a dso:
      
        $ cat burncpu.cpp
        #include <dlfcn.h>
      
        int main() {
          void* handle = dlopen("./dso.so", RTLD_LAZY);
          if (!handle) return -1;
      
          typedef void (*fp)();
          fp do_nothing = (fp) dlsym(handle, "do_nothing");
      
          while(1) {
            do_nothing();
          }
        }
      
        $ cat dso.cpp
        extern "C" void do_nothing() {}
      
        $ cat build.sh
        #!/bin/bash
        g++ -shared dso.cpp -o dso.so
        g++ burncpu.cpp -o burncpu -ldl
      
      I sampled the execution of this program with perf record -b.
      
      Using the existing "brstack,dso", we get absolute addresses that are
      affected by ASLR, and could be different on different hosts. The address
      does not uniquely identify a branch/target in the binary:
      
        $ perf script -F brstack,dso | sed 's/\/0 /\/0\n/g' | grep burncpu | grep dso.so | head -n 1
        0x7f967139b6aa(/tmp/burncpu/dso.so)/0x4006b1(/tmp/burncpu/exe)/P/-/-/0
      
      Using the existing "brstacksym,dso" is a little better, because the
      symbol plus offset and dso name *does* uniquely identify a branch/target
      in the binary.  Ultimately, however, AutoFDO wants a simple offset into
      the binary, so we'd have to undo all the work perf did to symbolize in
      the first place:
      
        $ perf script -F brstacksym,dso | sed 's/\/0 /\/0\n/g' | grep burncpu | grep dso.so | head -n 1
        do_nothing+0x5(/tmp/burncpu/dso.so)/main+0x44(/tmp/burncpu/exe)/P/-/-/0
      
      With the new "brstackoff,dso" we get what we need: a simple offset into a
      specific dso/binary that uniquely identifies a branch/target:
        $ perf script -F brstackoff,dso | sed 's/\/0 /\/0\n/g' | grep burncpu | grep dso.so | head -n 1
        0x6aa(/tmp/burncpu/dso.so)/0x4006b1(/tmp/burncpu/exe)/P/-/-/0
      Signed-off-by: default avatarMark Santaniello <marksan@fb.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20170619163825.2012979-2-marksan@fb.com
      [ Updated documentation about 'brstackoff' using text from above ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      106dacd8
    • Mark Santaniello's avatar
      perf script: Support -F brstack,dso and brstacksym,dso · 55b9b508
      Mark Santaniello authored
      Perf script can report the dso for "addr" and "ip" fields.
      
      This adds the same support for the "brstack" and "brstacksym" fields.
      This can be helpful for AutoFDO: we can ignore LBR entries unless the
      source and target address are both in the target module we are about to
      build.
      
      I built a small test akin to "while(1) { do_nothing(); }" where the
      do_nothing function is loaded from a dso:
      
        $ cat burncpu.cpp
        #include <dlfcn.h>
      
        int main() {
          void* handle = dlopen("./dso.so", RTLD_LAZY);
          if (!handle) return -1;
      
          typedef void (*fp)();
          fp do_nothing = (fp) dlsym(handle, "do_nothing");
      
          while(1) {
            do_nothing();
          }
        }
      
        $ cat dso.cpp
        extern "C" void do_nothing() {}
      
        $ cat build.sh
        #!/bin/bash
        g++ -shared dso.cpp -o dso.so
        g++ burncpu.cpp -o burncpu -ldl
      
      I sampled the execution with perf record -b.  Using the new perf script
      functionality I can easily find cases where there was a transition from one
      dso to another:
      
        $ perf record -a -b -- sleep 5
        [ perf record: Woken up 55 times to write data ]
        [ perf record: Captured and wrote 18.815 MB perf.data (43593 samples) ]
      
        $ perf script -F brstack,dso | sed 's/\/0 /\/0\n/g' | grep burncpu | grep dso.so | head -n 1
        0x7f967139b6aa(/tmp/burncpu/dso.so)/0x4006b1(/tmp/burncpu/exe)/P/-/-/0
      
        $ perf script -F brstacksym,dso | sed 's/\/0 /\/0\n/g' | grep burncpu | grep dso.so | head -n 1
        do_nothing+0x5(/tmp/burncpu/dso.so)/main+0x44(/tmp/burncpu/exe)/P/-/-/0
      Signed-off-by: default avatarMark Santaniello <marksan@fb.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20170619163825.2012979-1-marksan@fb.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      55b9b508
  3. 19 Jun, 2017 4 commits
    • Wang Nan's avatar
      perf test llvm: Avoid error when PROFILE_ALL_BRANCHES is set · 9b57fb7e
      Wang Nan authored
      The 'if' keyword is a define that expands to complex code when
      CONFIG_PROFILE_ALL_BRANCHES is selected, which causes a 'perf test LLVM'
      failure like:
      
        $ ./perf test LLVM
        35: LLVM search and compile                    :
        35.1: Basic BPF llvm compile                    : Ok
        35.2: kbuild searching                          : Ok
        35.3: Compile source for BPF prologue generation: FAILED!
        35.4: Compile source for BPF relocation         : Skip
      
      The only affected test case is bpf-script-test-prologue.c
      because it uses kernel headers and has 'if' inside.
      
      This patch undefines 'if' to make it passes perf test.
      
      More detailed analysis from a message in this thread, also by Wang:
      
      The problem is caused by following relocation information:
      
        $ readelf -a ./llvmsubtest3
        ...
           [ 5] _ftrace_branch    PROGBITS         0000000000000000  00000260
                00000000000000a0  0000000000000000  WA       0     0     4
        ...
        Relocation section '.relfunc=null_lseek file->f_mode offset orig' at
        offset 0x490 contains 4 entries:
           Offset          Info           Type           Sym. Value    Sym. Name
        000000000038  000b00000001 unrecognized: 1       0000000000000000 _ftrace_branch
        0000000000b0  000b00000001 unrecognized: 1       0000000000000000 _ftrace_branch
        000000000128  000b00000001 unrecognized: 1       0000000000000000 _ftrace_branch
        0000000001c0  000b00000001 unrecognized: 1       0000000000000000 _ftrace_branch
      
        Relocation section '.rel_ftrace_branch' at offset 0x4d0 contains 8 entries:
           Offset          Info           Type           Sym. Value    Sym. Name
        000000000000  000200000001 unrecognized: 1       0000000000000000 .L__func__.bpf_func__n
        000000000008  000100000001 unrecognized: 1       0000000000000015 .L.str
        000000000028  000200000001 unrecognized: 1       0000000000000000 .L__func__.bpf_func__n
        000000000030  000100000001 unrecognized: 1       0000000000000015 .L.str
        000000000050  000200000001 unrecognized: 1       0000000000000000 .L__func__.bpf_func__n
        000000000058  000100000001 unrecognized: 1       0000000000000015 .L.str
        000000000078  000200000001 unrecognized: 1       0000000000000000 .L__func__.bpf_func__n
        000000000080  000100000001 unrecognized: 1       0000000000000015 .L.str
        ...
      
      So I think the failure is because you enabled CONFIG_PROFILE_ALL_BRANCHES.
      
      I can reproduce your buggy result by selecting
      CONFIG_PROFILE_ALL_BRANCHES in my kbuild:
      
        $ ./perf test LLVM
        35: LLVM search and compile                    :
        35.1: Basic BPF llvm compile                    : Ok
        35.2: kbuild searching                          : Ok
        35.3: Compile source for BPF prologue generation: FAILED!
        35.4: Compile source for BPF relocation         : Skip
      
      Simply undef CONFIG_PROFILE_ALL_BRANCHES in clang opts not working
      because it is introduced by "#include <uapi/linux/fs.h>", which override
      cmdline options. So I think the best way is to undefine 'if' inside BPF
      script.
      Reported-and-Tested-by: default avatarThomas-Mich Richter <tmricht@linux.vnet.ibm.com>
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Link: http://lkml.kernel.org/r/20170620183203.2517-1-wangnan0@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9b57fb7e
    • Jin Yao's avatar
      perf annotate: Return arch from symbol__disassemble() and save it in browser · dcaa3948
      Jin Yao authored
      In annotate browser, we will add support to check fused instructions.
      While this is x86-specific feature so we need the annotate browser to
      know what the arch it runs on.
      
      symbol__disassemble() has figured out the arch. This patch just lets the
      arch return from symbol__disassemble and save the arch in annotate
      browser.
      Signed-off-by: default avatarYao Jin <yao.jin@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1497840958-4759-2-git-send-email-yao.jin@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      dcaa3948
    • Kim Phillips's avatar
      perf intel-pt/bts: Remove unused SAMPLE_SIZE defines and bts priv array · d3cef7fe
      Kim Phillips authored
      These defines were probably dragged in from sampling support in earlier
      patches.  They can be put back when needed.
      Signed-off-by: default avatarKim Phillips <kim.phillips@arm.com>
      Acked-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20170616112339.3fb6986e4ff33e353008244b@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d3cef7fe
    • Kim Phillips's avatar
      perf coresight: Remove superfluous check before use · 0c788d47
      Kim Phillips authored
      The cs_etm_evsel variable is guaranteed to be set at this point in
      cs_etm_recording_options().
      Signed-off-by: default avatarKim Phillips <kim.phillips@arm.com>
      Acked-by: default avatarMathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/20170615125521.80cc128dc856bc1f2e61b730@arm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0c788d47