1. 26 Feb, 2016 14 commits
    • Wang Nan's avatar
      perf trace: Print content of bpf-output event · 1d6c9407
      Wang Nan authored
      With this patch the contend of BPF output event is printed by
      'perf trace'. For example:
      
       # ./perf trace -a --ev bpf-output/no-inherit,name=evt/ \
                         --ev ./test_bpf_trace.c/map:channel.event=evt/ \
                         usleep 100000
        ...
          1.787 ( 0.004 ms): usleep/3832 nanosleep(rqtp: 0x7ffc78b18980                                        ) ...
          1.787 (         ): evt:Raise a BPF event!..)
          1.788 (         ): perf_bpf_probe:func_begin:(ffffffff810e97d0))
        ...
        101.866 (87.038 ms): gmain/1654 poll(ufds: 0x7f57a80008c0, nfds: 2, timeout_msecs: 1000               ) ...
        101.866 (         ): evt:Raise a BPF event!..)
        101.867 (         ): perf_bpf_probe:func_end:(ffffffff810e97d0 <- ffffffff81796173))
        101.869 (100.087 ms): usleep/3832  ... [continued]: nanosleep()) = 0
        ...
      
       (There is an extra ')' at the end of several lines. However, it is
        another problem, unrelated to this commit.)
      
      Where test_bpf_trace.c is:
      
        /************************ BEGIN **************************/
        #include <uapi/linux/bpf.h>
        struct bpf_map_def {
              unsigned int type;
              unsigned int key_size;
              unsigned int value_size;
              unsigned int max_entries;
        };
        #define SEC(NAME) __attribute__((section(NAME), used))
        static u64 (*ktime_get_ns)(void) =
              (void *)BPF_FUNC_ktime_get_ns;
        static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
              (void *)BPF_FUNC_trace_printk;
        static int (*get_smp_processor_id)(void) =
              (void *)BPF_FUNC_get_smp_processor_id;
        static int (*perf_event_output)(void *, struct bpf_map_def *, int, void *, unsigned long) =
              (void *)BPF_FUNC_perf_event_output;
      
        struct bpf_map_def SEC("maps") channel = {
              .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
              .key_size = sizeof(int),
              .value_size = sizeof(u32),
              .max_entries = __NR_CPUS__,
        };
      
        static inline int __attribute__((always_inline))
        func(void *ctx, int type)
        {
      	char output_str[] = "Raise a BPF event!";
      	char err_str[] = "BAD %d\n";
      	int err;
      
              err = perf_event_output(ctx, &channel, get_smp_processor_id(),
      			        &output_str, sizeof(output_str));
      	if (err)
      		trace_printk(err_str, sizeof(err_str), err);
              return 1;
        }
        SEC("func_begin=sys_nanosleep")
        int func_begin(void *ctx) {return func(ctx, 1);}
        SEC("func_end=sys_nanosleep%return")
        int func_end(void *ctx) { return func(ctx, 2);}
        char _license[] SEC("license") = "GPL";
        int _version SEC("version") = LINUX_VERSION_CODE;
        /************************* END ***************************/
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1456479154-136027-8-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1d6c9407
    • Wang Nan's avatar
      perf trace: Call bpf__apply_obj_config in 'perf trace' · ba504235
      Wang Nan authored
      Without this patch BPF map configuration is not applied.
      
      Command like this:
       # ./perf trace --ev bpf-output/no-inherit,name=evt/ \
                      --ev ./test_bpf_trace.c/map:channel.event=evt/ \
                      usleep 100000
      
      Load BPF files without error, but since map:channel.event=evt is not
      applied, bpf-output event not work.
      
      This patch allows 'perf trace' load and run BPF scripts.
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1456479154-136027-7-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ba504235
    • Wang Nan's avatar
      perf tools: Only set filter for tracepoints events · fdf14720
      Wang Nan authored
      perf_evlist__set_filter() tries to set filter to every evsel linked in
      the evlist. However, since filters can only be applied to tracepoints,
      checking type of evsel before calling perf_evsel__set_filter() would be
      better.
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1456479154-136027-6-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      fdf14720
    • Wang Nan's avatar
      perf config: Bring perf_default_config to the very beginning at main() · b8cbb349
      Wang Nan authored
      Before this patch each subcommand calls perf_config() by themself,
      reading the default configuration together with subcommand specific
      options. If a subcommand doesn't have it own options, it needs to call
      'perf_config(perf_default_config, NULL)' to ensure .perfconfig is
      loaded.
      
      This patch brings perf_config(perf_default_config, NULL) to the very
      start of main(), so subcommands don't need to do it.
      
      After this patch, 'llvm.clang-path' works for 'perf trace'.
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Suggested-and-Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1456479154-136027-4-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b8cbb349
    • Namhyung Kim's avatar
      perf report: Update column width of dynamic entries · abab5e7f
      Namhyung Kim authored
      The column width of dynamic entries is updated when comparing hist
      entries.  However some unique entries can miss the chance to update.  So
      move the update to output resort stage to make sure every entry will get
      called before display.
      
      To do that, abuse ->sort callback to update the width when the third
      argument is NULL.  When resorting entries in normal path, it never be
      NULL so it should be fine IMHO.
      
      Before:
      
        #       Overhead  ptr / bytes_req / gfp_flags
        # ..............  ..........................................
        #
            37.50%        0xffff8803f7669400
               37.50%        448
                  37.50%        GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
            10.42%        0xffff8803f766be00
                8.33%        96
                   8.33%        GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
                2.08%        512
                   2.08%        GFP_KERNEL|GFP_NOWARN|GFP_REPEAT|GFP   <-- here
      
      After:
      
        #       Overhead  ptr / bytes_req / gfp_flags
        # ..............  .....................................................
        #
            37.50%        0xffff8803f7669400
               37.50%        448
                  37.50%        GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
            10.42%        0xffff8803f766be00
                8.33%        96
                   8.33%        GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
                2.08%        512
                   2.08%        GFP_KERNEL|GFP_NOWARN|GFP_REPEAT|GFP_NOMEMALLOC
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1456512767-1164-5-git-send-email-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      abab5e7f
    • Namhyung Kim's avatar
      perf hists: Fix dynamic entry display in hierarchy · e049d4a3
      Namhyung Kim authored
      When dynamic sort key is used it might not show pretty printed output.
      This is because the trace output was not set only for the first dynamic
      sort key.  During hierarchy_insert_entry() it missed to pass the
      trace_output to dynamic entries.  Also even if it did, only first entry
      will have it.  Subsequent entries might set it during collapsing stage
      but it's not guaranteed.
      
      Before:
      
        $ perf report --hierarchy --stdio -s ptr,bytes_req,gfp_flags -g none
        #
        #       Overhead  ptr / bytes_req / gfp_flags
        # ..............  ..........................................
        #
            37.50%        0xffff8803f7669400
               37.50%        448
                  37.50%        66080
            10.42%        0xffff8803f766be00
                8.33%        96
                   8.33%        66080
                2.08%        512
                   2.08%        67280
      
      After:
      
        #
        #       Overhead  ptr / bytes_req / gfp_flags
        # ..............  ..........................................
        #
            37.50%        0xffff8803f7669400
               37.50%        448
                  37.50%        GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
            10.42%        0xffff8803f766be00
                8.33%        96
                   8.33%        GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
                2.08%        512
                   2.08%        GFP_KERNEL|GFP_NOWARN|GFP_REPEAT|GFP
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1456512767-1164-4-git-send-email-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e049d4a3
    • Namhyung Kim's avatar
      perf report: Left align dynamic entries in hierarchy · cb1fab91
      Namhyung Kim authored
      The dynamic entries are right-aligned unlike other entries since it
      usually has numeric value.  But for the hierarchy mode, left alignment
      is more appropriate IMHO.  Also trim spaces on the left so that we can
      easily identify the hierarchy.
      
      Before:
      
        $ perf report --hierarchy -i perf.data.kmem -s gfp_flags,ptr,bytes_req --stdio -g none
        ...
        #
        #       Overhead                                        gfp_flags /                ptr /          bytes_req
        # ..............  .................................................................................................
        #
            91.67%                   GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
               37.50%        0xffff8803f7669400
                  37.50%                       448
                8.33%        0xffff8803f766be00
                   8.33%                        96
                4.17%        0xffff8800d156dc00
                   4.17%                       704
      
      After:
      
        #       Overhead  gfp_flags / ptr / bytes_req
        # ..............  ....................................
        #
            91.67%        GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
               37.50%        0xffff8803f7669400
                  37.50%        448
                8.33%        0xffff8803f766be00
                   8.33%        96
                4.17%        0xffff8800d156dc00
                   4.17%        704
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1456512767-1164-3-git-send-email-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      cb1fab91
    • Namhyung Kim's avatar
      perf report: Fix indentation of dynamic entries in hierarchy · d3a72fd8
      Namhyung Kim authored
      When dynamic entries are used in the hierarchy mode with multiple
      events, the output might not be aligned properly.  In the hierarchy
      mode, the each sort column is indented using total number of sort keys.
      So it keeps track of number of sort keys when adding them.  However
      a dynamic sort key can be added more than once when multiple events have
      same field names.  This results in unnecessarily long indentation in the
      output.
      
      For example perf kmem records following events:
      
        $ perf evlist --trace-fields -i perf.data.kmem
        kmem:kmalloc: trace_fields: call_site,ptr,bytes_req,bytes_alloc,gfp_flags
        kmem:kmalloc_node: trace_fields: call_site,ptr,bytes_req,bytes_alloc,gfp_flags,node
        kmem:kfree: trace_fields: call_site,ptr
        kmem:kmem_cache_alloc: trace_fields: call_site,ptr,bytes_req,bytes_alloc,gfp_flags
        kmem:kmem_cache_alloc_node: trace_fields: call_site,ptr,bytes_req,bytes_alloc,gfp_flags,node
        kmem:kmem_cache_free: trace_fields: call_site,ptr
        kmem:mm_page_alloc: trace_fields: page,order,gfp_flags,migratetype
        kmem:mm_page_free: trace_fields: page,order
      
      As you can see, many field names shared between kmem events.  So adding
      'ptr' dynamic sort key alone will set nr_sort_keys to 6.  And this adds
      many unnecessary spaces between columns.
      
      Before:
      
        $ perf report -i perf.data.kmem --hierarchy -s ptr -g none --stdio
        ...
        #                Overhead                 ptr
        # .......................  ...................................
        #
            99.89%                 0xffff8803ffb79720
             0.06%                 0xffff8803d228a000
             0.03%                 0xffff8803f7678f00
             0.00%                 0xffff880401dc5280
             0.00%                 0xffff880406172380
             0.00%                 0xffff8803ffac3a00
             0.00%                 0xffff8803ffac1600
      
      After:
      
        # Overhead                 ptr
        # ........  ....................
        #
            99.89%  0xffff8803ffb79720
             0.06%  0xffff8803d228a000
             0.03%  0xffff8803f7678f00
             0.00%  0xffff880401dc5280
             0.00%  0xffff880406172380
             0.00%  0xffff8803ffac3a00
             0.00%  0xffff8803ffac1600
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1456512767-1164-2-git-send-email-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d3a72fd8
    • Namhyung Kim's avatar
      perf hists: Fix comparing of dynamic entries · 84b6ee8e
      Namhyung Kim authored
      When hist_entry__cmp() and hist_entry__collapse() are called, they
      should check if the dynamic entry is comparing matching hists only.
      
      Otherwise it might access different hists resulting in incorrect output.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1456512767-1164-1-git-send-email-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      84b6ee8e
    • Namhyung Kim's avatar
      perf report: Show message for percent limit on gtk · 2ddda792
      Namhyung Kim authored
      Like the stdio, it should show messages about omitted hierarchy
      entries.  Please refer the previous commit for more details.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1456488800-28124-5-git-send-email-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2ddda792
    • Namhyung Kim's avatar
      perf hists browser: Show message for percent limit · 79dded87
      Namhyung Kim authored
      Like the stdio, it should show messages about omitted hierarchy entries.
      Please refer the previous commit for more details.
      
      As it needs to check an entry is omitted or not multiple times, add the
      has_no_entry field in the hist entry.
      Suggested-and-Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1456488800-28124-4-git-send-email-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      79dded87
    • Namhyung Kim's avatar
      perf hists browser: Cleanup hist_browser__update_percent_limit() · 201fde73
      Namhyung Kim authored
      The previous patch introduced __rb_hierarchy_next() function with
      various move direction like HMD_FORCE_CHILD but missed to change using
      it some place.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1456488800-28124-3-git-send-email-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      201fde73
    • Namhyung Kim's avatar
      perf report: Show message for percent limit on stdio · bd4abd39
      Namhyung Kim authored
      When the hierarchy mode is used, some entries might be omiited due to a
      percent limit or filter.  In this case the output hierarchy is different
      than other entries.  Add an informative message to users about this.
      
      For example, when 4% of percent limit is applied:
      
      Before:
        #       Overhead  Command / Shared Object / Symbol
        # ..............  ..........................................
        #
            49.09%        swapper
               48.67%        [kernel.vmlinux]
                  34.42%        [k] intel_idle
            11.51%        firefox
                8.87%        libpthread-2.22.so
                   6.60%        [.] __GI___libc_recvmsg
            10.49%        gnome-shell
                4.74%        libc-2.22.so
            10.08%        Xorg
                6.11%        libc-2.22.so
                   5.27%        [.] __memcpy_sse2_unaligned
             6.15%        perf
      
      Note that, gnome-shell/libc has no symbols and perf has no dso/symbols.
      With that patch the output will look like below:
      
      After:
      
        #       Overhead  Command / Shared Object / Symbol
        # ..............  ..........................................
        #
            49.09%        swapper
               48.67%        [kernel.vmlinux]
                  34.42%        [k] intel_idle
            11.51%        firefox
                8.87%        libpthread-2.22.so
                   6.60%        [.] __GI___libc_recvmsg
            10.49%        gnome-shell
                4.74%        libc-2.22.so
                                no entry >= 4.00%
            10.08%        Xorg
                6.11%        libc-2.22.so
                   5.27%        [.] __memcpy_sse2_unaligned
             6.15%        perf
                             no entry >= 4.00%
      Suggested-and-Tested-by: default avatarArnaldo Carvalho de Melo <acme@kernel.org>
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1456488800-28124-2-git-send-email-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      bd4abd39
    • Namhyung Kim's avatar
      perf hists: Add more helper functions for the hierarchy mode · a7b5895b
      Namhyung Kim authored
      The hists__overhead_width() is to calculate width occupied by the
      overhead (and others) columns before the sort columns.
      
      The hist_entry__has_hiearchy_children() is to check whether an entry has
      lower entries (children) in the hierarchy to be shown in the output.
      This means the children should not be filtered out and above the percent
      limit.
      
      These two functions will be used to show information when all children
      of an entry is omitted by the percent limit (or filter).
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1456488800-28124-1-git-send-email-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a7b5895b
  2. 25 Feb, 2016 6 commits
  3. 24 Feb, 2016 20 commits