• Alexey Budankov's avatar
    perf report: Prefer DWARF callstacks to LBR ones when captured both · 10ccbc1c
    Alexey Budankov authored
    Display DWARF based callchains when the perf.data file contains raw thread
    stack data as LBR callstack data.
    
    Commiter testing:
    
    This changes the output from the branch stack based one, i.e. without
    this patch, for the same file as in the previous csets:
    
      # perf report --stdio
      # To display the perf.data header info, please use --header/--header-only options.
      #
      # Total Lost Samples: 0
      #
      # Samples: 13  of event 'cycles'
      # Event count (approx.): 13
      #
      # Overhead  Command  Source Shared Object  Source Symbol                Target Symbol                              Basic Block Cycles
      # ........  .......  ....................  ...........................  .........................................  ..................
      #
           7.69%  ls       libpthread-2.29.so    [.] _init                    [.] __pthread_initialize_minimal_internal  6827
           7.69%  ls       ld-2.29.so            [k] _start                   [k] _dl_start                              -
           7.69%  ls       ld-2.29.so            [.] _dl_start_user           [.] _dl_init                               -24790
           7.69%  ls       ld-2.29.so            [k] _dl_start                [k] _dl_sysdep_start                       278
           7.69%  ls       ld-2.29.so            [k] dl_main                  [k] _dl_map_object_deps                    15581
           7.69%  ls       ld-2.29.so            [k] open_verify.constprop.0  [k] lseek64                                4228
           7.69%  ls       ld-2.29.so            [k] _dl_map_object           [k] open_verify.constprop.0                55
           7.69%  ls       ld-2.29.so            [k] openaux                  [k] _dl_map_object                         67
           7.69%  ls       ld-2.29.so            [k] _dl_map_object_deps      [k] 0x00007f441b57c090                     112
           7.69%  ls       ld-2.29.so            [.] call_init.part.0         [.] _init                                  334
           7.69%  ls       ld-2.29.so            [.] _dl_init                 [.] call_init.part.0                       383
           7.69%  ls       ld-2.29.so            [k] _dl_sysdep_start         [k] dl_main                                45
           7.69%  ls       ld-2.29.so            [k] _dl_catch_exception      [k] openaux                                116
    
      #
      # (Tip: For memory address profiling, try: perf mem record / perf mem report)
      #
    
    To the one that shows call chains:
    
      # perf report --stdio
      # To display the perf.data header info, please use --header/--header-only options.
      #
      #
      # Total Lost Samples: 0
      #
      # Samples: 10  of event 'cycles'
      # Event count (approx.): 3204047
      #
      # Children      Self  Command  Shared Object       Symbol
      # ........  ........  .......  ..................  .........................................
      #
          55.01%     0.00%  ls       [kernel.vmlinux]    [k] entry_SYSCALL_64_after_hwframe
                  |
                  ---entry_SYSCALL_64_after_hwframe
                     do_syscall_64
                     |
                      --16.01%--__x64_sys_execve
                                __do_execve_file.isra.0
                                search_binary_handler
                                load_elf_binary
                                elf_map
                                vm_mmap_pgoff
                                do_mmap
                                mmap_region
                                perf_event_mmap
                                perf_iterate_sb
                                perf_iterate_ctx
                                perf_event_mmap_output
                                perf_output_copy
                                memcpy_erms
    
          55.01%    39.00%  ls       [kernel.vmlinux]    [k] do_syscall_64
                  |
                  |--39.00%--0xffffffffffffffff
                  |          _dl_map_object
                  |          open_verify.constprop.0
                  |          __lseek64 (inlined)
                  |          entry_SYSCALL_64_after_hwframe
                  |          do_syscall_64
                  |
                   --16.01%--do_syscall_64
                             __x64_sys_execve
                             __do_execve_file.isra.0
                             search_binary_handler
                             load_elf_binary
                             elf_map
                             vm_mmap_pgoff
                             do_mmap
                             mmap_region
                             perf_event_mmap
                             perf_iterate_sb
                             perf_iterate_ctx
                             perf_event_mmap_output
                             perf_output_copy
                             memcpy_erms
    
          42.95%    42.95%  ls       libpthread-2.29.so  [.] __pthread_initialize_minimal_internal
                  |
                  ---_init
                     __pthread_initialize_minimal_internal
    
          42.95%     0.00%  ls       libpthread-2.29.so  [.] _init
                  |
                  ---_init
                     __pthread_initialize_minimal_internal
    
      <SNIP>
    
      #
      # (Tip: Profiling branch (mis)predictions with: perf record -b / perf report)
      #
      #
    
    The branch stack view be explicitely selected using:
    
      # perf report -h branch-stack
    
       Usage: perf report [<options>]
    
          -b, --branch-stack    use branch records for per branch histogram filling
    
      #
    
    I.e. after this patch:
    
      # perf report -b --stdio
      # To display the perf.data header info, please use --header/--header-only options.
      #
      #
      # Total Lost Samples: 0
      #
      # Samples: 13  of event 'cycles'
      # Event count (approx.): 13
      #
      # Overhead  Command  Source Shared Object  Source Symbol                Target Symbol                              Basic Block Cycles
      # ........  .......  ....................  ...........................  .........................................  ..................
      #
           7.69%  ls       libpthread-2.29.so    [.] _init                    [.] __pthread_initialize_minimal_internal  6827
           7.69%  ls       ld-2.29.so            [k] _start                   [k] _dl_start                              -
           7.69%  ls       ld-2.29.so            [.] _dl_start_user           [.] _dl_init                               -24790
           7.69%  ls       ld-2.29.so            [k] _dl_start                [k] _dl_sysdep_start                       278
           7.69%  ls       ld-2.29.so            [k] dl_main                  [k] _dl_map_object_deps                    15581
           7.69%  ls       ld-2.29.so            [k] open_verify.constprop.0  [k] lseek64                                4228
           7.69%  ls       ld-2.29.so            [k] _dl_map_object           [k] open_verify.constprop.0                55
           7.69%  ls       ld-2.29.so            [k] openaux                  [k] _dl_map_object                         67
           7.69%  ls       ld-2.29.so            [k] _dl_map_object_deps      [k] 0x00007f441b57c090                     112
           7.69%  ls       ld-2.29.so            [.] call_init.part.0         [.] _init                                  334
           7.69%  ls       ld-2.29.so            [.] _dl_init                 [.] call_init.part.0                       383
           7.69%  ls       ld-2.29.so            [k] _dl_sysdep_start         [k] dl_main                                45
           7.69%  ls       ld-2.29.so            [k] _dl_catch_exception      [k] openaux                                116
    
      #
      # (Tip: Show current config key-value pairs: perf config --list)
      #
      #
    Signed-off-by: default avatarAlexey Budankov <alexey.budankov@linux.intel.com>
    Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
    Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
    Cc: Andi Kleen <ak@linux.intel.com>
    Cc: Jin Yao <yao.jin@linux.intel.com>
    Cc: Jiri Olsa <jolsa@redhat.com>
    Cc: Kan Liang <kan.liang@linux.intel.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Link: http://lkml.kernel.org/r/ccbd9583-82f4-dec5-7e84-64bf56e351fb@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
    10ccbc1c
builtin-report.c 41 KB