• Kan Liang's avatar
    perf report: Add option to enable the LBR stitching approach · b1d1429b
    Kan Liang authored
    With the LBR stitching approach, the reconstructed LBR call stack can
    break the HW limitation. However, it may reconstruct invalid call stacks
    in some cases, e.g. exception handing such as setjmp/longjmp.  Also, it
    may impact the processing time especially when the number of samples
    with stitched LBRs are huge.
    
    Add an option to enable the approach.
    
      # To display the perf.data header info, please use
      # --header/--header-only options.
      #
      #
      # Total Lost Samples: 0
      #
      # Samples: 6K of event 'cycles'
      # Event count (approx.): 6492797701
      #
      # Children      Self  Command          Shared Object       Symbol
      # ........  ........  ...............  ..................
      # .................................
      #
        99.99%    99.99%  tchain_edit      tchain_edit        [.] f43
                |
                ---main
                   f1
                   f2
                   f3
                   f4
                   f5
                   f6
                   f7
                   f8
                   f9
                   f10
                   f11
                   f12
                   f13
                   f14
                   f15
                   f16
                   f17
                   f18
                   f19
                   f20
                   f21
                   f22
                   f23
                   f24
                   f25
                   f26
                   f27
                   f28
                   f29
                   f30
                   f31
                   |
                    --99.65%--f32
                              f33
                              f34
                              f35
                              f36
                              f37
                              f38
                              f39
                              f40
                              f41
                              f42
                              f43
    
    Committer testing:
    
      $ perf record --call-graph lbr /wb/tchain_edit
      [ perf record: Woken up 23 times to write data ]
      [ perf record: Captured and wrote 5.578 MB perf.data (6839 samples) ]
      $ perf report --header-only | egrep 'cpu(desc|.*capabilities)'
      # cpudesc : Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz
      # cpu pmu capabilities: branches=32, max_precise=3, pmu_name=skylake
      $
    
    Before:
    
      $ perf report --no-children --stdio
      # To display the perf.data header info, please use --header/--header-only options.
      #
      #
      # Total Lost Samples: 0
      #
      # Samples: 6K of event 'cycles:u'
      # Event count (approx.): 6459523879
      #
      # Overhead  Command      Shared Object     Symbol
      # ........  ...........  ................  .......................
      #
          99.95%  tchain_edit  tchain_edit       [.] f43
                  |
                   --99.92%--f43
                             f42
                             f41
                             f40
                             f39
                             f38
                             f37
                             f36
                             f35
                             f34
                             f33
                             f32
                             f31
                             f30
                             f29
                             f28
                             f27
                             f26
                             f25
                             f24
                             f23
                             f22
                             f21
                             f20
                             f19
                             f18
                             f17
                             f16
                             f15
                             f14
                             f13
                             f12
                             f11
    
           0.03%  tchain_edit  tchain_edit       [.] f42
           0.01%  tchain_edit  tchain_edit       [.] f41
           0.00%  tchain_edit  tchain_edit       [.] f31
           0.00%  tchain_edit  ld-2.29.so        [.] _dl_relocate_object
           0.00%  tchain_edit  ld-2.29.so        [.] memmove
           0.00%  tchain_edit  [unknown]         [k] 0xffffffff93a00b17
    
    After:
    
      $ perf report --stitch-lbr --no-children --stdio
      # To display the perf.data header info, please use --header/--header-only options.
      #
      #
      # Total Lost Samples: 0
      #
      # Samples: 6K of event 'cycles:u'
      # Event count (approx.): 6459496645
      #
      # Overhead  Command      Shared Object     Symbol
      # ........  ...........  ................  ........................
      #
          99.97%  tchain_edit  tchain_edit       [.] f43
                  |
                   --99.93%--f43
                             f42
                             f41
                             f40
                             f39
                             f38
                             f37
                             f36
                             f35
                             f34
                             f33
                             f32
                             f31
                             f30
                             f29
                             f28
                             f27
                             f26
                             f25
                             f24
                             f23
                             f22
                             f21
                             f20
                             f19
                             f18
                             f17
                             f16
                             f15
                             f14
                             f13
                             f12
                             f11
                             f10
                             f9
                             f8
                             f7
                             f6
                             f5
                             f4
                             f3
                             f2
                             f1
                             main
                             __libc_start_main
    
           0.02%  tchain_edit  [unknown]         [k] 0xffffffff93a00b17
           0.01%  tchain_edit  tchain_edit       [.] f31
           0.00%  tchain_edit  ld-2.29.so        [.] _dl_important_hwcaps
    Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
    Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
    Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
    Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
    Cc: Adrian Hunter <adrian.hunter@intel.com>
    Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
    Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
    Cc: Michael Ellerman <mpe@ellerman.id.au>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Pavel Gerasimov <pavel.gerasimov@intel.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
    Cc: Stephane Eranian <eranian@google.com>
    Cc: Vitaly Slobodskoy <vitaly.slobodskoy@intel.com>
    Link: http://lore.kernel.org/lkml/20200319202517.23423-14-kan.liang@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
    b1d1429b
builtin-report.c 44.7 KB