• Dmitry Vyukov's avatar
    perf bench: Add breakpoint benchmarks · 68a6772f
    Dmitry Vyukov authored
    Add 2 benchmarks:
    
    1. Performance of thread creation/exiting in presence of breakpoints.
    2. Performance of breakpoint modification in presence of threads.
    
    The benchmarks capture use cases that we are interested in:
    using inheritable breakpoints in large highly-threaded applications.
    
    The benchmarks show significant slowdown imposed by breakpoints
    (even when they don't fire).
    
    Testing on Intel 8173M with 112 HW threads show:
    
      perf bench --repeat=56 breakpoint thread --breakpoints=0 --parallelism=56 --threads=20
            78.675000 usecs/op
      perf bench --repeat=56 breakpoint thread --breakpoints=4 --parallelism=56 --threads=20
         12967.135714 usecs/op
    
    That's 165x slowdown due to presence of the breakpoints.
    
      perf bench --repeat=20000 breakpoint enable --passive=0 --active=0
             1.433250 usecs/op
      perf bench --repeat=20000 breakpoint enable --passive=224 --active=0
           585.318400 usecs/op
      perf bench --repeat=20000 breakpoint enable --passive=0 --active=111
           635.953000 usecs/op
    
    That's 408x and 444x slowdown due to presence of threads.
    
    Profiles show some overhead in toggle_bp_slot,
    but also very high contention:
    
        90.83%  breakpoint-thre  [kernel.kallsyms]  [k] osq_lock
         4.69%  breakpoint-thre  [kernel.kallsyms]  [k] mutex_spin_on_owner
         2.06%  breakpoint-thre  [kernel.kallsyms]  [k] __reserve_bp_slot
         2.04%  breakpoint-thre  [kernel.kallsyms]  [k] toggle_bp_slot
    
        79.01%  breakpoint-enab  [kernel.kallsyms]  [k] smp_call_function_single
         9.94%  breakpoint-enab  [kernel.kallsyms]  [k] llist_add_batch
         5.70%  breakpoint-enab  [kernel.kallsyms]  [k] _raw_spin_lock_irq
         1.84%  breakpoint-enab  [kernel.kallsyms]  [k] event_function_call
         1.12%  breakpoint-enab  [kernel.kallsyms]  [k] send_call_function_single_ipi
         0.37%  breakpoint-enab  [kernel.kallsyms]  [k] generic_exec_single
         0.24%  breakpoint-enab  [kernel.kallsyms]  [k] __perf_event_disable
         0.20%  breakpoint-enab  [kernel.kallsyms]  [k] _perf_event_enable
         0.18%  breakpoint-enab  [kernel.kallsyms]  [k] toggle_bp_slot
    
    Committer notes:
    
    Fixup struct init for older compilers:
    
       3    32.90 alpine:3.5                    : FAIL clang version 3.8.1 (tags/RELEASE_381/final)
        bench/breakpoint.c:49:34: error: missing field 'size' initializer [-Werror,-Wmissing-field-initializers]
                struct perf_event_attr attr = {0};
                                                ^
        1 error generated.
       7    37.31 alpine:3.9                    : FAIL gcc version 8.3.0 (Alpine 8.3.0)
        bench/breakpoint.c:49:34: error: missing field 'size' initializer [-Werror,-Wmissing-field-initializers]
                struct perf_event_attr attr = {0};
                                                ^
        1 error generated.
    Signed-off-by: default avatarDmitriy Vyukov <dvyukov@google.com>
    Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
    Acked-by: default avatarIan Rogers <irogers@google.com>
    Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Marco Elver <elver@google.com>
    Cc: Mark Rutland <mark.rutland@arm.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Link: https://lore.kernel.org/r/20220505155745.1690906-1-dvyukov@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
    68a6772f
builtin-bench.c 8.59 KB