1. 20 Aug, 2019 5 commits
  2. 16 Aug, 2019 5 commits
    • Thomas Gleixner's avatar
      Merge tag 'perf-core-for-mingo-5.4-20190816' of... · cfb104ca
      Thomas Gleixner authored
      Merge tag 'perf-core-for-mingo-5.4-20190816' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo:
      
      report/script/trace/top:
      
        Arnaldo Carvalho de Melo:
      
        - Allow specifying marker events demarcating when to consider the other events,
          i.e. one now can state something like:
      
              # perf probe kernel_function
              # perf record -e cycles,probe:kernel_function
      
          And then, in 'perf script' or 'perf report' say:
      
              # perf report --switch-on=probe:kernel_function
      
          And then the cycles event samples will be considered only after we
          find the first probe:kernel_function event.
      
          There is also --switch-off=event, to make it stop considering events
          out of some window, say to avoid some winding down of a workload.
      
          The same can be done with the "live mode" tools: 'perf top' and 'perf trace'.
      
          There are examples in the cset comments showing how to use it with
          SDT events in things like 'systemtap', that have those tracepoint-like
          events for the start/end of passes, etc.
      
          Another example involves selecting scheduler events + entry/exit of
          a syscall, using the syscalls tracepoints, one can then see the
          scheduler events that take place while that syscall is being processed.
      
          In the future this should be possible in record/top/trace via eBPF
          where the perf tools would hook into the marker events and enable events
          put in place but not enabled when the on/off conditions are the desired
          ones, reducing the amount of events sampled, but this userspace only
          solution should be good enough for many scenarios.
      
      perf vendor events intel:
      
        Haiyan Song:
      
        - Add Tremontx event file v1.02.
      
      unwind:
      
        John Keeping:
      
        - Fix callchain unwinding when tid != pid, that was working only for the
          thread group leader.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      -----BEGIN PGP SIGNATURE-----
      
      iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCXVcMPgAKCRCyPKLppCJ+
      J6ipAP9F5+TitM1zln/wUUP7/Ug4ZPDsdvA+Ggc8x0Ns7URJQwD/RHk43MgDC1fG
      VXbpQ7byj339Wo7SpjadzLl9xPlh/Qw=
      =IaYE
      -----END PGP SIGNATURE-----
      
      commit e2736219
      Author: John Keeping <john@metanate.com>
      Date:   Thu Aug 15 11:01:46 2019 +0100
      
          perf unwind: Remove unnecessary test
      
          If dwarf_callchain_users is false, then unwind__prepare_access() will
          not set unwind_libunwind_ops so the remaining test here is sufficient.
      Signed-off-by: default avatarJohn Keeping <john@metanate.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
          Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
          Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
          Cc: Namhyung Kim <namhyung@kernel.org>
          Cc: Peter Zijlstra <peterz@infradead.org>
          Cc: john keeping <john@metanate.com>
          Link: http://lkml.kernel.org/r/20190815100146.28842-3-john@metanate.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      
      diff --git a/tools/perf/util/unwind-libunwind.c b/tools/perf/util/unwind-libunwind.c
      index b843f9d0a9ea..6499b22b158b 100644
      --- a/tools/perf/util/unwind-libunwind.c
      +++ b/tools/perf/util/unwind-libunwind.c
      @@ -69,18 +69,12 @@ int unwind__prepare_access(struct map_groups *mg, struct map *map,
      
       void unwind__flush_access(struct map_groups *mg)
       {
      -	if (!dwarf_callchain_users)
      -		return;
      -
       	if (mg->unwind_libunwind_ops)
       		mg->unwind_libunwind_ops->flush_access(mg);
       }
      
       void unwind__finish_access(struct map_groups *mg)
       {
      -	if (!dwarf_callchain_users)
      -		return;
      -
       	if (mg->unwind_libunwind_ops)
       		mg->unwind_libunwind_ops->finish_access(mg);
       }
      cfb104ca
    • John Keeping's avatar
      perf unwind: Remove unnecessary test · e2736219
      John Keeping authored
      If dwarf_callchain_users is false, then unwind__prepare_access() will
      not set unwind_libunwind_ops so the remaining test here is sufficient.
      Signed-off-by: default avatarJohn Keeping <john@metanate.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: john keeping <john@metanate.com>
      Link: http://lkml.kernel.org/r/20190815100146.28842-3-john@metanate.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e2736219
    • John Keeping's avatar
      perf unwind: Fix libunwind when tid != pid · e8ba2906
      John Keeping authored
      Commit e5adfc3e ("perf map: Synthesize maps only for thread group
      leader") changed the recording side so that we no longer get mmap events
      for threads other than the thread group leader (when synthesising these
      events for threads which exist before perf is started).
      
      When a file recorded after this change is loaded, the lack of mmap
      records mean that unwinding is not set up for any other threads.
      
      This can be seen in a simple record/report scenario:
      
      	perf record --call-graph=dwarf -t $TID
      	perf report
      
      If $TID is a process ID then the report will show call graphs, but if
      $TID is a secondary thread the output is as if --call-graph=none was
      specified.
      
      Following the rationale in that commit, move the libunwind fields into
      struct map_groups and update the libunwind functions to take this
      instead of the struct thread.  This is only required for
      unwind__finish_access which must now be called from map_groups__delete
      and the others are changed for symmetry.
      
      Note that unwind__get_entries keeps the thread argument since it is
      required for symbol lookup and the libdw unwind provider uses the thread
      ID.
      Signed-off-by: default avatarJohn Keeping <john@metanate.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Fixes: e5adfc3e ("perf map: Synthesize maps only for thread group leader")
      Link: http://lkml.kernel.org/r/20190815100146.28842-2-john@metanate.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e8ba2906
    • John Keeping's avatar
      perf map: Use zalloc for map_groups · ab6cd0e5
      John Keeping authored
      In the next commit we will add new fields to map_groups and we need
      these to be null if no value is assigned.  The simplest way to achieve
      this is to request zeroed memory from the allocator.
      Signed-off-by: default avatarJohn Keeping <john@metanate.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: john keeping <john@metanate.com>
      Link: http://lkml.kernel.org/r/20190815100146.28842-1-john@metanate.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ab6cd0e5
    • Arnaldo Carvalho de Melo's avatar
      perf report: Add --switch-on/--switch-off events · ef4b1a53
      Arnaldo Carvalho de Melo authored
      Since 'perf top' shares the histogram browser with 'perf report', then
      the same explanation in the previous cset applies.
      
      An additional example uses a pair of SDT events available for systemtap:
      
        # perf probe --exec=/usr/bin/stap '%*:*'
        Added new events:
          sdt_stap:benchmark__thread__start (on %* in /usr/bin/stap)
          sdt_stap:benchmark   (on %* in /usr/bin/stap)
          sdt_stap:benchmark__thread__end (on %* in /usr/bin/stap)
          sdt_stap:pass6__start (on %* in /usr/bin/stap)
          sdt_stap:pass6__end  (on %* in /usr/bin/stap)
          sdt_stap:pass5__start (on %* in /usr/bin/stap)
          sdt_stap:pass5__end  (on %* in /usr/bin/stap)
          sdt_stap:pass0__start (on %* in /usr/bin/stap)
          sdt_stap:pass0__end  (on %* in /usr/bin/stap)
          sdt_stap:pass1a__start (on %* in /usr/bin/stap)
          sdt_stap:pass1b__start (on %* in /usr/bin/stap)
          sdt_stap:pass1__end  (on %* in /usr/bin/stap)
          sdt_stap:pass2__start (on %* in /usr/bin/stap)
          sdt_stap:pass2__end  (on %* in /usr/bin/stap)
          sdt_stap:pass3__start (on %* in /usr/bin/stap)
          sdt_stap:pass3__end  (on %* in /usr/bin/stap)
          sdt_stap:pass4__start (on %* in /usr/bin/stap)
          sdt_stap:pass4__end  (on %* in /usr/bin/stap)
          sdt_stap:benchmark__start (on %* in /usr/bin/stap)
          sdt_stap:benchmark__end (on %* in /usr/bin/stap)
          sdt_stap:cache__get  (on %* in /usr/bin/stap)
          sdt_stap:cache__clean (on %* in /usr/bin/stap)
          sdt_stap:cache__add__module (on %* in /usr/bin/stap)
          sdt_stap:cache__add__source (on %* in /usr/bin/stap)
          sdt_stap:stap_system__complete (on %* in /usr/bin/stap)
          sdt_stap:stap_system__start (on %* in /usr/bin/stap)
          sdt_stap:stap_system__spawn (on %* in /usr/bin/stap)
          sdt_stap:stap_system__fork (on %* in /usr/bin/stap)
          sdt_stap:intern_string (on %* in /usr/bin/stap)
          sdt_stap:client__start (on %* in /usr/bin/stap)
          sdt_stap:client__end (on %* in /usr/bin/stap)
      
        You can now use it in all perf tools, such as:
      
        	perf record -e sdt_stap:client__end -aR sleep 1
      
        #
      
      From these we're use the two below to run systemtap's test suite:
      
        # perf record -e sdt_stap:pass2__*,cycles:P make installcheck > /dev/null
        ^C[ perf record: Woken up 8 times to write data ]
        [ perf record: Captured and wrote 2.691 MB perf.data (39638 samples) ]
        Terminated
        # perf script | grep sdt_stap
                    stap 28979 [000] 19424.302660: sdt_stap:pass2__start: (561b9a537de3) arg1=140730364262544
                    stap 28979 [000] 19424.333083:   sdt_stap:pass2__end: (561b9a53a9e1) arg1=140730364262544
                    stap 29045 [006] 19424.933460: sdt_stap:pass2__start: (563edddcede3) arg1=140722674883152
                    stap 29045 [006] 19424.963794:   sdt_stap:pass2__end: (563edddd19e1) arg1=140722674883152
        # perf script | grep cycles |  wc -l
        39634
        #
      
      Looking at the whole perf.data file:
      
        [root@quaco testsuite]# perf report | grep cycles:P -A25
        # Samples: 39K of event 'cycles:P'
        # Event count (approx.): 34044267368
        #
        # Overhead  Command  Shared Object         Symbol
        # ........  .......  ....................  ................................
        #
             3.50%  cc1      cc1                   [.] ht_lookup_with_hash
             3.04%  cc1      cc1                   [.] _cpp_lex_token
             2.11%  cc1      cc1                   [.] ggc_internal_alloc
             1.83%  cc1      cc1                   [.] cpp_get_token_with_location
             1.68%  cc1      libc-2.29.so          [.] _int_malloc
             1.41%  cc1      cc1                   [.] linemap_position_for_column
             1.25%  cc1      cc1                   [.] ggc_internal_cleared_alloc
             1.20%  cc1      cc1                   [.] c_lex_with_flags
             1.18%  cc1      cc1                   [.] get_combined_adhoc_loc
             1.05%  cc1      libc-2.29.so          [.] malloc
             1.01%  cc1      libc-2.29.so          [.] _int_free
             0.96%  stap     stap                  [.] std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Identity, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, stringtable_hash, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, true, true> >::_M_insert<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__detail::_AllocNode<std::allocator<std::__detail::_Hash_node<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, true> > > >
             0.78%  stap     stap                  [.] lexer::scan
             0.74%  cc1      cc1                   [.] _cpp_lex_direct
             0.70%  cc1      cc1                   [.] pop_scope
             0.70%  cc1      cc1                   [.] c_parser_declspecs
             0.69%  stap     libc-2.29.so          [.] _int_malloc
             0.68%  cc1      cc1                   [.] htab_find_slot
             0.68%  cc1      [kernel.vmlinux]      [k] prepare_exit_to_usermode
             0.64%  cc1      [kernel.vmlinux]      [k] clear_page_erms
        [root@quaco testsuite]#
      
      And now only what happens in slices demarcated by those start/end SDT
      events:
      
        [root@quaco testsuite]# perf report --switch-on=sdt_stap:pass2__start --switch-off=sdt_stap:pass2__end | grep cycles:P -A100
        # Samples: 240  of event 'cycles:P'
        # Event count (approx.): 206491934
        #
        # Overhead  Command  Shared Object        Symbol
        # ........  .......  ...................  ................................................
        #
            38.99%  stap     stap                 [.] systemtap_session::register_library_aliases
            19.47%  stap     stap                 [.] match_key::operator<
            15.01%  stap     libc-2.29.so         [.] __memcmp_avx2_movbe
             5.19%  stap     libc-2.29.so         [.] _int_malloc
             2.50%  stap     libstdc++.so.6.0.26  [.] std::_Rb_tree_insert_and_rebalance
             2.30%  stap     stap                 [.] match_node::build_no_more
             2.07%  stap     libc-2.29.so         [.] malloc
             1.66%  stap     stap                 [.] std::_Rb_tree<match_key, std::pair<match_key const, match_node*>, std::_Select1st<std::pair<match_key const, match_node*> >, std::less<match_key>, std::allocator<std::pair<match_key const, match_node*> > >::find
             1.66%  stap     stap                 [.] match_node::bind
             1.58%  stap     [kernel.vmlinux]     [k] prepare_exit_to_usermode
             1.17%  stap     [kernel.vmlinux]     [k] native_irq_return_iret
             0.87%  stap     stap                 [.] 0x0000000000032ec4
             0.77%  stap     libstdc++.so.6.0.26  [.] std::_Rb_tree_increment
             0.47%  stap     stap                 [.] std::vector<derived_probe_builder*, std::allocator<derived_probe_builder*> >::_M_realloc_insert<derived_probe_builder* const&>
             0.47%  stap     [kernel.vmlinux]     [k] get_page_from_freelist
             0.47%  stap     [kernel.vmlinux]     [k] swapgs_restore_regs_and_return_to_usermode
             0.47%  stap     [kernel.vmlinux]     [k] do_user_addr_fault
             0.46%  stap     [kernel.vmlinux]     [k] __pagevec_lru_add_fn
             0.46%  stap     stap                 [.] std::_Rb_tree<match_key, std::pair<match_key const, match_node*>, std::_Select1st<std::pair<match_key const, match_node*> >, std::less<match_key>, std::allocator<std::pair<match_key const, match_node*> > >::_M_emplace_unique<std::pair<match_key, match_node*> >
             0.42%  stap     libstdc++.so.6.0.26  [.] 0x00000000000c18fa
             0.40%  stap     [kernel.vmlinux]     [k] interrupt_entry
             0.40%  stap     [kernel.vmlinux]     [k] update_load_avg
             0.40%  stap     [kernel.vmlinux]     [k] __intel_pmu_disable_all
             0.40%  stap     [kernel.vmlinux]     [k] clear_page_erms
             0.39%  stap     [kernel.vmlinux]     [k] __mod_node_page_state
             0.39%  stap     [kernel.vmlinux]     [k] error_entry
             0.39%  stap     [kernel.vmlinux]     [k] sync_regs
             0.38%  stap     [kernel.vmlinux]     [k] __handle_mm_fault
             0.38%  stap     stap                 [.] derive_probes
      
        #
        # (Tip: System-wide collection from all CPUs: perf record -a)
        #
        [root@quaco testsuite]#
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Florian Weimer <fweimer@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: William Cohen <wcohen@redhat.com>
      Link: https://lkml.kernel.org/n/tip-408hvumcnyn93a0auihnawew@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ef4b1a53
  3. 15 Aug, 2019 14 commits
    • Arnaldo Carvalho de Melo's avatar
      perf top: Add --switch-on/--switch-off events · 2f53ae34
      Arnaldo Carvalho de Melo authored
      Just like 'perf trace' and 'perf script', should be useful for instance
      to only consider samples after the initialization phase of some
      workload.
      
      The man page has some examples and considerations about its current
      interface, that still doesn't handle the on/off events in a special way,
      behaving just like when multiple events are specified, i.e.:
      
      - In non-group mode (when the event list is not enclosed in {}) show a
        a menu to allow choosing which event the user wants to see in the
        histograms browser
      
      - In group mode, be it using {} or asking for --group, show one column
        per event.
      
      Try for instance:
      
        # perf top -e '{cycles,instructions,probe:icmp_rcv}' --switch-on=probe:icmp_rcv
      
      Replace probe:icmp_rcv, that I put in place using:
      
        # perf probe icmp_rcv:59
      
      To hit when broadcast packets arrive, with a probe installed after an
      initialization phase is over or after some other point of interest, some
      garbage collection, etc, and also use --switch-off, for instance, on a
      probe installed after said garbage collection is over.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Florian Weimer <fweimer@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: William Cohen <wcohen@redhat.com>
      Link: https://lkml.kernel.org/n/tip-c7q7qjeqtyvc9mkeipxza6ne@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      2f53ae34
    • Arnaldo Carvalho de Melo's avatar
      perf trace: Add --switch-on/--switch-off events · 22ac4318
      Arnaldo Carvalho de Melo authored
      Just like with 'perf script':
      
        # perf trace -e sched:*,syscalls:*sleep* sleep 1
             0.000 :28345/28345 sched:sched_waking:comm=perf pid=28346 prio=120 target_cpu=005
             0.005 :28345/28345 sched:sched_wakeup:perf:28346 [120] success=1 CPU:005
             0.383 sleep/28346 sched:sched_process_exec:filename=/usr/bin/sleep pid=28346 old_pid=28346
             0.613 sleep/28346 sched:sched_stat_runtime:comm=sleep pid=28346 runtime=607375 [ns] vruntime=23289041218 [ns]
             0.689 sleep/28346 syscalls:sys_enter_nanosleep:rqtp: 0x7ffc491789b0
             0.693 sleep/28346 sched:sched_stat_runtime:comm=sleep pid=28346 runtime=72021 [ns] vruntime=23289113239 [ns]
             0.694 sleep/28346 sched:sched_switch:sleep:28346 [120] S ==> swapper/5:0 [120]
          1000.787 :0/0 sched:sched_waking:comm=sleep pid=28346 prio=120 target_cpu=005
          1000.824 :0/0 sched:sched_wakeup:sleep:28346 [120] success=1 CPU:005
          1000.908 sleep/28346 syscalls:sys_exit_nanosleep:0x0
          1001.218 sleep/28346 sched:sched_process_exit:comm=sleep pid=28346 prio=120
        # perf trace -e sched:*,syscalls:*sleep* --switch-on=syscalls:sys_enter_nanosleep sleep 1
             0.000 sleep/28349 sched:sched_stat_runtime:comm=sleep pid=28349 runtime=603036 [ns] vruntime=23873537697 [ns]
             0.001 sleep/28349 sched:sched_switch:sleep:28349 [120] S ==> swapper/4:0 [120]
          1000.392 :0/0 sched:sched_waking:comm=sleep pid=28349 prio=120 target_cpu=004
          1000.443 :0/0 sched:sched_wakeup:sleep:28349 [120] success=1 CPU:004
          1000.540 sleep/28349 syscalls:sys_exit_nanosleep:0x0
          1000.852 sleep/28349 sched:sched_process_exit:comm=sleep pid=28349 prio=120
        # perf trace -e sched:*,syscalls:*sleep* --switch-on=syscalls:sys_enter_nanosleep --switch-off=syscalls:sys_exit_nanosleep sleep 1
             0.000 sleep/28352 sched:sched_stat_runtime:comm=sleep pid=28352 runtime=610543 [ns] vruntime=24811686681 [ns]
             0.001 sleep/28352 sched:sched_switch:sleep:28352 [120] S ==> swapper/0:0 [120]
          1000.397 :0/0 sched:sched_waking:comm=sleep pid=28352 prio=120 target_cpu=000
          1000.440 :0/0 sched:sched_wakeup:sleep:28352 [120] success=1 CPU:000
        #
        # perf trace -e sched:*,syscalls:*sleep* --switch-on=syscalls:sys_enter_nanosleep --switch-off=syscalls:sys_exit_nanosleep --show-on-off sleep 1
             0.000 sleep/28367 syscalls:sys_enter_nanosleep:rqtp: 0x7fffd1a25fc0
             0.004 sleep/28367 sched:sched_stat_runtime:comm=sleep pid=28367 runtime=628760 [ns] vruntime=22170052672 [ns]
             0.005 sleep/28367 sched:sched_switch:sleep:28367 [120] S ==> swapper/2:0 [120]
          1000.367 :0/0 sched:sched_waking:comm=sleep pid=28367 prio=120 target_cpu=002
          1000.412 :0/0 sched:sched_wakeup:sleep:28367 [120] success=1 CPU:002
          1000.512 sleep/28367 syscalls:sys_exit_nanosleep:0x0
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Florian Weimer <fweimer@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: William Cohen <wcohen@redhat.com>
      Link: https://lkml.kernel.org/n/tip-t3ngpt1brcc1fm9gep9gxm4q@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      22ac4318
    • Arnaldo Carvalho de Melo's avatar
      perf evswitch: Add hint when not finding specified on/off events · 8b3c9ea7
      Arnaldo Carvalho de Melo authored
      If the user specifies a on or off switch event and it isn't in the
      perf.data file, provide a hint about how to see the events in the
      perf.data evlist:
      
        # perf script --switch-on=syscall:sys_enter_nanosleep --switch-off=syscalls:sys_exit_nanosleep
        ERROR: event_on event not found (syscall:sys_enter_nanosleep)
        HINT:  use 'perf evlist' to see the available event names
        #
        # perf evlist
        sched:sched_kthread_stop
        sched:sched_kthread_stop_ret
        sched:sched_waking
        sched:sched_wakeup
        sched:sched_wakeup_new
        sched:sched_switch
        sched:sched_migrate_task
        sched:sched_process_free
        sched:sched_process_exit
        sched:sched_wait_task
        sched:sched_process_wait
        sched:sched_process_fork
        sched:sched_process_exec
        sched:sched_stat_wait
        sched:sched_stat_sleep
        sched:sched_stat_iowait
        sched:sched_stat_blocked
        sched:sched_stat_runtime
        sched:sched_pi_setprio
        sched:sched_move_numa
        sched:sched_stick_numa
        sched:sched_swap_numa
        sched:sched_wake_idle_without_ipi
        syscalls:sys_enter_clock_nanosleep
        syscalls:sys_exit_clock_nanosleep
        syscalls:sys_enter_nanosleep
        syscalls:sys_exit_nanosleep
        # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
        #
        # perf script --switch-on=syscalls:sys_enter_nanosleep --switch-off=syscalls:sys_exit_nanosleep
             sleep 20919 [001] 109866.144411:  sched:sched_stat_runtime: comm=sleep pid=20919 runtime=521249 [ns] vruntime=202919398131 [ns]
             sleep 20919 [001] 109866.144412:        sched:sched_switch: sleep:20919 [120] S ==> swapper/1:0 [120]
           swapper     0 [001] 109867.144568:        sched:sched_waking: comm=sleep pid=20919 prio=120 target_cpu=001
           swapper     0 [001] 109867.144586:        sched:sched_wakeup: sleep:20919 [120] success=1 CPU:001
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Florian Weimer <fweimer@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: William Cohen <wcohen@redhat.com>
      Link: https://lkml.kernel.org/n/tip-iijjvdlyad973oskdq8gmi5w@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8b3c9ea7
    • Arnaldo Carvalho de Melo's avatar
      perf evswitch: Move enoent error message printing to separate function · c9a42699
      Arnaldo Carvalho de Melo authored
      Allows adding hints there, will be done in followup patch.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Florian Weimer <fweimer@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: William Cohen <wcohen@redhat.com>
      Link: https://lkml.kernel.org/n/tip-1kvrdi7weuz3hxycwvarcu6v@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c9a42699
    • Arnaldo Carvalho de Melo's avatar
      perf evswitch: Introduce init() method to set the on/off evsels from the command line · 124e02be
      Arnaldo Carvalho de Melo authored
      Another step in having all the boilerplate in just one place to then use
      in the other tools.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Florian Weimer <fweimer@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: William Cohen <wcohen@redhat.com>
      Link: https://lkml.kernel.org/n/tip-snreb1wmwyjei3eefwotxp1l@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      124e02be
    • Arnaldo Carvalho de Melo's avatar
      perf evswitch: Introduce OPTS_EVSWITCH() for cmd line processing · add3a719
      Arnaldo Carvalho de Melo authored
      All tools will want those, so provide a convenient way to get them.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Florian Weimer <fweimer@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: William Cohen <wcohen@redhat.com>
      Link: https://lkml.kernel.org/n/tip-v16pe3sbf3wjmn152u18f649@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      add3a719
    • Arnaldo Carvalho de Melo's avatar
      perf evswitch: Add the names of on/off events · 0b495b12
      Arnaldo Carvalho de Melo authored
      So that we can have macros for the OPT_ entries and also for finding
      those in an evlist, this way other tools will use this very easily.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Florian Weimer <fweimer@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: William Cohen <wcohen@redhat.com>
      Link: https://lkml.kernel.org/n/tip-q0og1xoqqi0w38ve5u0a43k2@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0b495b12
    • Arnaldo Carvalho de Melo's avatar
      perf evswitch: Move switch logic to use in other tools · 8829e56f
      Arnaldo Carvalho de Melo authored
      Now other tools that want switching can use an evswitch for that, just
      set it up and add it to the PERF_RECORD_SAMPLE processing function.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Florian Weimer <fweimer@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: William Cohen <wcohen@redhat.com>
      Link: https://lkml.kernel.org/n/tip-b1trj1q97qwfv251l66q3noj@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8829e56f
    • Arnaldo Carvalho de Melo's avatar
      perf evswitch: Move struct to a separate header to use in other tools · d2360442
      Arnaldo Carvalho de Melo authored
      Now that we see that the simple userspace-based "slicing" of events
      using delimiter events ("markers") works, lets move it to a separate
      header to make it available to other tools, next step will be having
      the switch on/off check done at the PERF_RECORD_SAMPLE processing
      function moved too.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Florian Weimer <fweimer@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: William Cohen <wcohen@redhat.com>
      Link: https://lkml.kernel.org/n/tip-z0cyi9ifzlr37cedr9xztc1k@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d2360442
    • Arnaldo Carvalho de Melo's avatar
      perf script: Allow specifying event to switch off processing of other events · dd41f660
      Arnaldo Carvalho de Melo authored
      Counterpart of --switch-on:
      
        # perf record -e sched:*,syscalls:sys_*_nanosleep sleep 1
        [ perf record: Woken up 36 times to write data ]
        [ perf record: Captured and wrote 0.032 MB perf.data (10 samples) ]
        #
        # perf script
            :20918 20918 [002] 109866.143696:            sched:sched_waking: comm=perf pid=20919 prio=120 target_cpu=001
            :20918 20918 [002] 109866.143702:            sched:sched_wakeup: perf:20919 [120] success=1 CPU:001
             sleep 20919 [001] 109866.144081:      sched:sched_process_exec: filename=/usr/bin/sleep pid=20919 old_pid=20919
             sleep 20919 [001] 109866.144408:  syscalls:sys_enter_nanosleep: rqtp: 0x7ffc2384fef0, rmtp: 0x00000000
             sleep 20919 [001] 109866.144411:      sched:sched_stat_runtime: comm=sleep pid=20919 runtime=521249 [ns] vruntime=202919398131 [n>
             sleep 20919 [001] 109866.144412:            sched:sched_switch: sleep:20919 [120] S ==> swapper/1:0 [120]
           swapper     0 [001] 109867.144568:            sched:sched_waking: comm=sleep pid=20919 prio=120 target_cpu=001
           swapper     0 [001] 109867.144586:            sched:sched_wakeup: sleep:20919 [120] success=1 CPU:001
             sleep 20919 [001] 109867.144614:   syscalls:sys_exit_nanosleep: 0x0
             sleep 20919 [001] 109867.144753:      sched:sched_process_exit: comm=sleep pid=20919 prio=120
        #
        # perf script --switch-off syscalls:sys_exit_nanosleep
            :20918 20918 [002] 109866.143696:            sched:sched_waking: comm=perf pid=20919 prio=120 target_cpu=001
            :20918 20918 [002] 109866.143702:            sched:sched_wakeup: perf:20919 [120] success=1 CPU:001
             sleep 20919 [001] 109866.144081:      sched:sched_process_exec: filename=/usr/bin/sleep pid=20919 old_pid=20919
             sleep 20919 [001] 109866.144408:  syscalls:sys_enter_nanosleep: rqtp: 0x7ffc2384fef0, rmtp: 0x00000000
             sleep 20919 [001] 109866.144411:      sched:sched_stat_runtime: comm=sleep pid=20919 runtime=521249 [ns] vruntime=202919398131 [n>
             sleep 20919 [001] 109866.144412:            sched:sched_switch: sleep:20919 [120] S ==> swapper/1:0 [120]
           swapper     0 [001] 109867.144568:            sched:sched_waking: comm=sleep pid=20919 prio=120 target_cpu=001
           swapper     0 [001] 109867.144586:            sched:sched_wakeup: sleep:20919 [120] success=1 CPU:001
             sleep 20919 [001] 109867.144753:      sched:sched_process_exit: comm=sleep pid=20919 prio=120
        #
        # perf script --switch-on syscalls:sys_enter_nanosleep --switch-off syscalls:sys_exit_nanosleep
             sleep 20919 [001] 109866.144411:      sched:sched_stat_runtime: comm=sleep pid=20919 runtime=521249 [ns] vruntime=202919398131 [n>
             sleep 20919 [001] 109866.144412:            sched:sched_switch: sleep:20919 [120] S ==> swapper/1:0 [120]
           swapper     0 [001] 109867.144568:            sched:sched_waking: comm=sleep pid=20919 prio=120 target_cpu=001
           swapper     0 [001] 109867.144586:            sched:sched_wakeup: sleep:20919 [120] success=1 CPU:001
        #
        # perf script --switch-on syscalls:sys_enter_nanosleep --switch-off syscalls:sys_exit_nanosleep --show-on-off
             sleep 20919 [001] 109866.144408:  syscalls:sys_enter_nanosleep: rqtp: 0x7ffc2384fef0, rmtp: 0x00000000
             sleep 20919 [001] 109866.144411:      sched:sched_stat_runtime: comm=sleep pid=20919 runtime=521249 [ns] vruntime=202919398131 [n>
             sleep 20919 [001] 109866.144412:            sched:sched_switch: sleep:20919 [120] S ==> swapper/1:0 [120]
           swapper     0 [001] 109867.144568:            sched:sched_waking: comm=sleep pid=20919 prio=120 target_cpu=001
           swapper     0 [001] 109867.144586:            sched:sched_wakeup: sleep:20919 [120] success=1 CPU:001
             sleep 20919 [001] 109867.144614:   syscalls:sys_exit_nanosleep: 0x0
        #
      
      Now think about using this together with 'perf probe' to create custom on/off
      events in your app :-)
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Florian Weimer <fweimer@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: William Cohen <wcohen@redhat.com>
      Link: https://lkml.kernel.org/n/tip-li3j01c4tmj9kw6ydsl8swej@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      dd41f660
    • Arnaldo Carvalho de Melo's avatar
      perf script: Allow showing the --switch-on event · 6469eb6d
      Arnaldo Carvalho de Melo authored
      One may want to see the --switch-on event as well, allow for that, using
      the previous cset example:
      
        # perf script --switch-on syscalls:sys_enter_nanosleep --show-on-off
              sleep 13638 [001] 108237.582286: syscalls:sys_enter_nanosleep: rqtp: 0x7fff1948ac40, rmtp: 0x00000000
              sleep 13638 [001] 108237.582289:     sched:sched_stat_runtime: comm=sleep pid=13638 runtime=578104 [ns] vruntime=202889459556 [ns]
              sleep 13638 [001] 108237.582291:           sched:sched_switch: sleep:13638 [120] S ==> swapper/1:0 [120]
            swapper     0 [001] 108238.582428:           sched:sched_waking: comm=sleep pid=13638 prio=120 target_cpu=001
            swapper     0 [001] 108238.582458:           sched:sched_wakeup: sleep:13638 [120] success=1 CPU:001
              sleep 13638 [001] 108238.582698:     sched:sched_stat_runtime: comm=sleep pid=13638 runtime=173915 [ns] vruntime=202889633471 [ns]
              sleep 13638 [001] 108238.582782:     sched:sched_process_exit: comm=sleep pid=13638 prio=120
        #
        # perf script --switch-on syscalls:sys_enter_nanosleep
              sleep 13638 [001] 108237.582289:     sched:sched_stat_runtime: comm=sleep pid=13638 runtime=578104 [ns] vruntime=202889459556 [ns]
              sleep 13638 [001] 108237.582291:           sched:sched_switch: sleep:13638 [120] S ==> swapper/1:0 [120]
            swapper     0 [001] 108238.582428:           sched:sched_waking: comm=sleep pid=13638 prio=120 target_cpu=001
            swapper     0 [001] 108238.582458:           sched:sched_wakeup: sleep:13638 [120] success=1 CPU:001
              sleep 13638 [001] 108238.582698:     sched:sched_stat_runtime: comm=sleep pid=13638 runtime=173915 [ns] vruntime=202889633471 [ns]
              sleep 13638 [001] 108238.582782:     sched:sched_process_exit: comm=sleep pid=13638 prio=120
        #
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: William Cohen <wcohen@redhat.com>
      Cc: Florian Weimer <fweimer@redhat.com>
      Link: https://lkml.kernel.org/n/tip-0omwwoywj1v63gu8cz0tr0cy@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6469eb6d
    • Arnaldo Carvalho de Melo's avatar
      perf script: Allow specifying event to switch on processing of other events · f90a2417
      Arnaldo Carvalho de Melo authored
      Sometime we want to only consider events after something happens, so
      allow discarding events till such events is found, e.g.:
      
      Record all scheduler tracepoints and the sys_enter_nanosleep syscall
      event for the 'sleep 1' workload:
      
        # perf record -e sched:*,syscalls:sys_enter_nanosleep sleep 1
        [ perf record: Woken up 31 times to write data ]
        [ perf record: Captured and wrote 0.032 MB perf.data (10 samples) ]
        #
      
      So we have these events in the generated perf data file:
      
        # perf evlist
        sched:sched_kthread_stop
        sched:sched_kthread_stop_ret
        sched:sched_waking
        sched:sched_wakeup
        sched:sched_wakeup_new
        sched:sched_switch
        sched:sched_migrate_task
        sched:sched_process_free
        sched:sched_process_exit
        sched:sched_wait_task
        sched:sched_process_wait
        sched:sched_process_fork
        sched:sched_process_exec
        sched:sched_stat_wait
        sched:sched_stat_sleep
        sched:sched_stat_iowait
        sched:sched_stat_blocked
        sched:sched_stat_runtime
        sched:sched_pi_setprio
        sched:sched_move_numa
        sched:sched_stick_numa
        sched:sched_swap_numa
        sched:sched_wake_idle_without_ipi
        syscalls:sys_enter_nanosleep
        # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
        #
      
      Then show all of the events that actually took place in this 'perf record' session:
      
        # perf script
                :13637 13637 [002] 108237.581529:            sched:sched_waking: comm=perf pid=13638 prio=120 target_cpu=001
                :13637 13637 [002] 108237.581537:            sched:sched_wakeup: perf:13638 [120] success=1 CPU:001
                 sleep 13638 [001] 108237.581992:      sched:sched_process_exec: filename=/usr/bin/sleep pid=13638 old_pid=13638
                 sleep 13638 [001] 108237.582286:  syscalls:sys_enter_nanosleep: rqtp: 0x7fff1948ac40, rmtp: 0x00000000
                 sleep 13638 [001] 108237.582289:      sched:sched_stat_runtime: comm=sleep pid=13638 runtime=578104 [ns] vruntime=202889459556 [ns]
                 sleep 13638 [001] 108237.582291:            sched:sched_switch: sleep:13638 [120] S ==> swapper/1:0 [120]
               swapper     0 [001] 108238.582428:            sched:sched_waking: comm=sleep pid=13638 prio=120 target_cpu=001
               swapper     0 [001] 108238.582458:            sched:sched_wakeup: sleep:13638 [120] success=1 CPU:001
                 sleep 13638 [001] 108238.582698:      sched:sched_stat_runtime: comm=sleep pid=13638 runtime=173915 [ns] vruntime=202889633471 [ns]
                 sleep 13638 [001] 108238.582782:      sched:sched_process_exit: comm=sleep pid=13638 prio=120
        #
      
      Now lets see only the ones that took place after a certain "marker":
      
        # perf script --switch-on syscalls:sys_enter_nanosleep
                 sleep 13638 [001] 108237.582289:      sched:sched_stat_runtime: comm=sleep pid=13638 runtime=578104 [ns] vruntime=202889459556 [ns]
                 sleep 13638 [001] 108237.582291:            sched:sched_switch: sleep:13638 [120] S ==> swapper/1:0 [120]
               swapper     0 [001] 108238.582428:            sched:sched_waking: comm=sleep pid=13638 prio=120 target_cpu=001
               swapper     0 [001] 108238.582458:            sched:sched_wakeup: sleep:13638 [120] success=1 CPU:001
                 sleep 13638 [001] 108238.582698:      sched:sched_stat_runtime: comm=sleep pid=13638 runtime=173915 [ns] vruntime=202889633471 [ns]
                 sleep 13638 [001] 108238.582782:      sched:sched_process_exit: comm=sleep pid=13638 prio=120
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Florian Weimer <fweimer@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: William Cohen <wcohen@redhat.com>
      Link: https://lkml.kernel.org/n/tip-f1oo0ufdhrkx6nhy2lj1ierm@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f90a2417
    • Haiyan Song's avatar
      perf vendor events intel: Add Tremontx event file v1.02 · 11e54d35
      Haiyan Song authored
      Add a Intel event file for perf.
      Signed-off-by: default avatarHaiyan Song <haiyanx.song@intel.com>
      Reviewed-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20190815035942.30602-1-haiyanx.song@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      11e54d35
    • Thomas Gleixner's avatar
      Merge tag 'perf-core-for-mingo-5.4-20190814' of... · 4511708b
      Thomas Gleixner authored
      Merge tag 'perf-core-for-mingo-5.4-20190814' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo:
      
      Intel PT:
      
        Adrian Hunter:
      
        - Add PEBS via Intel PT support, the kernel bits went via PeterZ.
      
      perf record:
      
        Alexander Shishkin:
      
        - Add an option to take an AUX snapshot on exit.
      
        Tan Xiaojun:
      
        - Support aarch64 random socket_id assignment, just like was fixed for S/390.
      
      tools:
      
        Andy Shevchenko:
      
        - Keep list of tools in alphabetical order on 'make -C tools help'.
      
      perf session:
      
        Arnaldo Carvalho de Melo:
      
        - Avoid infinite loop when seeing invalid header.size, reported by
          Vince Weaver using a perf.data fuzzer.
      
      Documentation:
      
        Vince Weaver:
      
        - Clarify HEADER_SAMPLE_TOPOLOGY format in the perf.data spec.
      
      perf config:
      
        Arnaldo Carvalho de Melo:
      
        - Honour $PERF_CONFIG env var to specify alternate .perfconfig.
      
      perf test:
      
        Arnaldo Carvalho de Melo:
      
        - Disable ~/.perfconfig to get default output in 'perf trace' tests.
      
      perf top:
      
        Arnaldo Carvalho de Melo:
      
        - Set display thread COMM to help with debugging.
      
        - Collapse and resort evsels in a group, so that we have output
          similar to 'perf report' when using event groups, i.e.
      
            perf top -e '{cycles,instructions}'
      
          Will have two columns, and the instructions one will work.
      
      core:
      
        Igor Lubashev:
      
        - Detect if libcap development files are available so that we
          can use capabilities to match the checks made by the kernel instead
          of using plain (geteuid() == 0).
      
      Intel:
      
        Haiyan Song:
      
        - Add Icelake V1.00 event file.
      
      perf trace:
      
        Leo Yan:
      
        - Fix segmentation fault when access syscall info on arm64.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      4511708b
  4. 14 Aug, 2019 16 commits