1. 05 Oct, 2015 4 commits
  2. 03 Oct, 2015 1 commit
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo' of... · e3b0ac1b
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      User visible changes:
      
       - Do event name substring search as last resort in 'perf list'.
         (Arnaldo Carvalho de Melo)
      
         E.g.:
      
          # perf list clock
      
          List of pre-defined events (to be used in -e):
      
           cpu-clock                                          [Software event]
           task-clock                                         [Software event]
      
           uncore_cbox_0/clockticks/                          [Kernel PMU event]
           uncore_cbox_1/clockticks/                          [Kernel PMU event]
      
           kvm:kvm_pvclock_update                             [Tracepoint event]
           kvm:kvm_update_master_clock                        [Tracepoint event]
           power:clock_disable                                [Tracepoint event]
           power:clock_enable                                 [Tracepoint event]
           power:clock_set_rate                               [Tracepoint event]
           syscalls:sys_enter_clock_adjtime                   [Tracepoint event]
           syscalls:sys_enter_clock_getres                    [Tracepoint event]
           syscalls:sys_enter_clock_gettime                   [Tracepoint event]
           syscalls:sys_enter_clock_nanosleep                 [Tracepoint event]
           syscalls:sys_enter_clock_settime                   [Tracepoint event]
           syscalls:sys_exit_clock_adjtime                    [Tracepoint event]
           syscalls:sys_exit_clock_getres                     [Tracepoint event]
           syscalls:sys_exit_clock_gettime                    [Tracepoint event]
           syscalls:sys_exit_clock_nanosleep                  [Tracepoint event]
           syscalls:sys_exit_clock_settime                    [Tracepoint event]
      
       - Reduce min 'perf stat --interval-print/-I' to 10ms. (Kan Liang)
      
         perf stat --interval in action:
      
         # perf stat -e cycles -I 50 -a usleep $((200 * 1000))
         print interval < 100ms. The overhead percentage could be high in some cases. Please proceed with caution.
         #   time                    counts unit events
            0.050233636         48,240,396      cycles
            0.100557098         35,492,594      cycles
            0.150804687         39,295,112      cycles
            0.201032269         33,101,961      cycles
            0.201980732            786,379      cycles
        #
      
       - Allow for max_stack greater than PERF_MAX_STACK_DEPTH, as when
         synthesizing callchains from Intel PT data. (Adrian Hunter)
      
       - Allow probing on kmodules without DWARF. (Masami Hiramatsu)
      
       - Fix a segfault when processing a perf.data file with callchains using
         "perf report --call-graph none". (Namhyung Kim)
      
       - Fix unresolved COMMs in 'perf top' when -s comm is used. (Namhyung Kim)
      
       - Register idle thread in 'perf top'. (Namhyung Kim)
      
       - Change 'record.samples' type to unsigned long long, fixing output of
         number of samples in 32-bit architectures. (Yang Shi)
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e3b0ac1b
  3. 02 Oct, 2015 4 commits
    • Kan Liang's avatar
      perf stat: Reduce min --interval-print to 10ms · 19afd104
      Kan Liang authored
      The --interval-print parameter was limited to 100ms. However, for
      example, 10ms is required to do sophisticated bandwidth analysis using
      uncore events.
      
      The test shows that the overhead of the system-wide uncore monitoring
      with 10ms interval is only ~2%. So this patch reduces the minimal
      interval-print allowd to 10ms.
      
      But 10ms may not work well for all cases. For example, when the
      cpus/threads number is very large, for system-wide core event monitoring
      the overhead could be high.
      
      To handle this issue, a warning will be displayed when the
      interval-print is set between 10ms to 100ms. So users can make a
      decision according to their specific cases.
      
       # perf stat -e uncore_imc_1/cas_count_read/ -a --interval-print 10 -- sleep 1
      
       print interval < 100ms. The overhead percentage could be high in some
       cases. Please proceed with caution.
       #           time             counts unit events
            0.010200451               0.10 MiB  uncore_imc_1/cas_count_read/
            0.020475117               0.02 MiB  uncore_imc_1/cas_count_read/
            0.030692800               0.01 MiB  uncore_imc_1/cas_count_read/
            0.040948161               0.02 MiB  uncore_imc_1/cas_count_read/
            0.051159564               0.00 MiB  uncore_imc_1/cas_count_read/
      Signed-off-by: default avatarKan Liang <kan.liang@intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: http://lkml.kernel.org/r/1443776674-42511-1-git-send-email-kan.liang@intel.com
      [ Added warning about overhead when using sub 100ms intervals to the man page ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      19afd104
    • Yang Shi's avatar
      perf record: Change 'record.samples' type to unsigned long long · 9f065194
      Yang Shi authored
      When run "perf record -e", the number of samples showed up is wrong on some
      32 bit systems, i.e. powerpc and arm.
      
      For example, run the below commands on 32 bit powerpc:
      
        perf probe -x /lib/libc.so.6 malloc
        perf record -e probe_libc:malloc -a ls perf.data
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.036 MB perf.data (13829241621624967218 samples) ]
      
      Actually, "perf script" just shows 21 samples. The number of samples is also
      absurd since samples is long type, but it is printed as PRIu64.
      
      Build test ran on x86-64, x86, aarch64, arm, mips, ppc and ppc64.
      Signed-off-by: default avatarYang Shi <yang.shi@linaro.org>
      Cc: linaro-kernel@lists.linaro.org
      Link: http://lkml.kernel.org/r/1443563383-4064-1-git-send-email-yang.shi@linaro.org
      [ Bumped the 'hits' var used together with record.samples to 'unsigned long long' too ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9f065194
    • Masami Hiramatsu's avatar
      perf probe: Allow probing on kmodules without dwarf · 1a8ac29c
      Masami Hiramatsu authored
      Allow probing on kernel modules when 'perf' is built without debuginfo
      support.
      
      Currently perf-probe --module requires linking with libdw, but this
      doesn't make sense.
      
      E.g.
        ----
        # make NO_DWARF=1
        # ./perf probe -m pcspkr pcspkr_event%return
          Error: unknown switch `m'
        ----
      
      With this patch
        ----
        # ./perf probe -m pcspkr pcspkr_event%return
        Added new event:
          probe:pcspkr_event   (on pcspkr_event%return in pcspkr)
      
        You can now use it in all perf tools, such as:
      
                perf record -e probe:pcspkr_event -aR sleep 1
        ----
      Signed-off-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20151002125832.18617.78721.stgit@localhost.localdomainSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      1a8ac29c
    • Arnaldo Carvalho de Melo's avatar
      perf list: Honour 'event_glob' whem printing selectable PMUs · fa52ceab
      Arnaldo Carvalho de Melo authored
      Some PMUs, like the 'intel_bts' one can be used as an event name, i.e.:
      
      	$ perf record -e intel_bts:// usleep 1
      
      Is a valid event name.
      
      But the code printing such PMUs was not honouring the 'event_glob'
      parameter, so the following line was always appearing:
      
        $ intel_bts//                                        [Kernel PMU event]
      
      Fix it:
      
        $ [acme@felicio linux]$ perf list data
      
        List of pre-defined events (to be used in -e):
      
          uncore_imc/data_reads/                             [Kernel PMU event]
          uncore_imc/data_writes/                            [Kernel PMU event]
      
        $
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-ajb71858n7q7ao77b8pyy74w@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      fa52ceab
  4. 01 Oct, 2015 8 commits
  5. 30 Sep, 2015 16 commits
  6. 29 Sep, 2015 1 commit
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo' of... · 9c17dbc6
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
        - Accept a zero --itrace period, meaning "as often as possible".  In the case
          of Intel PT that is the same as a period of 1 and a unit of 'instructions'
          (i.e.  --itrace=i1i). (Adrian Hunter)
      
        - Harmonize itrace's synthesized callchains with the existing --max-stack
          tool option. (Adrian Hunter)
      
        - Allow time to be displayed in nanoseconds in 'perf script'. (Adrian Hunter)
      
        - Fix potential infinite loop when handling Intel PT timestamps. (Adrian Hunter)
      
        - Slighly improve Intel PT debug logging. (Adrian Hunter)
      
        - Warn when AUX data has been lost, just like when processing PERF_RECORD_LOST.
          (Adrian Hunter)
      
        - Further document export-to-postgresql.py script. (Adrian Hunter)
      
        - Add option to synthesize branch stack from auxtrace data. (Adrian Hunter)
      
        - Use equivalent logic to avoid using dso->kernel. (Arnaldo Carvalho de Melo)
      
        - Show proper error messages when parsing bad terms for hw/sw events. (He Kuang)
      
        - Tracepoint event parsing improvements. (He Kuang)
      
        - Store tracing mountpoint for better error message. (Jiri Olsa)
      
        - Add fixdep to tools/build, bringing it closer to the kernel counterpart, from
          where it is being lifted. (Jiri Olsa)
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      9c17dbc6
  7. 28 Sep, 2015 6 commits
    • He Kuang's avatar
      perf tools: Enable event_config terms to tracepoint events · e637d177
      He Kuang authored
      This patch enables config terms for tracepoint perf events. Valid terms
      for tracepoint events are 'call-graph' and 'stack-size', so we can use
      different callgraph settings for each event and eliminate unnecessary
      overhead.
      
      Here is an example for using different call-graph config for each
      tracepoint.
      
        $ perf record -e syscalls:sys_enter_write/call-graph=fp/
                      -e syscalls:sys_exit_write/call-graph=no/
                      dd if=/dev/zero of=test bs=4k count=10
      
        $ perf report --stdio
      
        #
        # Total Lost Samples: 0
        #
        # Samples: 13  of event 'syscalls:sys_enter_write'
        # Event count (approx.): 13
        #
        # Children      Self  Command  Shared Object       Symbol
        # ........  ........  .......  ..................  ......................
        #
            76.92%    76.92%  dd       libpthread-2.20.so  [.] __write_nocancel
                         |
                         ---__write_nocancel
      
            23.08%    23.08%  dd       libc-2.20.so        [.] write
                         |
                         ---write
                            |
                            |--33.33%-- 0x2031342820736574
                            |
                            |--33.33%-- 0xa6e69207364726f
                            |
                             --33.33%-- 0x34202c7320393039
        ...
      
        # Samples: 13  of event 'syscalls:sys_exit_write'
        # Event count (approx.): 13
        #
        # Children      Self  Command  Shared Object       Symbol
        # ........  ........  .......  ..................  ......................
        #
            76.92%    76.92%  dd       libpthread-2.20.so  [.] __write_nocancel
            23.08%    23.08%  dd       libc-2.20.so        [.] write
             7.69%     0.00%  dd       [unknown]           [.] 0x0a6e69207364726f
             7.69%     0.00%  dd       [unknown]           [.] 0x2031342820736574
             7.69%     0.00%  dd       [unknown]           [.] 0x34202c7320393039
      Signed-off-by: default avatarHe Kuang <hekuang@huawei.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1443412336-120050-4-git-send-email-hekuang@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e637d177
    • He Kuang's avatar
      perf tools: Adds the tracepoint name parsing support · 865582c3
      He Kuang authored
      Adds rules for parsing tracepoint names. Change rules of tracepoint which
      derives from PE_NAMEs into tracepoint names directly, so adding more rules
      based on tracepoint names will be easier.
      
      Changes v2-v3:
         - Change __event_legacy_tracepoint label in bison file to tracepoint_name
         - Fix formats error.
      Signed-off-by: default avatarHe Kuang <hekuang@huawei.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1443412336-120050-3-git-send-email-hekuang@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      865582c3
    • He Kuang's avatar
      perf tools: Show proper error message for wrong terms of hw/sw events · ffeb883e
      He Kuang authored
      Show proper error message and show valid terms when wrong config terms
      is specified for hw/sw type perf events.
      
      This patch makes the original error format function formats_error_string()
      more generic, which only outputs the static config terms for hw/sw perf
      events, and prepends pmu formats for pmu events.
      
      Before this patch:
      
        $ perf record -e 'cpu-clock/freqx=200/' -a sleep 1
        invalid or unsupported event: 'cpu-clock/freqx=200/'
        Run 'perf list' for a list of valid events
      
         usage: perf record [<options>] [<command>]
            or: perf record [<options>] -- <command> [<options>]
      
            -e, --event <event>   event selector. use 'perf list' to list available events
      
      After this patch:
      
        $ perf record -e 'cpu-clock/freqx=200/' -a sleep 1
        event syntax error: 'cpu-clock/freqx=200/'
                                       \___ unknown term
      
        valid terms: config,config1,config2,name,period,freq,branch_type,time,call-graph,stack-size
      
        Run 'perf list' for a list of valid events
      
         usage: perf record [<options>] [<command>]
            or: perf record [<options>] -- <command> [<options>]
      
            -e, --event <event>   event selector. use 'perf list' to list available events
      Signed-off-by: default avatarHe Kuang <hekuang@huawei.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1443412336-120050-2-git-send-email-hekuang@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ffeb883e
    • He Kuang's avatar
      perf tools: Adds the config_term callback for different type events · 0b8891a8
      He Kuang authored
      Currently, function config_term() is used for checking config terms of
      all types of events, while unknown terms is not reported as an error
      because pmu events have valid terms in sysfs.
      
      But this is wrong when unknown terms are specificed to hw/sw events.
      This patch Adds the config_term callback so we can use separate check
      routines for each type of events.
      Signed-off-by: default avatarHe Kuang <hekuang@huawei.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1443412336-120050-1-git-send-email-hekuang@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0b8891a8
    • Adrian Hunter's avatar
      perf intel-pt: Add mispred-all config option to aid use with autofdo · ba11ba65
      Adrian Hunter authored
      autofdo incorrectly expects branch flags to include either mispred or
      predicted.  In fact mispred = predicted = 0 is valid and means the flags
      are not supported, which they aren't by Intel PT.
      
      To make autofdo work, add a config option which will cause Intel PT
      decoder to set the mispred flag on all branches.
      
      Below is an example of using Intel PT with autofdo.  The example is
      also added to the Intel PT documentation.  It requires autofdo
      (https://github.com/google/autofdo) and gcc version 5.  The bubble
      sort example is from the AutoFDO tutorial (https://gcc.gnu.org/wiki/AutoFDO/Tutorial)
      amended to take the number of elements as a parameter.
      
      	$ gcc-5 -O3 sort.c -o sort_optimized
      	$ ./sort_optimized 30000
      	Bubble sorting array of 30000 elements
      	2254 ms
      
      	$ cat ~/.perfconfig
      	[intel-pt]
      		mispred-all
      
      	$ perf record -e intel_pt//u ./sort 3000
      	Bubble sorting array of 3000 elements
      	58 ms
      	[ perf record: Woken up 2 times to write data ]
      	[ perf record: Captured and wrote 3.939 MB perf.data ]
      	$ perf inject -i perf.data -o inj --itrace=i100usle --strip
      	$ ./create_gcov --binary=./sort --profile=inj --gcov=sort.gcov -gcov_version=1
      	$ gcc-5 -O3 -fauto-profile=sort.gcov sort.c -o sort_autofdo
      	$ ./sort_autofdo 30000
      	Bubble sorting array of 30000 elements
      	2155 ms
      
      Note there is currently no advantage to using Intel PT instead of LBR,
      but that may change in the future if greater use is made of the data.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1443186956-18718-26-git-send-email-adrian.hunter@intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      ba11ba65
    • Adrian Hunter's avatar
      perf inject: Add --strip option to strip out non-synthesized events · f56fb986
      Adrian Hunter authored
      Add a new option --strip which is used with --itrace to strip out
      non-synthesized events.  This results in a perf.data file that is
      simpler for external tools to parse.  In particular, this can be used to
      prepare a perf.data file for consumption by autofdo.
      
      A subsequent patch makes a change to Intel PT also to enable use with
      autofdo and gives an example of that use.
      Signed-off-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/1443186956-18718-25-git-send-email-adrian.hunter@intel.com
      [ Made it use perf_evlist__remove() + perf_evsel__delete() ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f56fb986