1. 04 Sep, 2015 22 commits
  2. 02 Sep, 2015 9 commits
  3. 01 Sep, 2015 9 commits
    • Stephane Eranian's avatar
      perf tools: Fix link time error with sample_reg_masks on non x86 · af4aeadd
      Stephane Eranian authored
      This patch makes perf compile on non x86 platforms by defining a weak
      symbol for sample_reg_masks[] in util/perf_regs.c.
      
      The patch also moves the REG() and REG_END() macros into the
      util/per_regs.h header file. The macros are renamed to
      SMPL_REG/SMPL_REG_END to avoid clashes with other header files.
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1441099814-26783-1-git-send-email-eranian@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      af4aeadd
    • Wang Nan's avatar
      perf build: Fix Intel PT instruction decoder dependency problem · 04aa90b5
      Wang Nan authored
      I hit following building error randomly:
      
          ...
        /bin/sh: /path/to/kernel/buildperf/util/intel-pt-decoder/inat-tables.c: No such file or directory
          ...
          LINK     /path/to/kernel/buildperf/plugin_mac80211.so
          LINK     /path/to/kernel/buildperf/plugin_kmem.so
          LINK     /path/to/kernel/buildperf/plugin_xen.so
          LINK     /path/to/kernel/buildperf/plugin_hrtimer.so
        In file included from util/intel-pt-decoder/intel-pt-insn-decoder.c:25:0:
        util/intel-pt-decoder/inat.c:24:25: fatal error: inat-tables.c: No such file or directory
         #include "inat-tables.c"
                                 ^
        compilation terminated.
        make[4]: *** [/path/to/kernel/buildperf/util/intel-pt-decoder/intel-pt-insn-decoder.o] Error 1
        make[4]: *** Waiting for unfinished jobs....
          LINK     /path/to/kernel/buildperf/plugin_function.so
      
      This is caused by tools/perf/util/intel-pt-decoder/Build that, it tries
      to generate $(OUTPUT)util/intel-pt-decoder/inat-tables.c atomatically
      but forget to ensure the existance of $(OUTPUT)util/intel-pt-decoder
      directory.
      
      This patch fixes it by adding $(call rule_mkdir) like other similar rules.
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Acked-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1441087005-107540-1-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      04aa90b5
    • Wang Nan's avatar
      perf dwarf: Fix potential array out of bounds access · 3b27d139
      Wang Nan authored
      There is a problem in the dwarf-regs.c files for sh, sparc and x86 where
      it is possible to make an out-of-bounds array access when searching for
      register names.
      
      This patch fixes it by replacing '<=' to '<', so when register (number
      == XXX_MAX_REGS), get_arch_regstr() will return NULL.
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Reviewed-by: default avatarMatt Fleming <matt@console-pimps.org>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Acked-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@huawei.com
      Link: http://lkml.kernel.org/r/1441078184-105038-1-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3b27d139
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo' of... · 53202661
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      User visible changes:
      
        - Add ability to specify to select which registers to record,
          to reduce the size of perf.data files, and also allow printing
          the registers in 'perf script': (Stephane Eranian)
      
            # perf record --intr-regs=AX,SP usleep 1
            [ perf record: Woken up 1 times to write data ]
            [ perf record: Captured and wrote 0.016 MB perf.data (8 samples) ]
            # perf script -F ip,sym,iregs | tail -5
             ffffffff8105f42a native_write_msr_safe   AX:0xf    SP:0xffff8802629c3c00
             ffffffff8105f42a native_write_msr_safe   AX:0xf    SP:0xffff8802629c3c00
             ffffffff81761ac0 _raw_spin_lock   AX:0xffff8801bfcf8020    SP:0xffff8802629c3ce8
             ffffffff81202bf8 __vma_adjust_trans_huge   AX:0x7ffc75200000    SP:0xffff8802629c3b30
             ffffffff8122b089 dput   AX:0x101    SP:0xffff8802629c3c78
            #
      
      Infrastructure changes:
      
        - Open event on evsel cpus and threads. (Kan Liang)
      
        - Add new bpf API to get name from a BPF object. (Wang Nan)
      
      Build fixes:
      
        - Fix build on powerpc broken by pt/bts. (Adrian Hunter)
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      53202661
    • Linus Torvalds's avatar
      Merge branch 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 65a99597
      Linus Torvalds authored
      Pull NOHZ updates from Ingo Molnar:
       "The main changes, mostly written by Frederic Weisbecker, include:
      
         - Fix some jiffies based cputime assumptions.  (No real harm because
           the concerned code isn't used by full dynticks.)
      
         - Simplify jiffies <-> usecs conversions.  Remove dead code.
      
         - Remove early hacks on nohz full code that avoided messing up idle
           nohz internals.  Now nohz integrates well full and idle and such
           hack have become needless.
      
         - Restart nohz full tick from irq exit.  (A simplification and a
           preparation for future optimization on scheduler kick to nohz
           full)
      
         - Code cleanups.
      
         - Tile driver isolation enhancement on top of nohz.  (Chris Metcalf)"
      
      * 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        nohz: Remove useless argument on tick_nohz_task_switch()
        nohz: Move tick_nohz_restart_sched_tick() above its users
        nohz: Restart nohz full tick from irq exit
        nohz: Remove idle task special case
        nohz: Prevent tilegx network driver interrupts
        alpha: Fix jiffies based cputime assumption
        apm32: Fix cputime == jiffies assumption
        jiffies: Remove HZ > USEC_PER_SEC special case
      65a99597
    • Linus Torvalds's avatar
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 418c2e1f
      Linus Torvalds authored
      Pull scheduler fix from Ingo Molnar:
       "This is a leftover scheduler fix from the v4.2 cycle"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched: Fix cpu_active_mask/cpu_online_mask race
      418c2e1f
    • Linus Torvalds's avatar
      Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a1d85611
      Linus Torvalds authored
      Pull scheduler updates from Ingo Molnar:
       "The biggest change in this cycle is the rewrite of the main SMP load
        balancing metric: the CPU load/utilization.  The main goal was to make
        the metric more precise and more representative - see the changelog of
        this commit for the gory details:
      
          9d89c257 ("sched/fair: Rewrite runnable load and utilization average tracking")
      
        It is done in a way that significantly reduces complexity of the code:
      
          5 files changed, 249 insertions(+), 494 deletions(-)
      
        and the performance testing results are encouraging.  Nevertheless we
        need to keep an eye on potential regressions, since this potentially
        affects every SMP workload in existence.
      
        This work comes from Yuyang Du.
      
        Other changes:
      
         - SCHED_DL updates.  (Andrea Parri)
      
         - Simplify architecture callbacks by removing finish_arch_switch().
           (Peter Zijlstra et al)
      
         - cputime accounting: guarantee stime + utime == rtime.  (Peter
           Zijlstra)
      
         - optimize idle CPU wakeups some more - inspired by Facebook server
           loads.  (Mike Galbraith)
      
         - stop_machine fixes and updates.  (Oleg Nesterov)
      
         - Introduce the 'trace_sched_waking' tracepoint.  (Peter Zijlstra)
      
         - sched/numa tweaks.  (Srikar Dronamraju)
      
         - misc fixes and small cleanups"
      
      * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
        sched/deadline: Fix comment in enqueue_task_dl()
        sched/deadline: Fix comment in push_dl_tasks()
        sched: Change the sched_class::set_cpus_allowed() calling context
        sched: Make sched_class::set_cpus_allowed() unconditional
        sched: Fix a race between __kthread_bind() and sched_setaffinity()
        sched: Ensure a task has a non-normalized vruntime when returning back to CFS
        sched/numa: Fix NUMA_DIRECT topology identification
        tile: Reorganize _switch_to()
        sched, sparc32: Update scheduler comments in copy_thread()
        sched: Remove finish_arch_switch()
        sched, tile: Remove finish_arch_switch
        sched, sh: Fold finish_arch_switch() into switch_to()
        sched, score: Remove finish_arch_switch()
        sched, avr32: Remove finish_arch_switch()
        sched, MIPS: Get rid of finish_arch_switch()
        sched, arm: Remove finish_arch_switch()
        sched/fair: Clean up load average references
        sched/fair: Provide runnable_load_avg back to cfs_rq
        sched/fair: Remove task and group entity load when they are dead
        sched/fair: Init cfs_rq's sched_entity load average
        ...
      a1d85611
    • Linus Torvalds's avatar
      Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3959df1d
      Linus Torvalds authored
      Pull RAS updates from Ingo Molnar:
       "MCE handling updates, but also some generic drivers/edac/ changes to
        better organize the Kconfig space"
      
      * 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/ras: Move AMD MCE injector to arch/x86/ras/
        x86/mce: Add a wrapper around mce_log() for injection
        x86/mce: Rename rcu_dereference_check_mce() to mce_log_get_idx_check()
        RAS: Add a menuconfig option with descriptive text
        x86/mce: Reenable CMCI banks when swiching back to interrupt mode
        x86/mce: Clear Local MCE opt-in before kexec
        x86/mce: Remove unused function declarations
        x86/mce: Kill drain_mcelog_buffer()
        x86/mce: Avoid potential deadlock due to printk() in MCE context
        x86/mce: Remove the MCE ring for Action Optional errors
        x86/mce: Don't use percpu workqueues
        x86/mce: Provide a lockless memory pool to save error records
        x86/mce: Reuse one of the u16 padding fields in 'struct mce'
      3959df1d
    • Linus Torvalds's avatar
      Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 41d859a8
      Linus Torvalds authored
      Pull perf updates from Ingo Molnar:
       "Main perf kernel side changes:
      
         - uprobes updates/fixes.  (Oleg Nesterov)
      
         - Add PERF_RECORD_SWITCH to indicate context switches and use it in
           tooling.  (Adrian Hunter)
      
         - Support BPF programs attached to uprobes and first steps for BPF
           tooling support.  (Wang Nan)
      
         - x86 generic x86 MSR-to-perf PMU driver.  (Andy Lutomirski)
      
         - x86 Intel PT, LBR and BTS updates.  (Alexander Shishkin)
      
         - x86 Intel Skylake support.  (Andi Kleen)
      
         - x86 Intel Knights Landing (KNL) RAPL support.  (Dasaratharaman
           Chandramouli)
      
         - x86 Intel Broadwell-DE uncore support.  (Kan Liang)
      
         - x86 hw breakpoints robustization (Andy Lutomirski)
      
        Main perf tooling side changes:
      
         - Support Intel PT in several tools, enabling the use of the
           processor trace feature introduced in Intel Broadwell processors:
           (Adrian Hunter)
      
             # dmesg | grep Performance
             # [0.188477] Performance Events: PEBS fmt2+, 16-deep LBR, Broadwell events, full-width counters, Intel PMU driver.
             # perf record -e intel_pt//u -a sleep 1
             [ perf record: Woken up 1 times to write data ]
             [ perf record: Captured and wrote 0.216 MB perf.data ]
             # perf script # then navigate in the tool output to some area, like this one:
             184 1030 dl_main (/usr/lib64/ld-2.17.so) => 7f21ba661440 dl_main (/usr/lib64/ld-2.17.so)
             185 1457 dl_main (/usr/lib64/ld-2.17.so) => 7f21ba669f10 _dl_new_object (/usr/lib64/ld-2.17.so)
             186 9f37 _dl_new_object (/usr/lib64/ld-2.17.so) => 7f21ba677b90 strlen (/usr/lib64/ld-2.17.so)
             187 7ba3 strlen (/usr/lib64/ld-2.17.so) => 7f21ba677c75 strlen (/usr/lib64/ld-2.17.so)
             188 7c78 strlen (/usr/lib64/ld-2.17.so) => 7f21ba669f3c _dl_new_object (/usr/lib64/ld-2.17.so)
             189 9f8a _dl_new_object (/usr/lib64/ld-2.17.so) => 7f21ba65fab0 calloc@plt (/usr/lib64/ld-2.17.so)
             190 fab0 calloc@plt (/usr/lib64/ld-2.17.so) => 7f21ba675e70 calloc (/usr/lib64/ld-2.17.so)
             191 5e87 calloc (/usr/lib64/ld-2.17.so) => 7f21ba65fa90 malloc@plt (/usr/lib64/ld-2.17.so)
             192 fa90 malloc@plt (/usr/lib64/ld-2.17.so) => 7f21ba675e60 malloc (/usr/lib64/ld-2.17.so)
             193 5e68 malloc (/usr/lib64/ld-2.17.so) => 7f21ba65fa80 __libc_memalign@plt (/usr/lib64/ld-2.17.so)
             194 fa80 __libc_memalign@plt (/usr/lib64/ld-2.17.so) => 7f21ba675d50 __libc_memalign (/usr/lib64/ld-2.17.so)
             195 5d63 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675e20 __libc_memalign (/usr/lib64/ld-2.17.so)
             196 5e40 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675d73 __libc_memalign (/usr/lib64/ld-2.17.so)
             197 5d97 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675e18 __libc_memalign (/usr/lib64/ld-2.17.so)
             198 5e1e __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675df9 __libc_memalign (/usr/lib64/ld-2.17.so)
             199 5e10 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba669f8f _dl_new_object (/usr/lib64/ld-2.17.so)
             200 9fc2 _dl_new_object (/usr/lib64/ld-2.17.so) =>  7f21ba678e70 memcpy (/usr/lib64/ld-2.17.so)
             201 8e8c memcpy (/usr/lib64/ld-2.17.so) => 7f21ba678ea0 memcpy (/usr/lib64/ld-2.17.so)
      
         - Add support for using several Intel PT features (CYC, MTC packets),
           the relevant documentation was updated in:
               tools/perf/Documentation/intel-pt.txt
           briefly describing those packets, its purposes, how to configure
           them in the event config terms and relevant external documentation
           for further reading.  (Adrian Hunter)
      
         - Introduce support for probing at an absolute address, for user and
           kernel 'perf probe's, useful when one have the symbol maps on a
           developer machine but not on an embedded system.  (Wang Nan)
      
         - Add Intel BTS support, with a call-graph script to show it and PT
           in use in a GUI using 'perf script' python scripting with
           postgresql and Qt.  (Adrian Hunter)
      
         - Allow selecting the type of callchains per event, including
           disabling callchains in all but one entry in an event list, to save
           space, and also to ask for the callchains collected in one event to
           be used in other events.  (Kan Liang)
      
         - Beautify more syscall arguments in 'perf trace': (Arnaldo Carvalho
           de Melo)
             * A bunch more translate file/pathnames from pointers to strings.
             * Convert numbers to strings for the 'keyctl' syscall 'option'
               arg.
             * Add missing 'clockid' entries.
      
         - Introduce 'srcfile' sort key: (Andi Kleen)
      
             # perf record -F 10000 usleep 1
             # perf report --stdio --dsos '[kernel.vmlinux]' -s srcfile
             <SNIP>
             # Overhead  Source File
                26.49%  copy_page_64.S
                 5.49%  signal.c
                 0.51%  msr.h
             #
      
           It can be combined with other fields, for instance, experiment with
           '-s srcfile,symbol'.
      
           There are some oddities in some distros and with some specific
           DSOs, being investigated, so your mileage may vary.
      
         - Support per-event 'freq' term: (Namhyung Kim)
      
             $ perf record -e 'cpu/instructions,freq=1234/',cycles -c 1000 sleep 1
             $ perf evlist -F
             cpu/instructions,freq=1234/: sample_freq=1234
             cycles: sample_period=1000
             $
      
         - Deref sys_enter pointer args with contents from probe:vfs_getname,
           showing pathnames instead of pointers in many syscalls in 'perf
           trace'.  (Arnaldo Carvalho de Melo)
      
         - Stop collecting /proc/kallsyms in perf.data files, saving about
           4.5MB on a typical x86-64 system, use the the symbol resolution
           routines used in all the other tools (report, top, etc) now that we
           can ask libtraceevent to use perf's symbol resolution code.
           (Arnaldo Carvalho de Melo)
      
         - Allow filtering out of perf's PID via 'perf record --exclude-perf'.
           (Wang Nan)
      
         - 'perf trace' now supports syscall groups, like strace, i.e:
      
             $ trace -e file touch file
      
           Will expand 'file' into multiple, file related, syscalls.  More
           work needed to add extra groups for other syscall groups, and also
           to complement what was added for the 'file' group, included as a
           proof of concept.  (Arnaldo Carvalho de Melo)
      
         - Add lock_pi stresser to 'perf bench futex', to test the kernel code
           related to FUTEX_(UN)LOCK_PI.  (Davidlohr Bueso)
      
         - Let user have timestamps with per-thread recording in 'perf record'
           (Adrian Hunter)
      
         - ... and tons of other changes, see the shortlog and the Git log for
           details"
      
      * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (240 commits)
        perf evlist: Add backpointer for perf_env to evlist
        perf tools: Rename perf_session_env to perf_env
        perf tools: Do not change lib/api/fs/debugfs directly
        perf tools: Add tracing_path and remove unneeded functions
        perf buildid: Introduce sysfs/filename__sprintf_build_id
        perf evsel: Add a backpointer to the evlist a evsel is in
        perf trace: Add header with copyright and background info
        perf scripts python: Add new compaction-times script
        perf stat: Get correct cpu id for print_aggr
        tools lib traceeveent: Allow for negative numbers in print format
        perf script: Add --[no-]-demangle/--[no-]-demangle-kernel
        tracing/uprobes: Do not print '0x (null)' when offset is 0
        perf probe: Support probing at absolute address
        perf probe: Fix error reported when offset without function
        perf probe: Fix list result when address is zero
        perf probe: Fix list result when symbol can't be found
        tools build: Allow duplicate objects in the object list
        perf tools: Remove export.h from MANIFEST
        perf probe: Prevent segfault when reading probe point with absolute address
        perf tools: Update Intel PT documentation
        ...
      41d859a8