1. 17 Sep, 2015 2 commits
    • Arnaldo Carvalho de Melo's avatar
      tools build: Add test for presence of numa_num_possible_cpus() in libnuma · f8ac8606
      Arnaldo Carvalho de Melo authored
      The existing numa test checks only if numa.h and numa_available() are
      present, but that can be satisfied with an old libnuma that is not
      enough for the 'perf bench numa' entry, so add a test to check for that:
      
        [acme@rhel5 linux]$  make NO_AUXTRACE=1 NO_LIBPERL=1 -C tools/perf O=/tmp/build/perf install-bin
        make: Entering directory `/home/acme/git/linux/tools/perf'
          BUILD:   Doing 'make -j2' parallel build
      
        Auto-detecting system features:
        ...                        libelf: [ on  ]
        ...                       libnuma: [ on  ]
        ...        numa_num_possible_cpus: [ OFF ]
        ...                       libperl: [ on  ]
      
        <SNIP>
        config/Makefile:577: Old numa library found, disables 'perf bench numa mem' benchmark, please install numactl-devel/libnuma-devel/libnuma-dev >= 2.0.8
          INSTALL  binaries
        <SNIP>
      
      This fixes the build on old systems such as RHEL/CentOS 5.11.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Victor Kamensky <victor.kamensky@linaro.org>
      Cc: Vinson Lee <vlee@twopensource.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-zqriqkezppi2de2iyjin1tnc@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f8ac8606
    • Arnaldo Carvalho de Melo's avatar
      Revert "perf symbols: Fix mismatched declarations for elf_getphdrnum" · 179f36dd
      Arnaldo Carvalho de Melo authored
      This reverts commit f785f235.
      
      We have a test to check if elf_getphdrnum() is present, so, if it fails,
      we'll get:
      
        [acme@rhel5 linux]$ cat /tmp/build/perf/feature/test-libelf-getphdrnum.make.output
        cc1: warnings being treated as errors
        test-libelf-getphdrnum.c: In function ‘main’:
        test-libelf-getphdrnum.c:7: warning: implicit declaration of function ‘elf_getphdrnum’
        [acme@rhel5 linux]$
      
      And this block will not be compiled:
      
        #ifndef HAVE_ELF_GETPHDRNUM_SUPPORT
        static int elf_getphdrnum(Elf *elf, size_t *dst)
        ...
        #endif
      
      So, if elf_getphdrnum() is being defined somewhere, there is a problem
      with the test that is not detecting that function, go fix it.
      Reported-by: default avatarVinson Lee <vlee@twopensource.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Victor Kamensky <victor.kamensky@linaro.org>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/n/tip-qn459fal6acvcvm50i8zxx9k@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      179f36dd
  2. 16 Sep, 2015 2 commits
    • Stephane Eranian's avatar
      perf stat: Fix per-pkg event reporting bug · 02d8dabc
      Stephane Eranian authored
      Per-pkg events need to be captured once per processor socket. The code
      in check_per_pkg() ensures only one value per processor package is used.
      However there is a problem with this function in case the first CPU of
      the package does not measure anything for the per-pkg event, but other
      CPUs do.
      
      Consider the following:
      
        $ create cgroup FOO; echo $$ >FOO/tasks; taskset -c 1 noploop &
        $ perf stat -a -I 1000 -e intel_cqm/llc_occupancy/ -G FOO sleep 100
          1.00000 <not counted> Bytes intel_cqm/llc_occupancy/  FOO
      
      The reason for this is that CPU0 in the cgroup has nothing running on it.
      Yet check_per_plg() will mark socket0 as processed and no other event
      value will be considered for the socket.
      
      This patch fixes the problem by having check_per_pkg() only consider
      events which actually ran.
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1441286620-10117-1-git-send-email-eranian@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      02d8dabc
    • Ingo Molnar's avatar
      Merge tag 'perf-urgent-for-mingo' of... · f6cf87f7
      Ingo Molnar authored
      Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
      
      Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
      
      - Fix segfault pressing -> in 'perf top' with no hist entries. (Wang Nan)
      
         E.g:
      	perf top -e page-faults --pid 11400 # 11400 generates no page-fault
      
      - Fix propagation of thread and cpu maps, that got broken when doing incomplete
        changes to better support events with a PMU cpu mask, leading to Intel PT to
        fail with an error like:
      
          $ perf record -e intel_pt//u uname
          Error: The sys_perf_event_open() syscall returned with
                    22 (Invalid argument) for event (sched:sched_switch).
      
        Because intel_pt adds that sched:sched_switch evsel to the evlist after the
        thread/cpu maps were propagated to the evsels, fix it. (Adrian Hunter)
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      f6cf87f7
  3. 15 Sep, 2015 14 commits
  4. 14 Sep, 2015 2 commits
  5. 13 Sep, 2015 2 commits
    • Arnaldo Carvalho de Melo's avatar
      perf header: Fixup reading of HEADER_NRCPUS feature · caa47047
      Arnaldo Carvalho de Melo authored
      The original patch introducing this header wrote the number of CPUs available
      and online in one order and then swapped those values when reading, fix it.
      
      Before:
      
        # perf record usleep 1
        # perf report --header-only | grep 'nrcpus \(online\|avail\)'
        # nrcpus online : 4
        # nrcpus avail : 4
        # echo 0 > /sys/devices/system/cpu/cpu2/online
        # perf record usleep 1
        # perf report --header-only | grep 'nrcpus \(online\|avail\)'
        # nrcpus online : 4
        # nrcpus avail : 3
        # echo 0 > /sys/devices/system/cpu/cpu1/online
        # perf record usleep 1
        # perf report --header-only | grep 'nrcpus \(online\|avail\)'
        # nrcpus online : 4
        # nrcpus avail : 2
      
      After the fix, bringing back the CPUs online:
      
        # perf report --header-only | grep 'nrcpus \(online\|avail\)'
        # nrcpus online : 2
        # nrcpus avail : 4
        # echo 1 > /sys/devices/system/cpu/cpu2/online
        # perf record usleep 1
        # perf report --header-only | grep 'nrcpus \(online\|avail\)'
        # nrcpus online : 3
        # nrcpus avail : 4
        # echo 1 > /sys/devices/system/cpu/cpu1/online
        # perf record usleep 1
        # perf report --header-only | grep 'nrcpus \(online\|avail\)'
        # nrcpus online : 4
        # nrcpus avail : 4
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Fixes: fbe96f29 ("perf tools: Make perf.data more self-descriptive (v8)")
      Link: http://lkml.kernel.org/r/20150911153323.GP23511@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      caa47047
    • Peter Zijlstra's avatar
      perf/x86/intel: Fix constraint access · ebfb4988
      Peter Zijlstra authored
      Sasha reported that we can get here with .idx==-1, and
      cpuc->event_constraints unallocated.
      Suggested-by: default avatarStephane Eranian <eranian@google.com>
      Reported-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: <stable@vger.kernel.org>
      Fixes: b371b594 ("perf/x86: Fix event/group validation")
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ebfb4988
  6. 11 Sep, 2015 1 commit
  7. 04 Sep, 2015 1 commit
    • Ingo Molnar's avatar
      Merge tag 'perf-urgent-for-mingo' of... · 21adf76e
      Ingo Molnar authored
      Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
      
      Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
      
        - In some cases where perf_event.fork.{pid,tid} should be used we were instead
          using perf_event.comm.{pid,tid}, which is not a problem for for the 'pid'
          case, that sits in the same place in these union_perf_event members, but
          comm.tid sits where fork.ppid is, oops.
      
          These cases were considered as (potentially) problematic:
      
           - 'perf script' with !sample_id_all, i.e. only non old kernels without
              perf_event_attr.sample_id_all.
      
           - intel_pt could be affected when decoding without timestamps, as the exit
             event is only used to flush out data which anyway gets flushed at the
             end of the session.
      
           - intel_bts also uses the exit event to flush data which would probably not
             cause errors as it would get flushed at the end of the session instead.
      
          Fix it. (Adrian Hunter)
      
        - Due to relaxing the compiler checks for bison generated files, we missed
          updating one parse_events_add_pmu() caller when this function had its
          prototype changed, fix it. (Jiri Olsa)
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      21adf76e
  8. 02 Sep, 2015 3 commits
  9. 01 Sep, 2015 13 commits
    • Stephane Eranian's avatar
      perf tools: Fix link time error with sample_reg_masks on non x86 · af4aeadd
      Stephane Eranian authored
      This patch makes perf compile on non x86 platforms by defining a weak
      symbol for sample_reg_masks[] in util/perf_regs.c.
      
      The patch also moves the REG() and REG_END() macros into the
      util/per_regs.h header file. The macros are renamed to
      SMPL_REG/SMPL_REG_END to avoid clashes with other header files.
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1441099814-26783-1-git-send-email-eranian@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      af4aeadd
    • Wang Nan's avatar
      perf build: Fix Intel PT instruction decoder dependency problem · 04aa90b5
      Wang Nan authored
      I hit following building error randomly:
      
          ...
        /bin/sh: /path/to/kernel/buildperf/util/intel-pt-decoder/inat-tables.c: No such file or directory
          ...
          LINK     /path/to/kernel/buildperf/plugin_mac80211.so
          LINK     /path/to/kernel/buildperf/plugin_kmem.so
          LINK     /path/to/kernel/buildperf/plugin_xen.so
          LINK     /path/to/kernel/buildperf/plugin_hrtimer.so
        In file included from util/intel-pt-decoder/intel-pt-insn-decoder.c:25:0:
        util/intel-pt-decoder/inat.c:24:25: fatal error: inat-tables.c: No such file or directory
         #include "inat-tables.c"
                                 ^
        compilation terminated.
        make[4]: *** [/path/to/kernel/buildperf/util/intel-pt-decoder/intel-pt-insn-decoder.o] Error 1
        make[4]: *** Waiting for unfinished jobs....
          LINK     /path/to/kernel/buildperf/plugin_function.so
      
      This is caused by tools/perf/util/intel-pt-decoder/Build that, it tries
      to generate $(OUTPUT)util/intel-pt-decoder/inat-tables.c atomatically
      but forget to ensure the existance of $(OUTPUT)util/intel-pt-decoder
      directory.
      
      This patch fixes it by adding $(call rule_mkdir) like other similar rules.
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Acked-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@163.com
      Link: http://lkml.kernel.org/r/1441087005-107540-1-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      04aa90b5
    • Wang Nan's avatar
      perf dwarf: Fix potential array out of bounds access · 3b27d139
      Wang Nan authored
      There is a problem in the dwarf-regs.c files for sh, sparc and x86 where
      it is possible to make an out-of-bounds array access when searching for
      register names.
      
      This patch fixes it by replacing '<=' to '<', so when register (number
      == XXX_MAX_REGS), get_arch_regstr() will return NULL.
      Signed-off-by: default avatarWang Nan <wangnan0@huawei.com>
      Reviewed-by: default avatarMatt Fleming <matt@console-pimps.org>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Acked-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Zefan Li <lizefan@huawei.com>
      Cc: pi3orama@huawei.com
      Link: http://lkml.kernel.org/r/1441078184-105038-1-git-send-email-wangnan0@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3b27d139
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo' of... · 53202661
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      User visible changes:
      
        - Add ability to specify to select which registers to record,
          to reduce the size of perf.data files, and also allow printing
          the registers in 'perf script': (Stephane Eranian)
      
            # perf record --intr-regs=AX,SP usleep 1
            [ perf record: Woken up 1 times to write data ]
            [ perf record: Captured and wrote 0.016 MB perf.data (8 samples) ]
            # perf script -F ip,sym,iregs | tail -5
             ffffffff8105f42a native_write_msr_safe   AX:0xf    SP:0xffff8802629c3c00
             ffffffff8105f42a native_write_msr_safe   AX:0xf    SP:0xffff8802629c3c00
             ffffffff81761ac0 _raw_spin_lock   AX:0xffff8801bfcf8020    SP:0xffff8802629c3ce8
             ffffffff81202bf8 __vma_adjust_trans_huge   AX:0x7ffc75200000    SP:0xffff8802629c3b30
             ffffffff8122b089 dput   AX:0x101    SP:0xffff8802629c3c78
            #
      
      Infrastructure changes:
      
        - Open event on evsel cpus and threads. (Kan Liang)
      
        - Add new bpf API to get name from a BPF object. (Wang Nan)
      
      Build fixes:
      
        - Fix build on powerpc broken by pt/bts. (Adrian Hunter)
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      53202661
    • Linus Torvalds's avatar
      Merge branch 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 65a99597
      Linus Torvalds authored
      Pull NOHZ updates from Ingo Molnar:
       "The main changes, mostly written by Frederic Weisbecker, include:
      
         - Fix some jiffies based cputime assumptions.  (No real harm because
           the concerned code isn't used by full dynticks.)
      
         - Simplify jiffies <-> usecs conversions.  Remove dead code.
      
         - Remove early hacks on nohz full code that avoided messing up idle
           nohz internals.  Now nohz integrates well full and idle and such
           hack have become needless.
      
         - Restart nohz full tick from irq exit.  (A simplification and a
           preparation for future optimization on scheduler kick to nohz
           full)
      
         - Code cleanups.
      
         - Tile driver isolation enhancement on top of nohz.  (Chris Metcalf)"
      
      * 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        nohz: Remove useless argument on tick_nohz_task_switch()
        nohz: Move tick_nohz_restart_sched_tick() above its users
        nohz: Restart nohz full tick from irq exit
        nohz: Remove idle task special case
        nohz: Prevent tilegx network driver interrupts
        alpha: Fix jiffies based cputime assumption
        apm32: Fix cputime == jiffies assumption
        jiffies: Remove HZ > USEC_PER_SEC special case
      65a99597
    • Linus Torvalds's avatar
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 418c2e1f
      Linus Torvalds authored
      Pull scheduler fix from Ingo Molnar:
       "This is a leftover scheduler fix from the v4.2 cycle"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched: Fix cpu_active_mask/cpu_online_mask race
      418c2e1f
    • Linus Torvalds's avatar
      Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a1d85611
      Linus Torvalds authored
      Pull scheduler updates from Ingo Molnar:
       "The biggest change in this cycle is the rewrite of the main SMP load
        balancing metric: the CPU load/utilization.  The main goal was to make
        the metric more precise and more representative - see the changelog of
        this commit for the gory details:
      
          9d89c257 ("sched/fair: Rewrite runnable load and utilization average tracking")
      
        It is done in a way that significantly reduces complexity of the code:
      
          5 files changed, 249 insertions(+), 494 deletions(-)
      
        and the performance testing results are encouraging.  Nevertheless we
        need to keep an eye on potential regressions, since this potentially
        affects every SMP workload in existence.
      
        This work comes from Yuyang Du.
      
        Other changes:
      
         - SCHED_DL updates.  (Andrea Parri)
      
         - Simplify architecture callbacks by removing finish_arch_switch().
           (Peter Zijlstra et al)
      
         - cputime accounting: guarantee stime + utime == rtime.  (Peter
           Zijlstra)
      
         - optimize idle CPU wakeups some more - inspired by Facebook server
           loads.  (Mike Galbraith)
      
         - stop_machine fixes and updates.  (Oleg Nesterov)
      
         - Introduce the 'trace_sched_waking' tracepoint.  (Peter Zijlstra)
      
         - sched/numa tweaks.  (Srikar Dronamraju)
      
         - misc fixes and small cleanups"
      
      * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
        sched/deadline: Fix comment in enqueue_task_dl()
        sched/deadline: Fix comment in push_dl_tasks()
        sched: Change the sched_class::set_cpus_allowed() calling context
        sched: Make sched_class::set_cpus_allowed() unconditional
        sched: Fix a race between __kthread_bind() and sched_setaffinity()
        sched: Ensure a task has a non-normalized vruntime when returning back to CFS
        sched/numa: Fix NUMA_DIRECT topology identification
        tile: Reorganize _switch_to()
        sched, sparc32: Update scheduler comments in copy_thread()
        sched: Remove finish_arch_switch()
        sched, tile: Remove finish_arch_switch
        sched, sh: Fold finish_arch_switch() into switch_to()
        sched, score: Remove finish_arch_switch()
        sched, avr32: Remove finish_arch_switch()
        sched, MIPS: Get rid of finish_arch_switch()
        sched, arm: Remove finish_arch_switch()
        sched/fair: Clean up load average references
        sched/fair: Provide runnable_load_avg back to cfs_rq
        sched/fair: Remove task and group entity load when they are dead
        sched/fair: Init cfs_rq's sched_entity load average
        ...
      a1d85611
    • Linus Torvalds's avatar
      Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3959df1d
      Linus Torvalds authored
      Pull RAS updates from Ingo Molnar:
       "MCE handling updates, but also some generic drivers/edac/ changes to
        better organize the Kconfig space"
      
      * 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/ras: Move AMD MCE injector to arch/x86/ras/
        x86/mce: Add a wrapper around mce_log() for injection
        x86/mce: Rename rcu_dereference_check_mce() to mce_log_get_idx_check()
        RAS: Add a menuconfig option with descriptive text
        x86/mce: Reenable CMCI banks when swiching back to interrupt mode
        x86/mce: Clear Local MCE opt-in before kexec
        x86/mce: Remove unused function declarations
        x86/mce: Kill drain_mcelog_buffer()
        x86/mce: Avoid potential deadlock due to printk() in MCE context
        x86/mce: Remove the MCE ring for Action Optional errors
        x86/mce: Don't use percpu workqueues
        x86/mce: Provide a lockless memory pool to save error records
        x86/mce: Reuse one of the u16 padding fields in 'struct mce'
      3959df1d
    • Linus Torvalds's avatar
      Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 41d859a8
      Linus Torvalds authored
      Pull perf updates from Ingo Molnar:
       "Main perf kernel side changes:
      
         - uprobes updates/fixes.  (Oleg Nesterov)
      
         - Add PERF_RECORD_SWITCH to indicate context switches and use it in
           tooling.  (Adrian Hunter)
      
         - Support BPF programs attached to uprobes and first steps for BPF
           tooling support.  (Wang Nan)
      
         - x86 generic x86 MSR-to-perf PMU driver.  (Andy Lutomirski)
      
         - x86 Intel PT, LBR and BTS updates.  (Alexander Shishkin)
      
         - x86 Intel Skylake support.  (Andi Kleen)
      
         - x86 Intel Knights Landing (KNL) RAPL support.  (Dasaratharaman
           Chandramouli)
      
         - x86 Intel Broadwell-DE uncore support.  (Kan Liang)
      
         - x86 hw breakpoints robustization (Andy Lutomirski)
      
        Main perf tooling side changes:
      
         - Support Intel PT in several tools, enabling the use of the
           processor trace feature introduced in Intel Broadwell processors:
           (Adrian Hunter)
      
             # dmesg | grep Performance
             # [0.188477] Performance Events: PEBS fmt2+, 16-deep LBR, Broadwell events, full-width counters, Intel PMU driver.
             # perf record -e intel_pt//u -a sleep 1
             [ perf record: Woken up 1 times to write data ]
             [ perf record: Captured and wrote 0.216 MB perf.data ]
             # perf script # then navigate in the tool output to some area, like this one:
             184 1030 dl_main (/usr/lib64/ld-2.17.so) => 7f21ba661440 dl_main (/usr/lib64/ld-2.17.so)
             185 1457 dl_main (/usr/lib64/ld-2.17.so) => 7f21ba669f10 _dl_new_object (/usr/lib64/ld-2.17.so)
             186 9f37 _dl_new_object (/usr/lib64/ld-2.17.so) => 7f21ba677b90 strlen (/usr/lib64/ld-2.17.so)
             187 7ba3 strlen (/usr/lib64/ld-2.17.so) => 7f21ba677c75 strlen (/usr/lib64/ld-2.17.so)
             188 7c78 strlen (/usr/lib64/ld-2.17.so) => 7f21ba669f3c _dl_new_object (/usr/lib64/ld-2.17.so)
             189 9f8a _dl_new_object (/usr/lib64/ld-2.17.so) => 7f21ba65fab0 calloc@plt (/usr/lib64/ld-2.17.so)
             190 fab0 calloc@plt (/usr/lib64/ld-2.17.so) => 7f21ba675e70 calloc (/usr/lib64/ld-2.17.so)
             191 5e87 calloc (/usr/lib64/ld-2.17.so) => 7f21ba65fa90 malloc@plt (/usr/lib64/ld-2.17.so)
             192 fa90 malloc@plt (/usr/lib64/ld-2.17.so) => 7f21ba675e60 malloc (/usr/lib64/ld-2.17.so)
             193 5e68 malloc (/usr/lib64/ld-2.17.so) => 7f21ba65fa80 __libc_memalign@plt (/usr/lib64/ld-2.17.so)
             194 fa80 __libc_memalign@plt (/usr/lib64/ld-2.17.so) => 7f21ba675d50 __libc_memalign (/usr/lib64/ld-2.17.so)
             195 5d63 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675e20 __libc_memalign (/usr/lib64/ld-2.17.so)
             196 5e40 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675d73 __libc_memalign (/usr/lib64/ld-2.17.so)
             197 5d97 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675e18 __libc_memalign (/usr/lib64/ld-2.17.so)
             198 5e1e __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675df9 __libc_memalign (/usr/lib64/ld-2.17.so)
             199 5e10 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba669f8f _dl_new_object (/usr/lib64/ld-2.17.so)
             200 9fc2 _dl_new_object (/usr/lib64/ld-2.17.so) =>  7f21ba678e70 memcpy (/usr/lib64/ld-2.17.so)
             201 8e8c memcpy (/usr/lib64/ld-2.17.so) => 7f21ba678ea0 memcpy (/usr/lib64/ld-2.17.so)
      
         - Add support for using several Intel PT features (CYC, MTC packets),
           the relevant documentation was updated in:
               tools/perf/Documentation/intel-pt.txt
           briefly describing those packets, its purposes, how to configure
           them in the event config terms and relevant external documentation
           for further reading.  (Adrian Hunter)
      
         - Introduce support for probing at an absolute address, for user and
           kernel 'perf probe's, useful when one have the symbol maps on a
           developer machine but not on an embedded system.  (Wang Nan)
      
         - Add Intel BTS support, with a call-graph script to show it and PT
           in use in a GUI using 'perf script' python scripting with
           postgresql and Qt.  (Adrian Hunter)
      
         - Allow selecting the type of callchains per event, including
           disabling callchains in all but one entry in an event list, to save
           space, and also to ask for the callchains collected in one event to
           be used in other events.  (Kan Liang)
      
         - Beautify more syscall arguments in 'perf trace': (Arnaldo Carvalho
           de Melo)
             * A bunch more translate file/pathnames from pointers to strings.
             * Convert numbers to strings for the 'keyctl' syscall 'option'
               arg.
             * Add missing 'clockid' entries.
      
         - Introduce 'srcfile' sort key: (Andi Kleen)
      
             # perf record -F 10000 usleep 1
             # perf report --stdio --dsos '[kernel.vmlinux]' -s srcfile
             <SNIP>
             # Overhead  Source File
                26.49%  copy_page_64.S
                 5.49%  signal.c
                 0.51%  msr.h
             #
      
           It can be combined with other fields, for instance, experiment with
           '-s srcfile,symbol'.
      
           There are some oddities in some distros and with some specific
           DSOs, being investigated, so your mileage may vary.
      
         - Support per-event 'freq' term: (Namhyung Kim)
      
             $ perf record -e 'cpu/instructions,freq=1234/',cycles -c 1000 sleep 1
             $ perf evlist -F
             cpu/instructions,freq=1234/: sample_freq=1234
             cycles: sample_period=1000
             $
      
         - Deref sys_enter pointer args with contents from probe:vfs_getname,
           showing pathnames instead of pointers in many syscalls in 'perf
           trace'.  (Arnaldo Carvalho de Melo)
      
         - Stop collecting /proc/kallsyms in perf.data files, saving about
           4.5MB on a typical x86-64 system, use the the symbol resolution
           routines used in all the other tools (report, top, etc) now that we
           can ask libtraceevent to use perf's symbol resolution code.
           (Arnaldo Carvalho de Melo)
      
         - Allow filtering out of perf's PID via 'perf record --exclude-perf'.
           (Wang Nan)
      
         - 'perf trace' now supports syscall groups, like strace, i.e:
      
             $ trace -e file touch file
      
           Will expand 'file' into multiple, file related, syscalls.  More
           work needed to add extra groups for other syscall groups, and also
           to complement what was added for the 'file' group, included as a
           proof of concept.  (Arnaldo Carvalho de Melo)
      
         - Add lock_pi stresser to 'perf bench futex', to test the kernel code
           related to FUTEX_(UN)LOCK_PI.  (Davidlohr Bueso)
      
         - Let user have timestamps with per-thread recording in 'perf record'
           (Adrian Hunter)
      
         - ... and tons of other changes, see the shortlog and the Git log for
           details"
      
      * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (240 commits)
        perf evlist: Add backpointer for perf_env to evlist
        perf tools: Rename perf_session_env to perf_env
        perf tools: Do not change lib/api/fs/debugfs directly
        perf tools: Add tracing_path and remove unneeded functions
        perf buildid: Introduce sysfs/filename__sprintf_build_id
        perf evsel: Add a backpointer to the evlist a evsel is in
        perf trace: Add header with copyright and background info
        perf scripts python: Add new compaction-times script
        perf stat: Get correct cpu id for print_aggr
        tools lib traceeveent: Allow for negative numbers in print format
        perf script: Add --[no-]-demangle/--[no-]-demangle-kernel
        tracing/uprobes: Do not print '0x (null)' when offset is 0
        perf probe: Support probing at absolute address
        perf probe: Fix error reported when offset without function
        perf probe: Fix list result when address is zero
        perf probe: Fix list result when symbol can't be found
        tools build: Allow duplicate objects in the object list
        perf tools: Remove export.h from MANIFEST
        perf probe: Prevent segfault when reading probe point with absolute address
        perf tools: Update Intel PT documentation
        ...
      41d859a8
    • Linus Torvalds's avatar
      Merge branch 'mm-kasan-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 46580009
      Linus Torvalds authored
      Pull x86/kasan changes from Ingo Molnar:
       "These are two KASAN changes that factor out (and generalize) x86
        specific KASAN code from x86 to mm"
      
      * 'mm-kasan-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/kasan, mm: Introduce generic kasan_populate_zero_shadow()
        x86/kasan: Define KASAN_SHADOW_OFFSET per architecture
      46580009
    • Linus Torvalds's avatar
      Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e10994ff
      Linus Torvalds authored
      Pull liblockdep fixes from Ingo Molnar:
       "Three liblockdep fixes left over from the v4.2 cycle"
      
      * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        tools/liblockdep: Use the rbtree header provided by common tools headers
        tools/liblockdep: Correct macro for WARN
        tools: Restore export.h
      e10994ff
    • Linus Torvalds's avatar
      Merge branch 'core-types-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 5757bd61
      Linus Torvalds authored
      Pull inlining tuning from Ingo Molnar:
       "A handful of inlining optimizations inspired by x86 work but
        applicable in general"
      
      * 'core-types-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        jiffies: Force inlining of {m,u}msecs_to_jiffies()
        x86/hweight: Force inlining of __arch_hweight{32,64}()
        linux/bitmap: Force inlining of bitmap weight functions
      5757bd61
    • Linus Torvalds's avatar
      Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 7073bc66
      Linus Torvalds authored
      Pull RCU updates from Ingo Molnar:
       "The main RCU changes in this cycle are:
      
         - the combination of tree geometry-initialization simplifications and
           OS-jitter-reduction changes to expedited grace periods.  These two
           are stacked due to the large number of conflicts that would
           otherwise result.
      
         - privatize smp_mb__after_unlock_lock().
      
           This commit moves the definition of smp_mb__after_unlock_lock() to
           kernel/rcu/tree.h, in recognition of the fact that RCU is the only
           thing using this, that nothing else is likely to use it, and that
           it is likely to go away completely.
      
         - documentation updates.
      
         - torture-test updates.
      
         - misc fixes"
      
      * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits)
        rcu,locking: Privatize smp_mb__after_unlock_lock()
        rcu: Silence lockdep false positive for expedited grace periods
        rcu: Don't disable CPU hotplug during OOM notifiers
        scripts: Make checkpatch.pl warn on expedited RCU grace periods
        rcu: Update MAINTAINERS entry
        rcu: Clarify CONFIG_RCU_EQS_DEBUG help text
        rcu: Fix backwards RCU_LOCKDEP_WARN() in synchronize_rcu_tasks()
        rcu: Rename rcu_lockdep_assert() to RCU_LOCKDEP_WARN()
        rcu: Make rcu_is_watching() really notrace
        cpu: Wait for RCU grace periods concurrently
        rcu: Create a synchronize_rcu_mult()
        rcu: Fix obsolete priority-boosting comment
        rcu: Use WRITE_ONCE in RCU_INIT_POINTER
        rcu: Hide RCU_NOCB_CPU behind RCU_EXPERT
        rcu: Add RCU-sched flavors of get-state and cond-sync
        rcu: Add fastpath bypassing funnel locking
        rcu: Rename RCU_GP_DONE_FQS to RCU_GP_DOING_FQS
        rcu: Pull out wait_event*() condition into helper function
        documentation: Describe new expedited stall warnings
        rcu: Add stall warnings to synchronize_sched_expedited()
        ...
      7073bc66