1. 23 May, 2016 1 commit
    • Andi Kleen's avatar
      perf report: Add srcline_from/to branch sort keys · 508be0df
      Andi Kleen authored
      Add "srcline_from" and "srcline_to" branch sort keys that allow to show
      the source lines of a branch.
      
      That makes it much easier to track down where particular branches happen
      in the program, for example to examine branch mispredictions, or to
      associate it with cycle counts:
      
        % perf record -b -e cycles:p ./tcall
        % perf report --sort srcline_from,srcline_to,mispredict
        ...
          15.10%  tcall.c:18       tcall.c:10       N
          14.83%  tcall.c:11       tcall.c:5        N
          14.12%  tcall.c:7        tcall.c:12       N
          14.04%  tcall.c:12       tcall.c:5        N
          12.42%  tcall.c:17       tcall.c:18       N
          12.39%  tcall.c:7        tcall.c:13       N
          12.27%  tcall.c:13       tcall.c:17       N
        ...
      
        % perf report --sort srcline_from,srcline_to,cycles
        ...
          17.12%  tcall.c:18       tcall.c:11       1
          17.01%  tcall.c:12       tcall.c:6        1
          16.98%  tcall.c:11       tcall.c:6        1
          15.91%  tcall.c:17       tcall.c:18       1
           6.38%  tcall.c:7        tcall.c:17       7
           4.80%  tcall.c:7        tcall.c:12       8
           4.21%  tcall.c:7        tcall.c:17       8
           2.67%  tcall.c:7        tcall.c:12       7
           2.62%  tcall.c:7        tcall.c:12       10
           2.10%  tcall.c:7        tcall.c:17       9
           1.58%  tcall.c:7        tcall.c:12       6
           1.44%  tcall.c:7        tcall.c:12       5
           1.38%  tcall.c:7        tcall.c:12       9
           1.06%  tcall.c:7        tcall.c:17       13
           1.05%  tcall.c:7        tcall.c:12       4
           1.01%  tcall.c:7        tcall.c:17       6
      
      Open issues:
      
      - Some kernel symbols get misresolved.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Link: http://lkml.kernel.org/r/1463775308-32748-1-git-send-email-andi@firstfloor.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      508be0df
  2. 20 May, 2016 15 commits
  3. 19 May, 2016 1 commit
  4. 18 May, 2016 1 commit
  5. 17 May, 2016 12 commits
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Separate accounting of contexts and real addresses in a stack trace · a29d5c9b
      Arnaldo Carvalho de Melo authored
      The perf_sample->ip_callchain->nr value includes all the entries in the
      ip_callchain->ip[] array, real addresses and PERF_CONTEXT_{KERNEL,USER,etc},
      while what the user expects is that what is in the kernel.perf_event_max_stack
      sysctl or in the upcoming per event perf_event_attr.sample_max_stack knob be
      honoured in terms of IP addresses in the stack trace.
      
      So match the kernel support and validate chain->nr taking into account
      both kernel.perf_event_max_stack and kernel.perf_event_max_contexts_per_stack.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Link: http://lkml.kernel.org/n/tip-mgx0jpzfdq4uq4abfa40byu0@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a29d5c9b
    • Arnaldo Carvalho de Melo's avatar
      perf core: Separate accounting of contexts and real addresses in a stack trace · c85b0334
      Arnaldo Carvalho de Melo authored
      The perf_sample->ip_callchain->nr value includes all the entries in the
      ip_callchain->ip[] array, real addresses and PERF_CONTEXT_{KERNEL,USER,etc},
      while what the user expects is that what is in the kernel.perf_event_max_stack
      sysctl or in the upcoming per event perf_event_attr.sample_max_stack knob be
      honoured in terms of IP addresses in the stack trace.
      
      So allocate a bunch of extra entries for contexts, and do the accounting
      via perf_callchain_entry_ctx struct members.
      
      A new sysctl, kernel.perf_event_max_contexts_per_stack is also
      introduced for investigating possible bugs in the callchain
      implementation by some arch.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Link: http://lkml.kernel.org/n/tip-3b4wnqk340c4sg4gwkfdi9yk@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c85b0334
    • Arnaldo Carvalho de Melo's avatar
      perf core: Add perf_callchain_store_context() helper · 3e4de4ec
      Arnaldo Carvalho de Melo authored
      We need have different helpers to account how many contexts we have in
      the sample and for real addresses, so do it now as a prep patch, to
      ease review.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/n/tip-q964tnyuqrxw5gld18vizs3c@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3e4de4ec
    • Arnaldo Carvalho de Melo's avatar
      perf core: Add a 'nr' field to perf_event_callchain_context · 3b1fff08
      Arnaldo Carvalho de Melo authored
      We will use it to count how many addresses are in the entry->ip[] array,
      excluding PERF_CONTEXT_{KERNEL,USER,etc} entries, so that we can really
      return the number of entries specified by the user via the relevant
      sysctl, kernel.perf_event_max_contexts, or via the per event
      perf_event_attr.sample_max_stack knob.
      
      This way we keep the perf_sample->ip_callchain->nr meaning, that is the
      number of entries, be it real addresses or PERF_CONTEXT_ entries, while
      honouring the max_stack knobs, i.e. the end result will be max_stack
      entries if we have at least that many entries in a given stack trace.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/n/tip-s8teto51tdqvlfhefndtat9r@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      3b1fff08
    • Arnaldo Carvalho de Melo's avatar
      perf core: Pass max stack as a perf_callchain_entry context · cfbcf468
      Arnaldo Carvalho de Melo authored
      This makes perf_callchain_{user,kernel}() receive the max stack
      as context for the perf_callchain_entry, instead of accessing
      the global sysctl_perf_event_max_stack.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Link: http://lkml.kernel.org/n/tip-kolmn1yo40p7jhswxwrc7rrd@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      cfbcf468
    • Arnaldo Carvalho de Melo's avatar
      perf core: Generalize max_stack sysctl handler · a831100a
      Arnaldo Carvalho de Melo authored
      So that it can be used for other stack related knobs, such as the
      upcoming one to tweak the max number of of contexts per stack sample.
      
      In all those cases we can only change the value if there are no perf
      sessions collecting stacks, so they need to grab that mutex, etc.
      
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/n/tip-8t3fk94wuzp8m2z1n4gc0s17@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a831100a
    • Masami Hiramatsu's avatar
      perf symbols: Introduce DSO__NAME_KALLSYMS and DSO__NAME_KCORE · 0a77582f
      Masami Hiramatsu authored
      Instead of using a raw string, use DSO__NAME_KALLSYMS and
      DSO__NAME_KCORE macros for kallsyms and kcore.
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20160515031935.4017.50971.stgit@devboxSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0a77582f
    • Namhyung Kim's avatar
      perf stat: Use cpu-clock event for cpu targets · a1f3d567
      Namhyung Kim authored
      Currently 'perf stat' always counts task-clock event by default.  But
      it's somewhat confusing for system-wide targets (especially with 'sleep
      N' as the 'sleep' task just sleeps and doesn't use cputime).  Changing
      to cpu-clock event instead for that case makes more sense IMHO.
      
      Before:
        # perf stat -a sleep 0.1
      
         Performance counter stats for 'system wide':
      
              403.038603      task-clock (msec)     #    4.001 CPUs utilized
                     150      context-switches      #    0.372 K/sec
                       7      cpu-migrations        #    0.017 K/sec
                      71      page-faults           #    0.176 K/sec
              23,705,169      cycles                #    0.059 GHz
              15,888,166      instructions          #    0.67  insn per cycle
               3,326,078      branches              #    8.253 M/sec
                  87,643      branch-misses         #    2.64% of all branches
      
             0.100737009 seconds time elapsed
      
        #
      
      After:
      
        # perf stat -a sleep 0.1
      
         Performance counter stats for 'system wide':
      
              404.271182      cpu-clock (msec)      #    4.000 CPUs utilized
                     143      context-switches      #    0.354 K/sec
                      13      cpu-migrations        #    0.032 K/sec
                      73      page-faults           #    0.181 K/sec
              22,119,220      cycles                #    0.055 GHz
              13,622,065      instructions          #    0.62  insn per cycle
               2,918,769      branches              #    7.220 M/sec
                  85,033      branch-misses         #    2.91% of all branches
      
             0.101073089 seconds time elapsed
      
        #
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1463119263-5569-3-git-send-email-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      a1f3d567
    • Namhyung Kim's avatar
      perf stat: Update runtime using cpu-clock event · daf4f478
      Namhyung Kim authored
      Currently only the task-clock event updates the runtime_nsec so it
      cannot show the metric when using cpu-clock events.  However cpu clock
      works basically same as task-clock, so no need to not update the runtime
      IMHO.
      
      Before:
      
        # perf stat -a -e cpu-clock,context-switches,page-faults,cycles sleep 0.1
      
          Performance counter stats for 'system wide':
      
               1217.759506      cpu-clock (msec)
                        93      context-switches
                        61      page-faults
                18,958,022      cycles
      
               0.101393794 seconds time elapsed
      
      After:
      
         Performance counter stats for 'system wide':
      
               1220.471884      cpu-clock (msec)          #   12.013 CPUs utilized
                       118      context-switches          #    0.097 K/sec
                        59      page-faults               #    0.048 K/sec
                17,941,247      cycles                    #    0.015 GHz
      
               0.101594777 seconds time elapsed
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1463119263-5569-2-git-send-email-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      daf4f478
    • Namhyung Kim's avatar
      perf stat: Fix indentation of stalled backend cycle · b0404be8
      Namhyung Kim authored
      The commit 140aeadc ("perf stat: Abstract stat metrics printing")
      changed how shadow metrics are printed, but it missed to update the
      width of the stalled backend cycles event to 7.2% like others.  This
      resulted in misaligned output like below:
      
        Performance counter stats for 'pwd':
      
                0.638313      task-clock (msec)         #    0.567 CPUs utilized
                       0      context-switches          #    0.000 K/sec
                       0      cpu-migrations            #    0.000 K/sec
                      54      page-faults               #    0.085 M/sec
                 885,600      cycles                    #    1.387 GHz
                 558,438      stalled-cycles-frontend   #   63.06% frontend cycles idle
                 431,355      stalled-cycles-backend    #  48.71% backend cycles idle
                 674,956      instructions              #    0.76  insn per cycle
                                                        #    0.83  stalled cycles per insn
                 130,380      branches                  #  204.257 M/sec
           <not counted>      branch-misses
      
             0.001125426 seconds time elapsed
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Fixes: 140aeadc ("perf stat: Abstract stat metrics printing")
      Link: http://lkml.kernel.org/r/1463119263-5569-1-git-send-email-namhyung@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      b0404be8
    • He Kuang's avatar
      perf symbols: Store vdso buildid unconditionally · 6ae98ba6
      He Kuang authored
      When unwinding callchains on a different machine, vdso info should be
      available so the unwind process won't be interrupted if address falls
      into vdso region. But in most cases, the addresses of sample events are
      not in vdso range, the buildid of a zero hit vdso won't be stored into
      perf.data.
      
      This patch stores vdso buildid regardless of whether the vdso is hit or
      not.
      Signed-off-by: default avatarHe Kuang <hekuang@huawei.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/1463042596-61703-3-git-send-email-hekuang@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      6ae98ba6
    • Andi Kleen's avatar
      perf stat: Avoid fractional digits for integer scales · e3b03b6c
      Andi Kleen authored
      When the scaling factor is a full integer don't display fractional
      digits. This avoids unnecessary .00 output for topdown metrics with
      scale factors.
      
      v2: Remove redundant check.
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1462489447-31832-7-git-send-email-andi@firstfloor.org
      [ Rename 'round' to 'stat_round' as 'round' is defined in math.h,
        included by this patch, and this breaks the build on ubuntu 12.04 ]
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      e3b03b6c
  6. 16 May, 2016 10 commits
    • Linus Torvalds's avatar
      Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · bc231d9e
      Linus Torvalds authored
      Pull x86 platform updates from Ingo Molnar:
       "The main change is the addition of SGI/UV4 support"
      
      * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits)
        x86/platform/UV: Fix incorrect nodes and pnodes for cpuless and memoryless nodes
        x86/platform/UV: Remove Obsolete GRU MMR address translation
        x86/platform/UV: Update physical address conversions for UV4
        x86/platform/UV: Build GAM reference tables
        x86/platform/UV: Support UV4 socket address changes
        x86/platform/UV: Add obtaining GAM Range Table from UV BIOS
        x86/platform/UV: Add UV4 addressing discovery function
        x86/platform/UV: Fold blade info into per node hub info structs
        x86/platform/UV: Allocate common per node hub info structs on local node
        x86/platform/UV: Move blade local processor ID to the per cpu info struct
        x86/platform/UV: Move scir info to the per cpu info struct
        x86/platform/UV: Create per cpu info structs to replace per hub info structs
        x86/platform/UV: Update MMIOH setup function to work for both UV3 and UV4
        x86/platform/UV: Clean up redunduncies after merge of UV4 MMR definitions
        x86/platform/UV: Add UV4 Specific MMR definitions
        x86/platform/UV: Prep for UV4 MMR updates
        x86/platform/UV: Add UV MMR Illegal Access Function
        x86/platform/UV: Add UV4 Specific Defines
        x86/platform/UV: Add UV Architecture Defines
        x86/platform/UV: Add Initial UV4 definitions
        ...
      bc231d9e
    • Linus Torvalds's avatar
      Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 62a00278
      Linus Torvalds authored
      Pull x86 debug cleanup from Ingo Molnar:
       "A printk() output simplification"
      
      * 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/dumpstack: Combine some printk()s
      62a00278
    • Linus Torvalds's avatar
      Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · bcea36df
      Linus Torvalds authored
      Pull x86 cleanup from Ingo Molnar:
       "Inline optimizations"
      
      * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86: Fix non-static inlines
      bcea36df
    • Linus Torvalds's avatar
      Merge branch 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 05e30f01
      Linus Torvalds authored
      Pull x86-64 defconfig update from Ingo Molnar:
       "Small defconfig addition"
      
      [ I'm not actually convinced our defconfig is sensible, but whatever ]
      
      * 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/build/defconfig/64: Enable CONFIG_E1000E=y
      05e30f01
    • Linus Torvalds's avatar
      Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 9a45f036
      Linus Torvalds authored
      Pull x86 boot updates from Ingo Molnar:
       "The biggest changes in this cycle were:
      
         - prepare for more KASLR related changes, by restructuring, cleaning
           up and fixing the existing boot code.  (Kees Cook, Baoquan He,
           Yinghai Lu)
      
         - simplifly/concentrate subarch handling code, eliminate
           paravirt_enabled() usage.  (Luis R Rodriguez)"
      
      * 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits)
        x86/KASLR: Clarify purpose of each get_random_long()
        x86/KASLR: Add virtual address choosing function
        x86/KASLR: Return earliest overlap when avoiding regions
        x86/KASLR: Add 'struct slot_area' to manage random_addr slots
        x86/boot: Add missing file header comments
        x86/KASLR: Initialize mapping_info every time
        x86/boot: Comment what finalize_identity_maps() does
        x86/KASLR: Build identity mappings on demand
        x86/boot: Split out kernel_ident_mapping_init()
        x86/boot: Clean up indenting for asm/boot.h
        x86/KASLR: Improve comments around the mem_avoid[] logic
        x86/boot: Simplify pointer casting in choose_random_location()
        x86/KASLR: Consolidate mem_avoid[] entries
        x86/boot: Clean up pointer casting
        x86/boot: Warn on future overlapping memcpy() use
        x86/boot: Extract error reporting functions
        x86/boot: Correctly bounds-check relocations
        x86/KASLR: Clean up unused code from old 'run_size' and rename it to 'kernel_total_size'
        x86/boot: Fix "run_size" calculation
        x86/boot: Calculate decompression size during boot not build
        ...
      9a45f036
    • Linus Torvalds's avatar
      Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 168f1a71
      Linus Torvalds authored
      Pull x86 asm updates from Ingo Molnar:
       "The main changes in this cycle were:
      
         - MSR access API fixes and enhancements (Andy Lutomirski)
      
         - early exception handling improvements (Andy Lutomirski)
      
         - user-space FS/GS prctl usage fixes and improvements (Andy
           Lutomirski)
      
         - Remove the cpu_has_*() APIs and replace them with equivalents
           (Borislav Petkov)
      
         - task switch micro-optimization (Brian Gerst)
      
         - 32-bit entry code simplification (Denys Vlasenko)
      
         - enhance PAT handling in enumated CPUs (Toshi Kani)
      
        ... and lots of other cleanups/fixlets"
      
      * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits)
        x86/arch_prctl/64: Restore accidentally removed put_cpu() in ARCH_SET_GS
        x86/entry/32: Remove asmlinkage_protect()
        x86/entry/32: Remove GET_THREAD_INFO() from entry code
        x86/entry, sched/x86: Don't save/restore EFLAGS on task switch
        x86/asm/entry/32: Simplify pushes of zeroed pt_regs->REGs
        selftests/x86/ldt_gdt: Test set_thread_area() deletion of an active segment
        x86/tls: Synchronize segment registers in set_thread_area()
        x86/asm/64: Rename thread_struct's fs and gs to fsbase and gsbase
        x86/arch_prctl/64: Remove FSBASE/GSBASE < 4G optimization
        x86/segments/64: When load_gs_index fails, clear the base
        x86/segments/64: When loadsegment(fs, ...) fails, clear the base
        x86/asm: Make asm/alternative.h safe from assembly
        x86/asm: Stop depending on ptrace.h in alternative.h
        x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall()
        x86/asm: Make sure verify_cpu() has a good stack
        x86/extable: Add a comment about early exception handlers
        x86/msr: Set the return value to zero when native_rdmsr_safe() fails
        x86/paravirt: Make "unsafe" MSR accesses unsafe even if PARAVIRT=y
        x86/paravirt: Add paravirt_{read,write}_msr()
        x86/msr: Carry on after a non-"safe" MSR access fails
        ...
      168f1a71
    • Linus Torvalds's avatar
      Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 825a3b26
      Linus Torvalds authored
      Pull scheduler updates from Ingo Molnar:
      
       - massive CPU hotplug rework (Thomas Gleixner)
      
       - improve migration fairness (Peter Zijlstra)
      
       - CPU load calculation updates/cleanups (Yuyang Du)
      
       - cpufreq updates (Steve Muckle)
      
       - nohz optimizations (Frederic Weisbecker)
      
       - switch_mm() micro-optimization on x86 (Andy Lutomirski)
      
       - ... lots of other enhancements, fixes and cleanups.
      
      * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (66 commits)
        ARM: Hide finish_arch_post_lock_switch() from modules
        sched/core: Provide a tsk_nr_cpus_allowed() helper
        sched/core: Use tsk_cpus_allowed() instead of accessing ->cpus_allowed
        sched/loadavg: Fix loadavg artifacts on fully idle and on fully loaded systems
        sched/fair: Correct unit of load_above_capacity
        sched/fair: Clean up scale confusion
        sched/nohz: Fix affine unpinned timers mess
        sched/fair: Fix fairness issue on migration
        sched/core: Kill sched_class::task_waking to clean up the migration logic
        sched/fair: Prepare to fix fairness problems on migration
        sched/fair: Move record_wakee()
        sched/core: Fix comment typo in wake_q_add()
        sched/core: Remove unused variable
        sched: Make hrtick_notifier an explicit call
        sched/fair: Make ilb_notifier an explicit call
        sched/hotplug: Make activate() the last hotplug step
        sched/hotplug: Move migration CPU_DYING to sched_cpu_dying()
        sched/migration: Move CPU_ONLINE into scheduler state
        sched/migration: Move calc_load_migrate() into CPU_DYING
        sched/migration: Move prepare transition to SCHED_STARTING state
        ...
      825a3b26
    • Linus Torvalds's avatar
      Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · cf6ed9a6
      Linus Torvalds authored
      Pull RAS updates from Ingo Molnar:
       "Main changes in this cycle were:
      
         - AMD MCE/RAS handling updates (Yazen Ghannam, Aravind
           Gopalakrishnan)
      
         - Cleanups (Borislav Petkov)
      
         - logging fix (Tony Luck)"
      
      * 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/RAS: Add SMCA support to AMD Error Injector
        EDAC, mce_amd: Detect SMCA using X86_FEATURE_SMCA
        x86/mce: Update AMD mcheck init to use cpu_has() facilities
        x86/cpu: Add detection of AMD RAS Capabilities
        x86/mce/AMD: Save an indentation level in prepare_threshold_block()
        x86/mce/AMD: Disable LogDeferredInMcaStat for SMCA systems
        x86/mce/AMD: Log Deferred Errors using SMCA MCA_DE{STAT,ADDR} registers
        x86/mce: Detect local MCEs properly
        x86/mce: Look in genpool instead of mcelog for pending error records
        x86/mce: Detect and use SMCA-specific msr_ops
        x86/mce: Define vendor-specific MSR accessors
        x86/mce: Carve out writes to MCx_STATUS and MCx_CTL
        x86/mce: Grade uncorrected errors for SMCA-enabled systems
        x86/mce: Log MCEs after a warm rest on AMD, Fam17h and later
        x86/mce: Remove explicit smp_rmb() when starting CPUs sync
        x86/RAS: Rename AMD MCE injector config item
      cf6ed9a6
    • Linus Torvalds's avatar
      Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 36db171c
      Linus Torvalds authored
      Pull perf updates from Ingo Molnar:
       "Bigger kernel side changes:
      
         - Add backwards writing capability to the perf ring-buffer code,
           which is preparation for future advanced features like robust
           'overwrite support' and snapshot mode.  (Wang Nan)
      
         - Add pause and resume ioctls for the perf ringbuffer (Wang Nan)
      
         - x86 Intel cstate code cleanups and reorgnization (Thomas Gleixner)
      
         - x86 Intel uncore and CPU PMU driver updates (Kan Liang, Peter
           Zijlstra)
      
         - x86 AUX (Intel PT) related enhancements and updates (Alexander
           Shishkin)
      
         - x86 MSR PMU driver enhancements and updates (Huang Rui)
      
         - ... and lots of other changes spread out over 40+ commits.
      
        Biggest tooling side changes:
      
         - 'perf trace' features and enhancements.  (Arnaldo Carvalho de Melo)
      
         - BPF tooling updates (Wang Nan)
      
         - 'perf sched' updates (Jiri Olsa)
      
         - 'perf probe' updates (Masami Hiramatsu)
      
         - ... plus 200+ other enhancements, fixes and cleanups to tools/
      
        The merge commits, the shortlog and the changelogs contain a lot more
        details"
      
      * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (249 commits)
        perf/core: Disable the event on a truncated AUX record
        perf/x86/intel/pt: Generate PMI in the STOP region as well
        perf buildid-cache: Use lsdir() for looking up buildid caches
        perf symbols: Use lsdir() for the search in kcore cache directory
        perf tools: Use SBUILD_ID_SIZE where applicable
        perf tools: Fix lsdir to set errno correctly
        perf trace: Move seccomp args beautifiers to tools/perf/trace/beauty/
        perf trace: Move flock op beautifier to tools/perf/trace/beauty/
        perf build: Add build-test for debug-frame on arm/arm64
        perf build: Add build-test for libunwind cross-platforms support
        perf script: Fix export of callchains with recursion in db-export
        perf script: Fix callchain addresses in db-export
        perf script: Fix symbol insertion behavior in db-export
        perf symbols: Add dso__insert_symbol function
        perf scripting python: Use Py_FatalError instead of die()
        perf tools: Remove xrealloc and ALLOC_GROW
        perf help: Do not use ALLOC_GROW in add_cmd_list
        perf pmu: Make pmu_formats_string to check return value of strbuf
        perf header: Make topology checkers to check return value of strbuf
        perf tools: Make alias handler to check return value of strbuf
        ...
      36db171c
    • Linus Torvalds's avatar
      Merge branch 'locking-rwsem-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3469d261
      Linus Torvalds authored
      Pull support for killable rwsems from Ingo Molnar:
       "This, by Michal Hocko, implements down_write_killable().
      
        The main usecase will be to update mm_sem usage sites to use this new
        API, to allow the mm-reaper introduced in commit aac45363 ("mm,
        oom: introduce oom reaper") to tear down oom victim address spaces
        asynchronously with minimum latencies and without deadlock worries"
      
      [ The vfs will want it too as the inode lock is changed from a mutex to
        a rwsem due to the parallel lookup and readdir updates ]
      
      * 'locking-rwsem-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        locking/rwsem: Fix comment on register clobbering
        locking/rwsem: Fix down_write_killable()
        locking/rwsem, x86: Add frame annotation for call_rwsem_down_write_failed_killable()
        locking/rwsem: Provide down_write_killable()
        locking/rwsem, x86: Provide __down_write_killable()
        locking/rwsem, s390: Provide __down_write_killable()
        locking/rwsem, ia64: Provide __down_write_killable()
        locking/rwsem, alpha: Provide __down_write_killable()
        locking/rwsem: Introduce basis for down_write_killable()
        locking/rwsem, sparc: Drop superfluous arch specific implementation
        locking/rwsem, sh: Drop superfluous arch specific implementation
        locking/rwsem, xtensa: Drop superfluous arch specific implementation
        locking/rwsem: Drop explicit memory barriers
        locking/rwsem: Get rid of __down_write_nested()
      3469d261