1. 14 Oct, 2017 2 commits
    • Borislav Petkov's avatar
      x86/microcode: Do the family check first · 1f161f67
      Borislav Petkov authored
      On CPUs like AMD's Geode, for example, we shouldn't even try to load
      microcode because they do not support the modern microcode loading
      interface.
      
      However, we do the family check *after* the other checks whether the
      loader has been disabled on the command line or whether we're running in
      a guest.
      
      So move the family checks first in order to exit early if we're being
      loaded on an unsupported family.
      Reported-and-tested-by: default avatarSven Glodowski <glodi1@arcor.de>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: <stable@vger.kernel.org> # 4.11..
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://bugzilla.suse.com/show_bug.cgi?id=1061396
      Link: http://lkml.kernel.org/r/20171012112316.977-1-bp@alien8.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      1f161f67
    • Andy Lutomirski's avatar
      x86/mm: Flush more aggressively in lazy TLB mode · b956575b
      Andy Lutomirski authored
      Since commit:
      
        94b1b03b ("x86/mm: Rework lazy TLB mode and TLB freshness tracking")
      
      x86's lazy TLB mode has been all the way lazy: when running a kernel thread
      (including the idle thread), the kernel keeps using the last user mm's
      page tables without attempting to maintain user TLB coherence at all.
      
      From a pure semantic perspective, this is fine -- kernel threads won't
      attempt to access user pages, so having stale TLB entries doesn't matter.
      
      Unfortunately, I forgot about a subtlety.  By skipping TLB flushes,
      we also allow any paging-structure caches that may exist on the CPU
      to become incoherent.  This means that we can have a
      paging-structure cache entry that references a freed page table, and
      the CPU is within its rights to do a speculative page walk starting
      at the freed page table.
      
      I can imagine this causing two different problems:
      
       - A speculative page walk starting from a bogus page table could read
         IO addresses.  I haven't seen any reports of this causing problems.
      
       - A speculative page walk that involves a bogus page table can install
         garbage in the TLB.  Such garbage would always be at a user VA, but
         some AMD CPUs have logic that triggers a machine check when it notices
         these bogus entries.  I've seen a couple reports of this.
      
      Boris further explains the failure mode:
      
      > It is actually more of an optimization which assumes that paging-structure
      > entries are in WB DRAM:
      >
      > "TlbCacheDis: cacheable memory disable. Read-write. 0=Enables
      > performance optimization that assumes PML4, PDP, PDE, and PTE entries
      > are in cacheable WB-DRAM; memory type checks may be bypassed, and
      > addresses outside of WB-DRAM may result in undefined behavior or NB
      > protocol errors. 1=Disables performance optimization and allows PML4,
      > PDP, PDE and PTE entries to be in any memory type. Operating systems
      > that maintain page tables in memory types other than WB- DRAM must set
      > TlbCacheDis to insure proper operation."
      >
      > The MCE generated is an NB protocol error to signal that
      >
      > "Link: A specific coherent-only packet from a CPU was issued to an
      > IO link. This may be caused by software which addresses page table
      > structures in a memory type other than cacheable WB-DRAM without
      > properly configuring MSRC001_0015[TlbCacheDis]. This may occur, for
      > example, when page table structure addresses are above top of memory. In
      > such cases, the NB will generate an MCE if it sees a mismatch between
      > the memory operation generated by the core and the link type."
      >
      > I'm assuming coherent-only packets don't go out on IO links, thus the
      > error.
      
      To fix this, reinstate TLB coherence in lazy mode.  With this patch
      applied, we do it in one of two ways:
      
       - If we have PCID, we simply switch back to init_mm's page tables
         when we enter a kernel thread -- this seems to be quite cheap
         except for the cost of serializing the CPU.
      
       - If we don't have PCID, then we set a flag and switch to init_mm
         the first time we would otherwise need to flush the TLB.
      
      The /sys/kernel/debug/x86/tlb_use_lazy_mode debug switch can be changed
      to override the default mode for benchmarking.
      
      In theory, we could optimize this better by only flushing the TLB in
      lazy CPUs when a page table is freed.  Doing that would require
      auditing the mm code to make sure that all page table freeing goes
      through tlb_remove_page() as well as reworking some data structures
      to implement the improved flush logic.
      Reported-by: default avatarMarkus Trippelsdorf <markus@trippelsdorf.de>
      Reported-by: default avatarAdam Borowski <kilobyte@angband.pl>
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Eric Biggers <ebiggers@google.com>
      Cc: Johannes Hirte <johannes.hirte@datenkhaos.de>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Roman Kagan <rkagan@virtuozzo.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 94b1b03b ("x86/mm: Rework lazy TLB mode and TLB freshness tracking")
      Link: http://lkml.kernel.org/r/20171009170231.fkpraqokz6e4zeco@pd.tnicSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      b956575b
  2. 12 Oct, 2017 2 commits
  3. 11 Oct, 2017 1 commit
  4. 10 Oct, 2017 7 commits
    • Marcelo Henrique Cerri's avatar
      x86/hyperv: Fix hypercalls with extended CPU ranges for TLB flushing · ab7ff471
      Marcelo Henrique Cerri authored
      Do not consider the fixed size of hv_vp_set when passing the variable
      header size to hv_do_rep_hypercall().
      
      The Hyper-V hypervisor specification states that for a hypercall with a
      variable header only the size of the variable portion should be supplied
      via the input control.
      
      For HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX/LIST_EX calls that means the
      fixed portion of hv_vp_set should not be considered.
      
      That fixes random failures of some applications that are unexpectedly
      killed with SIGBUS or SIGSEGV.
      Signed-off-by: default avatarMarcelo Henrique Cerri <marcelo.cerri@canonical.com>
      Cc: Dexuan Cui <decui@microsoft.com>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Jork Loeser <Jork.Loeser@microsoft.com>
      Cc: Josh Poulson <jopoulso@microsoft.com>
      Cc: K. Y. Srinivasan <kys@microsoft.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Simon Xiao <sixiao@microsoft.com>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: devel@linuxdriverproject.org
      Fixes: 628f54cc ("x86/hyper-v: Support extended CPU ranges for TLB flush hypercalls")
      Link: http://lkml.kernel.org/r/1507210469-29065-1-git-send-email-marcelo.cerri@canonical.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ab7ff471
    • Vitaly Kuznetsov's avatar
      x86/hyperv: Don't use percpu areas for pcpu_flush/pcpu_flush_ex structures · 60d73a7c
      Vitaly Kuznetsov authored
      hv_do_hypercall() does virt_to_phys() translation and with some configs
      (CONFIG_SLAB) this doesn't work for percpu areas, we pass wrong memory to
      hypervisor and get #GP. We could use working slow_virt_to_phys() instead
      but doing so kills the performance.
      
      Move pcpu_flush/pcpu_flush_ex structures out of percpu areas and
      allocate memory on first call. The additional level of indirection gives
      us a small performance penalty, in future we may consider introducing
      hypercall functions which avoid virt_to_phys() conversion and cache
      physical addresses of pcpu_flush/pcpu_flush_ex structures somewhere.
      Reported-by: default avatarSimon Xiao <sixiao@microsoft.com>
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Dexuan Cui <decui@microsoft.com>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Jork Loeser <Jork.Loeser@microsoft.com>
      Cc: K. Y. Srinivasan <kys@microsoft.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: devel@linuxdriverproject.org
      Link: http://lkml.kernel.org/r/20171005113924.28021-1-vkuznets@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      60d73a7c
    • Vitaly Kuznetsov's avatar
      x86/hyperv: Clear vCPU banks between calls to avoid flushing unneeded vCPUs · a3b74243
      Vitaly Kuznetsov authored
      hv_flush_pcpu_ex structures are not cleared between calls for performance
      reasons (they're variable size up to PAGE_SIZE each) but we must clear
      hv_vp_set.bank_contents part of it to avoid flushing unneeded vCPUs. The
      rest of the structure is formed correctly.
      
      To do the clearing in an efficient way stash the maximum possible vCPU
      number (this may differ from Linux CPU id).
      Reported-by: default avatarJork Loeser <Jork.Loeser@microsoft.com>
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Dexuan Cui <decui@microsoft.com>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: K. Y. Srinivasan <kys@microsoft.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: devel@linuxdriverproject.org
      Link: http://lkml.kernel.org/r/20171006154854.18092-1-vkuznets@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      a3b74243
    • Josh Poimboeuf's avatar
      x86/unwind: Disable unwinder warnings on 32-bit · d4a2d031
      Josh Poimboeuf authored
      x86-32 doesn't have stack validation, so in most cases it doesn't make
      sense to warn about bad frame pointers.
      Reported-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: LKP <lkp@01.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/a69658760800bf281e6353248c23e0fa0acf5230.1507597785.git.jpoimboe@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      d4a2d031
    • Josh Poimboeuf's avatar
      x86/unwind: Align stack pointer in unwinder dump · 99bd28a4
      Josh Poimboeuf authored
      When printing the unwinder dump, the stack pointer could be unaligned,
      for one of two reasons:
      
      - stack corruption; or
      
      - GCC created an unaligned stack.
      
      There's no way for the unwinder to tell the difference between the two,
      so we have to assume one or the other.  GCC unaligned stacks are very
      rare, and have only been spotted before GCC 5.  Presumably, if we're
      doing an unwinder stack dump, stack corruption is more likely than a
      GCC unaligned stack.  So always align the stack before starting the
      dump.
      Reported-and-tested-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Reported-and-tested-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: LKP <lkp@01.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/2f540c515946ab09ed267e1a1d6421202a0cce08.1507597785.git.jpoimboe@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      99bd28a4
    • Josh Poimboeuf's avatar
      x86/unwind: Use MSB for frame pointer encoding on 32-bit · 5c99b692
      Josh Poimboeuf authored
      On x86-32, Tetsuo Handa and Fengguang Wu reported unwinder warnings
      like:
      
        WARNING: kernel stack regs at f60bb9c8 in swapper:1 has bad 'bp' value 0ba00000
      
      And also there were some stack dumps with a bunch of unreliable '?'
      symbols after an apic_timer_interrupt symbol, meaning the unwinder got
      confused when it tried to read the regs.
      
      The cause of those issues is that, with GCC 4.8 (and possibly older),
      there are cases where GCC misaligns the stack pointer in a leaf function
      for no apparent reason:
      
        c124a388 <acpi_rs_move_data>:
        c124a388:       55                      push   %ebp
        c124a389:       89 e5                   mov    %esp,%ebp
        c124a38b:       57                      push   %edi
        c124a38c:       56                      push   %esi
        c124a38d:       89 d6                   mov    %edx,%esi
        c124a38f:       53                      push   %ebx
        c124a390:       31 db                   xor    %ebx,%ebx
        c124a392:       83 ec 03                sub    $0x3,%esp
        ...
        c124a3e3:       83 c4 03                add    $0x3,%esp
        c124a3e6:       5b                      pop    %ebx
        c124a3e7:       5e                      pop    %esi
        c124a3e8:       5f                      pop    %edi
        c124a3e9:       5d                      pop    %ebp
        c124a3ea:       c3                      ret
      
      If an interrupt occurs in such a function, the regs on the stack will be
      unaligned, which breaks the frame pointer encoding assumption.  So on
      32-bit, use the MSB instead of the LSB to encode the regs.
      
      This isn't an issue on 64-bit, because interrupts align the stack before
      writing to it.
      Reported-and-tested-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Reported-and-tested-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: LKP <lkp@01.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/279a26996a482ca716605c7dbc7f2db9d8d91e81.1507597785.git.jpoimboe@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      5c99b692
    • Josh Poimboeuf's avatar
      x86/unwind: Fix dereference of untrusted pointer · 62dd86ac
      Josh Poimboeuf authored
      Tetsuo Handa and Fengguang Wu reported a panic in the unwinder:
      
        BUG: unable to handle kernel NULL pointer dereference at 000001f2
        IP: update_stack_state+0xd4/0x340
        *pde = 00000000
      
        Oops: 0000 [#1] PREEMPT SMP
        CPU: 0 PID: 18728 Comm: 01-cpu-hotplug Not tainted 4.13.0-rc4-00170-gb09be676 #592
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-20161025_171302-gandalf 04/01/2014
        task: bb0b53c0 task.stack: bb3ac000
        EIP: update_stack_state+0xd4/0x340
        EFLAGS: 00010002 CPU: 0
        EAX: 0000a570 EBX: bb3adccb ECX: 0000f401 EDX: 0000a570
        ESI: 00000001 EDI: 000001ba EBP: bb3adc6b ESP: bb3adc3f
         DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
        CR0: 80050033 CR2: 000001f2 CR3: 0b3a7000 CR4: 00140690
        DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
        DR6: fffe0ff0 DR7: 00000400
        Call Trace:
         ? unwind_next_frame+0xea/0x400
         ? __unwind_start+0xf5/0x180
         ? __save_stack_trace+0x81/0x160
         ? save_stack_trace+0x20/0x30
         ? __lock_acquire+0xfa5/0x12f0
         ? lock_acquire+0x1c2/0x230
         ? tick_periodic+0x3a/0xf0
         ? _raw_spin_lock+0x42/0x50
         ? tick_periodic+0x3a/0xf0
         ? tick_periodic+0x3a/0xf0
         ? debug_smp_processor_id+0x12/0x20
         ? tick_handle_periodic+0x23/0xc0
         ? local_apic_timer_interrupt+0x63/0x70
         ? smp_trace_apic_timer_interrupt+0x235/0x6a0
         ? trace_apic_timer_interrupt+0x37/0x3c
         ? strrchr+0x23/0x50
        Code: 0f 95 c1 89 c7 89 45 e4 0f b6 c1 89 c6 89 45 dc 8b 04 85 98 cb 74 bc 88 4d e3 89 45 f0 83 c0 01 84 c9 89 04 b5 98 cb 74 bc 74 3b <8b> 47 38 8b 57 34 c6 43 1d 01 25 00 00 02 00 83 e2 03 09 d0 83
        EIP: update_stack_state+0xd4/0x340 SS:ESP: 0068:bb3adc3f
        CR2: 00000000000001f2
        ---[ end trace 0d147fd4aba8ff50 ]---
        Kernel panic - not syncing: Fatal exception in interrupt
      
      On x86-32, after decoding a frame pointer to get a regs address,
      regs_size() dereferences the regs pointer when it checks regs->cs to see
      if the regs are user mode.  This is dangerous because it's possible that
      what looks like a decoded frame pointer is actually a corrupt value, and
      we don't want the unwinder to make things worse.
      
      Instead of calling regs_size() on an unsafe pointer, just assume they're
      kernel regs to start with.  Later, once it's safe to access the regs, we
      can do the user mode check and corresponding safety check for the
      remaining two regs.
      Reported-and-tested-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Reported-and-tested-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Byungchul Park <byungchul.park@lge.com>
      Cc: LKP <lkp@01.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 5ed8d8bb ("x86/unwind: Move common code into update_stack_state()")
      Link: http://lkml.kernel.org/r/7f95b9a6993dec7674b3f3ab3dcd3294f7b9644d.1507597785.git.jpoimboe@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      62dd86ac
  5. 09 Oct, 2017 2 commits
  6. 03 Oct, 2017 2 commits
  7. 01 Oct, 2017 9 commits
    • Linus Torvalds's avatar
      Linux 4.14-rc3 · 9e66317d
      Linus Torvalds authored
      9e66317d
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 368f8998
      Linus Torvalds authored
      Pull x86 fixes from Thomas Gleixner:
       "This contains the following fixes and improvements:
      
         - Avoid dereferencing an unprotected VMA pointer in the fault signal
           generation code
      
         - Fix inline asm call constraints for GCC 4.4
      
         - Use existing register variable to retrieve the stack pointer
           instead of forcing the compiler to create another indirect access
           which results in excessive extra 'mov %rsp, %<dst>' instructions
      
         - Disable branch profiling for the memory encryption code to prevent
           an early boot crash
      
         - Fix a sparse warning caused by casting the __user annotation in
           __get_user_asm_u64() away
      
         - Fix an off by one error in the loop termination of the error patch
           in the x86 sysfs init code
      
         - Add missing CPU IDs to various Intel specific drivers to enable the
           functionality on recent hardware
      
         - More (init) constification in the numachip code"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/asm: Use register variable to get stack pointer value
        x86/mm: Disable branch profiling in mem_encrypt.c
        x86/asm: Fix inline asm call constraints for GCC 4.4
        perf/x86/intel/uncore: Correct num_boxes for IIO and IRP
        perf/x86/intel/rapl: Add missing CPU IDs
        perf/x86/msr: Add missing CPU IDs
        perf/x86/intel/cstate: Add missing CPU IDs
        x86: Don't cast away the __user in __get_user_asm_u64()
        x86/sysfs: Fix off-by-one error in loop termination
        x86/mm: Fix fault error path using unsafe vma pointer
        x86/numachip: Add const and __initconst to numachip2_clockevent
      368f8998
    • Linus Torvalds's avatar
      Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · c42ed9f9
      Linus Torvalds authored
      Pull timer fixes from Thomas Gleixner:
       "This adds a new timer wheel function which is required for the
        conversion of the timer callback function from the 'unsigned long
        data' argument to 'struct timer_list *timer'. This conversion has two
        benefits:
      
         1) It makes struct timer_list smaller
      
         2) Many callers hand in a pointer to the timer or to the structure
            containing the timer, which happens via type casting both at setup
            and in the callback. This change gets rid of the typecasts.
      
        Once the conversion is complete, which is planned for 4.15, the old
        setup function and the intermediate typecast in the new setup function
        go away along with the data field in struct timer_list.
      
        Merging this now into mainline allows a smooth queueing of the actual
        conversion in the affected maintainer trees without creating
        dependencies"
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        um/time: Fixup namespace collision
        timer: Prepare to change timer callback argument type
      c42ed9f9
    • Linus Torvalds's avatar
      Merge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 82513545
      Linus Torvalds authored
      Pull smp/hotplug fixes from Thomas Gleixner:
       "This addresses the fallout of the new lockdep mechanism which covers
        completions in the CPU hotplug code.
      
        The lockdep splats are false positives, but there is no way to
        annotate that reliably. The solution is to split the completions for
        CPU up and down, which requires some reshuffling of the failure
        rollback handling as well"
      
      * 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        smp/hotplug: Hotplug state fail injection
        smp/hotplug: Differentiate the AP completion between up and down
        smp/hotplug: Differentiate the AP-work lockdep class between up and down
        smp/hotplug: Callback vs state-machine consistency
        smp/hotplug: Rewrite AP state machine core
        smp/hotplug: Allow external multi-instance rollback
        smp/hotplug: Add state diagram
      82513545
    • Linus Torvalds's avatar
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 7e103ace
      Linus Torvalds authored
      Pull scheduler fixes from Thomas Gleixner:
       "The scheduler pull request comes with the following updates:
      
         - Prevent a divide by zero issue by validating the input value of
           sysctl_sched_time_avg
      
         - Make task state printing consistent all over the place and have
           explicit state characters for IDLE and PARKED so they wont be
           displayed as 'D' state which confuses tools"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/sysctl: Check user input value of sysctl_sched_time_avg
        sched/debug: Add explicit TASK_PARKED printing
        sched/debug: Ignore TASK_IDLE for SysRq-W
        sched/debug: Add explicit TASK_IDLE printing
        sched/tracing: Use common task-state helpers
        sched/tracing: Fix trace_sched_switch task-state printing
        sched/debug: Remove unused variable
        sched/debug: Convert TASK_state to hex
        sched/debug: Implement consistent task-state printing
      7e103ace
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 1c6f705b
      Linus Torvalds authored
      Pull perf fixes from Thomas Gleixner:
      
       - Prevent a division by zero in the perf aux buffer handling
      
       - Sync kernel headers with perf tool headers
      
       - Fix a build failure in the syscalltbl code
      
       - Make the debug messages of perf report --call-graph work correctly
      
       - Make sure that all required perf files are in the MANIFEST for
         container builds
      
       - Fix the atrr.exclude kernel handling so it respects the
         perf_event_paranoid and the user permissions
      
       - Make perf test on s390x work correctly
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/aux: Only update ->aux_wakeup in non-overwrite mode
        perf test: Fix vmlinux failure on s390x part 2
        perf test: Fix vmlinux failure on s390x
        perf tools: Fix syscalltbl build failure
        perf report: Fix debug messages with --call-graph option
        perf evsel: Fix attr.exclude_kernel setting for default cycles:p
        tools include: Sync kernel ABI headers with tooling headers
        perf tools: Get all of tools/{arch,include}/ in the MANIFEST
      1c6f705b
    • Linus Torvalds's avatar
      Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 1de47f3c
      Linus Torvalds authored
      Pull  locking fixes from Thomas Gleixner:
       "Two fixes for locking:
      
         - Plug a hole the pi_stat->owner serialization which was changed
           recently and failed to fixup two usage sites.
      
         - Prevent reordering of the rwsem_has_spinner() check vs the
           decrement of rwsem count in up_write() which causes a missed
           wakeup"
      
      * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        locking/rwsem-xadd: Fix missed wakeup due to reordering of load
        futex: Fix pi_state->owner serialization
      1de47f3c
    • Linus Torvalds's avatar
      Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3d9d62b9
      Linus Torvalds authored
      Pull irq fixes from Thomas Gleixner:
      
       - Add a missing NULL pointer check in free_irq()
      
       - Fix a memory leak/memory corruption in the generic irq chip
      
       - Add missing rcu annotations for radix tree access
      
       - Use ffs instead of fls when extracting data from a chip register in
         the MIPS GIC irq driver
      
       - Fix the unmasking of IPI interrupts in the MIPS GIC driver so they
         end up at the target CPU and not at CPU0
      
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irq/generic-chip: Don't replace domain's name
        irqdomain: Add __rcu annotations to radix tree accessors
        irqchip/mips-gic: Use effective affinity to unmask
        irqchip/mips-gic: Fix shifts to extract register fields
        genirq: Check __free_irq() return value for NULL
      3d9d62b9
    • Linus Torvalds's avatar
      Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 156069f8
      Linus Torvalds authored
      Pull objtool fixes from Thomas Gleixner:
       "Two small fixes for objtool:
      
         - Support frame pointer setup via 'lea (%rsp), %rbp' which was not
           yet supported and caused build warnings
      
         - Disable unreacahble warnings for GCC4.4 and older to avoid false
           positives caused by the compiler itself"
      
      * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        objtool: Support unoptimized frame pointer setup
        objtool: Skip unreachable warnings for GCC 4.4 and older
      156069f8
  8. 30 Sep, 2017 4 commits
  9. 29 Sep, 2017 11 commits
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 99637e42
      Linus Torvalds authored
      Pull waitid fix from Al Viro:
       "Fix infoleak in waitid()"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        fix infoleak in waitid(2)
      99637e42
    • Linus Torvalds's avatar
      Merge branch 'for-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 5ba88cd6
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
       "We've collected a bunch of isolated fixes, for crashes, user-visible
        behaviour or missing bits from other subsystem cleanups from the past.
      
        The overall number is not small but I was not able to make it
        significantly smaller. Most of the patches are supposed to go to
        stable"
      
      * 'for-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: log csums for all modified extents
        Btrfs: fix unexpected result when dio reading corrupted blocks
        btrfs: Report error on removing qgroup if del_qgroup_item fails
        Btrfs: skip checksum when reading compressed data if some IO have failed
        Btrfs: fix kernel oops while reading compressed data
        Btrfs: use btrfs_op instead of bio_op in __btrfs_map_block
        Btrfs: do not backup tree roots when fsync
        btrfs: remove BTRFS_FS_QUOTA_DISABLING flag
        btrfs: propagate error to btrfs_cmp_data_prepare caller
        btrfs: prevent to set invalid default subvolid
        Btrfs: send: fix error number for unknown inode types
        btrfs: fix NULL pointer dereference from free_reloc_roots()
        btrfs: finish ordered extent cleaning if no progress is found
        btrfs: clear ordered flag on cleaning up ordered extents
        Btrfs: fix incorrect {node,sector}size endianness from BTRFS_IOC_FS_INFO
        Btrfs: do not reset bio->bi_ops while writing bio
        Btrfs: use the new helper wbc_to_write_flags
      5ba88cd6
    • Linus Torvalds's avatar
      Merge tag 'md/4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md · 7b5ef823
      Linus Torvalds authored
      Pull MD fixes from Shaohua Li:
       "A few fixes for MD. Mainly fix a problem introduced in 4.13, which we
        retry bio for some code paths but not all in some situations"
      
      * tag 'md/4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
        md/raid5: cap worker count
        dm-raid: fix a race condition in request handling
        md: fix a race condition for flush request handling
        md: separate request handling
      7b5ef823
    • Linus Torvalds's avatar
      Merge tag 'pci-v4.14-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci · 93b5533a
      Linus Torvalds authored
      Pull PCI fixes from Bjorn Helgaas:
      
       - fix CONFIG_PCI=n build error (introduced in v4.14-rc1) (Geert
         Uytterhoeven)
      
       - fix a race in sysfs driver_override store/show (Nicolai Stange)
      
      * tag 'pci-v4.14-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
        PCI: Fix race condition with driver_override
        PCI: Add dummy pci_acs_enabled() for CONFIG_PCI=n build
      93b5533a
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-for-v4.14-rc3' of git://people.freedesktop.org/~airlied/linux · a3583202
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Regular fixes pull, some amdkfd, amdgpu, etnaviv, sun4i, qxl, tegra
        fixes.
      
        I've got an outstanding pull for i915 but it wasn't on an rc2 base so
        I wanted to ship these out first, I might get to it before rc3 or I
        might not"
      
      * tag 'drm-fixes-for-v4.14-rc3' of git://people.freedesktop.org/~airlied/linux:
        drm/tegra: trace: Fix path to include
        qxl: fix framebuffer unpinning
        drm/sun4i: cec: Enable back CEC-pin framework
        drm/amdkfd: Print event limit messages only once per process
        drm/amdkfd: Fix kernel-queue wrapping bugs
        drm/amdkfd: Fix incorrect destroy_mqd parameter
        drm/radeon: disable hard reset in hibernate for APUs
        drm/amdgpu: revert tile table update for oland
        etnaviv: fix gem object list corruption
        etnaviv: fix submit error path
        qxl: fix primary surface handling
        drm/amdkfd: check for null dev to avoid a null pointer dereference
      a3583202
    • Linus Torvalds's avatar
      Merge tag 'iommu-fixes-v4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 35dbba31
      Linus Torvalds authored
      Pull IOMMU fixes from Joerg Roedel:
      
       - A comment fix for 'struct iommu_ops'
      
       - Format string fixes for AMD IOMMU, unfortunatly I missed that during
         review.
      
       - Limit mediatek physical addresses to 32 bit for v7s to fix a warning
         triggered in io-page-table code.
      
       - Fix dma-sync in io-pgtable-arm-v7s code
      
      * tag 'iommu-fixes-v4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
        iommu: Fix comment for iommu_ops.map_sg
        iommu/amd: pr_err() strings should end with newlines
        iommu/mediatek: Limit the physical address in 32bit for v7s
        iommu/io-pgtable-arm-v7s: Need dma-sync while there is no QUIRK_NO_DMA
      35dbba31
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 06482600
      Linus Torvalds authored
      Pull arm64 fixes from Catalin Marinas:
      
       - SPsel register initialisation on reset as the architecture defines
         its state as unknown
      
       - Use READ_ONCE when dereferencing pmd_t pointers to avoid race
         conditions in page_vma_mapped_walk() (or fast GUP) with concurrent
         modifications of the page table
      
       - Avoid invoking the mm fault handling code for kernel addresses (check
         against TASK_SIZE) which would otherwise result in calling
         might_sleep() in atomic context
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: fault: Route pte translation faults via do_translation_fault
        arm64: mm: Use READ_ONCE when dereferencing pointer to pte table
        arm64: Make sure SPsel is always set
      06482600
    • Linus Torvalds's avatar
      Merge tag 'for-linus-4.14c-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 9f2a5128
      Linus Torvalds authored
      Pull xen fixes from Juergen Gross:
      
       - avoid a warning when compiling with clang
      
       - consider read-only bits in xen-pciback when writing to a BAR
      
       - fix a boot crash of pv-domains
      
      * tag 'for-linus-4.14c-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen/mmu: Call xen_cleanhighmap() with 4MB aligned for page tables mapping
        xen-pciback: relax BAR sizing write value check
        x86/xen: clean up clang build warning
      9f2a5128
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 42057e18
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "Mixed bugfixes. Perhaps the most interesting one is a latent bug that
        was finally triggered by PCID support"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        kvm/x86: Handle async PF in RCU read-side critical sections
        KVM: nVMX: Fix nested #PF intends to break L1's vmlauch/vmresume
        KVM: VMX: use cmpxchg64
        KVM: VMX: simplify and fix vmx_vcpu_pi_load
        KVM: VMX: avoid double list add with VT-d posted interrupts
        KVM: VMX: extract __pi_post_block
        KVM: PPC: Book3S HV: Check for updated HDSISR on P9 HDSI exception
        KVM: nVMX: fix HOST_CR3/HOST_CR4 cache
      42057e18
    • Al Viro's avatar
      fix infoleak in waitid(2) · 6c85501f
      Al Viro authored
      kernel_waitid() can return a PID, an error or 0.  rusage is filled in the first
      case and waitid(2) rusage should've been copied out exactly in that case, *not*
      whenever kernel_waitid() has not returned an error.  Compat variant shares that
      braino; none of kernel_wait4() callers do, so the below ought to fix it.
      Reported-and-tested-by: default avatarAlexander Potapenko <glider@google.com>
      Fixes: ce72a16f ("wait4(2)/waitid(2): separate copying rusage to userland")
      Cc: stable@vger.kernel.org # v4.13
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      6c85501f
    • Andrey Ryabinin's avatar
      x86/asm: Use register variable to get stack pointer value · 196bd485
      Andrey Ryabinin authored
      Currently we use current_stack_pointer() function to get the value
      of the stack pointer register. Since commit:
      
        f5caf621 ("x86/asm: Fix inline asm call constraints for Clang")
      
      ... we have a stack register variable declared. It can be used instead of
      current_stack_pointer() function which allows to optimize away some
      excessive "mov %rsp, %<dst>" instructions:
      
       -mov    %rsp,%rdx
       -sub    %rdx,%rax
       -cmp    $0x3fff,%rax
       -ja     ffffffff810722fd <ist_begin_non_atomic+0x2d>
      
       +sub    %rsp,%rax
       +cmp    $0x3fff,%rax
       +ja     ffffffff810722fa <ist_begin_non_atomic+0x2a>
      
      Remove current_stack_pointer(), rename __asm_call_sp to current_stack_pointer
      and use it instead of the removed function.
      Signed-off-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Reviewed-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20170929141537.29167-1-aryabinin@virtuozzo.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      196bd485