1. 11 Aug, 2016 3 commits
    • Aaron Lu's avatar
      x86/irq: Do not substract irq_tlb_count from irq_call_count · 82ba4fac
      Aaron Lu authored
      Since commit:
      
        52aec330 ("x86/tlb: replace INVALIDATE_TLB_VECTOR by CALL_FUNCTION_VECTOR")
      
      the TLB remote shootdown is done through call function vector. That
      commit didn't take care of irq_tlb_count, which a later commit:
      
        fd0f5869 ("x86: Distinguish TLB shootdown interrupts from other functions call interrupts")
      
      ... tried to fix.
      
      The fix assumes every increase of irq_tlb_count has a corresponding
      increase of irq_call_count. So the irq_call_count is always bigger than
      irq_tlb_count and we could substract irq_tlb_count from irq_call_count.
      
      Unfortunately this is not true for the smp_call_function_single() case.
      The IPI is only sent if the target CPU's call_single_queue is empty when
      adding a csd into it in generic_exec_single. That means if two threads
      are both adding flush tlb csds to the same CPU's call_single_queue, only
      one IPI is sent. In other words, the irq_call_count is incremented by 1
      but irq_tlb_count is incremented by 2. Over time, irq_tlb_count will be
      bigger than irq_call_count and the substract will produce a very large
      irq_call_count value due to overflow.
      
      Considering that:
      
        1) it's not worth to send more IPIs for the sake of accurate counting of
           irq_call_count in generic_exec_single();
      
        2) it's not easy to tell if the call function interrupt is for TLB
           shootdown in __smp_call_function_single_interrupt().
      
      Not to exclude TLB shootdown from call function count seems to be the
      simplest fix and this patch just does that.
      
      This bug was found by LKP's cyclic performance regression tracking recently
      with the vm-scalability test suite. I have bisected to commit:
      
        3dec0ba0 ("mm/rmap: share the i_mmap_rwsem")
      
      This commit didn't do anything wrong but revealed the irq_call_count
      problem. IIUC, the commit makes rwc->remap_one in rmap_walk_file
      concurrent with multiple threads.  When remap_one is try_to_unmap_one(),
      then multiple threads could queue flush TLB to the same CPU but only
      one IPI will be sent.
      
      Since the commit was added in Linux v3.19, the counting problem only
      shows up from v3.19 onwards.
      Signed-off-by: default avatarAaron Lu <aaron.lu@intel.com>
      Cc: Alex Shi <alex.shi@linaro.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tomoki Sekiyama <tomoki.sekiyama.qu@hitachi.com>
      Link: http://lkml.kernel.org/r/20160811074430.GA18163@aaronlu.sh.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      82ba4fac
    • Dave Hansen's avatar
      x86/mm: Fix swap entry comment and macro · ace7fab7
      Dave Hansen authored
      A recent patch changed the format of a swap PTE.
      
      The comment explaining the format of the swap PTE is wrong about
      the bits used for the swap type field.  Amusingly, the ASCII art
      and the patch description are correct, but the comment itself
      is wrong.
      
      As I was looking at this, I also noticed that the
      SWP_OFFSET_FIRST_BIT has an off-by-one error.  This does not
      really hurt anything.  It just wasted a bit of space in the PTE,
      giving us 2^59 bytes of addressable space in our swapfiles
      instead of 2^60.  But, it doesn't match with the comments, and it
      wastes a bit of space, so fix it.
      Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Fixes: 00839ee3 ("x86/mm: Move swap offset/type up in PTE to work around erratum")
      Link: http://lkml.kernel.org/r/20160810172325.E56AD7DA@viggo.jf.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ace7fab7
    • Nicolas Iooss's avatar
      x86/mm/kaslr: Fix -Wformat-security warning · 62d16b5a
      Nicolas Iooss authored
      debug_putstr() is used to output strings without using printf-like
      formatting but debug_putstr(v) is defined as early_printk(v) in
      arch/x86/lib/kaslr.c.
      
      This makes clang reports the following warning when building
      with -Wformat-security:
      
          arch/x86/lib/kaslr.c:57:15: warning: format string is not a string
          literal (potentially insecure) [-Wformat-security]
                  debug_putstr(purpose);
                               ^~~~~~~
      
      Fix this by using "%s" in early_printk().
      Signed-off-by: default avatarNicolas Iooss <nicolas.iooss_linux@m4x.org>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20160806102039.27221-1-nicolas.iooss_linux@m4x.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      62d16b5a
  2. 10 Aug, 2016 11 commits
    • Dave Hansen's avatar
      x86/mm/pkeys: Fix compact mode by removing protection keys' XSAVE buffer manipulation · b79daf85
      Dave Hansen authored
      The Memory Protection Keys "rights register" (PKRU) is
      XSAVE-managed, and is saved/restored along with the FPU state.
      
      When kernel code accesses FPU regsisters, it does a delicate
      dance with preempt.  Otherwise, the context switching code can
      get confused as to whether the most up-to-date state is in the
      registers themselves or in the XSAVE buffer.
      
      But, PKRU is not a normal FPU register.  Using it does not
      generate the normal device-not-available (#NM) exceptions which
      means we can not manage it lazily, and the kernel completley
      disallows using lazy mode when it is enabled.
      
      The dance with preempt *only* occurs when managing the FPU
      lazily.  Since we never manage PKRU lazily, we do not have to do
      the dance with preempt; we can access it directly.  Doing it
      this way saves a ton of complicated code (and is faster too).
      
      Further, the XSAVES reenabling failed to patch a bit of code
      in fpu__xfeature_set_state() the checked for compacted buffers.
      That check caused fpu__xfeature_set_state() to silently refuse to
      work when the kernel is using compacted XSAVE buffers.  This
      broke execute-only and future pkey_mprotect() support when using
      compact XSAVE buffers.
      
      But, removing fpu__xfeature_set_state() gets rid of this issue,
      in addition to the nice cleanup and speedup.
      
      This fixes the same thing as a fix that Sai posted:
      
        https://lkml.org/lkml/2016/7/25/637
      
      The fix that he posted is a much more obviously correct, but I
      think we should just do this instead.
      Reported-by: default avatarSai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
      Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Dave Hansen <dave@sr71.net>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
      Cc: Ravi Shankar <ravi.v.shankar@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yu-Cheng Yu <yu-cheng.yu@intel.com>
      Link: http://lkml.kernel.org/r/20160727232040.7D060DAD@viggo.jf.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      b79daf85
    • Valdis Kletnieks's avatar
      x86/build: Reduce the W=1 warnings noise when compiling x86 syscall tables · 5e44258d
      Valdis Kletnieks authored
      Building an X86_64 kernel with W=1 throws a total of 9,948 lines of warnings of
      this form for both 32-bit and 64-bit syscall tables. Given that the entire rest
      of the build for my config only generates 8,375 lines of output, this is a big
      reduction in the warnings generated.
      
      The warnings follow this pattern:
      
        ./arch/x86/include/generated/asm/syscalls_32.h:885:21: warning: initialized field overwritten [-Woverride-init]
         __SYSCALL_I386(379, compat_sys_pwritev2, )
                           ^
        arch/x86/entry/syscall_32.c:13:46: note: in definition of macro '__SYSCALL_I386'
         #define __SYSCALL_I386(nr, sym, qual) [nr] = sym,
                                                    ^~~
        ./arch/x86/include/generated/asm/syscalls_32.h:885:21: note: (near initialization for 'ia32_sys_call_table[379]')
         __SYSCALL_I386(379, compat_sys_pwritev2, )
                           ^
        arch/x86/entry/syscall_32.c:13:46: note: in definition of macro '__SYSCALL_I386'
         #define __SYSCALL_I386(nr, sym, qual) [nr] = sym,
      
      Since we intentionally build the syscall tables this way, ignore that one
      warning in the two files.
      Signed-off-by: default avatarValdis Kletnieks <valdis.kletnieks@vt.edu>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/7464.1470021890@turing-police.cc.vt.eduSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      5e44258d
    • Mike Travis's avatar
      x86/platform/UV: Fix kernel panic running RHEL kdump kernel on UV systems · 5a52e8f8
      Mike Travis authored
      The latest UV kernel support panics when RHEL7 kexec's the kdump kernel
      to make a dumpfile.  This patch fixes the problem by turning off all UV
      support if NUMA is off.
      Tested-by: default avatarFrank Ramsay <framsay@sgi.com>
      Tested-by: default avatarJohn Estabrook <estabrook@sgi.com>
      Signed-off-by: default avatarMike Travis <travis@sgi.com>
      Reviewed-by: default avatarDimitri Sivanich <sivanich@sgi.com>
      Reviewed-by: default avatarNathan Zimmer <nzimmer@sgi.com>
      Cc: Alex Thorlton <athorlton@sgi.com>
      Cc: Andrew Banman <abanman@sgi.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russ Anderson <rja@sgi.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20160801184050.577755634@asylum.americas.sgi.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      5a52e8f8
    • Mike Travis's avatar
      x86/platform/UV: Fix problem with UV4 BIOS providing incorrect PXM values · 22ac2bca
      Mike Travis authored
      There are some circumstances where the UV4 BIOS cannot provide the
      correct Proximity Node values to associate with specific Sockets and
      Physical Nodes.  The decision was made to remove these values from BIOS
      and for the kernel to get these values from the standard ACPI tables.
      Tested-by: default avatarFrank Ramsay <framsay@sgi.com>
      Tested-by: default avatarJohn Estabrook <estabrook@sgi.com>
      Signed-off-by: default avatarMike Travis <travis@sgi.com>
      Reviewed-by: default avatarDimitri Sivanich <sivanich@sgi.com>
      Reviewed-by: default avatarNathan Zimmer <nzimmer@sgi.com>
      Cc: Alex Thorlton <athorlton@sgi.com>
      Cc: Andrew Banman <abanman@sgi.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russ Anderson <rja@sgi.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20160801184050.414210079@asylum.americas.sgi.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      22ac2bca
    • Mike Travis's avatar
      x86/platform/UV: Fix bug with iounmap() of the UV4 EFI System Table causing a crash · e363d24c
      Mike Travis authored
      Save the uv_systab::size field before doing the iounmap()
      of the struct pointer, to avoid a NULL dereference crash.
      Tested-by: default avatarFrank Ramsay <framsay@sgi.com>
      Tested-by: default avatarJohn Estabrook <estabrook@sgi.com>
      Signed-off-by: default avatarMike Travis <travis@sgi.com>
      Reviewed-by: default avatarDimitri Sivanich <sivanich@sgi.com>
      Reviewed-by: default avatarNathan Zimmer <nzimmer@sgi.com>
      Cc: Alex Thorlton <athorlton@sgi.com>
      Cc: Andrew Banman <abanman@sgi.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russ Anderson <rja@sgi.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20160801184050.250424783@asylum.americas.sgi.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e363d24c
    • Mike Travis's avatar
      x86/platform/UV: Fix problem with UV4 Socket IDs not being contiguous · 054f621f
      Mike Travis authored
      The UV4 Socket IDs are not guaranteed to equate to Node values which
      can cause the GAM (Global Addressable Memory) table lookups to fail.
      Fix this by using an independent index into the GAM table instead of
      the Socket ID to reference the base address.
      Tested-by: default avatarFrank Ramsay <framsay@sgi.com>
      Tested-by: default avatarJohn Estabrook <estabrook@sgi.com>
      Signed-off-by: default avatarMike Travis <travis@sgi.com>
      Reviewed-by: default avatarDimitri Sivanich <sivanich@sgi.com>
      Reviewed-by: default avatarNathan Zimmer <nzimmer@sgi.com>
      Cc: Alex Thorlton <athorlton@sgi.com>
      Cc: Andrew Banman <abanman@sgi.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russ Anderson <rja@sgi.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20160801184050.048755337@asylum.americas.sgi.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      054f621f
    • Borislav Petkov's avatar
      x86/entry: Clarify the RF saving/restoring situation with SYSCALL/SYSRET · 3e035305
      Borislav Petkov authored
      Clarify why exactly RF cannot be restored properly by SYSRET to avoid
      confusion.
      
      No functionality change.
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Acked-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20160803171429.GA2590@nazgul.tnicSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      3e035305
    • Sebastian Andrzej Siewior's avatar
      x86/mm: Disable preemption during CR3 read+write · 5cf0791d
      Sebastian Andrzej Siewior authored
      There's a subtle preemption race on UP kernels:
      
      Usually current->mm (and therefore mm->pgd) stays the same during the
      lifetime of a task so it does not matter if a task gets preempted during
      the read and write of the CR3.
      
      But then, there is this scenario on x86-UP:
      
      TaskA is in do_exit() and exit_mm() sets current->mm = NULL followed by:
      
       -> mmput()
       -> exit_mmap()
       -> tlb_finish_mmu()
       -> tlb_flush_mmu()
       -> tlb_flush_mmu_tlbonly()
       -> tlb_flush()
       -> flush_tlb_mm_range()
       -> __flush_tlb_up()
       -> __flush_tlb()
       ->  __native_flush_tlb()
      
      At this point current->mm is NULL but current->active_mm still points to
      the "old" mm.
      
      Let's preempt taskA _after_ native_read_cr3() by taskB. TaskB has its
      own mm so CR3 has changed.
      
      Now preempt back to taskA. TaskA has no ->mm set so it borrows taskB's
      mm and so CR3 remains unchanged. Once taskA gets active it continues
      where it was interrupted and that means it writes its old CR3 value
      back. Everything is fine because userland won't need its memory
      anymore.
      
      Now the fun part:
      
      Let's preempt taskA one more time and get back to taskB. This
      time switch_mm() won't do a thing because oldmm (->active_mm)
      is the same as mm (as per context_switch()). So we remain
      with a bad CR3 / PGD and return to userland.
      
      The next thing that happens is handle_mm_fault() with an address for
      the execution of its code in userland. handle_mm_fault() realizes that
      it has a PTE with proper rights so it returns doing nothing. But the
      CPU looks at the wrong PGD and insists that something is wrong and
      faults again. And again. And one more time…
      
      This pagefault circle continues until the scheduler gets tired of it and
      puts another task on the CPU. It gets little difficult if the task is a
      RT task with a high priority. The system will either freeze or it gets
      fixed by the software watchdog thread which usually runs at RT-max prio.
      But waiting for the watchdog will increase the latency of the RT task
      which is no good.
      
      Fix this by disabling preemption across the critical code section.
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarRik van Riel <riel@redhat.com>
      Acked-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-mm@kvack.org
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/1470404259-26290-1-git-send-email-bigeasy@linutronix.de
      [ Prettified the changelog. ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      5cf0791d
    • Thomas Garnier's avatar
      x86/mm/KASLR: Increase BRK pages for KASLR memory randomization · fb754f95
      Thomas Garnier authored
      Default implementation expects 6 pages maximum are needed for low page
      allocations. If KASLR memory randomization is enabled, the worse case
      of e820 layout would require 12 pages (no large pages). It is due to the
      PUD level randomization and the variable e820 memory layout.
      
      This bug was found while doing extensive testing of KASLR memory
      randomization on different type of hardware.
      Signed-off-by: default avatarThomas Garnier <thgarnie@google.com>
      Cc: Aleksey Makarov <aleksey.makarov@linaro.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Fabian Frederick <fabf@skynet.be>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Lv Zheng <lv.zheng@intel.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: kernel-hardening@lists.openwall.com
      Fixes: 021182e5 ("Enable KASLR for physical mapping memory regions")
      Link: http://lkml.kernel.org/r/1470762665-88032-2-git-send-email-thgarnie@google.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      fb754f95
    • Thomas Garnier's avatar
      x86/mm/KASLR: Fix physical memory calculation on KASLR memory randomization · c7d2361f
      Thomas Garnier authored
      Initialize KASLR memory randomization after max_pfn is initialized. Also
      ensure the size is rounded up. It could create problems on machines
      with more than 1Tb of memory on certain random addresses.
      Signed-off-by: default avatarThomas Garnier <thgarnie@google.com>
      Cc: Aleksey Makarov <aleksey.makarov@linaro.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Fabian Frederick <fabf@skynet.be>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Lv Zheng <lv.zheng@intel.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: kernel-hardening@lists.openwall.com
      Fixes: 021182e5 ("Enable KASLR for physical mapping memory regions")
      Link: http://lkml.kernel.org/r/1470762665-88032-1-git-send-email-thgarnie@google.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      c7d2361f
    • Alexander Potapenko's avatar
      x86, kasan, ftrace: Put APIC interrupt handlers into .irqentry.text · 469f0023
      Alexander Potapenko authored
      Dmitry Vyukov has reported unexpected KASAN stackdepot growth:
      
        https://github.com/google/kasan/issues/36
      
      ... which is caused by the APIC handlers not being present in .irqentry.text:
      
      When building with CONFIG_FUNCTION_GRAPH_TRACER=y or CONFIG_KASAN=y, put the
      APIC interrupt handlers into the .irqentry.text section. This is needed
      because both KASAN and function graph tracer use __irqentry_text_start and
      __irqentry_text_end to determine whether a function is an IRQ entry point.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarAlexander Potapenko <glider@google.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: aryabinin@virtuozzo.com
      Cc: kasan-dev@googlegroups.com
      Cc: kcc@google.com
      Cc: rostedt@goodmis.org
      Link: http://lkml.kernel.org/r/1468575763-144889-1-git-send-email-glider@google.com
      [ Minor edits. ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      469f0023
  3. 09 Aug, 2016 14 commits
    • Linus Torvalds's avatar
      Revert "printk: create pr_<level> functions" · a0cba217
      Linus Torvalds authored
      This reverts commit 874f9c7d.
      
      Geert Uytterhoeven reports:
       "This change seems to have an (unintendent?) side-effect.
      
        Before, pr_*() calls without a trailing newline characters would be
        printed with a newline character appended, both on the console and in
        the output of the dmesg command.
      
        After this commit, no new line character is appended, and the output
        of the next pr_*() call of the same type may be appended, like in:
      
          - Truncating RAM at 0x0000000040000000-0x00000000c0000000 to -0x0000000070000000
          - Ignoring RAM at 0x0000000200000000-0x0000000240000000 (!CONFIG_HIGHMEM)
          + Truncating RAM at 0x0000000040000000-0x00000000c0000000 to -0x0000000070000000Ignoring RAM at 0x0000000200000000-0x0000000240000000 (!CONFIG_HIGHMEM)"
      
      Joe Perches says:
       "No, that is not intentional.
      
        The newline handling code inside vprintk_emit is a bit involved and
        for now I suggest a revert until this has all the same behavior as
        earlier"
      Reported-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Requested-by: default avatarJoe Perches <joe@perches.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a0cba217
    • Linus Torvalds's avatar
      Merge tag 'trace-v4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · 84bd8d33
      Linus Torvalds authored
      Pull tracing fix from Steven Rostedt:
       "Fix tick_stop tracepoint symbols for user export.
      
        Luiz Capitulino noticed that the tick_stop tracepoint wasn't being
        parsed properly by the tracing user space tools.
      
        This was due to the TRACE_DEFINE_ENUM() being set to a define, when it
        should have been set to the enum itself.  The define was of the MASK
        that used the BIT to shift.  The BIT was the enum and by adding that,
        everything gets converted nicely.  The MASK is still kept just in case
        it gets converted to an enum in the future"
      
      * tag 'trace-v4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        tracing: Fix tick_stop tracepoint symbols for user export
      84bd8d33
    • Linus Torvalds's avatar
      Merge tag 'gcc-plugin-infrastructure-v4.8-rc2' of... · b79f34d6
      Linus Torvalds authored
      Merge tag 'gcc-plugin-infrastructure-v4.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
      
      Pull gcc plugin improvements from Kees Cook:
       "Several fixes/improvements for the gcc plugin infrastructure:
      
         - fix a problem with gcc plugins interfering with cc-option tests.
      
         - abort more gracefully when gcc plugin headers or compiler support
           is missing.
      
         - improve the gcc plugin rule generation to be more dynamic, pass
           arguments, and build from subdirectories"
      
      * tag 'gcc-plugin-infrastructure-v4.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        gcc-plugins: Add support for plugin subdirectories
        gcc-plugins: Automate make rule generation
        gcc-plugins: Add support for passing plugin arguments
        gcc-plugins: abort builds cleanly when not supported
        kbuild: no gcc-plugins during cc-option tests
      b79f34d6
    • Linus Torvalds's avatar
      Merge tag 'platform-drivers-x86-v4.8-3' of... · e1d009ea
      Linus Torvalds authored
      Merge tag 'platform-drivers-x86-v4.8-3' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86
      
      Pull x86 platform driver update from Darren Hart:
       "dell-wmi: ignore battery remove/insert event"
      
      * tag 'platform-drivers-x86-v4.8-3' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86:
        dell-wmi: Ignore WMI event 0xe00e
      e1d009ea
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-for-4.8-rc2' of git://people.freedesktop.org/~airlied/linux · cb0d93aa
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "This contains a bunch of amdgpu fixes, and some i915 regression fixes.
      
        It also contains some fixes for an older regression with some EDID
        changes and some 6bpc panels.
      
        Then there are the lockdep, cirrus and rcar-du regression fixes from
        this window"
      
      * tag 'drm-fixes-for-4.8-rc2' of git://people.freedesktop.org/~airlied/linux:
        drm/cirrus: Fix NULL pointer dereference when registering the fbdev
        drm/edid: Set 8 bpc color depth for displays with "DFP 1.x compliant TMDS".
        drm/i915/dp: Revert "drm/i915/dp: fall back to 18 bpp when sink capability is unknown"
        drm/edid: Add 6 bpc quirk for display AEO model 0.
        drm: Paper over locking inversion after registration rework
        drm: rcar-du: Link HDMI encoder with bridge
        drm/ttm: Wait for a BO to become idle before unbinding it from GTT
        drm/i915/fbdev: Check for the framebuffer before use
        drm/amdgpu: update golden setting of polaris10
        drm/amdgpu: update golden setting of stoney
        drm/amdgpu: update golden setting of polaris11
        drm/amdgpu: update golden setting of carrizo
        drm/amdgpu: update golden setting of iceland
        drm/amd/amdgpu: change pptable output format from ASCII to binary
        drm/amdgpu/ci: add mullins to default case for smc ucode
        drm/amdgpu/gmc7: add missing mullins case
        drm/i915: Never fully mask the the EI up rps interrupt on SNB/IVB
        drm/i915: Wait up to 3ms for the pcu to ack the cdclk change request on SKL
      cb0d93aa
    • Brian King's avatar
      ipr: Fix sync scsi scan · a3d1ddd9
      Brian King authored
      Commit b195d5e2 ("ipr: Wait to do async scan until scsi host is
      initialized") fixed async scan for ipr, but broke sync scan for ipr.
      
      This fixes sync scan back up.
      Signed-off-by: default avatarBrian King <brking@linux.vnet.ibm.com>
      Reported-and-tested-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a3d1ddd9
    • Vladimir Davydov's avatar
      mm: memcontrol: only mark charged pages with PageKmemcg · c4159a75
      Vladimir Davydov authored
      To distinguish non-slab pages charged to kmemcg we mark them PageKmemcg,
      which sets page->_mapcount to -512.  Currently, we set/clear PageKmemcg
      in __alloc_pages_nodemask()/free_pages_prepare() for any page allocated
      with __GFP_ACCOUNT, including those that aren't actually charged to any
      cgroup, i.e. allocated from the root cgroup context.  To avoid overhead
      in case cgroups are not used, we only do that if memcg_kmem_enabled() is
      true.  The latter is set iff there are kmem-enabled memory cgroups
      (online or offline).  The root cgroup is not considered kmem-enabled.
      
      As a result, if a page is allocated with __GFP_ACCOUNT for the root
      cgroup when there are kmem-enabled memory cgroups and is freed after all
      kmem-enabled memory cgroups were removed, e.g.
      
        # no memory cgroups has been created yet, create one
        mkdir /sys/fs/cgroup/memory/test
        # run something allocating pages with __GFP_ACCOUNT, e.g.
        # a program using pipe
        dmesg | tail
        # remove the memory cgroup
        rmdir /sys/fs/cgroup/memory/test
      
      we'll get bad page state bug complaining about page->_mapcount != -1:
      
        BUG: Bad page state in process swapper/0  pfn:1fd945c
        page:ffffea007f651700 count:0 mapcount:-511 mapping:          (null) index:0x0
        flags: 0x1000000000000000()
      
      To avoid that, let's mark with PageKmemcg only those pages that are
      actually charged to and hence pin a non-root memory cgroup.
      
      Fixes: 4949148a ("mm: charge/uncharge kmemcg from generic page allocator paths")
      Reported-and-tested-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarVladimir Davydov <vdavydov@virtuozzo.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c4159a75
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Fix tick_stop tracepoint symbols for user export · c87edb36
      Steven Rostedt (Red Hat) authored
      The symbols used in the tick_stop tracepoint were not being converted
      properly into integers in the trace_stop format file. Instead we had this:
      
      print fmt: "success=%d dependency=%s", REC->success,
          __print_symbolic(REC->dependency, { 0, "NONE" },
           { (1 << TICK_DEP_BIT_POSIX_TIMER), "POSIX_TIMER" },
           { (1 << TICK_DEP_BIT_PERF_EVENTS), "PERF_EVENTS" },
           { (1 << TICK_DEP_BIT_SCHED), "SCHED" },
           { (1 << TICK_DEP_BIT_CLOCK_UNSTABLE), "CLOCK_UNSTABLE" })
      
      User space tools have no idea how to parse "TICK_DEP_BIT_SCHED" or the other
      symbols used to do the bit shifting. The reason is that the conversion was
      done with using the TICK_DEP_MASK_* symbols which are just macros that
      convert to the BIT shift itself (with the exception of NONE, which was
      converted properly, because it doesn't use bits, and is defined as zero).
      
      The TICK_DEP_BIT_* needs to be denoted by TRACE_DEFINE_ENUM() in order to
      have this properly converted for user space tools to parse this event.
      
      Cc: stable@vger.kernel.org
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Fixes: e6e6cc22 ("nohz: Use enum code for tick stop failure tracing message")
      Reported-by: default avatarLuiz Capitulino <lcapitulino@redhat.com>
      Tested-by: default avatarLuiz Capitulino <lcapitulino@redhat.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      c87edb36
    • Boris Brezillon's avatar
      drm/cirrus: Fix NULL pointer dereference when registering the fbdev · 36e9d08b
      Boris Brezillon authored
      cirrus_modeset_init() is initializing/registering the emulated fbdev
      and, since commit c61b93fe ("drm/atomic: Fix remaining places where
      !funcs->best_encoder is valid"), DRM internals can access/test some of
      the fields in mode_config->funcs as part of the fbdev registration
      process.
      Make sure dev->mode_config.funcs is properly set to avoid dereferencing
      a NULL pointer.
      Reported-by: default avatarMike Marshall <hubcap@omnibond.com>
      Reported-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarBoris Brezillon <boris.brezillon@free-electrons.com>
      Fixes: c61b93fe ("drm/atomic: Fix remaining places where !funcs->best_encoder is valid")
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      36e9d08b
    • Emese Revfy's avatar
      gcc-plugins: Add support for plugin subdirectories · caefd8c9
      Emese Revfy authored
      This adds support for building more complex gcc plugins that live in a
      subdirectory instead of just in a single source file.
      Reported-by: default avatarPaX Team <pageexec@freemail.hu>
      Signed-off-by: default avatarEmese Revfy <re.emese@gmail.com>
      [kees: clarified commit message]
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      caefd8c9
    • Emese Revfy's avatar
      gcc-plugins: Automate make rule generation · 7040c83b
      Emese Revfy authored
      There's no reason to repeat the same names in the Makefile when the .so
      files have already been listed. The .o list can be generated from them.
      Reported-by: default avatarPaX Team <pageexec@freemail.hu>
      Signed-off-by: default avatarEmese Revfy <re.emese@gmail.com>
      [kees: clarified commit message]
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      7040c83b
    • Emese Revfy's avatar
      gcc-plugins: Add support for passing plugin arguments · 65d59ec8
      Emese Revfy authored
      The latent_entropy plugin needs to pass arguments, so this adds the
      support.
      Signed-off-by: default avatarEmese Revfy <re.emese@gmail.com>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      65d59ec8
    • Kees Cook's avatar
      gcc-plugins: abort builds cleanly when not supported · ed58c0e9
      Kees Cook authored
      When the compiler doesn't support gcc plugins (either due to missing
      headers or too old a version), report the problem and abort the build
      instead of emitting a warning and letting the build founder with arcane
      compiler errors.
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      ed58c0e9
    • Emese Revfy's avatar
      kbuild: no gcc-plugins during cc-option tests · d26e9414
      Emese Revfy authored
      The gcc-plugins arguments should not be included when performing
      cc-option tests.
      
      Steps to reproduce:
      1) make mrproper
      2) make defconfig
      3) enable GCC_PLUGINS, GCC_PLUGIN_CYC_COMPLEXITY
      4) enable FUNCTION_TRACER (it will select other options as well)
      5) make && make modules
      
      Build errors:
      MODPOST 18 modules
      ERROR: "__fentry__" [net/netfilter/xt_nat.ko] undefined!
      ERROR: "__fentry__" [net/netfilter/xt_mark.ko] undefined!
      ERROR: "__fentry__" [net/netfilter/xt_addrtype.ko] undefined!
      ERROR: "__fentry__" [net/netfilter/xt_LOG.ko] undefined!
      ERROR: "__fentry__" [net/netfilter/nf_nat_sip.ko] undefined!
      ERROR: "__fentry__" [net/netfilter/nf_nat_irc.ko] undefined!
      ERROR: "__fentry__" [net/netfilter/nf_nat_ftp.ko] undefined!
      ERROR: "__fentry__" [net/netfilter/nf_nat.ko] undefined!
      Reported-by: default avatarLaura Abbott <labbott@redhat.com>
      Signed-off-by: default avatarEmese Revfy <re.emese@gmail.com>
      [kees: renamed variable, clarified commit message]
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      d26e9414
  4. 08 Aug, 2016 12 commits
    • Mario Kleiner's avatar
      drm/edid: Set 8 bpc color depth for displays with "DFP 1.x compliant TMDS". · 210a021d
      Mario Kleiner authored
      According to E-EDID spec 1.3, table 3.9, a digital video sink with the
      "DFP 1.x compliant TMDS" bit set is "signal compatible with VESA DFP 1.x
      TMDS CRGB, 1 pixel / clock, up to 8 bits / color MSB aligned".
      
      For such displays, the DFP spec 1.0, section 3.10 "EDID support" says:
      
      "If the DFP monitor only supports EDID 1.X (1.1, 1.2, etc.)
       without extensions, the host will make the following assumptions:
      
       1. 24-bit MSB-aligned RGB TFT
       2. DE polarity is active high
       3. H and V syncs are active high
       4. Established CRT timings will be used
       5. Dithering will not be enabled on the host"
      
      So if we don't know the bit depth of the display from additional
      colorimetry info we should assume 8 bpc / 24 bpp by default.
      
      This patch adds info->bpc = 8 assignement for that case.
      Signed-off-by: default avatarMario Kleiner <mario.kleiner.de@gmail.com>
      Cc: Jani Nikula <jani.nikula@intel.com>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      210a021d
    • Mario Kleiner's avatar
      drm/i915/dp: Revert "drm/i915/dp: fall back to 18 bpp when sink capability is unknown" · 196f954e
      Mario Kleiner authored
      This reverts commit 013dd9e0
      ("drm/i915/dp: fall back to 18 bpp when sink capability is unknown")
      
      This commit introduced a regression into stable kernels,
      as it reduces output color depth to 6 bpc for any video
      sink connected to a Displayport connector if that sink
      doesn't report a specific color depth via EDID, or if
      our EDID parser doesn't actually recognize the proper
      bpc from EDID.
      
      Affected are active DisplayPort->VGA converters and
      active DisplayPort->DVI converters. Both should be
      able to handle 8 bpc, but are degraded to 6 bpc with
      this patch.
      
      The reverted commit was meant to fix
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=105331
      
      A followup patch implements a fix for that specific bug,
      which is caused by a faulty EDID of the affected DP panel
      by adding a new EDID quirk for that panel.
      
      DP 18 bpp fallback handling and other improvements to
      DP sink bpc detection will be handled for future
      kernels in a separate series of patches.
      
      Please backport to stable.
      Signed-off-by: default avatarMario Kleiner <mario.kleiner.de@gmail.com>
      Acked-by: default avatarJani Nikula <jani.nikula@intel.com>
      Cc: stable@vger.kernel.org
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      196f954e
    • Mario Kleiner's avatar
      drm/edid: Add 6 bpc quirk for display AEO model 0. · e10aec65
      Mario Kleiner authored
      Bugzilla https://bugzilla.kernel.org/show_bug.cgi?id=105331
      reports that the "AEO model 0" display is driven with 8 bpc
      without dithering by default, which looks bad because that
      panel is apparently a 6 bpc DP panel with faulty EDID.
      
      A fix for this was made by commit 013dd9e0
      ("drm/i915/dp: fall back to 18 bpp when sink capability is unknown").
      
      That commit triggers new regressions in precision for DP->DVI and
      DP->VGA displays. A patch is out to revert that commit, but it will
      revert video output for the AEO model 0 panel to 8 bpc without
      dithering.
      
      The EDID 1.3 of that panel, as decoded from the xrandr output
      attached to that bugzilla bug report, is somewhat faulty, and beyond
      other problems also sets the "DFP 1.x compliant TMDS" bit, which
      according to DFP spec means to drive the panel with 8 bpc and
      no dithering in absence of other colorimetry information.
      
      Try to make the original bug reporter happy despite the
      faulty EDID by adding a quirk to mark that panel as 6 bpc,
      so 6 bpc output with dithering creates a nice picture.
      
      Tested by injecting the edid from the fdo bug into a DP connector
      via drm_kms_helper.edid_firmware and verifying the 6 bpc + dithering
      is selected.
      
      This patch should be backported to stable.
      Signed-off-by: default avatarMario Kleiner <mario.kleiner.de@gmail.com>
      Cc: stable@vger.kernel.org
      Cc: Jani Nikula <jani.nikula@intel.com>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      e10aec65
    • Linus Torvalds's avatar
      Merge tag 'lkdtm-v4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 81abf252
      Linus Torvalds authored
      Pull lkdtm update from Kees Cook:
       "Fix rebuild problem with LKDTM's rodata test"
      
      [ This, and the usercopy branch, both came in before the merge window
        closed, but ended up in my 'need to look more' queue and thus got
        merged only after rc1 was out ]
      
      * tag 'lkdtm-v4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        lkdtm: Fix targets for objcopy usage
        lkdtm: fix false positive warning from -Wmaybe-uninitialized
      81abf252
    • Linus Torvalds's avatar
      Merge tag 'usercopy-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 1eccfa09
      Linus Torvalds authored
      Pull usercopy protection from Kees Cook:
       "Tbhis implements HARDENED_USERCOPY verification of copy_to_user and
        copy_from_user bounds checking for most architectures on SLAB and
        SLUB"
      
      * tag 'usercopy-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        mm: SLUB hardened usercopy support
        mm: SLAB hardened usercopy support
        s390/uaccess: Enable hardened usercopy
        sparc/uaccess: Enable hardened usercopy
        powerpc/uaccess: Enable hardened usercopy
        ia64/uaccess: Enable hardened usercopy
        arm64/uaccess: Enable hardened usercopy
        ARM: uaccess: Enable hardened usercopy
        x86/uaccess: Enable hardened usercopy
        mm: Hardened usercopy
        mm: Implement stack frame object validation
        mm: Add is_migrate_cma_page
      1eccfa09
    • Linus Torvalds's avatar
      unsafe_[get|put]_user: change interface to use a error target label · 1bd4403d
      Linus Torvalds authored
      When I initially added the unsafe_[get|put]_user() helpers in commit
      5b24a7a2 ("Add 'unsafe' user access functions for batched
      accesses"), I made the mistake of modeling the interface on our
      traditional __[get|put]_user() functions, which return zero on success,
      or -EFAULT on failure.
      
      That interface is fairly easy to use, but it's actually fairly nasty for
      good code generation, since it essentially forces the caller to check
      the error value for each access.
      
      In particular, since the error handling is already internally
      implemented with an exception handler, and we already use "asm goto" for
      various other things, we could fairly easily make the error cases just
      jump directly to an error label instead, and avoid the need for explicit
      checking after each operation.
      
      So switch the interface to pass in an error label, rather than checking
      the error value in the caller.  Best do it now before we start growing
      more users (the signal handling code in particular would be a good place
      to use the new interface).
      
      So rather than
      
      	if (unsafe_get_user(x, ptr))
      		... handle error ..
      
      the interface is now
      
      	unsafe_get_user(x, ptr, label);
      
      where an error during the user mode fetch will now just cause a jump to
      'label' in the caller.
      
      Right now the actual _implementation_ of this all still ends up being a
      "if (err) goto label", and does not take advantage of any exception
      label tricks, but for "unsafe_put_user()" in particular it should be
      fairly straightforward to convert to using the exception table model.
      
      Note that "unsafe_get_user()" is much harder to convert to a clever
      exception table model, because current versions of gcc do not allow the
      use of "asm goto" (for the exception) with output values (for the actual
      value to be fetched).  But that is hopefully not a limitation in the
      long term.
      
      [ Also note that it might be a good idea to switch unsafe_get_user() to
        actually _return_ the value it fetches from user space, but this
        commit only changes the error handling semantics ]
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1bd4403d
    • Andreas Ziegler's avatar
      printk: Remove unnecessary #ifdef CONFIG_PRINTK · 574673c2
      Andreas Ziegler authored
      In commit 874f9c7d ("printk: create pr_<level> functions"), new
      pr_level defines were added to printk.c.
      
      These new defines are guarded by an #ifdef CONFIG_PRINTK - however,
      there is already a surrounding #ifdef CONFIG_PRINTK starting a lot
      earlier in line 249 which means the newly introduced #ifdef is
      unnecessary.
      
      Let's remove it to avoid confusion.
      Signed-off-by: default avatarAndreas Ziegler <andreas.ziegler@fau.de>
      Cc: Joe Perches <joe@perches.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      574673c2
    • Pali Rohár's avatar
      dell-wmi: Ignore WMI event 0xe00e · 65a97a67
      Pali Rohár authored
      WMI event 0xe00e is received when battery was removed or inserted.
      Signed-off-by: default avatarPali Rohár <pali.rohar@gmail.com>
      Signed-off-by: default avatarDarren Hart <dvhart@linux.intel.com>
      65a97a67
    • Ville Syrjälä's avatar
      x86/hweight: Don't clobber %rdi · 65ea11ec
      Ville Syrjälä authored
      The caller expects %rdi to remain intact, push+pop it make that happen.
      
      Fixes the following kind of explosions on my core2duo machine when
      trying to reboot or shut down:
      
        general protection fault: 0000 [#1] PREEMPT SMP
        Modules linked in: i915 i2c_algo_bit drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea drm netconsole configfs binfmt_misc iTCO_wdt psmouse pcspkr snd_hda_codec_idt e100 coretemp hwmon snd_hda_codec_generic i2c_i801 mii i2c_smbus lpc_ich mfd_core snd_hda_intel uhci_hcd snd_hda_codec snd_hwdep snd_hda_core ehci_pci 8250 ehci_hcd snd_pcm 8250_base usbcore evdev serial_core usb_common parport_pc parport snd_timer snd soundcore
        CPU: 0 PID: 3070 Comm: reboot Not tainted 4.8.0-rc1-perf-dirty #69
        Hardware name:                  /D946GZIS, BIOS TS94610J.86A.0087.2007.1107.1049 11/07/2007
        task: ffff88012a0b4080 task.stack: ffff880123850000
        RIP: 0010:[<ffffffff81003c92>]  [<ffffffff81003c92>] x86_perf_event_update+0x52/0xc0
        RSP: 0018:ffff880123853b60  EFLAGS: 00010087
        RAX: 0000000000000001 RBX: ffff88012fc0a3c0 RCX: 000000000000001e
        RDX: 0000000000000000 RSI: 0000000040000000 RDI: ffff88012b014800
        RBP: ffff880123853b88 R08: ffffffffffffffff R09: 0000000000000000
        R10: ffffea0004a012c0 R11: ffffea0004acedc0 R12: ffffffff80000001
        R13: ffff88012b0149c0 R14: ffff88012b014800 R15: 0000000000000018
        FS:  00007f8b155cd700(0000) GS:ffff88012fc00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00007f8b155f5000 CR3: 000000012a2d7000 CR4: 00000000000006f0
        Stack:
         ffff88012fc0a3c0 ffff88012b014800 0000000000000004 0000000000000001
         ffff88012fc1b750 ffff880123853bb0 ffffffff81003d59 ffff88012b014800
         ffff88012fc0a3c0 ffff88012b014800 ffff880123853bd8 ffffffff81003e13
        Call Trace:
         [<ffffffff81003d59>] x86_pmu_stop+0x59/0xd0
         [<ffffffff81003e13>] x86_pmu_del+0x43/0x140
         [<ffffffff8111705d>] event_sched_out.isra.105+0xbd/0x260
         [<ffffffff8111738d>] __perf_remove_from_context+0x2d/0xb0
         [<ffffffff8111745d>] __perf_event_exit_context+0x4d/0x70
         [<ffffffff810c8826>] generic_exec_single+0xb6/0x140
         [<ffffffff81117410>] ? __perf_remove_from_context+0xb0/0xb0
         [<ffffffff81117410>] ? __perf_remove_from_context+0xb0/0xb0
         [<ffffffff810c898f>] smp_call_function_single+0xdf/0x140
         [<ffffffff81113d27>] perf_event_exit_cpu_context+0x87/0xc0
         [<ffffffff81113d73>] perf_reboot+0x13/0x40
         [<ffffffff8107578a>] notifier_call_chain+0x4a/0x70
         [<ffffffff81075ad7>] __blocking_notifier_call_chain+0x47/0x60
         [<ffffffff81075b06>] blocking_notifier_call_chain+0x16/0x20
         [<ffffffff81076a1d>] kernel_restart_prepare+0x1d/0x40
         [<ffffffff81076ae2>] kernel_restart+0x12/0x60
         [<ffffffff81076d56>] SYSC_reboot+0xf6/0x1b0
         [<ffffffff811a823c>] ? mntput_no_expire+0x2c/0x1b0
         [<ffffffff811a83e4>] ? mntput+0x24/0x40
         [<ffffffff811894fc>] ? __fput+0x16c/0x1e0
         [<ffffffff811895ae>] ? ____fput+0xe/0x10
         [<ffffffff81072fc3>] ? task_work_run+0x83/0xa0
         [<ffffffff81001623>] ? exit_to_usermode_loop+0x53/0xc0
         [<ffffffff8100105a>] ? trace_hardirqs_on_thunk+0x1a/0x1c
         [<ffffffff81076e6e>] SyS_reboot+0xe/0x10
         [<ffffffff814c4ba5>] entry_SYSCALL_64_fastpath+0x18/0xa3
        Code: 7c 4c 8d af c0 01 00 00 49 89 fe eb 10 48 09 c2 4c 89 e0 49 0f b1 55 00 4c 39 e0 74 35 4d 8b a6 c0 01 00 00 41 8b 8e 60 01 00 00 <0f> 33 8b 35 6e 02 8c 00 48 c1 e2 20 85 f6 7e d2 48 89 d3 89 cf
        RIP  [<ffffffff81003c92>] x86_perf_event_update+0x52/0xc0
         RSP <ffff880123853b60>
        ---[ end trace 7ec95181faf211be ]---
        note: reboot[3070] exited with preempt_count 2
      
      Cc: Borislav Petkov <bp@suse.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Fixes: f5967101 ("x86/hweight: Get rid of the special calling convention")
      Signed-off-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      65ea11ec
    • Dave Airlie's avatar
      Merge branch 'drm-next-4.8' of git://people.freedesktop.org/~agd5f/linux into drm-next · 4872850a
      Dave Airlie authored
      A few fixes for amdgpu and ttm for 4.8
      - fix a ttm regression caused by the new pipelining code
      - fixes for mullins on amdgpu
      - updated golden settings for amdgpu
      
      * 'drm-next-4.8' of git://people.freedesktop.org/~agd5f/linux:
        drm/ttm: Wait for a BO to become idle before unbinding it from GTT
        drm/amdgpu: update golden setting of polaris10
        drm/amdgpu: update golden setting of stoney
        drm/amdgpu: update golden setting of polaris11
        drm/amdgpu: update golden setting of carrizo
        drm/amdgpu: update golden setting of iceland
        drm/amd/amdgpu: change pptable output format from ASCII to binary
        drm/amdgpu/ci: add mullins to default case for smc ucode
        drm/amdgpu/gmc7: add missing mullins case
      4872850a
    • Dave Airlie's avatar
      Merge tag 'drm-intel-next-fixes-2016-08-05' of... · e8285cec
      Dave Airlie authored
      Merge tag 'drm-intel-next-fixes-2016-08-05' of git://anongit.freedesktop.org/drm-intel into drm-next
      
      3 intel fixes.
      
      * tag 'drm-intel-next-fixes-2016-08-05' of git://anongit.freedesktop.org/drm-intel:
        drm/i915/fbdev: Check for the framebuffer before use
        drm/i915: Never fully mask the the EI up rps interrupt on SNB/IVB
        drm/i915: Wait up to 3ms for the pcu to ack the cdclk change request on SKL
      e8285cec
    • Daniel Vetter's avatar
      drm: Paper over locking inversion after registration rework · 5c6c201c
      Daniel Vetter authored
      drm_connector_register_all requires a few too many locks because our
      connector_list locking is busted. Add another FIXME+hack to work
      around this. This should address the below lockdep splat:
      
      ======================================================
      [ INFO: possible circular locking dependency detected ]
      4.7.0-rc5+ #524 Tainted: G           O
      -------------------------------------------------------
      kworker/u8:0/6 is trying to acquire lock:
       (&dev->mode_config.mutex){+.+.+.}, at: [<ffffffff815afde0>] drm_modeset_lock_all+0x40/0x120
      
      but task is already holding lock:
       ((fb_notifier_list).rwsem){++++.+}, at: [<ffffffff810ac195>] __blocking_notifier_call_chain+0x35/0x70
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #1 ((fb_notifier_list).rwsem){++++.+}:
             [<ffffffff810df611>] lock_acquire+0xb1/0x200
             [<ffffffff819a55b4>] down_write+0x44/0x80
             [<ffffffff810abf91>] blocking_notifier_chain_register+0x21/0xb0
             [<ffffffff814c7448>] fb_register_client+0x18/0x20
             [<ffffffff814c6c86>] backlight_device_register+0x136/0x260
             [<ffffffffa0127eb2>] intel_backlight_device_register+0xa2/0x160 [i915]
             [<ffffffffa00f46be>] intel_connector_register+0xe/0x10 [i915]
             [<ffffffffa0112bfb>] intel_dp_connector_register+0x1b/0x80 [i915]
             [<ffffffff8159dfea>] drm_connector_register+0x4a/0x80
             [<ffffffff8159fe44>] drm_connector_register_all+0x64/0xf0
             [<ffffffff815a2a64>] drm_modeset_register_all+0x174/0x1c0
             [<ffffffff81599b72>] drm_dev_register+0xc2/0xd0
             [<ffffffffa00621d7>] i915_driver_load+0x1547/0x2200 [i915]
             [<ffffffffa006d80f>] i915_pci_probe+0x4f/0x70 [i915]
             [<ffffffff814a2135>] local_pci_probe+0x45/0xa0
             [<ffffffff814a349b>] pci_device_probe+0xdb/0x130
             [<ffffffff815c07e3>] driver_probe_device+0x223/0x440
             [<ffffffff815c0ad5>] __driver_attach+0xd5/0x100
             [<ffffffff815be386>] bus_for_each_dev+0x66/0xa0
             [<ffffffff815c002e>] driver_attach+0x1e/0x20
             [<ffffffff815bf9be>] bus_add_driver+0x1ee/0x280
             [<ffffffff815c1810>] driver_register+0x60/0xe0
             [<ffffffff814a1a10>] __pci_register_driver+0x60/0x70
             [<ffffffffa01a905b>] i915_init+0x5b/0x62 [i915]
             [<ffffffff8100042d>] do_one_initcall+0x3d/0x150
             [<ffffffff811a935b>] do_init_module+0x5f/0x1d9
             [<ffffffff81124416>] load_module+0x20e6/0x27e0
             [<ffffffff81124d63>] SYSC_finit_module+0xc3/0xf0
             [<ffffffff81124dae>] SyS_finit_module+0xe/0x10
             [<ffffffff819a83a9>] entry_SYSCALL_64_fastpath+0x1c/0xac
      
      -> #0 (&dev->mode_config.mutex){+.+.+.}:
             [<ffffffff810df0ac>] __lock_acquire+0x10fc/0x1260
             [<ffffffff810df611>] lock_acquire+0xb1/0x200
             [<ffffffff819a3097>] mutex_lock_nested+0x67/0x3c0
             [<ffffffff815afde0>] drm_modeset_lock_all+0x40/0x120
             [<ffffffff8158f79b>] drm_fb_helper_restore_fbdev_mode_unlocked+0x2b/0x80
             [<ffffffff8158f81d>] drm_fb_helper_set_par+0x2d/0x50
             [<ffffffffa0105f7a>] intel_fbdev_set_par+0x1a/0x60 [i915]
             [<ffffffff814c13c6>] fbcon_init+0x586/0x610
             [<ffffffff8154d16a>] visual_init+0xca/0x130
             [<ffffffff8154e611>] do_bind_con_driver+0x1c1/0x3a0
             [<ffffffff8154eaf6>] do_take_over_console+0x116/0x180
             [<ffffffff814bd3a7>] do_fbcon_takeover+0x57/0xb0
             [<ffffffff814c1e48>] fbcon_event_notify+0x658/0x750
             [<ffffffff810abcae>] notifier_call_chain+0x3e/0xb0
             [<ffffffff810ac1ad>] __blocking_notifier_call_chain+0x4d/0x70
             [<ffffffff810ac1e6>] blocking_notifier_call_chain+0x16/0x20
             [<ffffffff814c748b>] fb_notifier_call_chain+0x1b/0x20
             [<ffffffff814c86b1>] register_framebuffer+0x251/0x330
             [<ffffffff8158fa9f>] drm_fb_helper_initial_config+0x25f/0x3f0
             [<ffffffffa0106b48>] intel_fbdev_initial_config+0x18/0x30 [i915]
             [<ffffffff810adfd8>] async_run_entry_fn+0x48/0x150
             [<ffffffff810a3947>] process_one_work+0x1e7/0x750
             [<ffffffff810a3efb>] worker_thread+0x4b/0x4f0
             [<ffffffff810aad4f>] kthread+0xef/0x110
             [<ffffffff819a85ef>] ret_from_fork+0x1f/0x40
      
      other info that might help us debug this:
      
       Possible unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock((fb_notifier_list).rwsem);
                                     lock(&dev->mode_config.mutex);
                                     lock((fb_notifier_list).rwsem);
        lock(&dev->mode_config.mutex);
      
       *** DEADLOCK ***
      
      6 locks held by kworker/u8:0/6:
       #0:  ("events_unbound"){.+.+.+}, at: [<ffffffff810a38c9>] process_one_work+0x169/0x750
       #1:  ((&entry->work)){+.+.+.}, at: [<ffffffff810a38c9>] process_one_work+0x169/0x750
       #2:  (registration_lock){+.+.+.}, at: [<ffffffff814c8487>] register_framebuffer+0x27/0x330
       #3:  (console_lock){+.+.+.}, at: [<ffffffff814c86ce>] register_framebuffer+0x26e/0x330
       #4:  (&fb_info->lock){+.+.+.}, at: [<ffffffff814c78dd>] lock_fb_info+0x1d/0x40
       #5:  ((fb_notifier_list).rwsem){++++.+}, at: [<ffffffff810ac195>] __blocking_notifier_call_chain+0x35/0x70
      
      stack backtrace:
      CPU: 2 PID: 6 Comm: kworker/u8:0 Tainted: G           O    4.7.0-rc5+ #524
      Hardware name: Intel Corp. Broxton P/NOTEBOOK, BIOS APLKRVPA.X64.0138.B33.1606250842 06/25/2016
      Workqueue: events_unbound async_run_entry_fn
       0000000000000000 ffff8800758577f0 ffffffff814507a5 ffffffff828b9900
       ffffffff828b9900 ffff880075857830 ffffffff810dc6fa ffff880075857880
       ffff88007584d688 0000000000000005 0000000000000006 ffff88007584d6b0
      Call Trace:
       [<ffffffff814507a5>] dump_stack+0x67/0x92
       [<ffffffff810dc6fa>] print_circular_bug+0x1aa/0x200
       [<ffffffff810df0ac>] __lock_acquire+0x10fc/0x1260
       [<ffffffff810df611>] lock_acquire+0xb1/0x200
       [<ffffffff815afde0>] ? drm_modeset_lock_all+0x40/0x120
       [<ffffffff815afde0>] ? drm_modeset_lock_all+0x40/0x120
       [<ffffffff819a3097>] mutex_lock_nested+0x67/0x3c0
       [<ffffffff815afde0>] ? drm_modeset_lock_all+0x40/0x120
       [<ffffffff810fa85f>] ? rcu_read_lock_sched_held+0x7f/0x90
       [<ffffffff81208218>] ? kmem_cache_alloc_trace+0x248/0x2b0
       [<ffffffff815afdc5>] ? drm_modeset_lock_all+0x25/0x120
       [<ffffffff815afde0>] drm_modeset_lock_all+0x40/0x120
       [<ffffffff8158f79b>] drm_fb_helper_restore_fbdev_mode_unlocked+0x2b/0x80
       [<ffffffff8158f81d>] drm_fb_helper_set_par+0x2d/0x50
       [<ffffffffa0105f7a>] intel_fbdev_set_par+0x1a/0x60 [i915]
       [<ffffffff814c13c6>] fbcon_init+0x586/0x610
       [<ffffffff8154d16a>] visual_init+0xca/0x130
       [<ffffffff8154e611>] do_bind_con_driver+0x1c1/0x3a0
       [<ffffffff8154eaf6>] do_take_over_console+0x116/0x180
       [<ffffffff814bd3a7>] do_fbcon_takeover+0x57/0xb0
       [<ffffffff814c1e48>] fbcon_event_notify+0x658/0x750
       [<ffffffff810abcae>] notifier_call_chain+0x3e/0xb0
       [<ffffffff810ac1ad>] __blocking_notifier_call_chain+0x4d/0x70
       [<ffffffff810ac1e6>] blocking_notifier_call_chain+0x16/0x20
       [<ffffffff814c748b>] fb_notifier_call_chain+0x1b/0x20
       [<ffffffff814c86b1>] register_framebuffer+0x251/0x330
       [<ffffffff815b7e8d>] ? vga_switcheroo_client_fb_set+0x5d/0x70
       [<ffffffff8158fa9f>] drm_fb_helper_initial_config+0x25f/0x3f0
       [<ffffffffa0106b48>] intel_fbdev_initial_config+0x18/0x30 [i915]
       [<ffffffff810adfd8>] async_run_entry_fn+0x48/0x150
       [<ffffffff810a3947>] process_one_work+0x1e7/0x750
       [<ffffffff810a38c9>] ? process_one_work+0x169/0x750
       [<ffffffff810a3efb>] worker_thread+0x4b/0x4f0
       [<ffffffff810a3eb0>] ? process_one_work+0x750/0x750
       [<ffffffff810aad4f>] kthread+0xef/0x110
       [<ffffffff819a85ef>] ret_from_fork+0x1f/0x40
       [<ffffffff810aac60>] ? kthread_stop+0x2e0/0x2e0
      
      v2: Rebase onto the right branch (hand-editing patches ftw) and add more
      reporters.
      Reported-by: default avatarImre Deak <imre.deak@intel.com>
      Cc: Imre Deak <imre.deak@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Acked-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Reported-by: default avatarJiri Kosina <jikos@kernel.org>
      Cc: Jiri Kosina <jikos@kernel.org>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      5c6c201c