1. 14 Jun, 2018 31 commits
    • Randy Dunlap's avatar
      hexagon: fix printk format warning in setup.c · 2738f359
      Randy Dunlap authored
      Fix printk format warning in hexagon/kernel/setup.c:
      
      ../arch/hexagon/kernel/setup.c: In function 'setup_arch':
      ../arch/hexagon/kernel/setup.c:69:2: warning: format '%x' expects argument of type 'unsigned int', but argument 2 has type 'long unsigned int' [-Wformat]
      
      where:
      extern unsigned long	__phys_offset;
      #define PHYS_OFFSET	__phys_offset
      
      Link: http://lkml.kernel.org/r/adce8db5-4b01-dc10-7fbb-6a64e0787eb5@infradead.orgSigned-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2738f359
    • Roman Gushchin's avatar
      mm: fix oom_kill event handling · fe6bdfc8
      Roman Gushchin authored
      Commit e27be240 ("mm: memcg: make sure memory.events is uptodate
      when waking pollers") converted most of memcg event counters to
      per-memcg atomics, which made them less confusing for a user.  The
      "oom_kill" counter remained untouched, so now it behaves differently
      than other counters (including "oom").  This adds nothing but confusion.
      
      Let's fix this by adding the MEMCG_OOM_KILL event, and follow the
      MEMCG_OOM approach.
      
      This also removes a hack from count_memcg_event_mm(), introduced earlier
      specially for the OOM_KILL counter.
      
      [akpm@linux-foundation.org: fix for droppage of memcg-replace-mm-owner-with-mm-memcg.patch]
      Link: http://lkml.kernel.org/r/20180508124637.29984-1-guro@fb.comSigned-off-by: default avatarRoman Gushchin <guro@fb.com>
      Acked-by: default avatarKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fe6bdfc8
    • Stefan Agner's avatar
      treewide: use PHYS_ADDR_MAX to avoid type casting ULLONG_MAX · d7dc899a
      Stefan Agner authored
      With PHYS_ADDR_MAX there is now a type safe variant for all bits set.
      Make use of it.
      
      Patch created using a semantic patch as follows:
      
      // <smpl>
      @@
      typedef phys_addr_t;
      @@
      -(phys_addr_t)ULLONG_MAX
      +PHYS_ADDR_MAX
      // </smpl>
      
      Link: http://lkml.kernel.org/r/20180419214204.19322-1-stefan@agner.chSigned-off-by: default avatarStefan Agner <stefan@agner.ch>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Acked-by: Catalin Marinas <catalin.marinas@arm.com>	[arm64]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d7dc899a
    • Joe Perches's avatar
      mm: use octal not symbolic permissions · 0825a6f9
      Joe Perches authored
      mm/*.c files use symbolic and octal styles for permissions.
      
      Using octal and not symbolic permissions is preferred by many as more
      readable.
      
      https://lkml.org/lkml/2016/8/2/1945
      
      Prefer the direct use of octal for permissions.
      
      Done using
      $ scripts/checkpatch.pl -f --types=SYMBOLIC_PERMS --fix-inplace mm/*.c
      and some typing.
      
      Before:	 $ git grep -P -w "0[0-7]{3,3}" mm | wc -l
      44
      After:	 $ git grep -P -w "0[0-7]{3,3}" mm | wc -l
      86
      
      Miscellanea:
      
      o Whitespace neatening around these conversions.
      
      Link: http://lkml.kernel.org/r/2e032ef111eebcd4c5952bae86763b541d373469.1522102887.git.joe@perches.comSigned-off-by: default avatarJoe Perches <joe@perches.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0825a6f9
    • Souptick Joarder's avatar
      ipc: use new return type vm_fault_t · 14f28f57
      Souptick Joarder authored
      Use new return type vm_fault_t for fault handler.  For now, this is just
      documenting that the function returns a VM_FAULT value rather than an
      errno.  Once all instances are converted, vm_fault_t will become a
      distinct type.
      
      Commit 1c8f4220 ("mm: change return type to vm_fault_t")
      
      Link: http://lkml.kernel.org/r/20180425043413.GA21467@jordon-HP-15-Notebook-PCSigned-off-by: default avatarSouptick Joarder <jrdr.linux@gmail.com>
      Reviewed-by: default avatarMatthew Wilcox <mawilcox@microsoft.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarDavidlohr Bueso <dbueso@suse.de>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      14f28f57
    • Davidlohr Bueso's avatar
      sysvipc/sem: mitigate semnum index against spectre v1 · ec67aaa4
      Davidlohr Bueso authored
      Both smatch and coverity are reporting potential issues with spectre
      variant 1 with the 'semnum' index within the sma->sems array, ie:
      
        ipc/sem.c:388 sem_lock() warn: potential spectre issue 'sma->sems'
        ipc/sem.c:641 perform_atomic_semop_slow() warn: potential spectre issue 'sma->sems'
        ipc/sem.c:721 perform_atomic_semop() warn: potential spectre issue 'sma->sems'
      
      Avoid any possible speculation by using array_index_nospec() thus
      ensuring the semnum value is bounded to [0, sma->sem_nsems).  With the
      exception of sem_lock() all of these are slowpaths.
      
      Link: http://lkml.kernel.org/r/20180423171131.njs4rfm2yzyeg6do@linux-n805Signed-off-by: default avatarDavidlohr Bueso <dbueso@suse.de>
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ec67aaa4
    • Mikulas Patocka's avatar
    • Dmitry Vyukov's avatar
      arm: port KCOV to arm · 75851720
      Dmitry Vyukov authored
      KCOV is code coverage collection facility used, in particular, by
      syzkaller system call fuzzer.  There is some interest in using syzkaller
      on arm devices.  So port KCOV to arm.
      
      On implementation level this merely declares that KCOV is supported and
      disables instrumentation of 3 special cases.  Reasons for disabling are
      commented in code.
      
      Tested with qemu-system-arm/vexpress-a15.
      
      Link: http://lkml.kernel.org/r/20180511143248.112484-1-dvyukov@google.comSigned-off-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Acked-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Abbott Liu <liuwenliang@huawei.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Koguchi Takuo <takuo.koguchi.sw@hitachi.com>
      Cc: <syzkaller@googlegroups.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      75851720
    • Mark Rutland's avatar
      sched/core / kcov: avoid kcov_area during task switch · 0ed557aa
      Mark Rutland authored
      During a context switch, we first switch_mm() to the next task's mm,
      then switch_to() that new task.  This means that vmalloc'd regions which
      had previously been faulted in can transiently disappear in the context
      of the prev task.
      
      Functions instrumented by KCOV may try to access a vmalloc'd kcov_area
      during this window, and as the fault handling code is instrumented, this
      results in a recursive fault.
      
      We must avoid accessing any kcov_area during this window.  We can do so
      with a new flag in kcov_mode, set prior to switching the mm, and cleared
      once the new task is live.  Since task_struct::kcov_mode isn't always a
      specific enum kcov_mode value, this is made an unsigned int.
      
      The manipulation is hidden behind kcov_{prepare,finish}_switch() helpers,
      which are empty for !CONFIG_KCOV kernels.
      
      The code uses macros because I can't use static inline functions without a
      circular include dependency between <linux/sched.h> and <linux/kcov.h>,
      since the definition of task_struct uses things defined in <linux/kcov.h>
      
      Link: http://lkml.kernel.org/r/20180504135535.53744-4-mark.rutland@arm.comSigned-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Acked-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0ed557aa
    • Mark Rutland's avatar
      kcov: prefault the kcov_area · dc55daff
      Mark Rutland authored
      On many architectures the vmalloc area is lazily faulted in upon first
      access.  This is problematic for KCOV, as __sanitizer_cov_trace_pc
      accesses the (vmalloc'd) kcov_area, and fault handling code may be
      instrumented.  If an access to kcov_area faults, this will result in
      mutual recursion through the fault handling code and
      __sanitizer_cov_trace_pc(), eventually leading to stack corruption
      and/or overflow.
      
      We can avoid this by faulting in the kcov_area before
      __sanitizer_cov_trace_pc() is permitted to access it.  Once it has been
      faulted in, it will remain present in the process page tables, and will
      not fault again.
      
      [akpm@linux-foundation.org: code cleanup]
      [akpm@linux-foundation.org: add comment explaining kcov_fault_in_area()]
      [akpm@linux-foundation.org: fancier code comment from Mark]
      Link: http://lkml.kernel.org/r/20180504135535.53744-3-mark.rutland@arm.comSigned-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Acked-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dc55daff
    • Mark Rutland's avatar
      kcov: ensure irq code sees a valid area · c9484b98
      Mark Rutland authored
      Patch series "kcov: fix unexpected faults".
      
      These patches fix a few issues where KCOV code could trigger recursive
      faults, discovered while debugging a patch enabling KCOV for arch/arm:
      
      * On CONFIG_PREEMPT kernels, there's a small race window where
        __sanitizer_cov_trace_pc() can see a bogus kcov_area.
      
      * Lazy faulting of the vmalloc area can cause mutual recursion between
        fault handling code and __sanitizer_cov_trace_pc().
      
      * During the context switch, switching the mm can cause the kcov_area to
        be transiently unmapped.
      
      These are prerequisites for enabling KCOV on arm, but the issues
      themsevles are generic -- we just happen to avoid them by chance rather
      than design on x86-64 and arm64.
      
      This patch (of 3):
      
      For kernels built with CONFIG_PREEMPT, some C code may execute before or
      after the interrupt handler, while the hardirq count is zero.  In these
      cases, in_task() can return true.
      
      A task can be interrupted in the middle of a KCOV_DISABLE ioctl while it
      resets the task's kcov data via kcov_task_init().  Instrumented code
      executed during this period will call __sanitizer_cov_trace_pc(), and as
      in_task() returns true, will inspect t->kcov_mode before trying to write
      to t->kcov_area.
      
      In kcov_init_task() we update t->kcov_{mode,area,size} with plain stores,
      which may be re-ordered, torn, etc.  Thus __sanitizer_cov_trace_pc() may
      see bogus values for any of these fields, and may attempt to write to
      memory which is not mapped.
      
      Let's avoid this by using WRITE_ONCE() to set t->kcov_mode, with a
      barrier() to ensure this is ordered before we clear t->kov_{area,size}.
      This ensures that any code execute while kcov_init_task() is preempted
      will either see valid values for t->kcov_{area,size}, or will see that
      t->kcov_mode is KCOV_MODE_DISABLED, and bail out without touching
      t->kcov_area.
      
      Link: http://lkml.kernel.org/r/20180504135535.53744-2-mark.rutland@arm.comSigned-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Acked-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c9484b98
    • Souptick Joarder's avatar
      kernel/relay.c: change return type to vm_fault_t · 3fb3894b
      Souptick Joarder authored
      Use new return type vm_fault_t for fault handler.  For now, this is just
      documenting that the function returns a VM_FAULT value rather than an
      errno.  Once all instances are converted, vm_fault_t will become a
      distinct type.
      
      commit 1c8f4220 ("mm: change return type to vm_fault_t")
      
      Link: http://lkml.kernel.org/r/20180510140335.GA25363@jordon-HP-15-Notebook-PCSigned-off-by: default avatarSouptick Joarder <jrdr.linux@gmail.com>
      Reviewed-by: default avatarMatthew Wilcox <mawilcox@microsoft.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Eric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3fb3894b
    • Kees Cook's avatar
      exofs: avoid VLA in structures · 20fe9353
      Kees Cook authored
      On the quest to remove all VLAs from the kernel[1] this adjusts several
      cases where allocation is made after an array of structures that points
      back into the allocation.  The allocations are changed to perform
      explicit calculations instead of using a Variable Length Array in a
      structure.
      
      Additionally, this lets Clang compile this code now, since Clang does
      not support VLAIS[2].
      
      [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
      [2] https://lkml.kernel.org/r/CA+55aFy6h1c3_rP_bXFedsTXzwW+9Q9MfJaW7GUmMBrAp-fJ9A@mail.gmail.com
      
      [keescook@chromium.org: v2]
        Link: http://lkml.kernel.org/r/20180418163546.GA45794@beast
      Link: http://lkml.kernel.org/r/20180327203904.GA1151@beastSigned-off-by: default avatarKees Cook <keescook@chromium.org>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Cc: Boaz Harrosh <ooo@electrozaur.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      20fe9353
    • Alexey Dobriyan's avatar
      coredump: fix spam with zero VMA process · 86a2bb5a
      Alexey Dobriyan authored
      Nobody ever tried to self destruct by unmapping whole address space at
      once:
      
      	munmap((void *)0, (1ULL << 47) - 4096);
      
      Doing this produces 2 warnings for zero-length vmalloc allocations:
      
        a.out[1353]: segfault at 7f80bcc4b757 ip 00007f80bcc4b757 sp 00007fff683939b8 error 14
        a.out: vmalloc: allocation failure: 0 bytes, mode:0xcc0(GFP_KERNEL), nodemask=(null)
      	...
        a.out: vmalloc: allocation failure: 0 bytes, mode:0xcc0(GFP_KERNEL), nodemask=(null)
      	...
      
      Fix is to switch to kvmalloc().
      
      Steps to reproduce:
      
      	// vsyscall=none
      	#include <sys/mman.h>
      	#include <sys/resource.h>
      	int main(void)
      	{
      		setrlimit(RLIMIT_CORE, &(struct rlimit){RLIM_INFINITY, RLIM_INFINITY});
      		munmap((void *)0, (1ULL << 47) - 4096);
      		return 0;
      	}
      
      Link: http://lkml.kernel.org/r/20180410180353.GA2515@avx2Signed-off-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      86a2bb5a
    • OGAWA Hirofumi's avatar
      fat: use fat_fs_error() instead of BUG_ON() in __fat_get_block() · c2574aaa
      OGAWA Hirofumi authored
      If file size and FAT cluster chain is not matched (corrupted image), we
      can hit BUG_ON(!phys) in __fat_get_block().
      
      So, use fat_fs_error() instead.
      
      [hirofumi@mail.parknet.co.jp: fix printk warning]
        Link: http://lkml.kernel.org/r/87po12aq5p.fsf@mail.parknet.co.jp
      Link: http://lkml.kernel.org/r/874lilcu67.fsf@mail.parknet.co.jpSigned-off-by: default avatarOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Reported-by: default avatarAnatoly Trosinenko <anatoly.trosinenko@gmail.com>
      Tested-by: default avatarAnatoly Trosinenko <anatoly.trosinenko@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c2574aaa
    • Alexey Dobriyan's avatar
      proc: skip branch in /proc/*/* lookup · 26b95137
      Alexey Dobriyan authored
      Code is structured like this:
      
      	for ( ... p < last; p++) {
      		if (memcmp == 0)
      			break;
      	}
      	if (p >= last)
      		ERROR
      	OK
      
      gcc doesn't see that if if lookup succeeds than post loop branch will
      never be taken and skip it.
      
      [akpm@linux-foundation.org: proc_pident_instantiate() no longer takes an inode*]
      Link: http://lkml.kernel.org/r/20180423213954.GD9043@avx2Signed-off-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      26b95137
    • Mel Gorman's avatar
      mremap: remove LATENCY_LIMIT from mremap to reduce the number of TLB shootdowns · 37a4094e
      Mel Gorman authored
      Commit 5d190420 ("mremap: fix race between mremap() and page
      cleanning") fixed races between mremap and other operations for both
      file-backed and anonymous mappings.  The file-backed was the most
      critical as it allowed the possibility that data could be changed on a
      physical page after page_mkclean returned which could trigger data loss
      or data integrity issues.
      
      A customer reported that the cost of the TLBs for anonymous regressions
      was excessive and resulting in a 30-50% drop in performance overall
      since this commit on a microbenchmark.  Unfortunately I neither have
      access to the test-case nor can I describe what it does other than
      saying that mremap operations dominate heavily.
      
      This patch removes the LATENCY_LIMIT to handle TLB flushes on a PMD
      boundary instead of every 64 pages to reduce the number of TLB
      shootdowns by a factor of 8 in the ideal case.  LATENCY_LIMIT was almost
      certainly used originally to limit the PTL hold times but the latency
      savings are likely offset by the cost of IPIs in many cases.  This patch
      is not reported to completely restore performance but gets it within an
      acceptable percentage.  The given metric here is simply described as
      "higher is better".
      
      Baseline that was known good
      002:  Metric:       91.05
      004:  Metric:      109.45
      008:  Metric:       73.08
      016:  Metric:       58.14
      032:  Metric:       61.09
      064:  Metric:       57.76
      128:  Metric:       55.43
      
      Current
      001:  Metric:       54.98
      002:  Metric:       56.56
      004:  Metric:       41.22
      008:  Metric:       35.96
      016:  Metric:       36.45
      032:  Metric:       35.71
      064:  Metric:       35.73
      128:  Metric:       34.96
      
      With patch
      001:  Metric:       61.43
      002:  Metric:       81.64
      004:  Metric:       67.92
      008:  Metric:       51.67
      016:  Metric:       50.47
      032:  Metric:       52.29
      064:  Metric:       50.01
      128:  Metric:       49.04
      
      So for low threads, it's not restored but for larger number of threads,
      it's closer to the "known good" baseline.
      
      Using a different mremap-intensive workload that is not representative
      of the real workload there is little difference observed outside of
      noise in the headline metrics However, the TLB shootdowns are reduced by
      11% on average and at the peak, TLB shootdowns were reduced by 21%.
      Interrupts were sampled every second while the workload ran to get those
      figures.  It's known that the figures will vary as the
      non-representative load is non-deterministic.
      
      An alternative patch was posted that should have significantly reduced
      the TLB flushes but unfortunately it does not perform as well as this
      version on the customer test case.  If revisited, the two patches can
      stack on top of each other.
      
      Link: http://lkml.kernel.org/r/20180606183803.k7qaw2xnbvzshv34@techsingularity.netSigned-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Aaron Lu <aaron.lu@intel.com>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      37a4094e
    • Mathieu Malaterre's avatar
      mm/memblock: add missing include <linux/bootmem.h> · 69b5086b
      Mathieu Malaterre authored
      Commit 26f09e9b ("mm/memblock: add memblock memory allocation apis")
      introduced two new function definitions:
      
        memblock_virt_alloc_try_nid_nopanic()
        memblock_virt_alloc_try_nid()
      
      Commit ea1f5f37 ("mm: define memblock_virt_alloc_try_nid_raw")
      introduced the following function definition:
      
        memblock_virt_alloc_try_nid_raw()
      
      This commit adds an includeof header file <linux/bootmem.h> to provide
      the missing function prototypes.  Silence the following gcc warning
      (W=1):
      
        mm/memblock.c:1334:15: warning: no previous prototype for `memblock_virt_alloc_try_nid_raw' [-Wmissing-prototypes]
        mm/memblock.c:1371:15: warning: no previous prototype for `memblock_virt_alloc_try_nid_nopanic' [-Wmissing-prototypes]
        mm/memblock.c:1407:15: warning: no previous prototype for `memblock_virt_alloc_try_nid' [-Wmissing-prototypes]
      
      Link: http://lkml.kernel.org/r/20180606194144.16990-1-malat@debian.orgSigned-off-by: default avatarMathieu Malaterre <malat@debian.org>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      69b5086b
    • Tetsuo Handa's avatar
      mm: check for SIGKILL inside dup_mmap() loop · 655c79bb
      Tetsuo Handa authored
      As a theoretical problem, dup_mmap() of an mm_struct with 60000+ vmas
      can loop while potentially allocating memory, with mm->mmap_sem held for
      write by current thread.  This is bad if current thread was selected as
      an OOM victim, for current thread will continue allocations using memory
      reserves while OOM reaper is unable to reclaim memory.
      
      As an actually observable problem, it is not difficult to make OOM
      reaper unable to reclaim memory if the OOM victim is blocked at
      i_mmap_lock_write() in this loop.  Unfortunately, since nobody can
      explain whether it is safe to use killable wait there, let's check for
      SIGKILL before trying to allocate memory.  Even without an OOM event,
      there is no point with continuing the loop from the beginning if current
      thread is killed.
      
      I tested with debug printk().  This patch should be safe because we
      already fail if security_vm_enough_memory_mm() or
      kmem_cache_alloc(GFP_KERNEL) fails and exit_mmap() handles it.
      
         ***** Aborting dup_mmap() due to SIGKILL *****
         ***** Aborting dup_mmap() due to SIGKILL *****
         ***** Aborting dup_mmap() due to SIGKILL *****
         ***** Aborting dup_mmap() due to SIGKILL *****
         ***** Aborting exit_mmap() due to NULL mmap *****
      
      [akpm@linux-foundation.org: add comment]
      Link: http://lkml.kernel.org/r/201804071938.CDE04681.SOFVQJFtMHOOLF@I-love.SAKURA.ne.jpSigned-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      655c79bb
    • Jarrett Farnitano's avatar
      kexec: yield to scheduler when loading kimage segments · a8311f64
      Jarrett Farnitano authored
      Without yielding while loading kimage segments, a large initrd will
      block all other work on the CPU performing the load until it is
      completed.  For example loading an initrd of 200MB on a low power single
      core system will lock up the system for a few seconds.
      
      To increase system responsiveness to other tasks at that time, call
      cond_resched() in both the crash kernel and normal kernel segment
      loading loops.
      
      I did run into a practical problem.  Hardware watchdogs on embedded
      systems can have short timers on the order of seconds.  If the system is
      locked up for a few seconds with only a single core available, the
      watchdog may not be pet in a timely fashion.  If this happens, the
      hardware watchdog will fire and reset the system.
      
      This really only becomes a problem when you are working with a single
      core, a decently sized initrd, and have a constrained hardware watchdog.
      
      Link: http://lkml.kernel.org/r/1528738546-3328-1-git-send-email-jmf@amazon.comSigned-off-by: default avatarJarrett Farnitano <jmf@amazon.com>
      Reviewed-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a8311f64
    • Shakeel Butt's avatar
      mm: fix race between kmem_cache destroy, create and deactivate · 92ee383f
      Shakeel Butt authored
      The memcg kmem cache creation and deactivation (SLUB only) is
      asynchronous.  If a root kmem cache is destroyed whose memcg cache is in
      the process of creation or deactivation, the kernel may crash.
      
      Example of one such crash:
      	general protection fault: 0000 [#1] SMP PTI
      	CPU: 1 PID: 1721 Comm: kworker/14:1 Not tainted 4.17.0-smp
      	...
      	Workqueue: memcg_kmem_cache kmemcg_deactivate_workfn
      	RIP: 0010:has_cpu_slab
      	...
      	Call Trace:
      	? on_each_cpu_cond
      	__kmem_cache_shrink
      	kmemcg_cache_deact_after_rcu
      	kmemcg_deactivate_workfn
      	process_one_work
      	worker_thread
      	kthread
      	ret_from_fork+0x35/0x40
      
      To fix this race, on root kmem cache destruction, mark the cache as
      dying and flush the workqueue used for memcg kmem cache creation and
      deactivation.  SLUB's memcg kmem cache deactivation also includes RCU
      callback and thus make sure all previous registered RCU callbacks have
      completed as well.
      
      [shakeelb@google.com: handle the RCU callbacks for SLUB deactivation]
        Link: http://lkml.kernel.org/r/20180611192951.195727-1-shakeelb@google.com
      [shakeelb@google.com: add more documentation, rename fields for readability]
        Link: http://lkml.kernel.org/r/20180522201336.196994-1-shakeelb@google.com
      [akpm@linux-foundation.org: fix build, per Shakeel]
      [shakeelb@google.com: v3.  Instead of refcount, flush the workqueue]
        Link: http://lkml.kernel.org/r/20180530001204.183758-1-shakeelb@google.com
      Link: http://lkml.kernel.org/r/20180521174116.171846-1-shakeelb@google.comSigned-off-by: default avatarShakeel Butt <shakeelb@google.com>
      Acked-by: default avatarVladimir Davydov <vdavydov.dev@gmail.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      92ee383f
    • Dan Williams's avatar
      mm: fix devmem_is_allowed() for sub-page System RAM intersections · 2bdce744
      Dan Williams authored
      Hussam reports:
      
          I was poking around and for no real reason, I did cat /dev/mem and
          strings /dev/mem.  Then I saw the following warning in dmesg. I saved it
          and rebooted immediately.
      
           memremap attempted on mixed range 0x000000000009c000 size: 0x1000
           ------------[ cut here ]------------
           WARNING: CPU: 0 PID: 11810 at kernel/memremap.c:98 memremap+0x104/0x170
           [..]
           Call Trace:
            xlate_dev_mem_ptr+0x25/0x40
            read_mem+0x89/0x1a0
            __vfs_read+0x36/0x170
      
      The memremap() implementation checks for attempts to remap System RAM
      with MEMREMAP_WB and instead redirects those mapping attempts to the
      linear map.  However, that only works if the physical address range
      being remapped is page aligned.  In low memory we have situations like
      the following:
      
          00000000-00000fff : Reserved
          00001000-0009fbff : System RAM
          0009fc00-0009ffff : Reserved
      
      ...where System RAM intersects Reserved ranges on a sub-page page
      granularity.
      
      Given that devmem_is_allowed() special cases any attempt to map System
      RAM in the first 1MB of memory, replace page_is_ram() with the more
      precise region_intersects() to trap attempts to map disallowed ranges.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=199999
      Link: http://lkml.kernel.org/r/152856436164.18127.2847888121707136898.stgit@dwillia2-desk3.amr.corp.intel.com
      Fixes: 92281dee ("arch: introduce memremap()")
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Reported-by: default avatarHussam Al-Tayeb <me@hussam.eu.org>
      Tested-by: default avatarHussam Al-Tayeb <me@hussam.eu.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2bdce744
    • Daniel Jordan's avatar
      mm/swapfile.c: fix swap_count comment about nonexistent SWAP_HAS_CONT · 955c97f0
      Daniel Jordan authored
      Commit 570a335b ("swap_info: swap count continuations") introduces
      COUNT_CONTINUED but refers to it incorrectly as SWAP_HAS_CONT in a
      comment in swap_count.  Fix it.
      
      Link: http://lkml.kernel.org/r/20180612175919.30413-1-daniel.m.jordan@oracle.com
      Fixes: 570a335b ("swap_info: swap count continuations")
      Signed-off-by: default avatarDaniel Jordan <daniel.m.jordan@oracle.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: "Huang, Ying" <ying.huang@intel.com>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      955c97f0
    • Roman Gushchin's avatar
      mm: fix null pointer dereference in mem_cgroup_protected · df2a4196
      Roman Gushchin authored
      Shakeel reported a crash in mem_cgroup_protected(), which can be triggered
      by memcg reclaim if the legacy cgroup v1 use_hierarchy=0 mode is used:
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000120
        PGD 8000001ff55da067 P4D 8000001ff55da067 PUD 1fdc7df067 PMD 0
        Oops: 0000 [#4] SMP PTI
        CPU: 0 PID: 15581 Comm: bash Tainted: G      D 4.17.0-smp-clean #5
        Hardware name: ...
        RIP: 0010:mem_cgroup_protected+0x54/0x130
        Code: 4c 8b 8e 00 01 00 00 4c 8b 86 08 01 00 00 48 8d 8a 08 ff ff ff 48 85 d2 ba 00 00 00 00 48 0f 44 ca 48 39 c8 0f 84 cf 00 00 00 <48> 8b 81 20 01 00 00 4d 89 ca 4c 39 c8 4c 0f 46 d0 4d 85 d2 74 05
        RSP: 0000:ffffabe64dfafa58 EFLAGS: 00010286
        RAX: ffff9fb6ff03d000 RBX: ffff9fb6f5b1b000 RCX: 0000000000000000
        RDX: 0000000000000000 RSI: ffff9fb6f5b1b000 RDI: ffff9fb6f5b1b000
        RBP: ffffabe64dfafb08 R08: 0000000000000000 R09: 0000000000000000
        R10: 0000000000000000 R11: 000000000000c800 R12: ffffabe64dfafb88
        R13: ffff9fb6f5b1b000 R14: ffffabe64dfafb88 R15: ffff9fb77fffe000
        FS:  00007fed1f8ac700(0000) GS:ffff9fb6ff400000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 0000000000000120 CR3: 0000001fdcf86003 CR4: 00000000001606f0
        Call Trace:
         ? shrink_node+0x194/0x510
         do_try_to_free_pages+0xfd/0x390
         try_to_free_mem_cgroup_pages+0x123/0x210
         try_charge+0x19e/0x700
         mem_cgroup_try_charge+0x10b/0x1a0
         wp_page_copy+0x134/0x5b0
         do_wp_page+0x90/0x460
         __handle_mm_fault+0x8e3/0xf30
         handle_mm_fault+0xfe/0x220
         __do_page_fault+0x262/0x500
         do_page_fault+0x28/0xd0
         ? page_fault+0x8/0x30
         page_fault+0x1e/0x30
        RIP: 0033:0x485b72
      
      The problem happens because parent_mem_cgroup() returns a NULL pointer,
      which is dereferenced later without a check.
      
      As cgroup v1 has no memory guarantee support, let's make
      mem_cgroup_protected() immediately return MEMCG_PROT_NONE, if the given
      cgroup has no parent (non-hierarchical mode is used).
      
      Link: http://lkml.kernel.org/r/20180611175418.7007-2-guro@fb.com
      Fixes: bf8d5d52 ("memcg: introduce memory.min")
      Signed-off-by: default avatarRoman Gushchin <guro@fb.com>
      Reported-by: default avatarShakeel Butt <shakeelb@google.com>
      Tested-by: default avatarShakeel Butt <shakeelb@google.com>
      Tested-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarMichal Hocko <mhocko@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      df2a4196
    • Jia He's avatar
      mm/ksm.c: ignore STABLE_FLAG of rmap_item->address in rmap_walk_ksm() · 1105a2fc
      Jia He authored
      In our armv8a server(QDF2400), I noticed lots of WARN_ON caused by
      PAGE_SIZE unaligned for rmap_item->address under memory pressure
      tests(start 20 guests and run memhog in the host).
      
        WARNING: CPU: 4 PID: 4641 at virt/kvm/arm/mmu.c:1826 kvm_age_hva_handler+0xc0/0xc8
        CPU: 4 PID: 4641 Comm: memhog Tainted: G        W 4.17.0-rc3+ #8
        Call trace:
         kvm_age_hva_handler+0xc0/0xc8
         handle_hva_to_gpa+0xa8/0xe0
         kvm_age_hva+0x4c/0xe8
         kvm_mmu_notifier_clear_flush_young+0x54/0x98
         __mmu_notifier_clear_flush_young+0x6c/0xa0
         page_referenced_one+0x154/0x1d8
         rmap_walk_ksm+0x12c/0x1d0
         rmap_walk+0x94/0xa0
         page_referenced+0x194/0x1b0
         shrink_page_list+0x674/0xc28
         shrink_inactive_list+0x26c/0x5b8
         shrink_node_memcg+0x35c/0x620
         shrink_node+0x100/0x430
         do_try_to_free_pages+0xe0/0x3a8
         try_to_free_pages+0xe4/0x230
         __alloc_pages_nodemask+0x564/0xdc0
         alloc_pages_vma+0x90/0x228
         do_anonymous_page+0xc8/0x4d0
         __handle_mm_fault+0x4a0/0x508
         handle_mm_fault+0xf8/0x1b0
         do_page_fault+0x218/0x4b8
         do_translation_fault+0x90/0xa0
         do_mem_abort+0x68/0xf0
         el0_da+0x24/0x28
      
      In rmap_walk_ksm, the rmap_item->address might still have the
      STABLE_FLAG, then the start and end in handle_hva_to_gpa might not be
      PAGE_SIZE aligned.  Thus it will cause exceptions in handle_hva_to_gpa
      on arm64.
      
      This patch fixes it by ignoring (not removing) the low bits of address
      when doing rmap_walk_ksm.
      
      IMO, it should be backported to stable tree.  the storm of WARN_ONs is
      very easy for me to reproduce.  More than that, I watched a panic (not
      reproducible) as follows:
      
        page:ffff7fe003742d80 count:-4871 mapcount:-2126053375 mapping: (null) index:0x0
        flags: 0x1fffc00000000000()
        raw: 1fffc00000000000 0000000000000000 0000000000000000 ffffecf981470000
        raw: dead000000000100 dead000000000200 ffff8017c001c000 0000000000000000
        page dumped because: nonzero _refcount
        CPU: 29 PID: 18323 Comm: qemu-kvm Tainted: G W 4.14.15-5.hxt.aarch64 #1
        Hardware name: <snip for confidential issues>
        Call trace:
          dump_backtrace+0x0/0x22c
          show_stack+0x24/0x2c
          dump_stack+0x8c/0xb0
          bad_page+0xf4/0x154
          free_pages_check_bad+0x90/0x9c
          free_pcppages_bulk+0x464/0x518
          free_hot_cold_page+0x22c/0x300
          __put_page+0x54/0x60
          unmap_stage2_range+0x170/0x2b4
          kvm_unmap_hva_handler+0x30/0x40
          handle_hva_to_gpa+0xb0/0xec
          kvm_unmap_hva_range+0x5c/0xd0
      
      I even injected a fault on purpose in kvm_unmap_hva_range by seting
      size=size-0x200, the call trace is similar as above.  So I thought the
      panic is similarly caused by the root cause of WARN_ON.
      
      Andrea said:
      
      : It looks a straightforward safe fix, on x86 hva_to_gfn_memslot would
      : zap those bits and hide the misalignment caused by the low metadata
      : bits being erroneously left set in the address, but the arm code
      : notices when that's the last page in the memslot and the hva_end is
      : getting aligned and the size is below one page.
      :
      : I think the problem triggers in the addr += PAGE_SIZE of
      : unmap_stage2_ptes that never matches end because end is aligned but
      : addr is not.
      :
      : 	} while (pte++, addr += PAGE_SIZE, addr != end);
      :
      : x86 again only works on hva_start/hva_end after converting it to
      : gfn_start/end and that being in pfn units the bits are zapped before
      : they risk to cause trouble.
      
      Jia He said:
      
      : I've tested by myself in arm64 server (QDF2400,46 cpus,96G mem) Without
      : this patch, the WARN_ON is very easy for reproducing.  After this patch, I
      : have run the same benchmarch for a whole day without any WARN_ONs
      
      Link: http://lkml.kernel.org/r/1525403506-6750-1-git-send-email-hejianet@gmail.comSigned-off-by: default avatarJia He <jia.he@hxt-semitech.com>
      Reviewed-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Tested-by: default avatarJia He <hejianet@gmail.com>
      Cc: Suzuki K Poulose <Suzuki.Poulose@arm.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
      Cc: Arvind Yadav <arvind.yadav.cs@gmail.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1105a2fc
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 2837461d
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "This is a set of minor (and safe changes) that didn't make the initial
        pull request plus some bug fixes"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: qla2xxx: Mask off Scope bits in retry delay
        scsi: qla2xxx: Fix crash on qla2x00_mailbox_command
        scsi: aic7xxx: aic79xx: fix potential null pointer dereference on ahd
        scsi: mpt3sas: Add an I/O barrier
        scsi: qla2xxx: Fix setting lower transfer speed if GPSC fails
        scsi: hpsa: disable device during shutdown
        scsi: sd_zbc: Fix sd_zbc_check_zone_size() error path
        scsi: aacraid: remove bogus GFP_DMA32 specifies
      2837461d
    • Linus Torvalds's avatar
      Merge tag 'platform-drivers-x86-v4.18-1' of git://git.infradead.org/linux-platform-drivers-x86 · f3b5020e
      Linus Torvalds authored
      Pull x86 platform driver updates from Darren Hart:
       "Several incremental improvements including new keycodes, new models,
        new quirks, and related documentation. Adds LED platform driver
        activation for Mellanox systems. Some minor optimizations and
        cleanups. Includes several bug fixes, message silencing, mostly minor
      
        Automated summary:
      
        acer-wmi:
         -  add another KEY_POWER keycode
      
        apple-gmux:
         -  fix gmux_get_client_id()'s return type
      
        asus-laptop:
         -  Simplify getting .drvdata
      
        asus-wireless:
         -  Fix format specifier
      
        dell-laptop:
         -  Fix keyboard backlight timeout on XPS 13 9370
      
        dell-smbios:
         -  Match on www.dell.com in OEM strings too
      
        dell-wmi:
         -  Ignore new rfkill and fn-lock events
         -  Set correct keycode for Fn + left arrow
      
        fujitsu-laptop:
         -  Simplify soft key handling
      
        ideapad-laptop:
         -  Add E42-80 to no_hw_rfkill
         -  Add fn-lock setting
         -  Add MIIX 720-12IKB to no_hw_rfkill
      
        lib/string_helpers:
         -  Add missed declaration of struct task_struct
      
        intel_scu_ipc:
         -  Replace mdelay with usleep_range in intel_scu_ipc_i2c_cntrl
      
        mlx-platform:
         -  Add LED platform driver activation
      
        platform/mellanox:
         -  Add new ODM system types to mlx-platform
         -  mlxreg-hotplug: add extra cycle for hotplug work queue
         -  mlxreg-hotplug: Document fixes for hotplug private data
      
        platform_data/mlxreg:
         -  Document fixes for hotplug device
      
        silead_dmi:
         -  Add entry for Chuwi Hi8 tablet touchscreen
         -  Add touchscreen info for the Onda V891w tablet
         -  Add info for the PoV mobii TAB-P800W (v2.0)
         -  Add touchscreen info for the Jumper EZpad 6 Pro
      
        thinkpad_acpi:
         -  silence false-positive-prone pr_warn
         -  do not report thermal sensor state for tablet mode switch
         -  silence HKEY 0x6032, 0x60f0, 0x6030"
      
      * tag 'platform-drivers-x86-v4.18-1' of git://git.infradead.org/linux-platform-drivers-x86: (30 commits)
        platform/x86: silead_dmi: Add entry for Chuwi Hi8 tablet touchscreen
        platform/x86: dell-laptop: Fix keyboard backlight timeout on XPS 13 9370
        platform/x86: dell-wmi: Ignore new rfkill and fn-lock events
        platform/x86: mlx-platform: Add LED platform driver activation
        platform/mellanox: Add new ODM system types to mlx-platform
        platform/mellanox: mlxreg-hotplug: add extra cycle for hotplug work queue
        platform/x86: ideapad-laptop: Add E42-80 to no_hw_rfkill
        platform/x86: silead_dmi: Add touchscreen info for the Onda V891w tablet
        platform/x86: silead_dmi: Add info for the PoV mobii TAB-P800W (v2.0)
        platform/x86: silead_dmi: Add touchscreen info for the Jumper EZpad 6 Pro
        platform/x86: asus-wireless: Fix format specifier
        platform/x86: asus-wmi: Fix NULL pointer dereference
        platform/x86: dell-wmi: Set correct keycode for Fn + left arrow
        platform/x86: acer-wmi: add another KEY_POWER keycode
        platform/x86: ideapad-laptop: Add fn-lock setting
        platform/x86: ideapad-laptop: Add MIIX 720-12IKB to no_hw_rfkill
        lib/string_helpers: Add missed declaration of struct task_struct
        platform/x86: DELL_WMI use depends on instead of select for DELL_SMBIOS
        platform/mellanox: mlxreg-hotplug: Document fixes for hotplug private data
        platform_data/mlxreg: Document fixes for hotplug device
        ...
      f3b5020e
    • Linus Torvalds's avatar
      Merge tag 'pwm/for-4.18-rc1' of... · 4b4bb99b
      Linus Torvalds authored
      Merge tag 'pwm/for-4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm
      
      Pull pwm updates from Thierry Reding:
       "This contains a couple of fixes and cleanups for the Meson and
        ACPI/LPSS drivers as well as capture support for STM32.
      
        Note that given the cross- subsystem changes, the STM32 patches were
        merged through the MFD and PWM trees, both sharing an immutable
        branch"
      
      * tag 'pwm/for-4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm:
        pwm: stm32: Fix build warning with CONFIG_DMA_ENGINE disabled
        pwm: stm32: Enforce dependency on CONFIG_MFD_STM32_TIMERS
        ACPI / LPSS: Add missing prv_offset setting for byt/cht PWM devices
        pwm: lpss: platform: Save/restore the ctrl register over a suspend/resume
        dt-bindings: mfd: stm32-timers: Add support for dmas
        pwm: simplify getting .drvdata
        pwm: meson: Fix allocation of PWM channel array
      4b4bb99b
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 9bca19a0
      Linus Torvalds authored
      Pull i2c updates from Wolfram Sang:
      
       - mainly feature additions to drivers (stm32f7, qup, xlp9xx, mlxcpld, ...)
      
       - conversion to use the i2c_8bit_addr_from_msg macro consistently
      
       - move includes to platform_data
      
       - core updates to allow the (still in review) I3C subsystem to connect
      
       - and the regular share of smaller driver updates
      
      * 'i2c/for-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (68 commits)
        i2c: qup: fix building without CONFIG_ACPI
        i2c: tegra: Remove suspend-resume
        i2c: imx-lpi2c: Switch to SPDX identifier
        i2c: mxs: Switch to SPDX identifier
        i2c: busses: make use of i2c_8bit_addr_from_msg
        i2c: algos: make use of i2c_8bit_addr_from_msg
        i2c: rcar: document R8A77980 bindings
        i2c: qup: Add command-line parameter to override SCL frequency
        i2c: qup: Correct duty cycle for FM and FM+
        i2c: qup: Add support for Fast Mode Plus
        i2c: qup: add probe path for Centriq ACPI devices
        i2c: robotfuzz-osif: drop pointless test
        i2c: robotfuzz-osif: remove pointless local variable
        i2c: rk3x: Don't print visible virtual mapping MMIO address
        i2c: opal: don't check number of messages in the driver
        i2c: ibm_iic: don't check number of messages in the driver
        i2c: imx: Switch to SPDX identifier
        i2c: mux: pca954x: merge calls to of_match_device and of_device_get_match_data
        i2c: mux: demux-pinctrl: use proper parent device for demux adapter
        i2c: mux: improve error message for failed symlink
        ...
      9bca19a0
    • Linus Torvalds's avatar
      Merge tag 'apparmor-pr-2018-06-13' of... · 463f2021
      Linus Torvalds authored
      Merge tag 'apparmor-pr-2018-06-13' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor
      
      Pull AppArmor updates from John Johansen:
       "Features
         - add support for mapping secids and using secctxes
         - add the ability to get a task's secid
         - add support for audit rule filtering
      
        Cleanups:
         - multiple typo fixes
         - Convert to use match_string() helper
         - update git and wiki locations in AppArmor docs
         - improve get_buffers macro by using get_cpu_ptr
         - Use an IDR to allocate apparmor secids
      
        Bug fixes:
         - fix '*seclen' is never less than zero
         - fix mediation of prlimit
         - fix memory leak when deduping profile load
         - fix ptrace read check
         - fix memory leak of rule on error exit path"
      
      * tag 'apparmor-pr-2018-06-13' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor: (21 commits)
        apparmor: fix ptrace read check
        apparmor: fix memory leak when deduping profile load
        apparmor: fix mediation of prlimit
        apparmor: fixup secid map conversion to using IDR
        apparmor: Use an IDR to allocate apparmor secids
        apparmor: Fix memory leak of rule on error exit path
        apparmor: modify audit rule support to support profile stacks
        apparmor: Add support for audit rule filtering
        apparmor: update git and wiki locations in AppArmor docs
        apparmor: Convert to use match_string() helper
        apparmor: improve get_buffers macro by using get_cpu_ptr
        apparmor: fix '*seclen' is never less than zero
        apparmor: fix typo "preconfinement"
        apparmor: fix typo "independent"
        apparmor: fix typo "traverse"
        apparmor: fix typo "type"
        apparmor: fix typo "replace"
        apparmor: fix typo "comparison"
        apparmor: fix typo "loosen"
        apparmor: add the ability to get a task's secid
        ...
      463f2021
    • Linus Torvalds's avatar
      Kbuild: rename CC_STACKPROTECTOR[_STRONG] config variables · 050e9baa
      Linus Torvalds authored
      The changes to automatically test for working stack protector compiler
      support in the Kconfig files removed the special STACKPROTECTOR_AUTO
      option that picked the strongest stack protector that the compiler
      supported.
      
      That was all a nice cleanup - it makes no sense to have the AUTO case
      now that the Kconfig phase can just determine the compiler support
      directly.
      
      HOWEVER.
      
      It also meant that doing "make oldconfig" would now _disable_ the strong
      stackprotector if you had AUTO enabled, because in a legacy config file,
      the sane stack protector configuration would look like
      
        CONFIG_HAVE_CC_STACKPROTECTOR=y
        # CONFIG_CC_STACKPROTECTOR_NONE is not set
        # CONFIG_CC_STACKPROTECTOR_REGULAR is not set
        # CONFIG_CC_STACKPROTECTOR_STRONG is not set
        CONFIG_CC_STACKPROTECTOR_AUTO=y
      
      and when you ran this through "make oldconfig" with the Kbuild changes,
      it would ask you about the regular CONFIG_CC_STACKPROTECTOR (that had
      been renamed from CONFIG_CC_STACKPROTECTOR_REGULAR to just
      CONFIG_CC_STACKPROTECTOR), but it would think that the STRONG version
      used to be disabled (because it was really enabled by AUTO), and would
      disable it in the new config, resulting in:
      
        CONFIG_HAVE_CC_STACKPROTECTOR=y
        CONFIG_CC_HAS_STACKPROTECTOR_NONE=y
        CONFIG_CC_STACKPROTECTOR=y
        # CONFIG_CC_STACKPROTECTOR_STRONG is not set
        CONFIG_CC_HAS_SANE_STACKPROTECTOR=y
      
      That's dangerously subtle - people could suddenly find themselves with
      the weaker stack protector setup without even realizing.
      
      The solution here is to just rename not just the old RECULAR stack
      protector option, but also the strong one.  This does that by just
      removing the CC_ prefix entirely for the user choices, because it really
      is not about the compiler support (the compiler support now instead
      automatially impacts _visibility_ of the options to users).
      
      This results in "make oldconfig" actually asking the user for their
      choice, so that we don't have any silent subtle security model changes.
      The end result would generally look like this:
      
        CONFIG_HAVE_CC_STACKPROTECTOR=y
        CONFIG_CC_HAS_STACKPROTECTOR_NONE=y
        CONFIG_STACKPROTECTOR=y
        CONFIG_STACKPROTECTOR_STRONG=y
        CONFIG_CC_HAS_SANE_STACKPROTECTOR=y
      
      where the "CC_" versions really are about internal compiler
      infrastructure, not the user selections.
      Acked-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      050e9baa
  2. 13 Jun, 2018 9 commits
    • Linus Torvalds's avatar
      Merge tag 'kbuild-v4.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild · be779f03
      Linus Torvalds authored
      Pull more Kbuild updates from Masahiro Yamada:
      
       - fix some bugs introduced by the recent Kconfig syntax extension
      
       - add some symbols about compiler information in Kconfig, such as
         CC_IS_GCC, CC_IS_CLANG, GCC_VERSION, etc.
      
       - test compiler capability for the stack protector in Kconfig, and
         clean-up Makefile
      
       - test compiler capability for GCC-plugins in Kconfig, and clean-up
         Makefile
      
       - allow to enable GCC-plugins for COMPILE_TEST
      
       - test compiler capability for KCOV in Kconfig and correct dependency
      
       - remove auto-detect mode of the GCOV format, which is now more nicely
         handled in Kconfig
      
       - test compiler capability for mprofile-kernel on PowerPC, and clean-up
         Makefile
      
       - misc cleanups
      
      * tag 'kbuild-v4.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        linux/linkage.h: replace VMLINUX_SYMBOL_STR() with __stringify()
        kconfig: fix localmodconfig
        sh: remove no-op macro VMLINUX_SYMBOL()
        powerpc/kbuild: move -mprofile-kernel check to Kconfig
        Documentation: kconfig: add recommended way to describe compiler support
        gcc-plugins: disable GCC_PLUGIN_STRUCTLEAK_BYREF_ALL for COMPILE_TEST
        gcc-plugins: allow to enable GCC_PLUGINS for COMPILE_TEST
        gcc-plugins: test plugin support in Kconfig and clean up Makefile
        gcc-plugins: move GCC version check for PowerPC to Kconfig
        kcov: test compiler capability in Kconfig and correct dependency
        gcov: remove CONFIG_GCOV_FORMAT_AUTODETECT
        arm64: move GCC version check for ARCH_SUPPORTS_INT128 to Kconfig
        kconfig: add CC_IS_CLANG and CLANG_VERSION
        kconfig: add CC_IS_GCC and GCC_VERSION
        stack-protector: test compiler capability in Kconfig and drop AUTO mode
        kbuild: fix endless syncconfig in case arch Makefile sets CROSS_COMPILE
      be779f03
    • Linus Torvalds's avatar
      Merge tag 'acpi-4.18-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · d290ef93
      Linus Torvalds authored
      Pull additional ACPI updates from Rafael Wysocki:
       "These update the ACPICA code in the kernel to upstream revision
        20180531 including one important AML parser fix and updates related to
        the IORT table, make the kernel recognize the "Windows 2017.2" _OSI
        string and update the customized methods documentation.
      
        Specifics:
      
         - Update the ACPICA code in the kernel to upstream revision 20180531
           including:
            * AML parser fix to continue loading tables after detecting an AML
              error (Erik Schmauss).
            * AML parser debug option to dump parse trees (Bob Moore).
            * Debugger updates (Bob Moore).
            * Initial bits of Unload () operator deprecation (Bob Moore).
            * Updates related to the IORT table (Robin Murphy).
      
         - Make Linux respond to the "Windows 2017.2" _OSI string which
           allows native Thunderbolt enumeration to be used on Dell systems
           and was unsafe before recent changes in the PCI subsystem (Mario
           Limonciello)
      
         - Update the ACPI method customization feature documentation (Erik
           Schmauss)"
      
      * tag 'acpi-4.18-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPICA: Recognize the _OSI string "Windows 2017.2"
        ACPICA: Update version to 20180531
        ACPICA: Interpreter: Begin deprecation of Unload operator
        ACPICA: AML parser: attempt to continue loading table after error
        ACPICA: Debugger: Reduce verbosity for module-level code errors.
        ACPICA: AML Parser: Add debug option to dump parse trees
        ACPICA: Debugger: Add count of namespace nodes after namespace dump
        ACPICA: IORT: Add PMCG node supprt
        ACPICA: IORT: Update for revision D
        ACPI / Documentation: update ACPI customize method feature docs
      d290ef93
    • Linus Torvalds's avatar
      Merge tag 'pm-4.18-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · d09fcecb
      Linus Torvalds authored
      Pull more power management updates from Rafael Wysocki:
       "These revert a recent PM core change that introduced a regression, fix
        the build when the recently added Kryo cpufreq driver is selected, add
        support for devices attached to multiple power domains to the generic
        power domains (genpd) framework, add support for iowait boosting on
        systens with hardware-managed P-states (HWP) enabled to the
        intel_pstate driver, modify the behavior of the wakeup_count device
        attribute in sysfs, fix a few issues and clean up some ugliness,
        mostly in cpufreq (core and drivers) and in the cpupower utility.
      
        Specifics:
      
         - Revert a recent PM core change that attempted to fix an issue
           related to device links, but introduced a regression (Rafael
           Wysocki)
      
         - Fix build when the recently added cpufreq driver for Kryo
           processors is selected by making it possible to build that driver
           as a module (Arnd Bergmann)
      
         - Fix the long idle detection mechanism in the out-of-band (ondemand
           and conservative) cpufreq governors (Chen Yu)
      
         - Add support for devices in multiple power domains to the generic
           power domains (genpd) framework (Ulf Hansson)
      
         - Add support for iowait boosting on systems with hardware-managed
           P-states (HWP) enabled to the intel_pstate driver and make it use
           that feature on systems with Skylake Xeon processors as it is
           reported to improve performance significantly on those systems
           (Srinivas Pandruvada)
      
         - Fix and update the acpi_cpufreq, ti-cpufreq and imx6q cpufreq
           drivers (Colin Ian King, Suman Anna, Sébastien Szymanski)
      
         - Change the behavior of the wakeup_count device attribute in sysfs
           to expose the number of events when the device might have aborted
           system suspend in progress (Ravi Chandra Sadineni)
      
         - Fix two minor issues in the cpupower utility (Abhishek Goel, Colin
           Ian King)"
      
      * tag 'pm-4.18-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        Revert "PM / runtime: Fixup reference counting of device link suppliers at probe"
        cpufreq: imx6q: check speed grades for i.MX6ULL
        cpufreq: governors: Fix long idle detection logic in load calculation
        cpufreq: intel_pstate: enable boost for Skylake Xeon
        PM / wakeup: Export wakeup_count instead of event_count via sysfs
        PM / Domains: Add dev_pm_domain_attach_by_id() to manage multi PM domains
        PM / Domains: Add support for multi PM domains per device to genpd
        PM / Domains: Split genpd_dev_pm_attach()
        PM / Domains: Don't attach devices in genpd with multi PM domains
        PM / Domains: dt: Allow power-domain property to be a list of specifiers
        cpufreq: intel_pstate: New sysfs entry to control HWP boost
        cpufreq: intel_pstate: HWP boost performance on IO wakeup
        cpufreq: intel_pstate: Add HWP boost utility and sched util hooks
        cpufreq: ti-cpufreq: Use devres managed API in probe()
        cpufreq: ti-cpufreq: Fix an incorrect error return value
        cpufreq: ACPI: make function acpi_cpufreq_fast_switch() static
        cpufreq: kryo: allow building as a loadable module
        cpupower : Fix header name to read idle state name
        cpupower: fix spelling mistake: "logilename" -> "logfilename"
      d09fcecb
    • Rafael J. Wysocki's avatar
      Merge branch 'acpica' · 67445532
      Rafael J. Wysocki authored
      ACPICA update to upstream revision 20180531 (including an important
      AML parser fix and updates related to IORT) and a change to start
      responding to the "Windows 2017.2" _OSI string.
      
      * acpica:
        ACPICA: Recognize the _OSI string "Windows 2017.2"
        ACPICA: Update version to 20180531
        ACPICA: Interpreter: Begin deprecation of Unload operator
        ACPICA: AML parser: attempt to continue loading table after error
        ACPICA: Debugger: Reduce verbosity for module-level code errors.
        ACPICA: AML Parser: Add debug option to dump parse trees
        ACPICA: Debugger: Add count of namespace nodes after namespace dump
        ACPICA: IORT: Add PMCG node supprt
        ACPICA: IORT: Update for revision D
      67445532
    • Rafael J. Wysocki's avatar
      Merge branches 'pm-domains' and 'pm-tools' · 6a900f88
      Rafael J. Wysocki authored
      Additional updates of the generic power domains (genpd) framework
      (support for devices attached to multiple domains) and the cpupower
      utility (minor fixes) for 4.18-rc1.
      
      * pm-domains:
        PM / Domains: Add dev_pm_domain_attach_by_id() to manage multi PM domains
        PM / Domains: Add support for multi PM domains per device to genpd
        PM / Domains: Split genpd_dev_pm_attach()
        PM / Domains: Don't attach devices in genpd with multi PM domains
        PM / Domains: dt: Allow power-domain property to be a list of specifiers
      
      * pm-tools:
        cpupower : Fix header name to read idle state name
        cpupower: fix spelling mistake: "logilename" -> "logfilename"
      6a900f88
    • Rafael J. Wysocki's avatar
      Merge branch 'pm-cpufreq' · 2652df3a
      Rafael J. Wysocki authored
      Additional cpufreq updates for 4.18-rc1: fixes and cleanups in the
      core and drivers and intel_pstate extension to do iowait boosting
      on systems with HWP that improves performance quite a bit.
      
      * pm-cpufreq:
        cpufreq: imx6q: check speed grades for i.MX6ULL
        cpufreq: governors: Fix long idle detection logic in load calculation
        cpufreq: intel_pstate: enable boost for Skylake Xeon
        cpufreq: intel_pstate: New sysfs entry to control HWP boost
        cpufreq: intel_pstate: HWP boost performance on IO wakeup
        cpufreq: intel_pstate: Add HWP boost utility and sched util hooks
        cpufreq: ti-cpufreq: Use devres managed API in probe()
        cpufreq: ti-cpufreq: Fix an incorrect error return value
        cpufreq: ACPI: make function acpi_cpufreq_fast_switch() static
        cpufreq: kryo: allow building as a loadable module
      2652df3a
    • Linus Torvalds's avatar
      Revert "debugfs: inode: debugfs_create_dir uses mode permission from parent" · f5b7769e
      Linus Torvalds authored
      This reverts commit 95cde3c5.
      
      The commit had good intentions, but it breaks kvm-tool and qemu-kvm.
      
      With it in place, "lkvm run" just fails with
      
        Error: KVM_CREATE_VM ioctl
        Warning: Failed init: kvm__init
      
      which isn't a wonderful error message, but bisection pinpointed the
      problematic commit.
      
      The problem is almost certainly due to the special kvm debugfs entries
      created dynamically by kvm under /sys/kernel/debug/kvm/.  See
      kvm_create_vm_debugfs()
      Bisected-and-reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Wanpeng Li <kernellwp@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Thomas Richter <tmricht@linux.ibm.com>
      Cc: Kees Cook <keescook@chromium.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f5b7769e
    • Linus Torvalds's avatar
      KVM: x86: VMX: fix build without hyper-v · dbee3d02
      Linus Torvalds authored
      Commit ceef7d10 ("KVM: x86: VMX: hyper-v: Enlightened MSR-Bitmap
      support") broke the build with Hyper-V disabled, because it accesses
      ms_hyperv.nested_features without checking if that exists.
      
      This is the quick-and-hacky build fix.
      
      I suspect the proper fix is to replace the
      
          static_branch_unlikely(&enable_evmcs)
      
      tests with an inline helper function that also checks that CONFIG_HYPERV
      is enabled, since without that, enable_evmcs makes no sense.
      
      But I want a working build environment first and foremost, and I'm upset
      this slipped through in the first place.  My primary build tests missed
      it because I tend to build with everything enabled, but it should have
      been caught in the kvm tree.
      
      Fixes: ceef7d10 ("KVM: x86: VMX: hyper-v: Enlightened MSR-Bitmap support")
      Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dbee3d02
    • Linus Torvalds's avatar
      Merge tag 'overflow-v4.18-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · b08fc527
      Linus Torvalds authored
      Pull more overflow updates from Kees Cook:
       "The rest of the overflow changes for v4.18-rc1.
      
        This includes the explicit overflow fixes from Silvio, further
        struct_size() conversions from Matthew, and a bug fix from Dan.
      
        But the bulk of it is the treewide conversions to use either the
        2-factor argument allocators (e.g. kmalloc(a * b, ...) into
        kmalloc_array(a, b, ...) or the array_size() macros (e.g. vmalloc(a *
        b) into vmalloc(array_size(a, b)).
      
        Coccinelle was fighting me on several fronts, so I've done a bunch of
        manual whitespace updates in the patches as well.
      
        Summary:
      
         - Error path bug fix for overflow tests (Dan)
      
         - Additional struct_size() conversions (Matthew, Kees)
      
         - Explicitly reported overflow fixes (Silvio, Kees)
      
         - Add missing kvcalloc() function (Kees)
      
         - Treewide conversions of allocators to use either 2-factor argument
           variant when available, or array_size() and array3_size() as needed
           (Kees)"
      
      * tag 'overflow-v4.18-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (26 commits)
        treewide: Use array_size in f2fs_kvzalloc()
        treewide: Use array_size() in f2fs_kzalloc()
        treewide: Use array_size() in f2fs_kmalloc()
        treewide: Use array_size() in sock_kmalloc()
        treewide: Use array_size() in kvzalloc_node()
        treewide: Use array_size() in vzalloc_node()
        treewide: Use array_size() in vzalloc()
        treewide: Use array_size() in vmalloc()
        treewide: devm_kzalloc() -> devm_kcalloc()
        treewide: devm_kmalloc() -> devm_kmalloc_array()
        treewide: kvzalloc() -> kvcalloc()
        treewide: kvmalloc() -> kvmalloc_array()
        treewide: kzalloc_node() -> kcalloc_node()
        treewide: kzalloc() -> kcalloc()
        treewide: kmalloc() -> kmalloc_array()
        mm: Introduce kvcalloc()
        video: uvesafb: Fix integer overflow in allocation
        UBIFS: Fix potential integer overflow in allocation
        leds: Use struct_size() in allocation
        Convert intel uncore to struct_size
        ...
      b08fc527