1. 20 May, 2018 8 commits
    • Linus Torvalds's avatar
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 441cab96
      Linus Torvalds authored
      Pull scheduler fixlets from Thomas Gleixner:
       "Three trivial fixlets for the scheduler:
      
         - move print_rt_rq() and print_dl_rq() declarations to the right
           place
      
         - make grub_reclaim() static
      
         - fix the bogus documentation reference in Kconfig"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/fair: Fix documentation file path
        sched/deadline: Make the grub_reclaim() function static
        sched/debug: Move the print_rt_rq() and print_dl_rq() declarations to kernel/sched/sched.h
      441cab96
    • Linus Torvalds's avatar
      Merge branch 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 74cce52f
      Linus Torvalds authored
      Pull RAS fix from Thomas Gleixner:
       "Fix a regression in the new AMD SMCA code which issues an SMP function
        call from the early interrupt disabled region of CPU hotplug. To avoid
        that, use cached block addresses which can be used directly"
      
      * 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/MCE/AMD: Cache SMCA MISC block addresses
      74cce52f
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 95bcce4d
      Linus Torvalds authored
      Pull perf tooling fixes from Thomas Gleixner:
      
       - fix segfault when processing unknown threads in cs-etm
      
       - fix "perf test inet_pton" on s390 failing due to missing inline
      
       - display all available events on 'perf annotate --stdio'
      
       - add missing newline when parsing an empty BPF program
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf tools: Add missing newline when parsing empty BPF proggie
        perf cs-etm: Remove redundant space
        perf cs-etm: Support unknown_thread in cs_etm_auxtrace
        perf annotate: Display all available events on --stdio
        perf test: "probe libc's inet_pton" fails on s390 due to missing inline
      95bcce4d
    • Linus Torvalds's avatar
      Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 4b65f455
      Linus Torvalds authored
      Pull locking fixes from Thomas Gleixner:
       "Two fixes to address shortcomings of the rwsem/percpu-rwsem lock
        debugging code which emits false positive warnings when the rwsem is
        anonymously locked and unlocked"
      
      * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        locking/percpu-rwsem: Annotate rwsem ownership transfer by setting RWSEM_OWNER_UNKNOWN
        locking/rwsem: Add a new RWSEM_ANONYMOUSLY_OWNED flag
      4b65f455
    • Linus Torvalds's avatar
      Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 056ad121
      Linus Torvalds authored
      Pull EFI fixes from Thomas Gleixner:
      
       - Use explicitely sized type for the romimage pointer in the 32bit EFI
         protocol struct so a 64bit kernel does not expand it to 64bit. Ditto
         for the 64bit struct to avoid the reverse issue on 32bit kernels.
      
       - Handle randomized tex offset correctly in the ARM64 EFI stub to avoid
         unaligned data resulting in stack corruption and other hard to
         diagnose wreckage.
      
      * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        efi/libstub/arm64: Handle randomized TEXT_OFFSET
        efi: Avoid potential crashes, fix the 'struct efi_pci_io_protocol_32' definition for mixed mode
      056ad121
    • Linus Torvalds's avatar
      Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 583dbad3
      Linus Torvalds authored
      Pull core fixes from Thomas Gleixner:
      
       - Unbreak the BPF compilation which got broken by the unconditional
         requirement of asm-goto, which is not supported by clang.
      
       - Prevent probing on exception masking instructions in uprobes and
         kprobes to avoid the issues of the delayed exceptions instead of
         having an ugly workaround.
      
       - Prevent a double free_page() in the error path of do_kexec_load()
      
       - A set of objtool updates addressing various issues mostly related to
         switch tables and the noreturn detection for recursive sibling calls
      
       - Header sync for tools.
      
      * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        objtool: Detect RIP-relative switch table references, part 2
        objtool: Detect RIP-relative switch table references
        objtool: Support GCC 8 switch tables
        objtool: Support GCC 8's cold subfunctions
        objtool: Fix "noreturn" detection for recursive sibling calls
        objtool, kprobes/x86: Sync the latest <asm/insn.h> header with tools/objtool/arch/x86/include/asm/insn.h
        x86/cpufeature: Guard asm_volatile_goto usage for BPF compilation
        uprobes/x86: Prohibit probing on MOV SS instruction
        kprobes/x86: Prohibit probing on exception masking instructions
        x86/kexec: Avoid double free_page() upon do_kexec_load() failure
      583dbad3
    • Linus Torvalds's avatar
      Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · 203ec2fe
      Linus Torvalds authored
      Pull ARM SoC fixes from Olof Johansson:
       "A handful of fixes. I've been queuing them up a bit too long so the
        list is longer than it otherwise would have been spread out across a
        few -rcs.
      
        In general, it's a scattering of fixes across several platforms,
        nothing truly serious enough to point out.
      
        There's a slightly larger batch of them for the Davinci platforms due
        to work to bring them back to life after some time, so there's a
        handful of regressions, some of them going back very far, others more
        recent.
      
        There's also a few patches fixing DT on Renesas platforms since they
        changed some bindings without remaining backwards compatible,
        splitting up describing LVDS as a proper bridge instead of having it
        as part of the display unit.
      
        We could push for them to be backwards compatible with old device
        trees, but it's likely to regress eventually if nobody's actually
        using said compatibility"
      
      * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (36 commits)
        ARM: davinci: board-dm646x-evm: set VPIF capture card name
        ARM: davinci: board-dm646x-evm: pass correct I2C adapter id for VPIF
        ARM: davinci: dm646x: fix timer interrupt generation
        ARM: keystone: fix platform_domain_notifier array overrun
        arm64: dts: exynos: Fix interrupt type for I2S1 device on Exynos5433
        ARM: dts: imx51-zii-rdu1: fix touchscreen bindings
        firmware: arm_scmi: Use after free in scmi_create_protocol_device()
        ARM: dts: cygnus: fix irq type for arm global timer
        Revert "ARM: dts: logicpd-som-lv: Fix pinmux controller references"
        tee: check shm references are consistent in offset/size
        tee: shm: fix use-after-free via temporarily dropped reference
        ARM: dts: imx7s: Pass the 'fsl,sec-era' property
        ARM: dts: tegra20: Revert "Fix ULPI regression on Tegra20"
        ARM: dts: correct missing "compatible" entry for ti81xx SoCs
        ARM: OMAP1: ams-delta: fix deferred_fiq handler
        arm64: tegra: Make BCM89610 PHY interrupt as active low
        ARM: davinci: fix GPIO lookup for I2C
        ARM: dts: logicpd-som-lv: Fix pinmux controller references
        ARM: dts: logicpd-som-lv: Fix Audio Mute
        ARM: dts: logicpd-som-lv: Fix WL127x Startup Issues
        ...
      203ec2fe
    • Olof Johansson's avatar
      Merge tag 'tegra-for-4.17-fixes-2' of... · 709f490d
      Olof Johansson authored
      Merge tag 'tegra-for-4.17-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux into fixes
      
      arm64: tegra: Device tree fixes for v4.17
      
      This contains a one-line update to the device tree of the Tegra186 P3310
      processor module, fixing the polarity of the PHY interrupt. Originally,
      this was queued to go into v4.18, but the PHY ID matching patch has now
      found its way into v4.17-rc5, which means that the PHY driver will know
      how to identify the PHY on this board and try to use the interrupt. This
      will unfortunately cause networking to break on P3310, hence why I think
      this should go into v4.17.
      
      * tag 'tegra-for-4.17-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux:
        arm64: tegra: Make BCM89610 PHY interrupt as active low
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
      709f490d
  2. 19 May, 2018 19 commits
    • Linus Torvalds's avatar
      Merge tag 'dmaengine-fix-4.17-rc6' of git://git.infradead.org/users/vkoul/slave-dma · 0b449a44
      Linus Torvalds authored
      Pull dmaengine fix from Vinod Koul:
      
       - qcom bam runtime_pm fix
      
       - email update for Vinod
      
      * tag 'dmaengine-fix-4.17-rc6' of git://git.infradead.org/users/vkoul/slave-dma:
        dmaengine: qcom: bam_dma: check if the runtime pm enabled
        dmaengine: Update email address for Vinod
      0b449a44
    • Linus Torvalds's avatar
      mmap: relax file size limit for regular files · 423913ad
      Linus Torvalds authored
      Commit be83bbf8 ("mmap: introduce sane default mmap limits") was
      introduced to catch problems in various ad-hoc character device drivers
      doing mmap and getting the size limits wrong.  In the process, it used
      "known good" limits for the normal cases of mapping regular files and
      block device drivers.
      
      It turns out that the "s_maxbytes" limit was less "known good" than I
      thought.  In particular, /proc doesn't set it, but exposes one regular
      file to mmap: /proc/vmcore.  As a result, that file got limited to the
      default MAX_INT s_maxbytes value.
      
      This went unnoticed for a while, because apparently the only thing that
      needs it is the s390 kernel zfcpdump, but there might be other tools
      that use this too.
      
      Vasily suggested just changing s_maxbytes for all of /proc, which isn't
      wrong, but makes me nervous at this stage.  So instead, just make the
      new mmap limit always be MAX_LFS_FILESIZE for regular files, which won't
      affect anything else.  It wasn't the regular file case I was worried
      about.
      
      I'd really prefer for maxsize to have been per-inode, but that is not
      how things are today.
      
      Fixes: be83bbf8 ("mmap: introduce sane default mmap limits")
      Reported-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      423913ad
    • Borislav Petkov's avatar
      x86/MCE/AMD: Cache SMCA MISC block addresses · 78ce2410
      Borislav Petkov authored
      ... into a global, two-dimensional array and service subsequent reads from
      that cache to avoid rdmsr_on_cpu() calls during CPU hotplug (IPIs with IRQs
      disabled).
      
      In addition, this fixes a KASAN slab-out-of-bounds read due to wrong usage
      of the bank->blocks pointer.
      
      Fixes: 27bd5950 ("x86/mce/AMD: Get address from already initialized block")
      Reported-by: default avatarJohannes Hirte <johannes.hirte@datenkhaos.de>
      Tested-by: default avatarJohannes Hirte <johannes.hirte@datenkhaos.de>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Yazen Ghannam <yazen.ghannam@amd.com>
      Link: http://lkml.kernel.org/r/20180414004230.GA2033@probook
      78ce2410
    • Josh Poimboeuf's avatar
      objtool: Detect RIP-relative switch table references, part 2 · 7dec80cc
      Josh Poimboeuf authored
      With the following commit:
      
        fd35c88b ("objtool: Support GCC 8 switch tables")
      
      I added a "can't find switch jump table" warning, to stop covering up
      silent failures if add_switch_table() can't find anything.
      
      That warning found yet another bug in the objtool switch table detection
      logic.  For cases 1 and 2 (as described in the comments of
      find_switch_table()), the find_symbol_containing() check doesn't adjust
      the offset for RIP-relative switch jumps.
      
      Incidentally, this bug was already fixed for case 3 with:
      
        6f5ec299 ("objtool: Detect RIP-relative switch table references")
      
      However, that commit missed the fix for cases 1 and 2.
      
      The different cases are now starting to look more and more alike.  So
      fix the bug by consolidating them into a single case, by checking the
      original dynamic jump instruction in the case 3 loop.
      
      This also simplifies the code and makes it more robust against future
      switch table detection issues -- of which I'm sure there will be many...
      
      Switch table detection has been the most fragile area of objtool, by
      far.  I long for the day when we'll have a GCC plugin for annotating
      switch tables.  Linus asked me to delay such a plugin due to the
      flakiness of the plugin infrastructure in older versions of GCC, so this
      rickety code is what we're stuck with for now.  At least the code is now
      a little simpler than it was.
      Reported-by: default avatarkbuild test robot <lkp@intel.com>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/f400541613d45689086329432f3095119ffbc328.1526674218.git.jpoimboe@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      7dec80cc
    • Mark Rutland's avatar
      efi/libstub/arm64: Handle randomized TEXT_OFFSET · 4f74d72a
      Mark Rutland authored
      When CONFIG_RANDOMIZE_TEXT_OFFSET=y, TEXT_OFFSET is an arbitrary
      multiple of PAGE_SIZE in the interval [0, 2MB).
      
      The EFI stub does not account for the potential misalignment of
      TEXT_OFFSET relative to EFI_KIMG_ALIGN, and produces a randomized
      physical offset which is always a round multiple of EFI_KIMG_ALIGN.
      This may result in statically allocated objects whose alignment exceeds
      PAGE_SIZE to appear misaligned in memory. This has been observed to
      result in spurious stack overflow reports and failure to make use of
      the IRQ stacks, and theoretically could result in a number of other
      issues.
      
      We can OR in the low bits of TEXT_OFFSET to ensure that we have the
      necessary offset (and hence preserve the misalignment of TEXT_OFFSET
      relative to EFI_KIMG_ALIGN), so let's do that.
      Reported-by: default avatarKim Phillips <kim.phillips@arm.com>
      Tested-by: default avatarKim Phillips <kim.phillips@arm.com>
      [ardb: clarify comment and commit log, drop unneeded parens]
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-efi@vger.kernel.org
      Fixes: 6f26b367 ("arm64: kaslr: increase randomization granularity")
      Link: http://lkml.kernel.org/r/20180518140841.9731-2-ard.biesheuvel@linaro.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      4f74d72a
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · 73fcb1a3
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "10 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        hfsplus: stop workqueue when fill_super() failed
        mm: don't allow deferred pages with NEED_PER_CPU_KM
        MAINTAINERS: add Q: entry to kselftest for patchwork project
        radix tree: fix multi-order iteration race
        radix tree test suite: multi-order iteration race
        radix tree test suite: add item_delete_rcu()
        radix tree test suite: fix compilation issue
        radix tree test suite: fix mapshift build target
        include/linux/mm.h: add new inline function vmf_error()
        lib/test_bitmap.c: fix bitmap optimisation tests to report errors correctly
      73fcb1a3
    • Linus Torvalds's avatar
      Merge tag 'platform-drivers-x86-v4.17-3' of git://git.infradead.org/linux-platform-drivers-x86 · 10a2f874
      Linus Torvalds authored
      Pull x86 platform driver fix from Darren Hart:
       "Remove the last of the "select DELL_SMBIOS" references in the Kconfig"
      
      * tag 'platform-drivers-x86-v4.17-3' of git://git.infradead.org/linux-platform-drivers-x86:
        platform/x86: DELL_WMI use depends on instead of select for DELL_SMBIOS
      10a2f874
    • Linus Torvalds's avatar
      Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · f65cfecf
      Linus Torvalds authored
      Pull clk fixes from Stephen Boyd:
      
       - a modified revert of a patch that made new choices come out for a
         couple stm32 clk drivers that really always need to be there when
         that particular machine is compiled in
      
       - boot fix on i.MX for Stefan who noticed odd behavior from the
         critical flag patch that came in during the merge window
      
      * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: stm32: fix: stm32 clock drivers are not compiled by default
        clk: imx6ull: use OSC clock during AXI rate change
      f65cfecf
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 6d16db00
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "A bunch of driver bugfixes and a MAINTAINERS addition"
      
      * 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        MAINTAINERS: add entry for STM32 I2C driver
        i2c: viperboard: return message count on master_xfer success
        i2c: pmcmsp: fix error return from master_xfer
        i2c: pmcmsp: return message count on master_xfer success
        i2c: designware: fix poll-after-enable regression
        eeprom: at24: fix retrieving the at24_chip_data structure
        i2c: core: ACPI: Log device not acking errors at dbg loglevel
        i2c: core: ACPI: Improve OpRegion read errors
      6d16db00
    • Tetsuo Handa's avatar
      hfsplus: stop workqueue when fill_super() failed · 66072c29
      Tetsuo Handa authored
      syzbot is reporting ODEBUG messages at hfsplus_fill_super() [1].  This
      is because hfsplus_fill_super() forgot to call cancel_delayed_work_sync().
      
      As far as I can see, it is hfsplus_mark_mdb_dirty() from
      hfsplus_new_inode() in hfsplus_fill_super() that calls
      queue_delayed_work().  Therefore, I assume that hfsplus_new_inode() does
      not fail if queue_delayed_work() was called, and the out_put_hidden_dir
      label is the appropriate location to call cancel_delayed_work_sync().
      
      [1] https://syzkaller.appspot.com/bug?id=a66f45e96fdbeb76b796bf46eb25ea878c42a6c9
      
      Link: http://lkml.kernel.org/r/964a8b27-cd69-357c-fe78-76b066056201@I-love.SAKURA.ne.jpSigned-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Reported-by: default avatarsyzbot <syzbot+4f2e5f086147d543ab03@syzkaller.appspotmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Ernesto A. Fernandez <ernesto.mnd.fernandez@gmail.com>
      Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      66072c29
    • Pavel Tatashin's avatar
      mm: don't allow deferred pages with NEED_PER_CPU_KM · ab1e8d89
      Pavel Tatashin authored
      It is unsafe to do virtual to physical translations before mm_init() is
      called if struct page is needed in order to determine the memory section
      number (see SECTION_IN_PAGE_FLAGS).  This is because only in mm_init()
      we initialize struct pages for all the allocated memory when deferred
      struct pages are used.
      
      My recent fix in commit c9e97a19 ("mm: initialize pages on demand
      during boot") exposed this problem, because it greatly reduced number of
      pages that are initialized before mm_init(), but the problem existed
      even before my fix, as Fengguang Wu found.
      
      Below is a more detailed explanation of the problem.
      
      We initialize struct pages in four places:
      
      1. Early in boot a small set of struct pages is initialized to fill the
         first section, and lower zones.
      
      2. During mm_init() we initialize "struct pages" for all the memory that
         is allocated, i.e reserved in memblock.
      
      3. Using on-demand logic when pages are allocated after mm_init call
         (when memblock is finished)
      
      4. After smp_init() when the rest free deferred pages are initialized.
      
      The problem occurs if we try to do va to phys translation of a memory
      between steps 1 and 2.  Because we have not yet initialized struct pages
      for all the reserved pages, it is inherently unsafe to do va to phys if
      the translation itself requires access of "struct page" as in case of
      this combination: CONFIG_SPARSE && !CONFIG_SPARSE_VMEMMAP
      
      The following path exposes the problem:
      
        start_kernel()
         trap_init()
          setup_cpu_entry_areas()
           setup_cpu_entry_area(cpu)
            get_cpu_gdt_paddr(cpu)
             per_cpu_ptr_to_phys(addr)
              pcpu_addr_to_page(addr)
               virt_to_page(addr)
                pfn_to_page(__pa(addr) >> PAGE_SHIFT)
      
      We disable this path by not allowing NEED_PER_CPU_KM with deferred
      struct pages feature.
      
      The problems are discussed in these threads:
        http://lkml.kernel.org/r/20180418135300.inazvpxjxowogyge@wfg-t540p.sh.intel.com
        http://lkml.kernel.org/r/20180419013128.iurzouiqxvcnpbvz@wfg-t540p.sh.intel.com
        http://lkml.kernel.org/r/20180426202619.2768-1-pasha.tatashin@oracle.com
      
      Link: http://lkml.kernel.org/r/20180515175124.1770-1-pasha.tatashin@oracle.com
      Fixes: 3a80a7fa ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set")
      Signed-off-by: default avatarPavel Tatashin <pasha.tatashin@oracle.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Steven Sistare <steven.sistare@oracle.com>
      Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Fengguang Wu <fengguang.wu@intel.com>
      Cc: Dennis Zhou <dennisszhou@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ab1e8d89
    • Shuah Khan (Samsung OSG)'s avatar
      MAINTAINERS: add Q: entry to kselftest for patchwork project · f3d8d3cf
      Shuah Khan (Samsung OSG) authored
      A new patchwork project is created to track kselftest patches.  Update
      the kselftest entry in the MAINTAINERS file adding 'Q:' entry:
      
        https://patchwork.kernel.org/project/linux-kselftest/list/
      
      Link: http://lkml.kernel.org/r/20180515164427.12201-1-shuah@kernel.orgSigned-off-by: default avatarShuah Khan (Samsung OSG) <shuah@kernel.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Linus Walleij <linus.walleij@linaro.org>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f3d8d3cf
    • Ross Zwisler's avatar
      radix tree: fix multi-order iteration race · 9f418224
      Ross Zwisler authored
      Fix a race in the multi-order iteration code which causes the kernel to
      hit a GP fault.  This was first seen with a production v4.15 based
      kernel (4.15.6-300.fc27.x86_64) utilizing a DAX workload which used
      order 9 PMD DAX entries.
      
      The race has to do with how we tear down multi-order sibling entries
      when we are removing an item from the tree.  Remember for example that
      an order 2 entry looks like this:
      
        struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling]
      
      where 'entry' is in some slot in the struct radix_tree_node, and the
      three slots following 'entry' contain sibling pointers which point back
      to 'entry.'
      
      When we delete 'entry' from the tree, we call :
      
        radix_tree_delete()
          radix_tree_delete_item()
            __radix_tree_delete()
              replace_slot()
      
      replace_slot() first removes the siblings in order from the first to the
      last, then at then replaces 'entry' with NULL.  This means that for a
      brief period of time we end up with one or more of the siblings removed,
      so:
      
        struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling]
      
      This causes an issue if you have a reader iterating over the slots in
      the tree via radix_tree_for_each_slot() while only under
      rcu_read_lock()/rcu_read_unlock() protection.  This is a common case in
      mm/filemap.c.
      
      The issue is that when __radix_tree_next_slot() => skip_siblings() tries
      to skip over the sibling entries in the slots, it currently does so with
      an exact match on the slot directly preceding our current slot.
      Normally this works:
      
                                            V preceding slot
        struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling]
                                                    ^ current slot
      
      This lets you find the first sibling, and you skip them all in order.
      
      But in the case where one of the siblings is NULL, that slot is skipped
      and then our sibling detection is interrupted:
      
                                                   V preceding slot
        struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling]
                                                          ^ current slot
      
      This means that the sibling pointers aren't recognized since they point
      all the way back to 'entry', so we think that they are normal internal
      radix tree pointers.  This causes us to think we need to walk down to a
      struct radix_tree_node starting at the address of 'entry'.
      
      In a real running kernel this will crash the thread with a GP fault when
      you try and dereference the slots in your broken node starting at
      'entry'.
      
      We fix this race by fixing the way that skip_siblings() detects sibling
      nodes.  Instead of testing against the preceding slot we instead look
      for siblings via is_sibling_entry() which compares against the position
      of the struct radix_tree_node.slots[] array.  This ensures that sibling
      entries are properly identified, even if they are no longer contiguous
      with the 'entry' they point to.
      
      Link: http://lkml.kernel.org/r/20180503192430.7582-6-ross.zwisler@linux.intel.com
      Fixes: 148deab2 ("radix-tree: improve multiorder iterators")
      Signed-off-by: default avatarRoss Zwisler <ross.zwisler@linux.intel.com>
      Reported-by: default avatarCR, Sapthagirish <sapthagirish.cr@intel.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9f418224
    • Ross Zwisler's avatar
      radix tree test suite: multi-order iteration race · fd8f58c4
      Ross Zwisler authored
      Add a test which shows a race in the multi-order iteration code.  This
      test reliably hits the race in under a second on my machine, and is the
      result of a real bug report against kernel a production v4.15 based
      kernel (4.15.6-300.fc27.x86_64).  With a real kernel this issue is hit
      when using order 9 PMD DAX radix tree entries.
      
      The race has to do with how we tear down multi-order sibling entries
      when we are removing an item from the tree.  Remember that an order 2
      entry looks like this:
      
        struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling]
      
      where 'entry' is in some slot in the struct radix_tree_node, and the
      three slots following 'entry' contain sibling pointers which point back
      to 'entry.'
      
      When we delete 'entry' from the tree, we call :
      
        radix_tree_delete()
          radix_tree_delete_item()
            __radix_tree_delete()
              replace_slot()
      
      replace_slot() first removes the siblings in order from the first to the
      last, then at then replaces 'entry' with NULL.  This means that for a
      brief period of time we end up with one or more of the siblings removed,
      so:
      
        struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling]
      
      This causes an issue if you have a reader iterating over the slots in
      the tree via radix_tree_for_each_slot() while only under
      rcu_read_lock()/rcu_read_unlock() protection.  This is a common case in
      mm/filemap.c.
      
      The issue is that when __radix_tree_next_slot() => skip_siblings() tries
      to skip over the sibling entries in the slots, it currently does so with
      an exact match on the slot directly preceding our current slot.
      Normally this works:
      
                                            V preceding slot
        struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling]
                                                    ^ current slot
      
      This lets you find the first sibling, and you skip them all in order.
      
      But in the case where one of the siblings is NULL, that slot is skipped
      and then our sibling detection is interrupted:
      
                                                   V preceding slot
        struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling]
                                                          ^ current slot
      
      This means that the sibling pointers aren't recognized since they point
      all the way back to 'entry', so we think that they are normal internal
      radix tree pointers.  This causes us to think we need to walk down to a
      struct radix_tree_node starting at the address of 'entry'.
      
      In a real running kernel this will crash the thread with a GP fault when
      you try and dereference the slots in your broken node starting at
      'entry'.
      
      In the radix tree test suite this will be caught by the address
      sanitizer:
      
        ==27063==ERROR: AddressSanitizer: heap-buffer-overflow on address
        0x60c0008ae400 at pc 0x00000040ce4f bp 0x7fa89b8fcad0 sp 0x7fa89b8fcac0
        READ of size 8 at 0x60c0008ae400 thread T3
            #0 0x40ce4e in __radix_tree_next_slot /home/rzwisler/project/linux/tools/testing/radix-tree/radix-tree.c:1660
            #1 0x4022cc in radix_tree_next_slot linux/../../../../include/linux/radix-tree.h:567
            #2 0x4022cc in iterator_func /home/rzwisler/project/linux/tools/testing/radix-tree/multiorder.c:655
            #3 0x7fa8a088d50a in start_thread (/lib64/libpthread.so.0+0x750a)
            #4 0x7fa8a03bd16e in clone (/lib64/libc.so.6+0xf516e)
      
      Link: http://lkml.kernel.org/r/20180503192430.7582-5-ross.zwisler@linux.intel.comSigned-off-by: default avatarRoss Zwisler <ross.zwisler@linux.intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: CR, Sapthagirish <sapthagirish.cr@intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Matthew Wilcox <willy@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fd8f58c4
    • Ross Zwisler's avatar
      radix tree test suite: add item_delete_rcu() · 3e252fa7
      Ross Zwisler authored
      Currently the lifetime of "struct item" entries in the radix tree are
      not controlled by RCU, but are instead deleted inline as they are
      removed from the tree.
      
      In the following patches we add a test which has threads iterating over
      items pulled from the tree and verifying them in an
      rcu_read_lock()/rcu_read_unlock() section.  This means that though an
      item has been removed from the tree it could still be being worked on by
      other threads until the RCU grace period expires.  So, we need to
      actually free the "struct item" structures at the end of the grace
      period, just as we do with "struct radix_tree_node" items.
      
      Link: http://lkml.kernel.org/r/20180503192430.7582-4-ross.zwisler@linux.intel.comSigned-off-by: default avatarRoss Zwisler <ross.zwisler@linux.intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: CR, Sapthagirish <sapthagirish.cr@intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Matthew Wilcox <willy@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3e252fa7
    • Ross Zwisler's avatar
      radix tree test suite: fix compilation issue · dcbbf25a
      Ross Zwisler authored
      Pulled from a patch from Matthew Wilcox entitled "xarray: Add definition
      of struct xarray":
      
      > From: Matthew Wilcox <mawilcox@microsoft.com>
      > Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
      
        https://patchwork.kernel.org/patch/10341249/
      
      These defines fix this compilation error:
      
        In file included from ./linux/radix-tree.h:6:0,
                         from ./linux/../../../../include/linux/idr.h:15,
                         from ./linux/idr.h:1,
                         from idr.c:4:
        ./linux/../../../../include/linux/idr.h: In function `idr_init_base':
        ./linux/../../../../include/linux/radix-tree.h:129:2: warning: implicit declaration of function `spin_lock_init'; did you mean `spinlock_t'? [-Wimplicit-function-declaration]
          spin_lock_init(&(root)->xa_lock);    \
          ^
        ./linux/../../../../include/linux/idr.h:126:2: note: in expansion of macro `INIT_RADIX_TREE'
          INIT_RADIX_TREE(&idr->idr_rt, IDR_RT_MARKER);
          ^~~~~~~~~~~~~~~
      
      by providing a spin_lock_init() wrapper for the v4.17-rc* version of the
      radix tree test suite.
      
      Link: http://lkml.kernel.org/r/20180503192430.7582-3-ross.zwisler@linux.intel.comSigned-off-by: default avatarRoss Zwisler <ross.zwisler@linux.intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: CR, Sapthagirish <sapthagirish.cr@intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Matthew Wilcox <willy@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dcbbf25a
    • Ross Zwisler's avatar
      radix tree test suite: fix mapshift build target · 8d9fa88e
      Ross Zwisler authored
      Commit c6ce3e2f ("radix tree test suite: Add config option for map
      shift") introduced a phony makefile target called 'mapshift' that ends
      up generating the file generated/map-shift.h.  This phony target was
      then added as a dependency of the top level 'targets' build target,
      which is what is run when you go to tools/testing/radix-tree and just
      type 'make'.
      
      Unfortunately, this phony target doesn't actually work as a dependency,
      so you end up getting:
      
        $ make
        make: *** No rule to make target 'generated/map-shift.h', needed by 'main.o'.  Stop.
        make: *** Waiting for unfinished jobs....
      
      Fix this by making the file generated/map-shift.h our real makefile
      target, and add this a dependency of the top level build target.
      
      Link: http://lkml.kernel.org/r/20180503192430.7582-2-ross.zwisler@linux.intel.comSigned-off-by: default avatarRoss Zwisler <ross.zwisler@linux.intel.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: CR, Sapthagirish <sapthagirish.cr@intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Matthew Wilcox <willy@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8d9fa88e
    • Souptick Joarder's avatar
      include/linux/mm.h: add new inline function vmf_error() · d97baf94
      Souptick Joarder authored
      Many places in drivers/ file systems, error was handled in a common way
      like below:
      
      	ret = (ret == -ENOMEM) ? VM_FAULT_OOM : VM_FAULT_SIGBUS;
      
      vmf_error() will replace this and return vm_fault_t type err.
      
      A lot of drivers and filesystems currently have a rather complex mapping
      of errno-to-VM_FAULT code.  We have been able to eliminate a lot of it
      by just returning VM_FAULT codes directly from functions which are
      called exclusively from the fault handling path.
      
      Some functions can be called both from the fault handler and other
      context which are expecting an errno, so they have to continue to return
      an errno.  Some users still need to choose different behaviour for
      different errnos, but vmf_error() captures the essential error
      translation that's common to all users, and those that need to handle
      additional errors can handle them first.
      
      Link: http://lkml.kernel.org/r/20180510174826.GA14268@jordon-HP-15-Notebook-PCSigned-off-by: default avatarSouptick Joarder <jrdr.linux@gmail.com>
      Reviewed-by: default avatarMatthew Wilcox <mawilcox@microsoft.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d97baf94
    • Matthew Wilcox's avatar
      lib/test_bitmap.c: fix bitmap optimisation tests to report errors correctly · 1e3054b9
      Matthew Wilcox authored
      I had neglected to increment the error counter when the tests failed,
      which made the tests noisy when they fail, but not actually return an
      error code.
      
      Link: http://lkml.kernel.org/r/20180509114328.9887-1-mpe@ellerman.id.au
      Fixes: 3cc78125 ("lib/test_bitmap.c: add optimisation tests")
      Signed-off-by: default avatarMatthew Wilcox <mawilcox@microsoft.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Reported-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Tested-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Yury Norov <ynorov@caviumnetworks.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: <stable@vger.kernel.org>	[4.13+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1e3054b9
  3. 18 May, 2018 13 commits