1. 05 Oct, 2015 3 commits
    • Yang Shi's avatar
      arm64: convert patch_lock to raw lock · abffa6f3
      Yang Shi authored
      When running kprobe test on arm64 rt kernel, it reports the below warning:
      
      root@qemu7:~# modprobe kprobe_example
      BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:917
      in_atomic(): 0, irqs_disabled(): 128, pid: 484, name: modprobe
      CPU: 0 PID: 484 Comm: modprobe Not tainted 4.1.6-rt5 #2
      Hardware name: linux,dummy-virt (DT)
      Call trace:
      [<ffffffc0000891b8>] dump_backtrace+0x0/0x128
      [<ffffffc000089300>] show_stack+0x20/0x30
      [<ffffffc00061dae8>] dump_stack+0x1c/0x28
      [<ffffffc0000bbad0>] ___might_sleep+0x120/0x198
      [<ffffffc0006223e8>] rt_spin_lock+0x28/0x40
      [<ffffffc000622b30>] __aarch64_insn_write+0x28/0x78
      [<ffffffc000622e48>] aarch64_insn_patch_text_nosync+0x18/0x48
      [<ffffffc000622ee8>] aarch64_insn_patch_text_cb+0x70/0xa0
      [<ffffffc000622f40>] aarch64_insn_patch_text_sync+0x28/0x48
      [<ffffffc0006236e0>] arch_arm_kprobe+0x38/0x48
      [<ffffffc00010e6f4>] arm_kprobe+0x34/0x50
      [<ffffffc000110374>] register_kprobe+0x4cc/0x5b8
      [<ffffffbffc002038>] kprobe_init+0x38/0x7c [kprobe_example]
      [<ffffffc000084240>] do_one_initcall+0x90/0x1b0
      [<ffffffc00061c498>] do_init_module+0x6c/0x1cc
      [<ffffffc0000fd0c0>] load_module+0x17f8/0x1db0
      [<ffffffc0000fd8cc>] SyS_finit_module+0xb4/0xc8
      
      Convert patch_lock to raw loc kto avoid this issue.
      
      Although the problem is found on rt kernel, the fix should be applicable to
      mainline kernel too.
      Acked-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarYang Shi <yang.shi@linaro.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      abffa6f3
    • Mark Salyzyn's avatar
      arm64: readahead: fault retry breaks mmap file read random detection · 569ba74a
      Mark Salyzyn authored
      This is the arm64 portion of commit 45cac65b ("readahead: fault
      retry breaks mmap file read random detection"), which was absent from
      the initial port and has since gone unnoticed. The original commit says:
      
      > .fault now can retry.  The retry can break state machine of .fault.  In
      > filemap_fault, if page is miss, ra->mmap_miss is increased.  In the second
      > try, since the page is in page cache now, ra->mmap_miss is decreased.  And
      > these are done in one fault, so we can't detect random mmap file access.
      >
      > Add a new flag to indicate .fault is tried once.  In the second try, skip
      > ra->mmap_miss decreasing.  The filemap_fault state machine is ok with it.
      
      With this change, Mark reports that:
      
      > Random read improves by 250%, sequential read improves by 40%, and
      > random write by 400% to an eMMC device with dm crypto wrapped around it.
      
      Cc: Shaohua Li <shli@kernel.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarMark Salyzyn <salyzyn@android.com>
      Signed-off-by: default avatarRiley Andrews <riandrews@android.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      569ba74a
    • Yang Shi's avatar
      arm64: debug: Fix typo in debug-monitors.c · 95485fdc
      Yang Shi authored
      Fix comment typo: s/handers/handlers/
      Signed-off-by: default avatarYang Shi <yang.shi@linaro.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      95485fdc
  2. 04 Oct, 2015 6 commits
    • Linus Torvalds's avatar
      Linux 4.3-rc4 · 049e6dde
      Linus Torvalds authored
      049e6dde
    • Linus Torvalds's avatar
      Merge branch 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile · 30c44659
      Linus Torvalds authored
      Pull strscpy string copy function implementation from Chris Metcalf.
      
      Chris sent this during the merge window, but I waffled back and forth on
      the pull request, which is why it's going in only now.
      
      The new "strscpy()" function is definitely easier to use and more secure
      than either strncpy() or strlcpy(), both of which are horrible nasty
      interfaces that have serious and irredeemable problems.
      
      strncpy() has a useless return value, and doesn't NUL-terminate an
      overlong result.  To make matters worse, it pads a short result with
      zeroes, which is a performance disaster if you have big buffers.
      
      strlcpy(), by contrast, is a mis-designed "fix" for strlcpy(), lacking
      the insane NUL padding, but having a differently broken return value
      which returns the original length of the source string.  Which means
      that it will read characters past the count from the source buffer, and
      you have to trust the source to be properly terminated.  It also makes
      error handling fragile, since the test for overflow is unnecessarily
      subtle.
      
      strscpy() avoids both these problems, guaranteeing the NUL termination
      (but not excessive padding) if the destination size wasn't zero, and
      making the overflow condition very obvious by returning -E2BIG.  It also
      doesn't read past the size of the source, and can thus be used for
      untrusted source data too.
      
      So why did I waffle about this for so long?
      
      Every time we introduce a new-and-improved interface, people start doing
      these interminable series of trivial conversion patches.
      
      And every time that happens, somebody does some silly mistake, and the
      conversion patch to the improved interface actually makes things worse.
      Because the patch is mindnumbing and trivial, nobody has the attention
      span to look at it carefully, and it's usually done over large swatches
      of source code which means that not every conversion gets tested.
      
      So I'm pulling the strscpy() support because it *is* a better interface.
      But I will refuse to pull mindless conversion patches.  Use this in
      places where it makes sense, but don't do trivial patches to fix things
      that aren't actually known to be broken.
      
      * 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
        tile: use global strscpy() rather than private copy
        string: provide strscpy()
        Make asm/word-at-a-time.h available on all architectures
      30c44659
    • Linus Torvalds's avatar
      Merge tag 'md/4.3-fixes' of git://neil.brown.name/md · 15ecf9a9
      Linus Torvalds authored
      Pull md fixes from Neil Brown:
       "Assorted fixes for md in 4.3-rc.
      
        Two tagged for -stable, and one is really a cleanup to match and
        improve kmemcache interface.
      
      * tag 'md/4.3-fixes' of git://neil.brown.name/md:
        md/bitmap: don't pass -1 to bitmap_storage_alloc.
        md/raid1: Avoid raid1 resync getting stuck
        md: drop null test before destroy functions
        md: clear CHANGE_PENDING in readonly array
        md/raid0: apply base queue limits *before* disk_stack_limits
        md/raid5: don't index beyond end of array in need_this_block().
        raid5: update analysis state for failed stripe
        md: wait for pending superblock updates before switching to read-only
      15ecf9a9
    • Linus Torvalds's avatar
      Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus · 0d877081
      Linus Torvalds authored
      Pull MIPS updates from Ralf Baechle:
       "This week's round of MIPS fixes:
         - Fix JZ4740 build
         - Fix fallback to GFP_DMA
         - FP seccomp in case of ENOSYS
         - Fix bootmem panic
         - A number of FP and CPS fixes
         - Wire up new syscalls
         - Make sure BPF assembler objects can properly be disassembled
         - Fix BPF assembler code for MIPS I"
      
      * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
        MIPS: scall: Always run the seccomp syscall filters
        MIPS: Octeon: Fix kernel panic on startup from memory corruption
        MIPS: Fix R2300 FP context switch handling
        MIPS: Fix octeon FP context switch handling
        MIPS: BPF: Fix load delay slots.
        MIPS: BPF: Do all exports of symbols with FEXPORT().
        MIPS: Fix the build on jz4740 after removing the custom gpio.h
        MIPS: CPS: #ifdef on CONFIG_MIPS_MT_SMP rather than CONFIG_MIPS_MT
        MIPS: CPS: Don't include MT code in non-MT kernels.
        MIPS: CPS: Stop dangling delay slot from has_mt.
        MIPS: dma-default: Fix 32-bit fall back to GFP_DMA
        MIPS: Wire up userfaultfd and membarrier syscalls.
      0d877081
    • Linus Torvalds's avatar
      Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3e519dde
      Linus Torvalds authored
      Pull irq fixes from Thomas Gleixner:
       "This update contains:
      
         - Fix for a long standing race affecting /proc/irq/NNN
      
         - One line fix for ARM GICV3-ITS counting the wrong data
      
         - Warning silencing in ARM GICV3-ITS.  Another GCC trying to be
           overly clever issue"
      
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqchip/gic-v3-its: Count additional LPIs for the aliased devices
        irqchip/gic-v3-its: Silence warning when its_lpi_alloc_chunks gets inlined
        genirq: Fix race in register_irq_proc()
      3e519dde
    • Markos Chandras's avatar
      MIPS: scall: Always run the seccomp syscall filters · d218af78
      Markos Chandras authored
      The MIPS syscall handler code used to return -ENOSYS on invalid
      syscalls. Whilst this is expected, it caused problems for seccomp
      filters because the said filters never had the change to run since
      the code returned -ENOSYS before triggering them. This caused
      problems on the chromium testsuite for filters looking for invalid
      syscalls. This has now changed and the seccomp filters are always
      run even if the syscall is invalid. We return -ENOSYS once we
      return from the seccomp filters. Moreover, similar codepaths have
      been merged in the process which simplifies somewhat the overall
      syscall code.
      Signed-off-by: default avatarMarkos Chandras <markos.chandras@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/11236/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      d218af78
  3. 03 Oct, 2015 4 commits
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 2cf30826
      Linus Torvalds authored
      Pull x86 fixes from Ingo Molnar:
       "Fixes all around the map: W+X kernel mapping fix, WCHAN fixes, two
        build failure fixes for corner case configs, x32 header fix and a
        speling fix"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/headers/uapi: Fix __BITS_PER_LONG value for x32 builds
        x86/mm: Set NX on gap between __ex_table and rodata
        x86/kexec: Fix kexec crash in syscall kexec_file_load()
        x86/process: Unify 32bit and 64bit implementations of get_wchan()
        x86/process: Add proper bound checks in 64bit get_wchan()
        x86, efi, kasan: Fix build failure on !KASAN && KMEMCHECK=y kernels
        x86/hyperv: Fix the build in the !CONFIG_KEXEC_CORE case
        x86/cpufeatures: Correct spelling of the HWP_NOTIFY flag
      2cf30826
    • Linus Torvalds's avatar
      Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 37cc7ab1
      Linus Torvalds authored
      Pull timer fixes from Ingo Molnar:
       "An abs64() fix in the watchdog driver, and two clocksource driver
        NO_IRQ assumption fixes"
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        clocksource: Fix abs() usage w/ 64bit values
        clocksource/drivers/keystone: Fix bad NO_IRQ usage
        clocksource/drivers/rockchip: Fix bad NO_IRQ usage
      37cc7ab1
    • Linus Torvalds's avatar
      Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a758379b
      Linus Torvalds authored
      Pull EFI fixes from Ingo Molnar:
       "Two EFI fixes: one for x86, one for ARM, fixing a boot crash bug that
        can trigger under newer EFI firmware"
      
      * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        arm64/efi: Fix boot crash by not padding between EFI_MEMORY_RUNTIME regions
        x86/efi: Fix boot crash by mapping EFI memmap entries bottom-up at runtime, instead of top-down
      a758379b
    • Linus Torvalds's avatar
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · 14f97d97
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Bunch of fixes all over the place, all pretty small: amdgpu, i915,
        exynos, one qxl and one vmwgfx.
      
        There is also a bunch of mst fixes, I left some cleanups in the series
        as I didn't think it was worth splitting up the tested series"
      
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (37 commits)
        drm/dp/mst: add some defines for logical/physical ports
        drm/dp/mst: drop cancel work sync in the mstb destroy path (v2)
        drm/dp/mst: split connector registration into two parts (v2)
        drm/dp/mst: update the link_address_sent before sending the link address (v3)
        drm/dp/mst: fixup handling hotplug on port removal.
        drm/dp/mst: don't pass port into the path builder function
        drm/radeon: drop radeon_fb_helper_set_par
        drm: handle cursor_set2 in restore_fbdev_mode
        drm/exynos: Staticize local function in exynos_drm_gem.c
        drm/exynos: fimd: actually disable dp clock
        drm/exynos: dp: remove suspend/resume functions
        drm/qxl: recreate the primary surface when the bo is not primary
        drm/amdgpu: only print meaningful VM faults
        drm/amdgpu/cgs: remove import_gpu_mem
        drm/i915: Call non-locking version of drm_kms_helper_poll_enable(), v2
        drm: Add a non-locking version of drm_kms_helper_poll_enable(), v2
        drm/vmwgfx: Fix a command submission hang regression
        drm/exynos: remove unused mode_fixup() code
        drm/exynos: remove decon_mode_fixup()
        drm/exynos: remove fimd_mode_fixup()
        ...
      14f97d97
  4. 02 Oct, 2015 27 commits
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 978ab6a0
      Linus Torvalds authored
      Pull input layer fixes from Dmitry Torokhov:
       "Fixes for two recent regressions (in Synaptics PS/2 and uinput
        drivers) and some more driver fixups"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
        Revert "Input: synaptics - fix handling of disabling gesture mode"
        Input: psmouse - fix data race in __ps2_command
        Input: elan_i2c - add all valid ic type for i2c/smbus
        Input: zhenhua - ensure we have BITREVERSE
        Input: omap4-keypad - fix memory leak
        Input: serio - fix blocking of parport
        Input: uinput - fix crash when using ABS events
        Input: elan_i2c - expand maximum product_id form 0xFF to 0xFFFF
        Input: elan_i2c - add ic type 0x03
        Input: elan_i2c - don't require known iap version
        Input: imx6ul_tsc - fix controller name
        Input: imx6ul_tsc - use the preferred method for kzalloc()
        Input: imx6ul_tsc - check for negative return value
        Input: imx6ul_tsc - propagate the errors
        Input: walkera0701 - fix abs() calculations on 64 bit values
        Input: mms114 - remove unneded semicolons
        Input: pm8941-pwrkey - remove unneded semicolon
        Input: fix typo in MT documentation
        Input: cyapa - fix address of Gen3 devices in device tree documentation
      978ab6a0
    • John Stultz's avatar
      clocksource: Fix abs() usage w/ 64bit values · 67dfae0c
      John Stultz authored
      This patch fixes one cases where abs() was being used with 64-bit
      nanosecond values, where the result may be capped at 32-bits.
      
      This potentially could cause watchdog false negatives on 32-bit
      systems, so this patch addresses the issue by using abs64().
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/1442279124-7309-2-git-send-email-john.stultz@linaro.orgSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      67dfae0c
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 5634347d
      Linus Torvalds authored
      Pull arm64 fixes from Catalin Marinas:
      
       - Fix for transparent huge page change_protection() logic which was
         inadvertently changing a huge pmd page into a pmd table entry.
      
       - Function graph tracer panic fix caused by the return_to_handler code
         corrupting the multi-regs function return value (composite types).
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: ftrace: fix function_graph tracer panic
        arm64: Fix THP protection change logic
      5634347d
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k · b55a97e7
      Linus Torvalds authored
      Pull m68k updates from Geert Uytterhoeven:
       "Summary:
         - Fix for accidental modification of arguments of syscall functions
         - Wire up new syscalls
         - Update defconfigs"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
        m68k/defconfig: Update defconfigs for v4.3-rc1
        m68k: Define asmlinkage_protect
        m68k: Wire up membarrier
        m68k: Wire up userfaultfd
        m68k: Wire up direct socket calls
      b55a97e7
    • Marc Zyngier's avatar
      irqchip/gic-v3-its: Count additional LPIs for the aliased devices · 791c76d5
      Marc Zyngier authored
      When configuring the interrupt mapping for a new device, we
      iterate over all the possible aliases to account for their
      maximum MSI allocation. This was introduced by e8137f4f
      ("irqchip: gicv3-its: Iterate over PCI aliases to generate ITS configuration").
      
      Turns out that the code doing that is a bit braindead, and repeatedly
      accounts for the same device over and over.
      
      Fix this by counting the actual alias that is passed to us by the
      core code.
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: Alex Shi <alex.shi@linaro.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: David Daney <ddaney.cavm@gmail.com>
      Cc: Jason Cooper <jason@lakedaemon.net>
      Link: http://lkml.kernel.org/r/1443800646-8074-3-git-send-email-marc.zyngier@arm.comSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      791c76d5
    • Marc Zyngier's avatar
      irqchip/gic-v3-its: Silence warning when its_lpi_alloc_chunks gets inlined · c8415b94
      Marc Zyngier authored
      More agressive inlining in recent versions of GCC have uncovered
      a new set of warnings:
      
       drivers/irqchip/irq-gic-v3-its.c: In function its_msi_prepare:
        drivers/irqchip/irq-gic-v3-its.c:1148:26: warning: lpi_base may be used
          uninitialized in this function [-Wmaybe-uninitialized]
           dev->event_map.lpi_base = lpi_base;
                                ^
       drivers/irqchip/irq-gic-v3-its.c:1116:6: note: lpi_base was declared here
        int lpi_base;
      	      ^
       drivers/irqchip/irq-gic-v3-its.c:1149:25: warning: nr_lpis may be used
        uninitialized in this function [-Wmaybe-uninitialized]
         dev->event_map.nr_lpis = nr_lpis;
      	                         ^
       drivers/irqchip/irq-gic-v3-its.c:1117:6: note: nr_lpis was declared here
        int nr_lpis;
      	      ^
      The warning is fairly benign (there is no code path that could
      actually use uninitialized variables), but let's silence it anyway
      by zeroing the variables on the error path.
      Reported-by: default avatarAlex Shi <alex.shi@linaro.org>
      Tested-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: David Daney <ddaney.cavm@gmail.com>
      Cc: Jason Cooper <jason@lakedaemon.net>
      Link: http://lkml.kernel.org/r/1443800646-8074-2-git-send-email-marc.zyngier@arm.comSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      c8415b94
    • Linus Torvalds's avatar
      Merge tag 'dmaengine-fix-4.3-rc4' of git://git.infradead.org/users/vkoul/slave-dma · 83dc311c
      Linus Torvalds authored
      Pull dmaengine fixes from Vinod Koul:
       "This contains fixes spread throughout the drivers, and also fixes one
        more instance of privatecnt in dmaengine.
      
        Driver fixes summary:
         - bunch of pxa_dma fixes for reuse of descriptor issue, residue and
           no-requestor
         - odd fixes in xgene, idma, sun4i and zxdma
         - at_xdmac fixes for cleaning descriptor and block addr mode"
      
      * tag 'dmaengine-fix-4.3-rc4' of git://git.infradead.org/users/vkoul/slave-dma:
        dmaengine: pxa_dma: fix residue corner case
        dmaengine: pxa_dma: fix the no-requestor case
        dmaengine: zxdma: Fix off-by-one for testing valid pchan request
        dmaengine: at_xdmac: clean used descriptor
        dmaengine: at_xdmac: change block increment addressing mode
        dmaengine: dw: properly read DWC_PARAMS register
        dmaengine: xgene-dma: Fix overwritting DMA tx ring
        dmaengine: fix balance of privatecnt
        dmaengine: sun4i: fix unsafe list iteration
        dmaengine: idma64: improve residue estimation
        dmaengine: xgene-dma: fix handling xgene_dma_get_ring_size result
        dmaengine: pxa_dma: fix initial list move
      83dc311c
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.dk/linux-block · 27728bf0
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "Another week, another round of fixes.
      
        These have been brewing for a bit and in various iterations, but I
        feel pretty comfortable about the quality of them.  They fix real
        issues.  The pull request is mostly blk-mq related, and the only one
        not fixing a real bug, is the tag iterator abstraction from Christoph.
        But it's pretty trivial, and we'll need it for another fix soon.
      
        Apart from the blk-mq fixes, there's an NVMe affinity fix from Keith,
        and a single fix for xen-blkback from Roger fixing failure to free
        requests on disconnect"
      
      * 'for-linus' of git://git.kernel.dk/linux-block:
        blk-mq: factor out a helper to iterate all tags for a request_queue
        blk-mq: fix racy updates of rq->errors
        blk-mq: fix deadlock when reading cpu_list
        blk-mq: avoid inserting requests before establishing new mapping
        blk-mq: fix q->mq_usage_counter access race
        blk-mq: Fix use after of free q->mq_map
        blk-mq: fix sysfs registration/unregistration race
        blk-mq: avoid setting hctx->tags->cpumask before allocation
        NVMe: Set affinity after allocating request queues
        xen/blkback: free requests on disconnection
      27728bf0
    • Dmitry Torokhov's avatar
      Revert "Input: synaptics - fix handling of disabling gesture mode" · 62d78461
      Dmitry Torokhov authored
      This reverts commit e51e3849: we
      actually do want the device to work in extended W mode, as this is the
      mode that allows us receiving multiple contact information.
      
      Cc: stable@vger.kernel.org
      62d78461
    • Matt Bennett's avatar
      MIPS: Octeon: Fix kernel panic on startup from memory corruption · 66803dd9
      Matt Bennett authored
      During development it was found that a number of builds would panic
      during the kernel init process, more specifically in 'delayed_fput()'.
      The panic showed the kernel trying to access a memory address of
      '0xb7fdc00' while traversing the 'delayed_fput_list' structure.
      Comparing this memory address to the value of the pointer used on
      builds that did not panic confirmed that the pointer on crashing
      builds must have been corrupted at some stage earlier in the init
      process.
      
      By traversing the list earlier and earlier in the code it was found
      that 'plat_mem_setup()' was responsible for corrupting the list.
      Specifically the line:
      
          memory = cvmx_bootmem_phy_alloc(mem_alloc_size,
      			__pa_symbol(&__init_end), -1,
      			0x100000,
      			CVMX_BOOTMEM_FLAG_NO_LOCKING);
      
      Which would eventually call:
      
          cvmx_bootmem_phy_set_size(new_ent_addr,
      		cvmx_bootmem_phy_get_size
      		(ent_addr) -
      		(desired_min_addr -
      			ent_addr));
      
      Where 'new_ent_addr'=0x4800000 (the address of 'delayed_fput_list')
      and the second argument (size)=0xb7fdc00 (the address causing the
      kernel panic). The job of this part of 'plat_mem_setup()' is to
      allocate chunks of memory for the kernel to use. At the start of
      each chunk of memory the size of the chunk is written, hence the
      value 0xb7fdc00 is written onto memory at 0x4800000, therefore the
      kernel panics when it goes back to access 'delayed_fput_list' later
      on in the initialisation process.
      
      On builds that were not crashing it was found that the compiler had
      placed 'delayed_fput_list' at 0x4800008, meaning it wasn't corrupted
      (but something else in memory was overwritten).
      
      As can be seen in the first function call above the code begins to
      allocate chunks of memory beginning from the symbol '__init_end'.
      The MIPS linker script (vmlinux.lds.S) however defines the .bss
      section to begin after '__init_end'. Therefore memory within the
      .bss section is allocated to the kernel to use (System.map shows
      'delayed_fput_list' and other kernel structures to be in .bss).
      
      To stop the kernel panic (and the .bss section being corrupted)
      memory should begin being allocated from the symbol '_end'.
      Signed-off-by: default avatarMatt Bennett <matt.bennett@alliedtelesis.co.nz>
      Acked-by: default avatarDavid Daney <david.daney@cavium.com>
      Cc: linux-mips@linux-mips.org
      Cc: aleksey.makarov@auriga.com
      Patchwork: https://patchwork.linux-mips.org/patch/11251/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      66803dd9
    • Paul Burton's avatar
      MIPS: Fix R2300 FP context switch handling · 085c2f25
      Paul Burton authored
      Commit 1a3d5957 ("MIPS: Tidy up FPU context switching") removed FP
      context saving from the asm-written resume function in favour of reusing
      existing code to perform the same task. However it only removed the FP
      context saving code from the r4k_switch.S implementation of resume.
      Remove it from the r2300_switch.S implementation too in order to prevent
      attempting to save the FP context twice, which would likely lead to an
      exception from the second save because the FPU had already been disabled
      by the first save.
      
      This patch has only been build tested, using rbtx49xx_defconfig.
      
      Fixes: 1a3d5957 ("MIPS: Tidy up FPU context switching")
      Signed-off-by: default avatarPaul Burton <paul.burton@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Cc: Maciej W. Rozycki <macro@linux-mips.org>
      Cc: linux-kernel@vger.kernel.org
      Cc: Manuel Lauss <manuel.lauss@gmail.com>
      Patchwork: https://patchwork.linux-mips.org/patch/11167/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      085c2f25
    • Paul Burton's avatar
      MIPS: Fix octeon FP context switch handling · 0fa24340
      Paul Burton authored
      Commit 1a3d5957 ("MIPS: Tidy up FPU context switching") removed FP
      context saving from the asm-written resume function in favour of reusing
      existing code to perform the same task. However it only removed the FP
      context saving code from the r4k_switch.S implementation of resume.
      Octeon uses its own implementation in octeon_switch.S, so remove FP
      context saving there too in order to prevent attempting to save context
      twice. That formerly led to an exception from the second save as follows
      because the FPU had already been disabled by the first save:
      
          do_cpu invoked from kernel context![#1]:
          CPU: 0 PID: 2 Comm: kthreadd Not tainted 4.3.0-rc2-dirty #2
          task: 800000041f84a008 ti: 800000041f864000 task.ti: 800000041f864000
          $ 0   : 0000000000000000 0000000010008ce1 0000000000100000 ffffffffbfffffff
          $ 4   : 800000041f84a008 800000041f84ac08 800000041f84c000 0000000000000004
          $ 8   : 0000000000000001 0000000000000000 0000000000000000 0000000000000001
          $12   : 0000000010008ce3 0000000000119c60 0000000000000036 800000041f864000
          $16   : 800000041f84ac08 800000000792ce80 800000041f84a008 ffffffff81758b00
          $20   : 0000000000000000 ffffffff8175ae50 0000000000000000 ffffffff8176c740
          $24   : 0000000000000006 ffffffff81170300
          $28   : 800000041f864000 800000041f867d90 0000000000000000 ffffffff815f3fa0
          Hi    : 0000000000fa8257
          Lo    : ffffffffe15cfc00
          epc   : ffffffff8112821c resume+0x9c/0x200
          ra    : ffffffff815f3fa0 __schedule+0x3f0/0x7d8
          Status: 10008ce2        KX SX UX KERNEL EXL
          Cause : 1080002c (ExcCode 0b)
          PrId  : 000d0601 (Cavium Octeon+)
          Modules linked in:
          Process kthreadd (pid: 2, threadinfo=800000041f864000, task=800000041f84a008, tls=0000000000000000)
          Stack : ffffffff81604218 ffffffff815f7e08 800000041f84a008 ffffffff811681b0
                    800000041f84a008 ffffffff817e9878 0000000000000000 ffffffff81770000
                    ffffffff81768340 ffffffff81161398 0000000000000001 0000000000000000
                    0000000000000000 ffffffff815f4424 0000000000000000 ffffffff81161d68
                    ffffffff81161be8 0000000000000000 0000000000000000 0000000000000000
                    0000000000000000 0000000000000000 0000000000000000 ffffffff8111e16c
                    0000000000000000 0000000000000000 0000000000000000 0000000000000000
                    0000000000000000 0000000000000000 0000000000000000 0000000000000000
                    0000000000000000 0000000000000000 0000000000000000 0000000000000000
                    0000000000000000 0000000000000000 0000000000000000 0000000000000000
                    ...
          Call Trace:
          [<ffffffff8112821c>] resume+0x9c/0x200
          [<ffffffff815f3fa0>] __schedule+0x3f0/0x7d8
          [<ffffffff815f4424>] schedule+0x34/0x98
          [<ffffffff81161d68>] kthreadd+0x180/0x198
          [<ffffffff8111e16c>] ret_from_kernel_thread+0x14/0x1c
      
      Tested using cavium_octeon_defconfig on an EdgeRouter Lite.
      
      Fixes: 1a3d5957 ("MIPS: Tidy up FPU context switching")
      Reported-by: default avatarAaro Koskinen <aaro.koskinen@nokia.com>
      Signed-off-by: default avatarPaul Burton <paul.burton@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Cc: Aleksey Makarov <aleksey.makarov@auriga.com>
      Cc: linux-kernel@vger.kernel.org
      Cc: Chandrakala Chavva <cchavva@caviumnetworks.com>
      Cc: David Daney <david.daney@cavium.com>
      Cc: Leonid Rosenboim <lrosenboim@caviumnetworks.com>
      Patchwork: https://patchwork.linux-mips.org/patch/11166/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      0fa24340
    • Linus Torvalds's avatar
      Merge tag 'mmc-v4.3-rc3' of git://git.linaro.org/people/ulf.hansson/mmc · 36f8dafe
      Linus Torvalds authored
      Pull MMC fixes from Ulf Hansson:
       "Here are some mmc fixes intended for v4.3 rc4:
      
        MMC core:
         - Allow users of mmc_of_parse() to succeed when CONFIG_GPIOLIB is
           unset
         - Prevent infinite loop of re-tuning for CRC-errors for CMD19 and
           CMD21
      
         MMC host:
         - pxamci: Fix issues with card detect
         - sunxi: Fix clk-delay settings"
      
      * tag 'mmc-v4.3-rc3' of git://git.linaro.org/people/ulf.hansson/mmc:
        mmc: core: fix dead loop of mmc_retune
        mmc: pxamci: fix card detect with slot-gpio API
        mmc: sunxi: Fix clk-delay settings
        mmc: core: Don't return an error for CD/WP GPIOs when GPIOLIB is unset
      36f8dafe
    • Linus Torvalds's avatar
      Merge git://git.infradead.org/intel-iommu · 8c25ab8b
      Linus Torvalds authored
      Pull IOVA fixes from David Woodhouse:
       "The main fix here is the first one, fixing the over-allocation of
         size-aligned requests.  The other patches simply make the existing
        IOVA code available to users other than the Intel VT-d driver, with no
        functional change.
      
        I concede the latter really *should* have been submitted during the
        merge window, but since it's basically risk-free and people are
        waiting to build on top of it and it's my fault I didn't get it in, I
        (and they) would be grateful if you'd take it"
      
      * git://git.infradead.org/intel-iommu:
        iommu: Make the iova library a module
        iommu: iova: Export symbols
        iommu: iova: Move iova cache management to the iova library
        iommu/iova: Avoid over-allocating when size-aligned
      8c25ab8b
    • Li Bin's avatar
      arm64: ftrace: fix function_graph tracer panic · ee556d00
      Li Bin authored
      When function graph tracer is enabled, the following operation
      will trigger panic:
      
      mount -t debugfs nodev /sys/kernel
      echo next_tgid > /sys/kernel/tracing/set_ftrace_filter
      echo function_graph > /sys/kernel/tracing/current_tracer
      ls /proc/
      
      ------------[ cut here ]------------
      [  198.501417] Unable to handle kernel paging request at virtual address cb88537fdc8ba316
      [  198.506126] pgd = ffffffc008f79000
      [  198.509363] [cb88537fdc8ba316] *pgd=00000000488c6003, *pud=00000000488c6003, *pmd=0000000000000000
      [  198.517726] Internal error: Oops: 94000005 [#1] SMP
      [  198.518798] Modules linked in:
      [  198.520582] CPU: 1 PID: 1388 Comm: ls Tainted: G
      [  198.521800] Hardware name: linux,dummy-virt (DT)
      [  198.522852] task: ffffffc0fa9e8000 ti: ffffffc0f9ab0000 task.ti: ffffffc0f9ab0000
      [  198.524306] PC is at next_tgid+0x30/0x100
      [  198.525205] LR is at return_to_handler+0x0/0x20
      [  198.526090] pc : [<ffffffc0002a1070>] lr : [<ffffffc0000907c0>] pstate: 60000145
      [  198.527392] sp : ffffffc0f9ab3d40
      [  198.528084] x29: ffffffc0f9ab3d40 x28: ffffffc0f9ab0000
      [  198.529406] x27: ffffffc000d6a000 x26: ffffffc000b786e8
      [  198.530659] x25: ffffffc0002a1900 x24: ffffffc0faf16c00
      [  198.531942] x23: ffffffc0f9ab3ea0 x22: 0000000000000002
      [  198.533202] x21: ffffffc000d85050 x20: 0000000000000002
      [  198.534446] x19: 0000000000000002 x18: 0000000000000000
      [  198.535719] x17: 000000000049fa08 x16: ffffffc000242efc
      [  198.537030] x15: 0000007fa472b54c x14: ffffffffff000000
      [  198.538347] x13: ffffffc0fada84a0 x12: 0000000000000001
      [  198.539634] x11: ffffffc0f9ab3d70 x10: ffffffc0f9ab3d70
      [  198.540915] x9 : ffffffc0000907c0 x8 : ffffffc0f9ab3d40
      [  198.542215] x7 : 0000002e330f08f0 x6 : 0000000000000015
      [  198.543508] x5 : 0000000000000f08 x4 : ffffffc0f9835ec0
      [  198.544792] x3 : cb88537fdc8ba316 x2 : cb88537fdc8ba306
      [  198.546108] x1 : 0000000000000002 x0 : ffffffc000d85050
      [  198.547432]
      [  198.547920] Process ls (pid: 1388, stack limit = 0xffffffc0f9ab0020)
      [  198.549170] Stack: (0xffffffc0f9ab3d40 to 0xffffffc0f9ab4000)
      [  198.582568] Call trace:
      [  198.583313] [<ffffffc0002a1070>] next_tgid+0x30/0x100
      [  198.584359] [<ffffffc0000907bc>] ftrace_graph_caller+0x6c/0x70
      [  198.585503] [<ffffffc0000907bc>] ftrace_graph_caller+0x6c/0x70
      [  198.586574] [<ffffffc0000907bc>] ftrace_graph_caller+0x6c/0x70
      [  198.587660] [<ffffffc0000907bc>] ftrace_graph_caller+0x6c/0x70
      [  198.588896] Code: aa0003f5 2a0103f4 b4000102 91004043 (885f7c60)
      [  198.591092] ---[ end trace 6a346f8f20949ac8 ]---
      
      This is because when using function graph tracer, if the traced
      function return value is in multi regs ([x0-x7]), return_to_handler
      may corrupt them. So in return_to_handler, the parameter regs should
      be protected properly.
      
      Cc: <stable@vger.kernel.org> # 3.18+
      Signed-off-by: default avatarLi Bin <huawei.libin@huawei.com>
      Acked-by: default avatarAKASHI Takahiro <takahiro.akashi@linaro.org>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      ee556d00
    • Ralf Baechle's avatar
      MIPS: BPF: Fix load delay slots. · 0c5d1878
      Ralf Baechle authored
      The entire bpf_jit_asm.S is written in noreorder mode because "we know
      better" according to a comment.  This also prevented the assembler from
      throwing in the required NOPs for MIPS I processors which have no
      load-use interlock, thus the load's consumer might end up using the
      old value of the register from prior to the load.
      
      Fixed by putting the assembler in reorder mode for just the affected
      load instructions.  This is not enough for gas to actually try to be
      clever by looking at the next instruction and inserting a nop only
      when needed but as the comment said "we know better", so getting gas
      to unconditionally emit a NOP is just right in this case and prevents
      adding further ifdefery.
      Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      0c5d1878
    • Ben Hutchings's avatar
      x86/headers/uapi: Fix __BITS_PER_LONG value for x32 builds · f4b4aae1
      Ben Hutchings authored
      On x32, gcc predefines __x86_64__ but long is only 32-bit.  Use
      __ILP32__ to distinguish x32.
      
      Fixes this compiler error in perf:
      
      	tools/include/asm-generic/bitops/__ffs.h: In function '__ffs':
      	tools/include/asm-generic/bitops/__ffs.h:19:8: error: right shift count >= width of type [-Werror=shift-count-overflow]
      	  word >>= 32;
      	       ^
      
      This isn't sufficient to build perf for x32, though.
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Link: http://lkml.kernel.org/r/1443660043.2730.15.camel@decadent.org.ukSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      f4b4aae1
    • NeilBrown's avatar
      md/bitmap: don't pass -1 to bitmap_storage_alloc. · da6fb7a9
      NeilBrown authored
      Passing -1 to bitmap_storage_alloc() causes page->index to be set to
      -1, which is quite problematic.
      
      So only pass ->cluster_slot if mddev_is_clustered().
      
      Fixes: b97e9257 ("Use separate bitmaps for each nodes in the cluster")
      Cc: stable@vger.kernel.org (v4.1+)
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      da6fb7a9
    • Jes Sorensen's avatar
      md/raid1: Avoid raid1 resync getting stuck · e8ff8bf0
      Jes Sorensen authored
      close_sync() needs to set conf->next_resync to a large, but safe value
      below MaxSector and use it to determine whether or not to set
      start_next_window in wait_barrier()
      
      Solution suggested by Neil Brown.
      Reported-by: default avatarNate Dailey <nate.dailey@stratus.com>
      Tested-by: default avatarXiao Ni <xni@redhat.com>
      Signed-off-by: default avatarJes Sorensen <Jes.Sorensen@redhat.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      e8ff8bf0
    • Julia Lawall's avatar
      md: drop null test before destroy functions · 644df1a8
      Julia Lawall authored
      Remove unneeded NULL test.
      
      The semantic patch that makes this change is as follows:
      (http://coccinelle.lip6.fr/)
      
      // <smpl>
      @@ expression x; @@
      -if (x != NULL)
        \(kmem_cache_destroy\|mempool_destroy\|dma_pool_destroy\)(x);
      // </smpl>
      Signed-off-by: default avatarJulia Lawall <Julia.Lawall@lip6.fr>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      644df1a8
    • Shaohua Li's avatar
      md: clear CHANGE_PENDING in readonly array · d4929add
      Shaohua Li authored
      If faulty disks of an array are more than allowed degraded number, the
      array enters error handling. It will be marked as read-only with
      MD_CHANGE_PENDING/RECOVERY_NEEDED set. But currently recovery doesn't
      clear CHANGE_PENDING bit for read-only array.  If MD_CHANGE_PENDING is
      set for a raid5 array, all returned IO will be hold on a list till the
      bit is clear. But recovery nevery clears this bit, the IO is always in
      pending state and nevery finish. This has bad effects like upper layer
      can't get an IO error and the array can't be stopped.
      
      Fixes: c3cce6cd ("md/raid5: ensure device failure recorded before write request returns.")
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      d4929add
    • NeilBrown's avatar
      md/raid0: apply base queue limits *before* disk_stack_limits · 66eefe5d
      NeilBrown authored
      Calling e.g. blk_queue_max_hw_sectors() after calls to
      disk_stack_limits() discards the settings determined by
      disk_stack_limits().
      So we need to make those calls first.
      
      Fixes: 199dc6ed ("md/raid0: update queue parameter in a safer location.")
      Cc: stable@vger.kernel.org (v2.6.35+ - please apply with 199dc6ed).
      Reported-by: default avatarJes Sorensen <Jes.Sorensen@redhat.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      66eefe5d
    • NeilBrown's avatar
      md/raid5: don't index beyond end of array in need_this_block(). · 36707bb2
      NeilBrown authored
      When need_this_block probably shouldn't be called when there
      are more than 2 failed devices, we really don't want it to try
      indexing beyond the end of the failed_num[] of fdev[] arrays.
      
      So limit the loops to at most 2 iterations.
      Reported-by: default avatarShaohua Li <shli@fb.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      36707bb2
    • Shaohua Li's avatar
      raid5: update analysis state for failed stripe · ebda780b
      Shaohua Li authored
      handle_failed_stripe() makes the stripe fail, eg, all IO will return
      with a failure, but it doesn't update stripe_head_state. Later
      handle_stripe() has special handling for raid6 for handle_stripe_fill().
      That check before handle_stripe_fill() doesn't skip the failed stripe
      and we get a kernel crash in need_this_block.  This patch clear the
      analysis state to make sure no functions wrongly called after
      handle_failed_stripe()
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      ebda780b
    • NeilBrown's avatar
      md: wait for pending superblock updates before switching to read-only · 88724bfa
      NeilBrown authored
      If a superblock update is pending, wait for it to complete before
      letting md_set_readonly() switch to readonly.
      Otherwise we might lose important information about a device having
      failed.
      
      For external arrays, waiting for superblock updates can wait on
      user-space, so in that case, just return an error.
      Reported-and-tested-by: default avatarShaohua Li <shli@fb.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      88724bfa
    • Stephen Smalley's avatar
      x86/mm: Set NX on gap between __ex_table and rodata · ab76f7b4
      Stephen Smalley authored
      Unused space between the end of __ex_table and the start of
      rodata can be left W+x in the kernel page tables.  Extend the
      setting of the NX bit to cover this gap by starting from
      text_end rather than rodata_start.
      
        Before:
        ---[ High Kernel Mapping ]---
        0xffffffff80000000-0xffffffff81000000          16M                               pmd
        0xffffffff81000000-0xffffffff81600000           6M     ro         PSE     GLB x  pmd
        0xffffffff81600000-0xffffffff81754000        1360K     ro                 GLB x  pte
        0xffffffff81754000-0xffffffff81800000         688K     RW                 GLB x  pte
        0xffffffff81800000-0xffffffff81a00000           2M     ro         PSE     GLB NX pmd
        0xffffffff81a00000-0xffffffff81b3b000        1260K     ro                 GLB NX pte
        0xffffffff81b3b000-0xffffffff82000000        4884K     RW                 GLB NX pte
        0xffffffff82000000-0xffffffff82200000           2M     RW         PSE     GLB NX pmd
        0xffffffff82200000-0xffffffffa0000000         478M                               pmd
      
        After:
        ---[ High Kernel Mapping ]---
        0xffffffff80000000-0xffffffff81000000          16M                               pmd
        0xffffffff81000000-0xffffffff81600000           6M     ro         PSE     GLB x  pmd
        0xffffffff81600000-0xffffffff81754000        1360K     ro                 GLB x  pte
        0xffffffff81754000-0xffffffff81800000         688K     RW                 GLB NX pte
        0xffffffff81800000-0xffffffff81a00000           2M     ro         PSE     GLB NX pmd
        0xffffffff81a00000-0xffffffff81b3b000        1260K     ro                 GLB NX pte
        0xffffffff81b3b000-0xffffffff82000000        4884K     RW                 GLB NX pte
        0xffffffff82000000-0xffffffff82200000           2M     RW         PSE     GLB NX pmd
        0xffffffff82200000-0xffffffffa0000000         478M                               pmd
      Signed-off-by: default avatarStephen Smalley <sds@tycho.nsa.gov>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Cc: <stable@vger.kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Link: http://lkml.kernel.org/r/1443704662-3138-1-git-send-email-sds@tycho.nsa.govSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ab76f7b4
    • Lee, Chun-Yi's avatar
      x86/kexec: Fix kexec crash in syscall kexec_file_load() · e3c41e37
      Lee, Chun-Yi authored
      The original bug is a page fault crash that sometimes happens
      on big machines when preparing ELF headers:
      
          BUG: unable to handle kernel paging request at ffffc90613fc9000
          IP: [<ffffffff8103d645>] prepare_elf64_ram_headers_callback+0x165/0x260
      
      The bug is caused by us under-counting the number of memory ranges
      and subsequently not allocating enough ELF header space for them.
      The bug is typically masked on smaller systems, because the ELF header
      allocation is rounded up to the next page.
      
      This patch modifies the code in fill_up_crash_elf_data() by using
      walk_system_ram_res() instead of walk_system_ram_range() to correctly
      count the max number of crash memory ranges. That's because the
      walk_system_ram_range() filters out small memory regions that
      reside in the same page, but walk_system_ram_res() does not.
      
      Here's how I found the bug:
      
      After tracing prepare_elf64_headers() and prepare_elf64_ram_headers_callback(),
      the code uses walk_system_ram_res() to fill-in crash memory regions information
      to the program header, so it counts those small memory regions that
      reside in a page area.
      
      But, when the kernel was using walk_system_ram_range() in
      fill_up_crash_elf_data() to count the number of crash memory regions,
      it filters out small regions.
      
      I printed those small memory regions, for example:
      
        kexec: Get nr_ram ranges. vaddr=0xffff880077592258 paddr=0x77592258, sz=0xdc0
      
      Based on the code in walk_system_ram_range(), this memory region
      will be filtered out:
      
        pfn = (0x77592258 + 0x1000 - 1) >> 12 = 0x77593
        end_pfn = (0x77592258 + 0xfc0 -1 + 1) >> 12 = 0x77593
        end_pfn - pfn = 0x77593 - 0x77593 = 0  <=== if (end_pfn > pfn) is FALSE
      
      So, the max_nr_ranges that's counted by the kernel doesn't include
      small memory regions - causing us to under-allocate the required space.
      That causes the page fault crash that happens in a later code path
      when preparing ELF headers.
      
      This bug is not easy to reproduce on small machines that have few
      CPUs, because the allocated page aligned ELF buffer has more free
      space to cover those small memory regions' PT_LOAD headers.
      Signed-off-by: default avatarLee, Chun-Yi <jlee@suse.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Jiang Liu <jiang.liu@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Takashi Iwai <tiwai@suse.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Viresh Kumar <viresh.kumar@linaro.org>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: kexec@lists.infradead.org
      Cc: linux-kernel@vger.kernel.org
      Cc: <stable@vger.kernel.org>
      Link: http://lkml.kernel.org/r/1443531537-29436-1-git-send-email-jlee@suse.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e3c41e37