1. 18 Apr, 2017 2 commits
    • Paul E. McKenney's avatar
      rcu: Make arch select smp_mb__after_unlock_lock() strength · 77e58496
      Paul E. McKenney authored
      The definition of smp_mb__after_unlock_lock() is currently smp_mb()
      for CONFIG_PPC and a no-op otherwise.  It would be better to instead
      provide an architecture-selectable Kconfig option, and select the
      strength of smp_mb__after_unlock_lock() based on that option.  This
      commit therefore creates ARCH_WEAK_RELEASE_ACQUIRE, has PPC select it,
      and bases the definition of smp_mb__after_unlock_lock() on this new
      ARCH_WEAK_RELEASE_ACQUIRE Kconfig option.
      Reported-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Boqun Feng <boqun.feng@linux.vnet.ibm.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Acked-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Cc: <linuxppc-dev@lists.ozlabs.org>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      77e58496
    • Paul E. McKenney's avatar
      rcu: Maintain special bits at bottom of ->dynticks counter · b8c17e66
      Paul E. McKenney authored
      Currently, IPIs are used to force other CPUs to invalidate their TLBs
      in response to a kernel virtual-memory mapping change.  This works, but
      degrades both battery lifetime (for idle CPUs) and real-time response
      (for nohz_full CPUs), and in addition results in unnecessary IPIs due to
      the fact that CPUs executing in usermode are unaffected by stale kernel
      mappings.  It would be better to cause a CPU executing in usermode to
      wait until it is entering kernel mode to do the flush, first to avoid
      interrupting usemode tasks and second to handle multiple flush requests
      with a single flush in the case of a long-running user task.
      
      This commit therefore reserves a bit at the bottom of the ->dynticks
      counter, which is checked upon exit from extended quiescent states.
      If it is set, it is cleared and then a new rcu_eqs_special_exit() macro is
      invoked, which, if not supplied, is an empty single-pass do-while loop.
      If this bottom bit is set on -entry- to an extended quiescent state,
      then a WARN_ON_ONCE() triggers.
      
      This bottom bit may be set using a new rcu_eqs_special_set() function,
      which returns true if the bit was set, or false if the CPU turned
      out to not be in an extended quiescent state.  Please note that this
      function refuses to set the bit for a non-nohz_full CPU when that CPU
      is executing in usermode because usermode execution is tracked by RCU
      as a dyntick-idle extended quiescent state only for nohz_full CPUs.
      Reported-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      b8c17e66
  2. 12 Mar, 2017 5 commits
    • Linus Torvalds's avatar
      Linux 4.11-rc2 · 4495c08e
      Linus Torvalds authored
      4495c08e
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 56b24d1b
      Linus Torvalds authored
      Pull s390 fixes from Martin Schwidefsky:
      
       - four patches to get the new cputime code in shape for s390
      
       - add the new statx system call
      
       - a few bug fixes
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390: wire up statx system call
        KVM: s390: Fix guest migration for huge guests resulting in panic
        s390/ipl: always use load normal for CCW-type re-IPL
        s390/timex: micro optimization for tod_to_ns
        s390/cputime: provide archicture specific cputime_to_nsecs
        s390/cputime: reset all accounting fields on fork
        s390/cputime: remove last traces of cputime_t
        s390: fix in-kernel program checks
        s390/crypt: fix missing unlock in ctr_paes_crypt on error path
      56b24d1b
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 5a45a5a8
      Linus Torvalds authored
      Pull x86 fixes from Thomas Gleixner:
      
       - a fix for the kexec/purgatory regression which was introduced in the
         merge window via an innocent sparse fix. We could have reverted that
         commit, but on deeper inspection it turned out that the whole
         machinery is neither documented nor robust. So a proper cleanup was
         done instead
      
       - the fix for the TLB flush issue which was discovered recently
      
       - a simple typo fix for a reboot quirk
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/tlb: Fix tlb flushing when lguest clears PGE
        kexec, x86/purgatory: Unbreak it and clean it up
        x86/reboot/quirks: Fix typo in ASUS EeeBook X205TA reboot quirk
      5a45a5a8
    • Linus Torvalds's avatar
      Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ecade114
      Linus Torvalds authored
      Pull irq fixes from Thomas Gleixner:
      
       - a workaround for a GIC erratum
      
       - a missing stub function for CONFIG_IRQDOMAIN=n
      
       - fixes for a couple of type inconsistencies
      
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqchip/crossbar: Fix incorrect type of register size
        irqchip/gicv3-its: Add workaround for QDF2400 ITS erratum 0065
        irqdomain: Add empty irq_domain_check_msi_remap
        irqchip/crossbar: Fix incorrect type of local variables
      ecade114
    • Daniel Borkmann's avatar
      x86/tlb: Fix tlb flushing when lguest clears PGE · 2c4ea6e2
      Daniel Borkmann authored
      Fengguang reported random corruptions from various locations on x86-32
      after commits d2852a22 ("arch: add ARCH_HAS_SET_MEMORY config") and
      9d876e79 ("bpf: fix unlocking of jited image when module ronx not set")
      that uses the former. While x86-32 doesn't have a JIT like x86_64, the
      bpf_prog_lock_ro() and bpf_prog_unlock_ro() got enabled due to
      ARCH_HAS_SET_MEMORY, whereas Fengguang's test kernel doesn't have module
      support built in and therefore never had the DEBUG_SET_MODULE_RONX setting
      enabled.
      
      After investigating the crashes further, it turned out that using
      set_memory_ro() and set_memory_rw() didn't have the desired effect, for
      example, setting the pages as read-only on x86-32 would still let
      probe_kernel_write() succeed without error. This behavior would manifest
      itself in situations where the vmalloc'ed buffer was accessed prior to
      set_memory_*() such as in case of bpf_prog_alloc(). In cases where it
      wasn't, the page attribute changes seemed to have taken effect, leading to
      the conclusion that a TLB invalidate didn't happen. Moreover, it turned out
      that this issue reproduced with qemu in "-cpu kvm64" mode, but not for
      "-cpu host". When the issue occurs, change_page_attr_set_clr() did trigger
      a TLB flush as expected via __flush_tlb_all() through cpa_flush_range(),
      though.
      
      There are 3 variants for issuing a TLB flush: invpcid_flush_all() (depends
      on CPU feature bits X86_FEATURE_INVPCID, X86_FEATURE_PGE), cr4 based flush
      (depends on X86_FEATURE_PGE), and cr3 based flush.  For "-cpu host" case in
      my setup, the flush used invpcid_flush_all() variant, whereas for "-cpu
      kvm64", the flush was cr4 based. Switching the kvm64 case to cr3 manually
      worked fine, and further investigating the cr4 one turned out that
      X86_CR4_PGE bit was not set in cr4 register, meaning the
      __native_flush_tlb_global_irq_disabled() wrote cr4 twice with the same
      value instead of clearing X86_CR4_PGE in the first write to trigger the
      flush.
      
      It turned out that X86_CR4_PGE was cleared from cr4 during init from
      lguest_arch_host_init() via adjust_pge(). The X86_FEATURE_PGE bit is also
      cleared from there due to concerns of using PGE in guest kernel that can
      lead to hard to trace bugs (see bff672e6 ("lguest: documentation V:
      Host") in init()). The CPU feature bits are cleared in dynamic
      boot_cpu_data, but they never propagated to __flush_tlb_all() as it uses
      static_cpu_has() instead of boot_cpu_has() for testing which variant of TLB
      flushing to use, meaning they still used the old setting of the host
      kernel.
      
      Clearing via setup_clear_cpu_cap(X86_FEATURE_PGE) so this would propagate
      to static_cpu_has() checks is too late at this point as sections have been
      patched already, so for now, it seems reasonable to switch back to
      boot_cpu_has(X86_FEATURE_PGE) as it was prior to commit c109bf95
      ("x86/cpufeature: Remove cpu_has_pge"). This lets the TLB flush trigger via
      cr3 as originally intended, properly makes the new page attributes visible
      and thus fixes the crashes seen by Fengguang.
      
      Fixes: c109bf95 ("x86/cpufeature: Remove cpu_has_pge")
      Reported-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Cc: bp@suse.de
      Cc: Kees Cook <keescook@chromium.org>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: netdev@vger.kernel.org
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: lkp@01.org
      Cc: Laura Abbott <labbott@redhat.com>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernrl.org/r/20170301125426.l4nf65rx4wahohyl@wfg-t540p.sh.intel.com
      Link: http://lkml.kernel.org/r/25c41ad9eca164be4db9ad84f768965b7eb19d9e.1489191673.git.daniel@iogearbox.netSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      2c4ea6e2
  3. 11 Mar, 2017 8 commits
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 106e4da6
      Linus Torvalds authored
      Pull KVM fixes from Radim Krčmář:
       "ARM updates from Marc Zyngier:
         - vgic updates:
           - Honour disabling the ITS
           - Don't deadlock when deactivating own interrupts via MMIO
           - Correctly expose the lact of IRQ/FIQ bypass on GICv3
      
         - I/O virtualization:
           - Make KVM_CAP_NR_MEMSLOTS big enough for large guests with many
             PCIe devices
      
         - General bug fixes:
           - Gracefully handle exception generated with syndroms that the host
             doesn't understand
           - Properly invalidate TLBs on VHE systems
      
        x86:
         - improvements in emulation of VMCLEAR, VMX MSR bitmaps, and VCPU
           reset
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: nVMX: do not warn when MSR bitmap address is not backed
        KVM: arm64: Increase number of user memslots to 512
        KVM: arm/arm64: Remove KVM_PRIVATE_MEM_SLOTS definition that are unused
        KVM: arm/arm64: Enable KVM_CAP_NR_MEMSLOTS on arm/arm64
        KVM: Add documentation for KVM_CAP_NR_MEMSLOTS
        KVM: arm/arm64: VGIC: Fix command handling while ITS being disabled
        arm64: KVM: Survive unknown traps from guests
        arm: KVM: Survive unknown traps from guests
        KVM: arm/arm64: Let vcpu thread modify its own active state
        KVM: nVMX: reset nested_run_pending if the vCPU is going to be reset
        kvm: nVMX: VMCLEAR should not cause the vCPU to shut down
        KVM: arm/arm64: vgic-v3: Don't pretend to support IRQ/FIQ bypass
        arm64: KVM: VHE: Clear HCR_TGE when invalidating guest TLBs
      106e4da6
    • Linus Torvalds's avatar
      Merge tag 'extable-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux · 4b050f22
      Linus Torvalds authored
      Pull extable.h fix from Paul Gortmaker:
       "Fixup for arch/score after extable.h introduction.
      
        It seems that Guenter is the only one on the planet doing builds for
        arch/score -- we don't have compile coverage for it in linux-next or
        in the kbuild-bot either. Guenter couldn't even recall where he got
        his toolchain, but was kind enough to share it with me so I could
        validate this change and also add arch/score to my build coverage.
      
        I sat on this a bit in case there was any other fallout in other arch
        dirs, but since this still seems to be the only one, I might as well
        send it on its way"
      
      * tag 'extable-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux:
        score: Fix implicit includes now failing build after extable change
      4b050f22
    • Linus Torvalds's avatar
      Merge tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random · 84c37c16
      Linus Torvalds authored
      Pull random updates from Ted Ts'o:
       "Change get_random_{int,log} to use the CRNG used by /dev/urandom and
        getrandom(2). It's faster and arguably more secure than cut-down MD5
        that we had been using.
      
        Also do some code cleanup"
      
      * tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
        random: move random_min_urandom_seed into CONFIG_SYSCTL ifdef block
        random: convert get_random_int/long into get_random_u32/u64
        random: use chacha20 for get_random_int/long
        random: fix comment for unused random_min_urandom_seed
        random: remove variable limit
        random: remove stale urandom_init_wait
        random: remove stale maybe_reseed_primary_crng
      84c37c16
    • Guenter Roeck's avatar
      score: Fix implicit includes now failing build after extable change · 0acf6119
      Guenter Roeck authored
      After changing from module.h to extable.h, score builds fail with:
      
        arch/score/kernel/traps.c: In function 'do_ri':
        arch/score/kernel/traps.c:248:4: error: implicit declaration of function 'user_disable_single_step'
        arch/score/mm/extable.c: In function 'fixup_exception':
        arch/score/mm/extable.c:32:38: error: dereferencing pointer to incomplete type
        arch/score/mm/extable.c:34:24: error: dereferencing pointer to incomplete type
      
      because extable.h doesn't drag in the same amount of headers as the
      module.h did.  Add in the headers which were implicitly expected.
      
      Fixes: 90858794 ("module.h: remove extable.h include now users have migrated")
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      [PG: tweak commit log; refresh for sched header refactoring.]
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      0acf6119
    • Linus Torvalds's avatar
      Merge tag 'tty-4.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · 434fd635
      Linus Torvalds authored
      Pull tty/serial fixes frpm Greg KH:
       "Here are two bugfixes for tty stuff for 4.11-rc2.
      
        One of them resolves the pretty bad bug in the n_hdlc code that
        Alexander Popov found and fixed and has been reported everywhere. The
        other just fixes a samsung serial driver issue when DMA fails on some
        systems.
      
        Both have been in linux-next with no reported issues"
      
      * tag 'tty-4.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        serial: samsung: Continue to work if DMA request fails
        tty: n_hdlc: get rid of racy n_hdlc.tbuf
      434fd635
    • Linus Torvalds's avatar
      Merge tag 'staging-4.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · 85298808
      Linus Torvalds authored
      Pull staging driver fixes from Greg KH:
       "Here are two small build warning fixes for some staging drivers that
        Arnd has found on his valiant quest to get the kernel to build
        properly with no warnings.
      
        Both of these have been in linux-next this week and resolve the
        reported issues"
      
      * tag 'staging-4.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
        staging: octeon: remove unused variable
        staging/vc04_services: add CONFIG_OF dependency
      85298808
    • Linus Torvalds's avatar
      Merge tag 'usb-4.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 46552bf4
      Linus Torvalds authored
      Pull USB fixes from Greg KH:
       "Here is a number of different USB fixes for 4.11-rc2.
      
        Seems like there were a lot of unresolved issues that people have been
        finding for this subsystem, and a bunch of good security auditing
        happening as well from Johan Hovold. There's the usual batch of gadget
        driver fixes and xhci issues resolved as well.
      
       All of these have been in linux-next with no reported issues"
      
      * tag 'usb-4.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (35 commits)
        usb: host: xhci-plat: Fix timeout on removal of hot pluggable xhci controllers
        usb: host: xhci-dbg: HCIVERSION should be a binary number
        usb: xhci: remove dummy extra_priv_size for size of xhci_hcd struct
        usb: xhci-mtk: check hcc_params after adding primary hcd
        USB: serial: digi_acceleport: fix OOB-event processing
        MAINTAINERS: usb251xb: remove reference inexistent file
        doc: dt-bindings: usb251xb: mark reg as required
        usb: usb251xb: dt: add unit suffix to oc-delay and power-on-time
        usb: usb251xb: remove max_{power,current}_{sp,bp} properties
        usb-storage: Add ignore-residue quirk for Initio INIC-3619
        USB: iowarrior: fix NULL-deref in write
        USB: iowarrior: fix NULL-deref at probe
        usb: phy: isp1301: Add OF device ID table
        usb: ohci-at91: Do not drop unhandled USB suspend control requests
        USB: serial: safe_serial: fix information leak in completion handler
        USB: serial: io_ti: fix information leak in completion handler
        USB: serial: omninet: drop open callback
        USB: serial: omninet: fix reference leaks at open
        USB: serial: io_ti: fix NULL-deref in interrupt callback
        usb: dwc3: gadget: make to increment req->remaining in all cases
        ...
      46552bf4
    • Linus Torvalds's avatar
      Merge tag 'pinctrl-v4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · cb853a82
      Linus Torvalds authored
      Pull pinctrl fixes from Linus Walleij:
       "Two smaller pin control fixes for the v4.11 series:
      
         - Add a get_direction() function to the qcom driver
      
         - Fix two pin names in the uniphier driver"
      
      * tag 'pinctrl-v4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
        pinctrl: uniphier: change pin names of aio/xirq for LD11
        pinctrl: qcom: add get_direction function
      cb853a82
  4. 10 Mar, 2017 25 commits
    • Thomas Gleixner's avatar
      kexec, x86/purgatory: Unbreak it and clean it up · 40c50c1f
      Thomas Gleixner authored
      The purgatory code defines global variables which are referenced via a
      symbol lookup in the kexec code (core and arch).
      
      A recent commit addressing sparse warnings made these static and thereby
      broke kexec_file.
      
      Why did this happen? Simply because the whole machinery is undocumented and
      lacks any form of forward declarations. The variable names are unspecific
      and lack a prefix, so adding forward declarations creates shadow variables
      in the core code. Aside of that the code relies on magic constants and
      duplicate struct definitions with no way to ensure that these things stay
      in sync. The section placement of the purgatory variables happened by
      chance and not by design.
      
      Unbreak kexec and cleanup the mess:
      
       - Add proper forward declarations and document the usage
       - Use common struct definition
       - Use the proper common defines instead of magic constants
       - Add a purgatory_ prefix to have a proper name space
       - Use ARRAY_SIZE() instead of a homebrewn reimplementation
       - Add proper sections to the purgatory variables [ From Mike ]
      
      Fixes: 72042a8c ("x86/purgatory: Make functions and variables static")
      Reported-by: default avatarMike Galbraith <&lt;efault@gmx.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Nicholas Mc Guire <der.herr@hofr.at>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: "Tobin C. Harding" <me@tobin.cc>
      Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1703101315140.3681@nanosSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      40c50c1f
    • Linus Torvalds's avatar
      Merge tag 'ceph-for-4.11-rc2' of git://github.com/ceph/ceph-client · 24c534bb
      Linus Torvalds authored
      Pull ceph fixes from Ilya Dryomov:
      
       - a fix for the recently discovered misdirected requests bug present in
         jewel and later on the server side and all stable kernels
      
       - a fixup for -rc1 CRUSH changes
      
       - two usability enhancements: osd_request_timeout option and
         supported_features bus attribute.
      
      * tag 'ceph-for-4.11-rc2' of git://github.com/ceph/ceph-client:
        libceph: osd_request_timeout option
        rbd: supported_features bus attribute
        libceph: don't set weight to IN when OSD is destroyed
        libceph: fix crush_decode() for older maps
      24c534bb
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 2baf3809
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "Here are some driver bugfixes from I2C.
      
        Unusual this time are the two reverts. One because I accidently picked
        a patch from the list which I should have pulled from my co-maintainer
        instead ("missing of_node_put"). And one which I wrongly assumed to be
        an easy fix but it turned out already that it needs more iterations
        ("copy device properties")"
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        Revert "i2c: copy device properties when using i2c_register_board_info()"
        Revert "i2c: add missing of_node_put in i2c_mux_del_adapters"
        i2c: exynos5: Avoid transaction timeouts due TRANSFER_DONE_AUTO not set
        i2c: designware: add reset interface
        i2c: meson: fix wrong variable usage in meson_i2c_put_data
        i2c: copy device properties when using i2c_register_board_info()
        i2c: m65xx: drop superfluous quirk structure
        i2c: brcmstb: Fix START and STOP conditions
        i2c: add missing of_node_put in i2c_mux_del_adapters
        i2c: riic: fix restart condition
        i2c: add missing of_node_put in i2c_mux_del_adapters
      2baf3809
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-for-4.11-rc2' of git://people.freedesktop.org/~airlied/linux · 7c7fba98
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Intel, amd and mxsfb fixes.
      
        These are the drm fixes I've collected for rc2. Mostly i915 GVT only
        fixes, along with a single EDID fix, some mxsfb fixes and a few minor
        amd fixes"
      
      * tag 'drm-fixes-for-4.11-rc2' of git://people.freedesktop.org/~airlied/linux: (38 commits)
        drm: mxsfb: Implement drm_panel handling
        drm: mxsfb_crtc: Fix the framebuffer misplacement
        drm: mxsfb: Fix crash when provided invalid DT bindings
        drm: mxsfb: fix pixel clock polarity
        drm: mxsfb: use bus_format to determine LCD bus width
        drm/amdgpu: bump driver version for some new features
        drm/amdgpu: validate paramaters in the gem ioctl
        drm/amd/amdgpu: fix console deadlock if late init failed
        drm/i915/gvt: change some gvt_err to gvt_dbg_cmd
        drm/i915/gvt: protect RO and Rsvd bits of virtual vgpu configuration space
        drm/i915/gvt: handle workload lifecycle properly
        drm/edid: Add EDID_QUIRK_FORCE_8BPC quirk for Rotel RSX-1058
        drm/i915/gvt: fix an error for F_RO flag
        drm/i915/gvt: use pfn_valid for better checking
        drm/i915/gvt: set SFUSE_STRAP properly for vitual monitor detection
        drm/i915/gvt: fix an error for one register
        drm/i915/gvt: add more registers into handlers list
        drm/i915/gvt: have more registers with F_CMD_ACCESS flags set
        drm/i915/gvt: add some new MMIOs to cmd_access white list
        drm/i915/gvt: fix pcode mailbox write emulation of BDW
        ...
      7c7fba98
    • Linus Torvalds's avatar
      Merge branch 'prep-for-5level' · baeedc71
      Linus Torvalds authored
      Merge 5-level page table prep from Kirill Shutemov:
       "Here's relatively low-risk part of 5-level paging patchset. Merging it
        now will make x86 5-level paging enabling in v4.12 easier.
      
        The first patch is actually x86-specific: detect 5-level paging
        support. It boils down to single define.
      
        The rest of patchset converts Linux MMU abstraction from 4- to 5-level
        paging.
      
        Enabling of new abstraction in most cases requires adding single line
        of code in arch-specific code. The rest is taken care by asm-generic/.
      
        Changes to mm/ code are mostly mechanical: add support for new page
        table level -- p4d_t -- where we deal with pud_t now.
      
        v2:
         - fix build on microblaze (Michal);
         - comment for __ARCH_HAS_5LEVEL_HACK in kasan_populate_zero_shadow();
         - acks from Michal"
      
      * emailed patches from Kirill A Shutemov <kirill.shutemov@linux.intel.com>:
        mm: introduce __p4d_alloc()
        mm: convert generic code to 5-level paging
        asm-generic: introduce <asm-generic/pgtable-nop4d.h>
        arch, mm: convert all architectures to use 5level-fixup.h
        asm-generic: introduce __ARCH_USE_5LEVEL_HACK
        asm-generic: introduce 5level-fixup.h
        x86/cpufeature: Add 5-level paging detection
      baeedc71
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · 8fe3ccae
      Linus Torvalds authored
      Merge fixes from Andrew Morton:
       "26 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (26 commits)
        userfaultfd: remove wrong comment from userfaultfd_ctx_get()
        fat: fix using uninitialized fields of fat_inode/fsinfo_inode
        sh: cayman: IDE support fix
        kasan: fix races in quarantine_remove_cache()
        kasan: resched in quarantine_remove_cache()
        mm: do not call mem_cgroup_free() from within mem_cgroup_alloc()
        thp: fix another corner case of munlock() vs. THPs
        rmap: fix NULL-pointer dereference on THP munlocking
        mm/memblock.c: fix memblock_next_valid_pfn()
        userfaultfd: selftest: vm: allow to build in vm/ directory
        userfaultfd: non-cooperative: userfaultfd_remove revalidate vma in MADV_DONTNEED
        userfaultfd: non-cooperative: fix fork fctx->new memleak
        mm/cgroup: avoid panic when init with low memory
        drivers/md/bcache/util.h: remove duplicate inclusion of blkdev.h
        mm/vmstats: add thp_split_pud event for clarity
        include/linux/fs.h: fix unsigned enum warning with gcc-4.2
        userfaultfd: non-cooperative: release all ctx in dup_userfaultfd_complete
        userfaultfd: non-cooperative: robustness check
        userfaultfd: non-cooperative: rollback userfaultfd_exit
        x86, mm: unify exit paths in gup_pte_range()
        ...
      8fe3ccae
    • Matjaz Hegedic's avatar
      x86/reboot/quirks: Fix typo in ASUS EeeBook X205TA reboot quirk · bba8376a
      Matjaz Hegedic authored
      The reboot quirk for ASUS EeeBook X205TA contains a typo in
      DMI_PRODUCT_NAME, improperly referring to X205TAW instead of
      X205TA, which prevents the quirk from being triggered. The
      model X205TAW already has a reboot quirk of its own.
      
      This fix simply removes the inappropriate final letter W.
      
      Fixes: 90b28ded ("x86/reboot/quirks: Add ASUS EeeBook X205TA reboot quirk")
      Signed-off-by: default avatarMatjaz Hegedic <matjaz.hegedic@gmail.com>
      Link: http://lkml.kernel.org/r/1489064417-7445-1-git-send-email-matjaz.hegedic@gmail.comSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      bba8376a
    • Linus Torvalds's avatar
      Merge tag 'xfs-4.11-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · 9db61d6f
      Linus Torvalds authored
      Pull xfs fixes from Darrick Wong:
       "Here are some bug fixes for -rc2 to clean up the copy on write
        handling and to remove a cause of hangs.
      
         - Fix various iomap bugs
      
         - Fix overly aggressive CoW preallocation garbage collection
      
         - Fixes to CoW endio error handling
      
         - Fix some incorrect geometry calculations
      
         - Remove a potential system hang in bulkstat
      
         - Try to allocate blocks more aggressively to reduce ENOSPC errors"
      
      * tag 'xfs-4.11-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        xfs: try any AG when allocating the first btree block when reflinking
        xfs: use iomap new flag for newly allocated delalloc blocks
        xfs: remove kmem_zalloc_greedy
        xfs: Use xfs_icluster_size_fsb() to calculate inode alignment mask
        xfs: fix and streamline error handling in xfs_end_io
        xfs: only reclaim unwritten COW extents periodically
        iomap: invalidate page caches should be after iomap_dio_complete() in direct write
      9db61d6f
    • Linus Torvalds's avatar
      Merge tag 'gcc-plugins-v4.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 794fe789
      Linus Torvalds authored
      Pull gcc-plugins fix from Kees Cook:
       "Fixes a typo in sancov plugin, exposed in earlier compiler versions"
      
      * tag 'gcc-plugins-v4.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        gcc-plugins: fix sancov_plugin for gcc-5
      794fe789
    • Fabio Estevam's avatar
      drm: mxsfb: Implement drm_panel handling · 3f81e134
      Fabio Estevam authored
      Currently when the 'power-supply' regulator is passed via device tree
      it does not actually work since drm_panel_prepare()/drm_panel_enable()
      are never called.
      
      Quoting Thierry Reding: "It should really call drm_panel_prepare() and
      drm_panel_enable() while switching on the display pipeline and
      drm_panel_disable(), followed by drm_panel_unprepare() while switching
      off the display pipeline."
      
      So do as suggested, so that the 'power-supply' regulator can be functional.
      Reported-by: default avatarBreno Lima <breno.lima@nxp.com>
      Suggested-by: default avatarThierry Reding <thierry.reding@gmail.com>
      Signed-off-by: default avatarFabio Estevam <fabio.estevam@nxp.com>
      Tested-by: default avatarMarek Vasut <marex@denx.de>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      3f81e134
    • Fabio Estevam's avatar
      drm: mxsfb_crtc: Fix the framebuffer misplacement · d42986b6
      Fabio Estevam authored
      Currently the framebuffer content is displayed with incorrect offsets
      in both the vertical and horizontal directions.
      
      The fbdev version of the driver does not show this problem. Breno Lima
      dumped the eLCDIF controller registers on both the drm and fbdev drivers
      and noticed that the VDCTRL3 register is configured incorrectly in the
      drm driver.
      
      The fbdev driver calculates the vertical and horizontal wait counts
      of the VDCTRL3 register by doing: back porch + sync length.
      
      Looking at the horizontal and vertical timing diagram from
      include/drm/drm_modes.h this value corresponds to:
      
      crtc_[hv]total - crtc_[hv]sync_start
      
      So fix the VDCTRL3 register setting accordingly so that the eLCDIF
      controller can properly show the framebuffer content in the correct
      position.
      Reported-by: default avatarBreno Lima <breno.lima@nxp.com>
      Signed-off-by: default avatarFabio Estevam <fabio.estevam@nxp.com>
      Tested-by: default avatarBreno Lima <breno.lima@nxp.com>
      Tested-by: default avatarMarek Vasut <marex@denx.de>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      d42986b6
    • Marek Vasut's avatar
      drm: mxsfb: Fix crash when provided invalid DT bindings · 7ad7a5ac
      Marek Vasut authored
      The mxsfb driver will crash if the mxsfb DT node has a subnode,
      but the content of the subnode is not of-graph binding with an
      endpoint linking to panel. The crash was triggered by providing
      old-style panel bindings to the mxsfb driver instead of the new
      of-graph ones.
      
      The problem happens in mxsfb_create_output(), which is invoked
      from mxsfb_load(). The mxsfb_create_output() iterates over all
      mxsfb DT subnode endpoints and tries to bind a panel on each
      endpoint. If there is any problem binding the panel, that is,
      mxsfb->panel == NULL, this function will return an error code,
      otherwise success 0 is returned.
      
      If the subnodes do not specify of-graph binding with an endpoint,
      the iteration over endpoints in mxsfb_create_output() will have
      zero cycles and the function will immediatelly return 0, but the
      mxsfb->panel will remain NULL. This is propagated back into the
      mxsfb_load(), which does not detect any problem and expects that
      the mxsfb->panel is valid, thus calls mxsfb_panel_attach(). But
      since mxsfb->panel == NULL, mxsfb_panel_attach() is called with
      first argument NULL and this crashes the kernel.
      
      This patch fixes the problem by explicitly checking for valid
      mxsfb->panel at the end of the iteration in mxsfb_create_output().
      Signed-off-by: default avatarMarek Vasut <marex@denx.de>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dave Airlie <airlied@redhat.com>
      Cc: Stefan Agner <stefan@agner.ch>
      Cc: Breno Matheus Lima <brenomatheus@gmail.com>
      Tested-by: default avatarBreno Lima <breno.lima@nxp.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      7ad7a5ac
    • Stefan Agner's avatar
      drm: mxsfb: fix pixel clock polarity · 53990e41
      Stefan Agner authored
      The DRM subsystem specifies the pixel clock polarity from a
      controllers perspective: DRM_BUS_FLAG_PIXDATA_NEGEDGE means
      the controller drives the data on pixel clocks falling edge.
      That is the controllers DOTCLK_POL=0 (Default is data launched
      at negative edge).
      
      Also change the data enable logic to be high active by default
      and only change if explicitly requested via bus_flags. With
      that defaults are:
      - Data enable: high active
      - Pixel clock polarity: controller drives data on negative edge
      Signed-off-by: default avatarStefan Agner <stefan@agner.ch>
      Acked-by: default avatarMarek Vasut <marex@denx.de>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      53990e41
    • Stefan Agner's avatar
      drm: mxsfb: use bus_format to determine LCD bus width · 10f2889b
      Stefan Agner authored
      The LCD bus width does not need to align with the pixel format. The
      LCDIF controller automatically converts between pixel formats and
      bus width by padding or dropping LSBs.
      
      The DRM subsystem has the notion of bus_format which allows to
      determine what bus_formats are supported by the display. Choose the
      first available or fallback to 24 bit if none are available.
      Signed-off-by: default avatarStefan Agner <stefan@agner.ch>
      Acked-by: default avatarMarek Vasut <marex@denx.de>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      10f2889b
    • Dave Airlie's avatar
      Merge branch 'drm-fixes-4.11' of git://people.freedesktop.org/~agd5f/linux into drm-fixes · 9813527a
      Dave Airlie authored
      * 'drm-fixes-4.11' of git://people.freedesktop.org/~agd5f/linux:
        drm/amdgpu: bump driver version for some new features
        drm/amdgpu: validate paramaters in the gem ioctl
        drm/amd/amdgpu: fix console deadlock if late init failed
      9813527a
    • Dave Airlie's avatar
      Merge tag 'drm-intel-fixes-2017-03-09' of... · 31aec642
      Dave Airlie authored
      Merge tag 'drm-intel-fixes-2017-03-09' of git://anongit.freedesktop.org/git/drm-intel into drm-fixes
      
      flushing out gvt-g fixes
      
      * tag 'drm-intel-fixes-2017-03-09' of git://anongit.freedesktop.org/git/drm-intel: (29 commits)
        drm/i915/gvt: change some gvt_err to gvt_dbg_cmd
        drm/i915/gvt: protect RO and Rsvd bits of virtual vgpu configuration space
        drm/i915/gvt: handle workload lifecycle properly
        drm/i915/gvt: fix an error for F_RO flag
        drm/i915/gvt: use pfn_valid for better checking
        drm/i915/gvt: set SFUSE_STRAP properly for vitual monitor detection
        drm/i915/gvt: fix an error for one register
        drm/i915/gvt: add more registers into handlers list
        drm/i915/gvt: have more registers with F_CMD_ACCESS flags set
        drm/i915/gvt: add some new MMIOs to cmd_access white list
        drm/i915/gvt: fix pcode mailbox write emulation of BDW
        drm/i915/gvt: add resolution definition for vGPU type
        drm/i915/gvt: Add more edid definition support
        drm/i915/gvt: adjust to fixed vGPU types
        drm/i915/gvt: remove unnecessary error msg from gtt write
        drm/i915/gvt: refine pcode write emulation
        drm/i915/gvt: clear the vGPU reset logic
        drm/i915/gvt: decrease priority of output msg for untracked mmio
        drm/i915/gvt: set default value to 0 for unhandled mmio regs
        drm/i915/gvt: add cmd_access to GEN7_HALF_SLICE_CHICKEN1
        ...
      31aec642
    • Dave Airlie's avatar
      Merge tag 'drm-misc-fixes-2017-03-06' of git://anongit.freedesktop.org/git/drm-misc into drm-fixes · aa717ae1
      Dave Airlie authored
      Just 1 8bpc quirk from Ville, cc: stable
      
      * tag 'drm-misc-fixes-2017-03-06' of git://anongit.freedesktop.org/git/drm-misc:
        drm/edid: Add EDID_QUIRK_FORCE_8BPC quirk for Rotel RSX-1058
      aa717ae1
    • David Hildenbrand's avatar
      userfaultfd: remove wrong comment from userfaultfd_ctx_get() · 2378cd61
      David Hildenbrand authored
      It's a void function, so there is no return value;
      
      Link: http://lkml.kernel.org/r/20170309150817.7510-1-david@redhat.comSigned-off-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2378cd61
    • OGAWA Hirofumi's avatar
      fat: fix using uninitialized fields of fat_inode/fsinfo_inode · c0d0e351
      OGAWA Hirofumi authored
      Recently fallocate patch was merged and it uses
      MSDOS_I(inode)->mmu_private at fat_evict_inode().  However,
      fat_inode/fsinfo_inode that was introduced in past didn't initialize
      MSDOS_I(inode) properly.
      
      With those combinations, it became the cause of accessing random entry
      in FAT area.
      
      Link: http://lkml.kernel.org/r/87pohrj4i8.fsf@mail.parknet.co.jpSigned-off-by: default avatarOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Reported-by: default avatarMoreno Bartalucci <moreno.bartalucci@tecnorama.it>
      Tested-by: default avatarMoreno Bartalucci <moreno.bartalucci@tecnorama.it>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c0d0e351
    • Bartlomiej Zolnierkiewicz's avatar
      sh: cayman: IDE support fix · ca5b58ea
      Bartlomiej Zolnierkiewicz authored
      Remove incorrect CONFIG_IDE ifdef (CONFIG_IDE config option is for
      internal drivers/ide/ use) and make IDE hardware interface always
      initialized (not only when IDE subsystem is built-in).
      
      This patch allows Cayman board to work with modular IDE subsystem
      support and removes the requirement of having the whole core IDE
      subsystem built-in when using libata PATA support.
      
      Link: http://lkml.kernel.org/r/1990884.yFoE6lSB9G@amdc3058Signed-off-by: default avatarBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Rich Felker <dalias@libc.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ca5b58ea
    • Dmitry Vyukov's avatar
      kasan: fix races in quarantine_remove_cache() · ce5bec54
      Dmitry Vyukov authored
      quarantine_remove_cache() frees all pending objects that belong to the
      cache, before we destroy the cache itself.  However there are currently
      two possibilities how it can fail to do so.
      
      First, another thread can hold some of the objects from the cache in
      temp list in quarantine_put().  quarantine_put() has a windows of
      enabled interrupts, and on_each_cpu() in quarantine_remove_cache() can
      finish right in that window.  These objects will be later freed into the
      destroyed cache.
      
      Then, quarantine_reduce() has the same problem.  It grabs a batch of
      objects from the global quarantine, then unlocks quarantine_lock and
      then frees the batch.  quarantine_remove_cache() can finish while some
      objects from the cache are still in the local to_free list in
      quarantine_reduce().
      
      Fix the race with quarantine_put() by disabling interrupts for the whole
      duration of quarantine_put().  In combination with on_each_cpu() in
      quarantine_remove_cache() it ensures that quarantine_remove_cache()
      either sees the objects in the per-cpu list or in the global list.
      
      Fix the race with quarantine_reduce() by protecting quarantine_reduce()
      with srcu critical section and then doing synchronize_srcu() at the end
      of quarantine_remove_cache().
      
      I've done some assessment of how good synchronize_srcu() works in this
      case.  And on a 4 CPU VM I see that it blocks waiting for pending read
      critical sections in about 2-3% of cases.  Which looks good to me.
      
      I suspect that these races are the root cause of some GPFs that I
      episodically hit.  Previously I did not have any explanation for them.
      
        BUG: unable to handle kernel NULL pointer dereference at 00000000000000c8
        IP: qlist_free_all+0x2e/0xc0 mm/kasan/quarantine.c:155
        PGD 6aeea067
        PUD 60ed7067
        PMD 0
        Oops: 0000 [#1] SMP KASAN
        Dumping ftrace buffer:
           (ftrace buffer empty)
        Modules linked in:
        CPU: 0 PID: 13667 Comm: syz-executor2 Not tainted 4.10.0+ #60
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
        task: ffff88005f948040 task.stack: ffff880069818000
        RIP: 0010:qlist_free_all+0x2e/0xc0 mm/kasan/quarantine.c:155
        RSP: 0018:ffff88006981f298 EFLAGS: 00010246
        RAX: ffffea0000ffff00 RBX: 0000000000000000 RCX: ffffea0000ffff1f
        RDX: 0000000000000000 RSI: ffff88003fffc3e0 RDI: 0000000000000000
        RBP: ffff88006981f2c0 R08: ffff88002fed7bd8 R09: 00000001001f000d
        R10: 00000000001f000d R11: ffff88006981f000 R12: ffff88003fffc3e0
        R13: ffff88006981f2d0 R14: ffffffff81877fae R15: 0000000080000000
        FS:  00007fb911a2d700(0000) GS:ffff88003ec00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 00000000000000c8 CR3: 0000000060ed6000 CR4: 00000000000006f0
        Call Trace:
         quarantine_reduce+0x10e/0x120 mm/kasan/quarantine.c:239
         kasan_kmalloc+0xca/0xe0 mm/kasan/kasan.c:590
         kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:544
         slab_post_alloc_hook mm/slab.h:456 [inline]
         slab_alloc_node mm/slub.c:2718 [inline]
         kmem_cache_alloc_node+0x1d3/0x280 mm/slub.c:2754
         __alloc_skb+0x10f/0x770 net/core/skbuff.c:219
         alloc_skb include/linux/skbuff.h:932 [inline]
         _sctp_make_chunk+0x3b/0x260 net/sctp/sm_make_chunk.c:1388
         sctp_make_data net/sctp/sm_make_chunk.c:1420 [inline]
         sctp_make_datafrag_empty+0x208/0x360 net/sctp/sm_make_chunk.c:746
         sctp_datamsg_from_user+0x7e8/0x11d0 net/sctp/chunk.c:266
         sctp_sendmsg+0x2611/0x3970 net/sctp/socket.c:1962
         inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:761
         sock_sendmsg_nosec net/socket.c:633 [inline]
         sock_sendmsg+0xca/0x110 net/socket.c:643
         SYSC_sendto+0x660/0x810 net/socket.c:1685
         SyS_sendto+0x40/0x50 net/socket.c:1653
      
      I am not sure about backporting.  The bug is quite hard to trigger, I've
      seen it few times during our massive continuous testing (however, it
      could be cause of some other episodic stray crashes as it leads to
      memory corruption...).  If it is triggered, the consequences are very
      bad -- almost definite bad memory corruption.  The fix is non trivial
      and has chances of introducing new bugs.  I am also not sure how
      actively people use KASAN on older releases.
      
      [dvyukov@google.com: - sorted includes[
        Link: http://lkml.kernel.org/r/20170309094028.51088-1-dvyukov@google.com
      Link: http://lkml.kernel.org/r/20170308151532.5070-1-dvyukov@google.comSigned-off-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Acked-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Greg Thelen <gthelen@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ce5bec54
    • Dmitry Vyukov's avatar
      kasan: resched in quarantine_remove_cache() · 68fd814a
      Dmitry Vyukov authored
      We see reported stalls/lockups in quarantine_remove_cache() on machines
      with large amounts of RAM.  quarantine_remove_cache() needs to scan
      whole quarantine in order to take out all objects belonging to the
      cache.  Quarantine is currently 1/32-th of RAM, e.g.  on a machine with
      256GB of memory that will be 8GB.  Moreover quarantine scanning is a
      walk over uncached linked list, which is slow.
      
      Add cond_resched() after scanning of each non-empty batch of objects.
      Batches are specifically kept of reasonable size for quarantine_put().
      On a machine with 256GB of RAM we should have ~512 non-empty batches,
      each with 16MB of objects.
      
      Link: http://lkml.kernel.org/r/20170308154239.25440-1-dvyukov@google.comSigned-off-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Acked-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      68fd814a
    • Tahsin Erdogan's avatar
      mm: do not call mem_cgroup_free() from within mem_cgroup_alloc() · 40e952f9
      Tahsin Erdogan authored
      mem_cgroup_free() indirectly calls wb_domain_exit() which is not
      prepared to deal with a struct wb_domain object that hasn't executed
      wb_domain_init().  For instance, the following warning message is
      printed by lockdep if alloc_percpu() fails in mem_cgroup_alloc():
      
        INFO: trying to register non-static key.
        the code is fine but needs lockdep annotation.
        turning off the locking correctness validator.
        CPU: 1 PID: 1950 Comm: mkdir Not tainted 4.10.0+ #151
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
        Call Trace:
         dump_stack+0x67/0x99
         register_lock_class+0x36d/0x540
         __lock_acquire+0x7f/0x1a30
         lock_acquire+0xcc/0x200
         del_timer_sync+0x3c/0xc0
         wb_domain_exit+0x14/0x20
         mem_cgroup_free+0x14/0x40
         mem_cgroup_css_alloc+0x3f9/0x620
         cgroup_apply_control_enable+0x190/0x390
         cgroup_mkdir+0x290/0x3d0
         kernfs_iop_mkdir+0x58/0x80
         vfs_mkdir+0x10e/0x1a0
         SyS_mkdirat+0xa8/0xd0
         SyS_mkdir+0x14/0x20
         entry_SYSCALL_64_fastpath+0x18/0xad
      
      Add __mem_cgroup_free() which skips wb_domain_exit().  This is used by
      both mem_cgroup_free() and mem_cgroup_alloc() clean up.
      
      Fixes: 0b8f73e1 ("mm: memcontrol: clean up alloc, online, offline, free functions")
      Link: http://lkml.kernel.org/r/20170306192122.24262-1-tahsin@google.comSigned-off-by: default avatarTahsin Erdogan <tahsin@google.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      40e952f9
    • Kirill A. Shutemov's avatar
      thp: fix another corner case of munlock() vs. THPs · 6ebb4a1b
      Kirill A. Shutemov authored
      The following test case triggers BUG() in munlock_vma_pages_range():
      
      	int main(int argc, char *argv[])
      	{
      		int fd;
      
      		system("mount -t tmpfs -o huge=always none /mnt");
      		fd = open("/mnt/test", O_CREAT | O_RDWR);
      		ftruncate(fd, 4UL << 20);
      		mmap(NULL, 4UL << 20, PROT_READ | PROT_WRITE,
      				MAP_SHARED | MAP_FIXED | MAP_LOCKED, fd, 0);
      		mmap(NULL, 4096, PROT_READ | PROT_WRITE,
      				MAP_SHARED | MAP_LOCKED, fd, 0);
      		munlockall();
      		return 0;
      	}
      
      The second mmap() create PTE-mapping of the first huge page in file.  It
      makes kernel munlock the page as we never keep PTE-mapped page mlocked.
      
      On munlockall() when we handle vma created by the first mmap(),
      munlock_vma_page() returns page_mask == 0, as the page is not mlocked
      anymore.  On next iteration follow_page_mask() return tail page, but
      page_mask is HPAGE_NR_PAGES - 1.  It makes us skip to the first tail
      page of the next huge page and step on
      VM_BUG_ON_PAGE(PageMlocked(page)).
      
      The fix is not use the page_mask from follow_page_mask() at all.  It has
      no use for us.
      
      Link: http://lkml.kernel.org/r/20170302150252.34120-1-kirill.shutemov@linux.intel.comSigned-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: <stable@vger.kernel.org>    [4.5+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6ebb4a1b
    • Kirill A. Shutemov's avatar
      rmap: fix NULL-pointer dereference on THP munlocking · 8346242a
      Kirill A. Shutemov authored
      The following test case triggers NULL-pointer derefernce in
      try_to_unmap_one():
      
      	#include <fcntl.h>
      	#include <stdlib.h>
      	#include <unistd.h>
      	#include <sys/mman.h>
      
      	int main(int argc, char *argv[])
      	{
      		int fd;
      
      		system("mount -t tmpfs -o huge=always none /mnt");
      		fd = open("/mnt/test", O_CREAT | O_RDWR);
      		ftruncate(fd, 2UL << 20);
      		mmap(NULL, 2UL << 20, PROT_READ | PROT_WRITE,
      				MAP_SHARED | MAP_FIXED | MAP_LOCKED, fd, 0);
      		mmap(NULL, 2UL << 20, PROT_READ | PROT_WRITE,
      				MAP_SHARED | MAP_LOCKED, fd, 0);
      		munlockall();
      		return 0;
      	}
      
      Apparently, there's a case when we call try_to_unmap() on huge PMDs:
      it's TTU_MUNLOCK.
      
      Let's handle this case correctly.
      
      Fixes: c7ab0d2f ("mm: convert try_to_unmap_one() to use page_vma_mapped_walk()")
      Link: http://lkml.kernel.org/r/20170302151159.30592-1-kirill.shutemov@linux.intel.comSigned-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8346242a