1. 04 Aug, 2017 3 commits
    • Linus Torvalds's avatar
      Merge tag 'powerpc-4.13-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 841fe953
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
       "Fixes for recently merged code:
         - a fix for the _PAGE_DEVMAP support, which was breaking KVM on
           Power9 radix
         - avoid a (harmless) lockdep warning in the early SMP code
         - return failure for some uses of dma_set_mask() rather than falling
           back to 32-bits
         - fix stack setup in watchdog soft_nmi_common() to use emergency
           stack
         - fix of_irq_to_resource() error check in of_fsl_spi_probe()
      
        Two fixes going to stable:
         - fix saving of Transactional Memory SPRs in core dump
         - fix __check_irq_replay missing decrementer interrupt
      
        And two misc:
         - fix 64-bit boot wrapper build with non-biarch compiler
         - work around a POWER9 PMU hang after state-loss idle
      
        Thanks to: Alistair Popple, Aneesh Kumar K.V, Cyril Bur, Gustavo
        Romero, Jose Ricardo Ziviani, Laurent Vivier, Nicholas Piggin, Oliver
        O'Halloran, Sergei Shtylyov, Suraj Jitindar Singh, Thomas Gleixner"
      
      * tag 'powerpc-4.13-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/64: Fix __check_irq_replay missing decrementer interrupt
        powerpc/perf: POWER9 PMU stops after idle workaround
        powerpc/83xx/mpc832x_rdb: fix of_irq_to_resource() error check
        powerpc/64s: Fix stack setup in watchdog soft_nmi_common()
        powerpc/powernv/pci: Return failure for some uses of dma_set_mask()
        powerpc/boot: Fix 64-bit boot wrapper build with non-biarch compiler
        powerpc/smp: Call smp_ops->setup_cpu() directly on the boot CPU
        powerpc/tm: Fix saving of TM SPRs in core dump
        powerpc/mm: Fix pmd/pte_devmap() on non-leaf entries
      841fe953
    • Nicholas Piggin's avatar
      powerpc/64: Fix __check_irq_replay missing decrementer interrupt · 3db40c31
      Nicholas Piggin authored
      If the decrementer wraps again and de-asserts the decrementer
      exception while hard-disabled, __check_irq_replay() has a test to
      notice the wrap when interrupts are re-enabled.
      
      The decrementer check must be done when clearing the PACA_IRQ_HARD_DIS
      flag, not when the PACA_IRQ_DEC flag is tested. Previously this worked
      because the decrementer interrupt was always the first one checked
      after clearing the hard disable flag, but HMI check was moved ahead of
      that, which introduced this bug.
      
      This can cause a missed decrementer interrupt if we soft-disable
      interrupts then take an HMI which is recorded in irq_happened, then
      hard-disable interrupts for > 4s to wrap the decrementer.
      
      Fixes: e0e0d6b7 ("powerpc/64: Replay hypervisor maintenance interrupt first")
      Cc: stable@vger.kernel.org # v4.9+
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      3db40c31
    • Nicholas Piggin's avatar
      powerpc/perf: POWER9 PMU stops after idle workaround · 09539f9b
      Nicholas Piggin authored
      POWER9 DD2 PMU can stop after a state-loss idle in some conditions.
      
      A solution is to set then clear MMCRA[60] after wake from state-loss
      idle. MMCRA[60] is a non-architected bit, see the user manual for
      details.
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Acked-by: default avatarMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Reviewed-by: default avatarVaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
      Acked-by: default avatarAnton Blanchard <anton@samba.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      09539f9b
  2. 03 Aug, 2017 17 commits
    • Linus Torvalds's avatar
      Merge tag 'vfio-v4.13-rc4' of git://github.com/awilliam/linux-vfio · 869c058f
      Linus Torvalds authored
      Pull VFIO fixes from Alex Williamson:
      
       - SPAPR/EEH config build fix (Murilo Opsfelder Araujo)
      
       - Fix possible device lock deadlock (Alex Williamson)
      
       - Correctly size integrated endpoint PCIe capabilities (Alex
         Williamson)
      
      * tag 'vfio-v4.13-rc4' of git://github.com/awilliam/linux-vfio:
        vfio/pci: Fix handling of RC integrated endpoint PCIe capability size
        vfio/pci: Use pci_try_reset_function() on initial open
        include/linux/vfio.h: Guard powerpc-specific functions with CONFIG_VFIO_SPAPR_EEH
      869c058f
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · 995d03ae
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "15 fixes"
      
      [ This does not merge the "fortify: use WARN instead of BUG for now"
        patch, which needs a bit of extra work to build cleanly with all
        configurations. Arnd is on it.   - Linus ]
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        ocfs2: don't clear SGID when inheriting ACLs
        mm: allow page_cache_get_speculative in interrupt context
        userfaultfd: non-cooperative: flush event_wqh at release time
        ipc: add missing container_of()s for randstruct
        cpuset: fix a deadlock due to incomplete patching of cpusets_enabled()
        userfaultfd_zeropage: return -ENOSPC in case mm has gone
        mm: take memory hotplug lock within numa_zonelist_order_handler()
        mm/page_io.c: fix oops during block io poll in swapin path
        zram: do not free pool->size_class
        kthread: fix documentation build warning
        kasan: avoid -Wmaybe-uninitialized warning
        userfaultfd: non-cooperative: notify about unmap of destination during mremap
        mm, mprotect: flush TLB if potentially racing with a parallel reclaim leaving stale TLB entries
        pid: kill pidhash_size in pidhash_init()
        mm/hugetlb.c: __get_user_pages ignores certain follow_hugetlb_page errors
      995d03ae
    • Linus Torvalds's avatar
      Merge tag 'acpi-4.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 8d3fe85f
      Linus Torvalds authored
      Pull ACPI fixes from Rafael Wysocki:
       "These fix two issues in the ACPI SoC drivers (Intel LPSS and AMD APD),
        a crash in the PCC mailbox initialization code and a WDAT watchdog
        initialization failure.
      
        Specifics:
      
         - Fix a device ID of Hisilicon Hip07/08 in the ACPI APD (AMD SoC)
           driver (Hanjun Guo).
      
         - Fix list corruption (introduced during the 4.11 cycle) in the ACPI
           LPSS (Intel SoC) driver (Hans de Goede).
      
         - Fix PCC mailbox handling code crash during initialization when PCCT
           is not present and PCC channel 0 is requested (Hoan Tran).
      
         - Fix a WDAT watchdog initialization issue causing platform device
           creation to fail due to partially overlapping address ranges in
           resources (Ryan Kennedy)"
      
      * tag 'acpi-4.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI: APD: Fix HID for Hisilicon Hip07/08
        mailbox: pcc: Fix crash when request PCC channel 0
        ACPI / watchdog: Fix init failure with overlapping register regions
        ACPI / LPSS: Only call pwm_add_table() for the first PWM controller
      8d3fe85f
    • Linus Torvalds's avatar
      Merge tag 'pm-4.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 73784fb7
      Linus Torvalds authored
      Pull power management fixes from Rafael Wysocki:
       "These fix two cpufreq issues, one introduced recently and one related
        to recent changes, fix cpufreq documentation, fix up recently added
        code in the Thunderbolt driver and update runtime PM framework
        documentation.
      
        Specifics:
      
         - Fix the handling of the scaling_cur_freq cpufreq policy attribute
           on x86 systems with the MPERF/APERF registers present to make it
           behave more as expected after recent changes (Rafael Wysocki).
      
         - Drop a leftover callback from the intel_pstate driver which also
           prevents the cpuinfo_cur_freq cpufreq policy attribute from being
           incorrectly exposed when intel_pstate works in the active mode
           (Rafael Wysocki).
      
         - Add a missing piece describing the cpuinfo_cur_freq policy
           attribute to cpufreq documentation (Rafael Wysocki).
      
         - Fix up a recently added part of the Thunderbolt driver to avoid
           aborting system suspends if its mailbox commands time out (Rafael
           Wysocki).
      
         - Update device runtime PM framework documentation to reflect the
           current behavior of the code (Johan Hovold)"
      
      * tag 'pm-4.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        thunderbolt: icm: Ignore mailbox errors in icm_suspend()
        cpufreq: x86: Make scaling_cur_freq behave more as expected
        PM / runtime: Document new pm_runtime_set_suspended() constraint
        cpufreq: docs: Add missing cpuinfo_cur_freq description
        cpufreq: intel_pstate: Drop ->get from intel_pstate structure
      73784fb7
    • Rafael J. Wysocki's avatar
      Merge branches 'acpi-soc', 'acpi-wdat' and 'acpi-cppc' · 3de559d4
      Rafael J. Wysocki authored
      * acpi-soc:
        ACPI: APD: Fix HID for Hisilicon Hip07/08
        ACPI / LPSS: Only call pwm_add_table() for the first PWM controller
      
      * acpi-wdat:
        ACPI / watchdog: Fix init failure with overlapping register regions
      
      * acpi-cppc:
        mailbox: pcc: Fix crash when request PCC channel 0
      3de559d4
    • Rafael J. Wysocki's avatar
      Merge branches 'pm-core' and 'pm-misc' · 78aa904a
      Rafael J. Wysocki authored
      * pm-core:
        PM / runtime: Document new pm_runtime_set_suspended() constraint
      
      * pm-misc:
        thunderbolt: icm: Ignore mailbox errors in icm_suspend()
      78aa904a
    • Rafael J. Wysocki's avatar
      Merge branches 'pm-cpufreq-x86', 'pm-cpufreq-docs' and 'intel_pstate' · 8a05c311
      Rafael J. Wysocki authored
      * pm-cpufreq-x86:
        cpufreq: x86: Make scaling_cur_freq behave more as expected
      
      * pm-cpufreq-docs:
        cpufreq: docs: Add missing cpuinfo_cur_freq description
      
      * intel_pstate:
        cpufreq: intel_pstate: Drop ->get from intel_pstate structure
      8a05c311
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-4.13-4' of git://git.linux-nfs.org/projects/anna/linux-nfs · 19ec50a4
      Linus Torvalds authored
      Pull NFS client fixes from Anna Schumaker:
       "Two fixes from Trond this time, now that he's back from his vacation.
        The first is a stable fix for the EXCHANGE_ID issue on the mailing
        list, and the other fixes a double-free situation that he found at the
        same time.
      
        Stable fix:
         - Fix EXCHANGE_ID corrupt verifier issue
      
        Other fix:
         - Fix double frees in nfs4_test_session_trunk()"
      
      * tag 'nfs-for-4.13-4' of git://git.linux-nfs.org/projects/anna/linux-nfs:
        NFSv4: Fix double frees in nfs4_test_session_trunk()
        NFSv4: Fix EXCHANGE_ID corrupt verifier issue
      19ec50a4
    • Annie Cherkaev's avatar
      isdn/i4l: fix buffer overflow · 9f5af546
      Annie Cherkaev authored
      This fixes a potential buffer overflow in isdn_net.c caused by an
      unbounded strcpy.
      
      [ ISDN seems to be effectively unmaintained, and the I4L driver in
        particular is long deprecated, but in case somebody uses this..
          - Linus ]
      Signed-off-by: default avatarJiten Thakkar <jitenmt@gmail.com>
      Signed-off-by: default avatarAnnie Cherkaev <annie.cherk@gmail.com>
      Cc: Karsten Keil <isdn@linux-pingi.de>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: stable@kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9f5af546
    • Jan Kara's avatar
      ocfs2: don't clear SGID when inheriting ACLs · 19ec8e48
      Jan Kara authored
      When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
      set, DIR1 is expected to have SGID bit set (and owning group equal to
      the owning group of 'DIR0').  However when 'DIR0' also has some default
      ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
      'DIR1' to get cleared if user is not member of the owning group.
      
      Fix the problem by moving posix_acl_update_mode() out of ocfs2_set_acl()
      into ocfs2_iop_set_acl().  That way the function will not be called when
      inheriting ACLs which is what we want as it prevents SGID bit clearing
      and the mode has been properly set by posix_acl_create() anyway.  Also
      posix_acl_chmod() that is calling ocfs2_set_acl() takes care of updating
      mode itself.
      
      Fixes: 07393101 ("posix_acl: Clear SGID bit when setting file permissions")
      Link: http://lkml.kernel.org/r/20170801141252.19675-3-jack@suse.czSigned-off-by: default avatarJan Kara <jack@suse.cz>
      Cc: Mark Fasheh <mfasheh@versity.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Joseph Qi <jiangqi903@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      19ec8e48
    • Kan Liang's avatar
      mm: allow page_cache_get_speculative in interrupt context · 1ee1c3f5
      Kan Liang authored
      Kernel panic when calling the IRQ-safe __get_user_pages_fast in NMI
      handler.
      
      The bug was introduced by commit 2947ba05 ("x86/mm/gup: Switch GUP
      to the generic get_user_page_fast() implementation").
      
      The original x86 __get_user_page_fast used plain get_page() or
      page_ref_add().  However, the generic __get_user_page_fast uses
      page_cache_get_speculative(), which has VM_BUG_ON(in_interrupt()).
      
      There is no reason to prevent page_cache_get_speculative from using in
      interrupt context.  According to the author, putting a BUG_ON there is
      just because the code is not verifying correctness of interrupt races.
      I did some tests in interrupt context.  There is no issue found.
      
      Removing VM_BUG_ON(in_interrupt()) for page_cache_get_speculative().
      
      Link: http://lkml.kernel.org/r/1501609146-59730-1-git-send-email-kan.liang@intel.com
      Fixes: 2947ba05 ("x86/mm/gup: Switch GUP to the generic get_user_page_fast() implementation")
      Signed-off-by: default avatarKan Liang <kan.liang@intel.com>
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Ying Huang <ying.huang@intel.com>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1ee1c3f5
    • Mike Rapoport's avatar
      userfaultfd: non-cooperative: flush event_wqh at release time · 5a18b64e
      Mike Rapoport authored
      There may still be threads waiting on event_wqh at the time the
      userfault file descriptor is closed.  Flush the events wait-queue to
      prevent waiting threads from hanging.
      
      Link: http://lkml.kernel.org/r/1501398127-30419-1-git-send-email-rppt@linux.vnet.ibm.com
      Fixes: 9cd75c3c ("userfaultfd: non-cooperative: add ability to report
      non-PF events from uffd descriptor")
      Signed-off-by: default avatarMike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
      Cc: Pavel Emelyanov <xemul@virtuozzo.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5a18b64e
    • Kees Cook's avatar
      ipc: add missing container_of()s for randstruct · ade9f91b
      Kees Cook authored
      When building with the randstruct gcc plugin, the layout of the IPC
      structs will be randomized, which requires any sub-structure accesses to
      use container_of().  The proc display handlers were missing the needed
      container_of()s since the iterator is passing in the top-level struct
      kern_ipc_perm.
      
      This would lead to crashes when running the "lsipc" program after the
      system had IPC registered (e.g. after starting up Gnome):
      
        general protection fault: 0000 [#1] PREEMPT SMP
        ...
        RIP: 0010:shm_add_rss_swap.isra.1+0x13/0xa0
        ...
        Call Trace:
          sysvipc_shm_proc_show+0x5e/0x150
          sysvipc_proc_show+0x1a/0x30
          seq_read+0x2e9/0x3f0
        ...
      
      Link: http://lkml.kernel.org/r/20170730205950.GA55841@beast
      Fixes: 3859a271 ("randstruct: Mark various structs for randomization")
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Reported-by: default avatarDominik Brodowski <linux@dominikbrodowski.net>
      Acked-by: default avatarDavidlohr Bueso <dave@stgolabs.net>
      Acked-by: default avatarManfred Spraul <manfred@colorfullife.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ade9f91b
    • Dima Zavin's avatar
      cpuset: fix a deadlock due to incomplete patching of cpusets_enabled() · 89affbf5
      Dima Zavin authored
      In codepaths that use the begin/retry interface for reading
      mems_allowed_seq with irqs disabled, there exists a race condition that
      stalls the patch process after only modifying a subset of the
      static_branch call sites.
      
      This problem manifested itself as a deadlock in the slub allocator,
      inside get_any_partial.  The loop reads mems_allowed_seq value (via
      read_mems_allowed_begin), performs the defrag operation, and then
      verifies the consistency of mem_allowed via the read_mems_allowed_retry
      and the cookie returned by xxx_begin.
      
      The issue here is that both begin and retry first check if cpusets are
      enabled via cpusets_enabled() static branch.  This branch can be
      rewritted dynamically (via cpuset_inc) if a new cpuset is created.  The
      x86 jump label code fully synchronizes across all CPUs for every entry
      it rewrites.  If it rewrites only one of the callsites (specifically the
      one in read_mems_allowed_retry) and then waits for the
      smp_call_function(do_sync_core) to complete while a CPU is inside the
      begin/retry section with IRQs off and the mems_allowed value is changed,
      we can hang.
      
      This is because begin() will always return 0 (since it wasn't patched
      yet) while retry() will test the 0 against the actual value of the seq
      counter.
      
      The fix is to use two different static keys: one for begin
      (pre_enable_key) and one for retry (enable_key).  In cpuset_inc(), we
      first bump the pre_enable key to ensure that cpuset_mems_allowed_begin()
      always return a valid seqcount if are enabling cpusets.  Similarly, when
      disabling cpusets via cpuset_dec(), we first ensure that callers of
      cpuset_mems_allowed_retry() will start ignoring the seqcount value
      before we let cpuset_mems_allowed_begin() return 0.
      
      The relevant stack traces of the two stuck threads:
      
        CPU: 1 PID: 1415 Comm: mkdir Tainted: G L  4.9.36-00104-g540c51286237 #4
        Hardware name: Default string Default string/Hardware, BIOS 4.29.1-20170526215256 05/26/2017
        task: ffff8817f9c28000 task.stack: ffffc9000ffa4000
        RIP: smp_call_function_many+0x1f9/0x260
        Call Trace:
          smp_call_function+0x3b/0x70
          on_each_cpu+0x2f/0x90
          text_poke_bp+0x87/0xd0
          arch_jump_label_transform+0x93/0x100
          __jump_label_update+0x77/0x90
          jump_label_update+0xaa/0xc0
          static_key_slow_inc+0x9e/0xb0
          cpuset_css_online+0x70/0x2e0
          online_css+0x2c/0xa0
          cgroup_apply_control_enable+0x27f/0x3d0
          cgroup_mkdir+0x2b7/0x420
          kernfs_iop_mkdir+0x5a/0x80
          vfs_mkdir+0xf6/0x1a0
          SyS_mkdir+0xb7/0xe0
          entry_SYSCALL_64_fastpath+0x18/0xad
      
        ...
      
        CPU: 2 PID: 1 Comm: init Tainted: G L  4.9.36-00104-g540c51286237 #4
        Hardware name: Default string Default string/Hardware, BIOS 4.29.1-20170526215256 05/26/2017
        task: ffff8818087c0000 task.stack: ffffc90000030000
        RIP: int3+0x39/0x70
        Call Trace:
          <#DB> ? ___slab_alloc+0x28b/0x5a0
          <EOE> ? copy_process.part.40+0xf7/0x1de0
          __slab_alloc.isra.80+0x54/0x90
          copy_process.part.40+0xf7/0x1de0
          copy_process.part.40+0xf7/0x1de0
          kmem_cache_alloc_node+0x8a/0x280
          copy_process.part.40+0xf7/0x1de0
          _do_fork+0xe7/0x6c0
          _raw_spin_unlock_irq+0x2d/0x60
          trace_hardirqs_on_caller+0x136/0x1d0
          entry_SYSCALL_64_fastpath+0x5/0xad
          do_syscall_64+0x27/0x350
          SyS_clone+0x19/0x20
          do_syscall_64+0x60/0x350
          entry_SYSCALL64_slow_path+0x25/0x25
      
      Link: http://lkml.kernel.org/r/20170731040113.14197-1-dmitriyz@waymo.com
      Fixes: 46e700ab ("mm, page_alloc: remove unnecessary taking of a seqlock when cpusets are disabled")
      Signed-off-by: default avatarDima Zavin <dmitriyz@waymo.com>
      Reported-by: default avatarCliff Spradlin <cspradlin@waymo.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Christopher Lameter <cl@linux.com>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      89affbf5
    • Mike Rapoport's avatar
      userfaultfd_zeropage: return -ENOSPC in case mm has gone · 9d95aa4b
      Mike Rapoport authored
      In the non-cooperative userfaultfd case, the process exit may race with
      outstanding mcopy_atomic called by the uffd monitor.  Returning -ENOSPC
      instead of -EINVAL when mm is already gone will allow uffd monitor to
      distinguish this case from other error conditions.
      
      Unfortunately I overlooked userfaultfd_zeropage when updating
      userfaultd_copy().
      
      Link: http://lkml.kernel.org/r/1501136819-21857-1-git-send-email-rppt@linux.vnet.ibm.com
      Fixes: 96333187 ("userfaultfd_copy: return -ENOSPC in case mm has gone")
      Signed-off-by: default avatarMike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
      Cc: Pavel Emelyanov <xemul@virtuozzo.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9d95aa4b
    • Heiko Carstens's avatar
      mm: take memory hotplug lock within numa_zonelist_order_handler() · 167d0f25
      Heiko Carstens authored
      Andre Wild reported the following warning:
      
        WARNING: CPU: 2 PID: 1205 at kernel/cpu.c:240 lockdep_assert_cpus_held+0x4c/0x60
        Modules linked in:
        CPU: 2 PID: 1205 Comm: bash Not tainted 4.13.0-rc2-00022-gfd2b2c57 #10
        Hardware name: IBM 2964 N96 702 (z/VM 6.4.0)
        task: 00000000701d8100 task.stack: 0000000073594000
        Krnl PSW : 0704f00180000000 0000000000145e24 (lockdep_assert_cpus_held+0x4c/0x60)
        ...
        Call Trace:
         lockdep_assert_cpus_held+0x42/0x60)
         stop_machine_cpuslocked+0x62/0xf0
         build_all_zonelists+0x92/0x150
         numa_zonelist_order_handler+0x102/0x150
         proc_sys_call_handler.isra.12+0xda/0x118
         proc_sys_write+0x34/0x48
         __vfs_write+0x3c/0x178
         vfs_write+0xbc/0x1a0
         SyS_write+0x66/0xc0
         system_call+0xc4/0x2b0
         locks held by bash/1205:
         #0:  (sb_writers#4){.+.+.+}, at: vfs_write+0xa6/0x1a0
         #1:  (zl_order_mutex){+.+...}, at: numa_zonelist_order_handler+0x44/0x150
         #2:  (zonelists_mutex){+.+...}, at: numa_zonelist_order_handler+0xf4/0x150
        Last Breaking-Event-Address:
          lockdep_assert_cpus_held+0x48/0x60
      
      This can be easily triggered with e.g.
      
          echo n > /proc/sys/vm/numa_zonelist_order
      
      In commit 3f906ba2 ("mm/memory-hotplug: switch locking to a percpu
      rwsem") memory hotplug locking was changed to fix a potential deadlock.
      
      This also switched the stop_machine() invocation within
      build_all_zonelists() to stop_machine_cpuslocked() which now expects
      that online cpus are locked when being called.
      
      This assumption is not true if build_all_zonelists() is being called
      from numa_zonelist_order_handler().
      
      In order to fix this simply add a mem_hotplug_begin()/mem_hotplug_done()
      pair to numa_zonelist_order_handler().
      
      Link: http://lkml.kernel.org/r/20170726111738.38768-1-heiko.carstens@de.ibm.com
      Fixes: 3f906ba2 ("mm/memory-hotplug: switch locking to a percpu rwsem")
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Reported-by: default avatarAndre Wild <wild@linux.vnet.ibm.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      167d0f25
    • Tetsuo Handa's avatar
      mm/page_io.c: fix oops during block io poll in swapin path · b0ba2d0f
      Tetsuo Handa authored
      When a thread is OOM-killed during swap_readpage() operation, an oops
      occurs because end_swap_bio_read() is calling wake_up_process() based on
      an assumption that the thread which called swap_readpage() is still
      alive.
      
        Out of memory: Kill process 525 (polkitd) score 0 or sacrifice child
        Killed process 525 (polkitd) total-vm:528128kB, anon-rss:0kB, file-rss:4kB, shmem-rss:0kB
        oom_reaper: reaped process 525 (polkitd), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
        general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
        Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter coretemp ppdev pcspkr vmw_balloon sg shpchp vmw_vmci parport_pc parport i2c_piix4 ip_tables xfs libcrc32c sd_mod sr_mod cdrom ata_generic pata_acpi vmwgfx ahci libahci drm_kms_helper ata_piix syscopyarea sysfillrect sysimgblt fb_sys_fops mptspi scsi_transport_spi ttm e1000 mptscsih drm mptbase i2c_core libata serio_raw
        CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.13.0-rc2-next-20170725 #129
        Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013
        task: ffffffffb7c16500 task.stack: ffffffffb7c00000
        RIP: 0010:__lock_acquire+0x151/0x12f0
        Call Trace:
         <IRQ>
         lock_acquire+0x59/0x80
         _raw_spin_lock_irqsave+0x3b/0x4f
         try_to_wake_up+0x3b/0x410
         wake_up_process+0x10/0x20
         end_swap_bio_read+0x6f/0xf0
         bio_endio+0x92/0xb0
         blk_update_request+0x88/0x270
         scsi_end_request+0x32/0x1c0
         scsi_io_completion+0x209/0x680
         scsi_finish_command+0xd4/0x120
         scsi_softirq_done+0x120/0x140
         __blk_mq_complete_request_remote+0xe/0x10
         flush_smp_call_function_queue+0x51/0x120
         generic_smp_call_function_single_interrupt+0xe/0x20
         smp_trace_call_function_single_interrupt+0x22/0x30
         smp_call_function_single_interrupt+0x9/0x10
         call_function_single_interrupt+0xa7/0xb0
         </IRQ>
        RIP: 0010:native_safe_halt+0x6/0x10
         default_idle+0xe/0x20
         arch_cpu_idle+0xa/0x10
         default_idle_call+0x1e/0x30
         do_idle+0x187/0x200
         cpu_startup_entry+0x6e/0x70
         rest_init+0xd0/0xe0
         start_kernel+0x456/0x477
         x86_64_start_reservations+0x24/0x26
         x86_64_start_kernel+0xf7/0x11a
         secondary_startup_64+0xa5/0xa5
        Code: c3 49 81 3f 20 9e 0b b8 41 bc 00 00 00 00 44 0f 45 e2 83 fe 01 0f 87 62 ff ff ff 89 f0 49 8b 44 c7 08 48 85 c0 0f 84 52 ff ff ff <f0> ff 80 98 01 00 00 8b 3d 5a 49 c4 01 45 8b b3 18 0c 00 00 85
        RIP: __lock_acquire+0x151/0x12f0 RSP: ffffa01f39e03c50
        ---[ end trace 6c441db499169b1e ]---
        Kernel panic - not syncing: Fatal exception in interrupt
        Kernel Offset: 0x36000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
        ---[ end Kernel panic - not syncing: Fatal exception in interrupt
      
      Fix it by holding a reference to the thread.
      
      [akpm@linux-foundation.org: add comment]
      Fixes: 23955622 ("swap: add block io poll in swapin path")
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Reviewed-by: default avatarShaohua Li <shli@fb.com>
      Cc: Tim Chen <tim.c.chen@intel.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b0ba2d0f
  3. 02 Aug, 2017 11 commits
    • Minchan Kim's avatar
      zram: do not free pool->size_class · 3189c820
      Minchan Kim authored
      Mike reported kernel goes oops with ltp:zram03 testcase.
      
        zram: Added device: zram0
        zram0: detected capacity change from 0 to 107374182400
        BUG: unable to handle kernel paging request at 0000306d61727a77
        IP: zs_map_object+0xb9/0x260
        PGD 0
        P4D 0
        Oops: 0000 [#1] SMP
        Dumping ftrace buffer:
           (ftrace buffer empty)
        Modules linked in: zram(E) xfs(E) libcrc32c(E) btrfs(E) xor(E) raid6_pq(E) loop(E) ebtable_filter(E) ebtables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) ip_tables(E) x_tables(E) af_packet(E) br_netfilter(E) bridge(E) stp(E) llc(E) iscsi_ibft(E) iscsi_boot_sysfs(E) nls_iso8859_1(E) nls_cp437(E) vfat(E) fat(E) intel_powerclamp(E) coretemp(E) cdc_ether(E) kvm_intel(E) usbnet(E) mii(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) iTCO_wdt(E) ghash_clmulni_intel(E) bnx2(E) iTCO_vendor_support(E) pcbc(E) ioatdma(E) ipmi_ssif(E) aesni_intel(E) i5500_temp(E) i2c_i801(E) aes_x86_64(E) lpc_ich(E) shpchp(E) mfd_core(E) crypto_simd(E) i7core_edac(E) dca(E) glue_helper(E) cryptd(E) ipmi_si(E) button(E) acpi_cpufreq(E) ipmi_devintf(E) pcspkr(E) ipmi_msghandler(E)
         nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) ext4(E) crc16(E) mbcache(E) jbd2(E) sd_mod(E) ata_generic(E) i2c_algo_bit(E) ata_piix(E) drm_kms_helper(E) ahci(E) syscopyarea(E) sysfillrect(E) libahci(E) sysimgblt(E) fb_sys_fops(E) uhci_hcd(E) ehci_pci(E) ttm(E) ehci_hcd(E) libata(E) drm(E) megaraid_sas(E) usbcore(E) sg(E) dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) scsi_mod(E) efivarfs(E) autofs4(E) [last unloaded: zram]
        CPU: 6 PID: 12356 Comm: swapon Tainted: G            E   4.13.0.g87b2c3fc-default #194
        Hardware name: IBM System x3550 M3 -[7944K3G]-/69Y5698     , BIOS -[D6E150AUS-1.10]- 12/15/2010
        task: ffff880158d2c4c0 task.stack: ffffc90001680000
        RIP: 0010:zs_map_object+0xb9/0x260
        Call Trace:
         zram_bvec_rw.isra.26+0xe8/0x780 [zram]
         zram_rw_page+0x6e/0xa0 [zram]
         bdev_read_page+0x81/0xb0
         do_mpage_readpage+0x51a/0x710
         mpage_readpages+0x122/0x1a0
         blkdev_readpages+0x1d/0x20
         __do_page_cache_readahead+0x1b2/0x270
         ondemand_readahead+0x180/0x2c0
         page_cache_sync_readahead+0x31/0x50
         generic_file_read_iter+0x7e7/0xaf0
         blkdev_read_iter+0x37/0x40
         __vfs_read+0xce/0x140
         vfs_read+0x9e/0x150
         SyS_read+0x46/0xa0
         entry_SYSCALL_64_fastpath+0x1a/0xa5
        Code: 81 e6 00 c0 3f 00 81 fe 00 00 16 00 0f 85 9f 01 00 00 0f b7 13 65 ff 05 5e 07 dc 7e 66 c1 ea 02 81 e2 ff 01 00 00 49 8b 54 d4 08 <8b> 4a 48 41 0f af ce 81 e1 ff 0f 00 00 41 89 c9 48 c7 c3 a0 70
        RIP: zs_map_object+0xb9/0x260 RSP: ffffc90001683988
        CR2: 0000306d61727a77
      
      He bisected the problem is [1].
      
      After commit cf8e0fed ("mm/zsmalloc: simplify zs_max_alloc_size
      handling"), zram doesn't use double pointer for pool->size_class any
      more in zs_create_pool so counter function zs_destroy_pool don't need to
      free it, either.
      
      Otherwise, it does kfree wrong address and then, kernel goes Oops.
      
      Link: http://lkml.kernel.org/r/20170725062650.GA12134@bbox
      Fixes: cf8e0fed ("mm/zsmalloc: simplify zs_max_alloc_size handling")
      Signed-off-by: default avatarMinchan Kim <minchan@kernel.org>
      Reported-by: default avatarMike Galbraith <efault@gmx.de>
      Tested-by: default avatarMike Galbraith <efault@gmx.de>
      Reviewed-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3189c820
    • Jonathan Corbet's avatar
      kthread: fix documentation build warning · d16977f3
      Jonathan Corbet authored
      The kerneldoc comment for kthread_create() had an incorrect argument
      name, leading to a warning in the docs build.
      
      Correct it, and make one more small step toward a warning-free build.
      
      Link: http://lkml.kernel.org/r/20170724135916.7f486c6f@lwn.netSigned-off-by: default avatarJonathan Corbet <corbet@lwn.net>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d16977f3
    • Arnd Bergmann's avatar
      kasan: avoid -Wmaybe-uninitialized warning · e7701557
      Arnd Bergmann authored
      gcc-7 produces this warning:
      
        mm/kasan/report.c: In function 'kasan_report':
        mm/kasan/report.c:351:3: error: 'info.first_bad_addr' may be used uninitialized in this function [-Werror=maybe-uninitialized]
           print_shadow_for_address(info->first_bad_addr);
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        mm/kasan/report.c:360:27: note: 'info.first_bad_addr' was declared here
      
      The code seems fine as we only print info.first_bad_addr when there is a
      shadow, and we always initialize it in that case, but this is relatively
      hard for gcc to figure out after the latest rework.
      
      Adding an intialization to the most likely value together with the other
      struct members shuts up that warning.
      
      Fixes: b235b9808664 ("kasan: unify report headers")
      Link: https://patchwork.kernel.org/patch/9641417/
      Link: http://lkml.kernel.org/r/20170725152739.4176967-1-arnd@arndb.deSigned-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Suggested-by: default avatarAlexander Potapenko <glider@google.com>
      Suggested-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Acked-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e7701557
    • Mike Rapoport's avatar
      userfaultfd: non-cooperative: notify about unmap of destination during mremap · b2282371
      Mike Rapoport authored
      When mremap is called with MREMAP_FIXED it unmaps memory at the
      destination address without notifying userfaultfd monitor.
      
      If the destination were registered with userfaultfd, the monitor has no
      way to distinguish between the old and new ranges and to properly relate
      the page faults that would occur in the destination region.
      
      Fixes: 897ab3e0 ("userfaultfd: non-cooperative: add event for memory unmaps")
      Link: http://lkml.kernel.org/r/1500276876-3350-1-git-send-email-rppt@linux.vnet.ibm.comSigned-off-by: default avatarMike Rapoport <rppt@linux.vnet.ibm.com>
      Acked-by: default avatarPavel Emelyanov <xemul@virtuozzo.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b2282371
    • Mel Gorman's avatar
      mm, mprotect: flush TLB if potentially racing with a parallel reclaim leaving stale TLB entries · 3ea27719
      Mel Gorman authored
      Nadav Amit identified a theoritical race between page reclaim and
      mprotect due to TLB flushes being batched outside of the PTL being held.
      
      He described the race as follows:
      
              CPU0                            CPU1
              ----                            ----
                                              user accesses memory using RW PTE
                                              [PTE now cached in TLB]
              try_to_unmap_one()
              ==> ptep_get_and_clear()
              ==> set_tlb_ubc_flush_pending()
                                              mprotect(addr, PROT_READ)
                                              ==> change_pte_range()
                                              ==> [ PTE non-present - no flush ]
      
                                              user writes using cached RW PTE
              ...
      
              try_to_unmap_flush()
      
      The same type of race exists for reads when protecting for PROT_NONE and
      also exists for operations that can leave an old TLB entry behind such
      as munmap, mremap and madvise.
      
      For some operations like mprotect, it's not necessarily a data integrity
      issue but it is a correctness issue as there is a window where an
      mprotect that limits access still allows access.  For munmap, it's
      potentially a data integrity issue although the race is massive as an
      munmap, mmap and return to userspace must all complete between the
      window when reclaim drops the PTL and flushes the TLB.  However, it's
      theoritically possible so handle this issue by flushing the mm if
      reclaim is potentially currently batching TLB flushes.
      
      Other instances where a flush is required for a present pte should be ok
      as either the page lock is held preventing parallel reclaim or a page
      reference count is elevated preventing a parallel free leading to
      corruption.  In the case of page_mkclean there isn't an obvious path
      that userspace could take advantage of without using the operations that
      are guarded by this patch.  Other users such as gup as a race with
      reclaim looks just at PTEs.  huge page variants should be ok as they
      don't race with reclaim.  mincore only looks at PTEs.  userfault also
      should be ok as if a parallel reclaim takes place, it will either fault
      the page back in or read some of the data before the flush occurs
      triggering a fault.
      
      Note that a variant of this patch was acked by Andy Lutomirski but this
      was for the x86 parts on top of his PCID work which didn't make the 4.13
      merge window as expected.  His ack is dropped from this version and
      there will be a follow-on patch on top of PCID that will include his
      ack.
      
      [akpm@linux-foundation.org: tweak comments]
      [akpm@linux-foundation.org: fix spello]
      Link: http://lkml.kernel.org/r/20170717155523.emckq2esjro6hf3z@suse.deReported-by: default avatarNadav Amit <nadav.amit@gmail.com>
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: <stable@vger.kernel.org>	[v4.4+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3ea27719
    • Kefeng Wang's avatar
      pid: kill pidhash_size in pidhash_init() · 27e37d84
      Kefeng Wang authored
      After commit 3d375d78 ("mm: update callers to use HASH_ZERO flag"),
      drop unused pidhash_size in pidhash_init().
      
      Link: http://lkml.kernel.org/r/1500389267-49222-1-git-send-email-wangkefeng.wang@huawei.comSigned-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Reviewed-by: default avatarPavel Tatashin <Pasha.Tatashin@Oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      27e37d84
    • Daniel Jordan's avatar
      mm/hugetlb.c: __get_user_pages ignores certain follow_hugetlb_page errors · 2be7cfed
      Daniel Jordan authored
      Commit 9a291a7c ("mm/hugetlb: report -EHWPOISON not -EFAULT when
      FOLL_HWPOISON is specified") causes __get_user_pages to ignore certain
      errors from follow_hugetlb_page.  After such error, __get_user_pages
      subsequently calls faultin_page on the same VMA and start address that
      follow_hugetlb_page failed on instead of returning the error immediately
      as it should.
      
      In follow_hugetlb_page, when hugetlb_fault returns a value covered under
      VM_FAULT_ERROR, follow_hugetlb_page returns it without setting nr_pages
      to 0 as __get_user_pages expects in this case, which causes the
      following to happen in __get_user_pages: the "while (nr_pages)" check
      succeeds, we skip the "if (!vma..." check because we got a VMA the last
      time around, we find no page with follow_page_mask, and we call
      faultin_page, which calls hugetlb_fault for the second time.
      
      This issue also slightly changes how __get_user_pages works.  Before, it
      only returned error if it had made no progress (i = 0).  But now,
      follow_hugetlb_page can clobber "i" with an error code since its new
      return path doesn't check for progress.  So if "i" is nonzero before a
      failing call to follow_hugetlb_page, that indication of progress is lost
      and __get_user_pages can return error even if some pages were
      successfully pinned.
      
      To fix this, change follow_hugetlb_page so that it updates nr_pages,
      allowing __get_user_pages to fail immediately and restoring the "error
      only if no progress" behavior to __get_user_pages.
      
      Tested that __get_user_pages returns when expected on error from
      hugetlb_fault in follow_hugetlb_page.
      
      Fixes: 9a291a7c ("mm/hugetlb: report -EHWPOISON not -EFAULT when FOLL_HWPOISON is specified")
      Link: http://lkml.kernel.org/r/1500406795-58462-1-git-send-email-daniel.m.jordan@oracle.comSigned-off-by: default avatarDaniel Jordan <daniel.m.jordan@oracle.com>
      Acked-by: default avatarPunit Agrawal <punit.agrawal@arm.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: zhong jiang <zhongjiang@huawei.com>
      Cc: <stable@vger.kernel.org>	[4.12.x]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2be7cfed
    • Linus Torvalds's avatar
      Merge tag 'platform-drivers-x86-v4.13-3' of git://git.infradead.org/linux-platform-drivers-x86 · 4d3f5d04
      Linus Torvalds authored
      Pull x86 platform driver fixes from Darren Hart:
       "Fix two bugs under error or abnormal usage conditions. Correct a
        config dependency:
      
        dell-wmi:
         - Fix driver interface version query
      
        wmi:
         - Fix error handling in acpi_wmi_init()
      
        peaq-wmi:
         - select INPUT_POLLDEV"
      
      * tag 'platform-drivers-x86-v4.13-3' of git://git.infradead.org/linux-platform-drivers-x86:
        platform/x86: dell-wmi: Fix driver interface version query
        platform/x86: wmi: Fix error handling in acpi_wmi_init()
        platform/x86: peaq-wmi: select INPUT_POLLDEV
      4d3f5d04
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 33611ba0
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "These seven patches are mostly minor build, Kconfig and error leg
        fixes"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: qedi: Fix return code in qedi_ep_connect()
        scsi: lpfc: fix linking against modular NVMe support
        scsi: scsi_transport_fc: return -EBUSY for deleted vport
        scsi: libcxgbi: add check for valid cxgbi_task_data
        scsi: aic7xxx: fix firmware build with O=path
        scsi: megaraid_sas: fix memleak in megasas_alloc_cmdlist_fusion
        scsi: qedi: Add ISCSI_BOOT_SYSFS to Kconfig
      33611ba0
    • Trond Myklebust's avatar
      NFSv4: Fix double frees in nfs4_test_session_trunk() · d9cb7330
      Trond Myklebust authored
      rpc_clnt_add_xprt() expects the callback function to be synchronous, and
      expects to release the transport and switch references itself.
      
      Fixes: 04fa2c6b ("NFS pnfs data server multipath session trunking")
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      d9cb7330
    • Sergei Shtylyov's avatar
      powerpc/83xx/mpc832x_rdb: fix of_irq_to_resource() error check · f29bb786
      Sergei Shtylyov authored
      of_irq_to_resource() has recently been fixed to return negative error #'s
      along with 0 in case of failure, however the Freescale MPC832x RDB board
      code still only regards 0 as a failure indication -- fix it up.
      
      Fixes: 7a4228bb ("of: irq: use of_irq_get() in of_irq_to_resource()")
      Signed-off-by: default avatarSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Acked-by: default avatarScott Wood <oss@buserror.net>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      f29bb786
  4. 01 Aug, 2017 9 commits
    • Andy Lutomirski's avatar
      platform/x86: dell-wmi: Fix driver interface version query · 51391caf
      Andy Lutomirski authored
      When I converted dell-wmi to the new bus infrastructure, I left the
      call to dell_wmi_check_descriptor_buffer() in dell_wmi_init().  This
      could cause two problems:
      
       - An error message when loading the driver on a system without
         dell-wmi.  We'd try to read the event descriptor even if the WMI
         GUID wasn't there.
      
       - A possible race if dell-wmi was loaded manually before wmi was
         fully initialized.
      
      Fix it by moving the call to the probe function where it belongs.
      
      Fixes: bff589be ("platform/x86: dell-wmi: Convert to the WMI bus infrastructure")
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Reviewed-by: default avatarPali Rohár <pali.rohar@gmail.com>
      Signed-off-by: default avatarDarren Hart (VMware) <dvhart@infradead.org>
      51391caf
    • Trond Myklebust's avatar
      NFSv4: Fix EXCHANGE_ID corrupt verifier issue · fd40559c
      Trond Myklebust authored
      The verifier is allocated on the stack, but the EXCHANGE_ID RPC call was
      changed to be asynchronous by commit 8d89bd70. If we interrrupt
      the call to rpc_wait_for_completion_task(), we can therefore end up
      transmitting random stack contents in lieu of the verifier.
      
      Fixes: 8d89bd70 ("NFS setup async exchange_id")
      Cc: stable@vger.kernel.org # v4.9+
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      fd40559c
    • Linus Torvalds's avatar
      Merge branch 'parisc-4.13-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · 26c5cebf
      Linus Torvalds authored
      Pull parsic fixes from Helge Deller:
      
       - Our cache flushing code ran into a BUG in case context is not
         current. Fix it by flushing the whole cache in such rare situations
         (by Dave Anglin).
      
       - Fix a "sleeping function called from invalid context BUG" in our
         pdc_stable driver by rearranging our locks (by James Bottomley)
      
       - The thread and irq stacks require more than 16 KB since kernel 4.11.
         Increase both to 32 KB.
      
       - Define CONFIG_CPU_BIG_ENDIAN unconditionally on parisc to avoid wrong
         behaviour in qrwlock functions (by Babu Moger).
      
      * 'parisc-4.13-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
        parisc: Define CONFIG_CPU_BIG_ENDIAN
        parisc: pdc_stable: Fix locking when creating sysfs links
        parisc: Increase thread and stack size to 32kb
        parisc: Handle vma's whose context is not current in flush_cache_range
      26c5cebf
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · bc78d646
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Handle notifier registry failures properly in tun/tap driver, from
          Tonghao Zhang.
      
       2) Fix bpf verifier handling of subtraction bounds and add a testcase
          for this, from Edward Cree.
      
       3) Increase reset timeout in ftgmac100 driver, from Ben Herrenschmidt.
      
       4) Fix use after free in prd_retire_rx_blk_timer_exired() in AF_PACKET,
          from Cong Wang.
      
       5) Fix SElinux regression due to recent UDP optimizations, from Paolo
          Abeni.
      
       6) We accidently increment IPSTATS_MIB_FRAGFAILS in the ipv6 code
          paths, fix from Stefano Brivio.
      
       7) Fix some mem leaks in dccp, from Xin Long.
      
       8) Adjust MDIO_BUS kconfig deps to avoid build errors, from Arnd
          Bergmann.
      
       9) Mac address length check and buffer size fixes from Cong Wang.
      
      10) Don't leak sockets in ipv6 udp early demux, from Paolo Abeni.
      
      11) Fix return value when copy_from_user() fails in
          bpf_prog_get_info_by_fd(), from Daniel Borkmann.
      
      12) Handle PHY_HALTED properly in phy library state machine, from
          Florian Fainelli.
      
      13) Fix OOPS in fib_sync_down_dev(), from Ido Schimmel.
      
      14) Fix truesize calculation in virtio_net which led to performance
          regressions, from Michael S Tsirkin.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (76 commits)
        samples/bpf: fix bpf tunnel cleanup
        udp6: fix jumbogram reception
        ppp: Fix a scheduling-while-atomic bug in del_chan
        Revert "net: bcmgenet: Remove init parameter from bcmgenet_mii_config"
        virtio_net: fix truesize for mergeable buffers
        mv643xx_eth: fix of_irq_to_resource() error check
        MAINTAINERS: Add more files to the PHY LIBRARY section
        ipv4: fib: Fix NULL pointer deref during fib_sync_down_dev()
        net: phy: Correctly process PHY_HALTED in phy_stop_machine()
        sunhme: fix up GREG_STAT and GREG_IMASK register offsets
        bpf: fix bpf_prog_get_info_by_fd to dump correct xlated_prog_len
        tcp: avoid bogus gcc-7 array-bounds warning
        net: tc35815: fix spelling mistake: "Intterrupt" -> "Interrupt"
        bpf: don't indicate success when copy_from_user fails
        udp6: fix socket leak on early demux
        net: thunderx: Fix BGX transmit stall due to underflow
        Revert "vhost: cache used event for better performance"
        team: use a larger struct for mac address
        net: check dev->addr_len for dev_set_mac_address()
        phy: bcm-ns-usb3: fix MDIO_BUS dependency
        ...
      bc78d646
    • William Tu's avatar
      samples/bpf: fix bpf tunnel cleanup · cc75f851
      William Tu authored
      test_tunnel_bpf.sh fails to remove the vxlan11 tunnel device, causing the
      next geneve tunnelling test case fails.  In addition, the geneve reserved bit
      in tcbpf2_kern.c should be zero, according to the RFC.
      Signed-off-by: default avatarWilliam Tu <u9012063@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cc75f851
    • Paolo Abeni's avatar
      udp6: fix jumbogram reception · cb891fa6
      Paolo Abeni authored
      Since commit 67a51780 ("ipv6: udp: leverage scratch area
      helpers") udp6_recvmsg() read the skb len from the scratch area,
      to avoid a cache miss.
      But the UDP6 rx path support RFC 2675 UDPv6 jumbograms, and their
      length exceeds the 16 bits available in the scratch area. As a side
      effect the length returned by recvmsg() is:
      <ingress datagram len> % (1<<16)
      
      This commit addresses the issue allocating one more bit in the
      IP6CB flags field and setting it for incoming jumbograms.
      Such field is still in the first cacheline, so at recvmsg()
      time we can check it and fallback to access skb->len if
      required, without a measurable overhead.
      
      Fixes: 67a51780 ("ipv6: udp: leverage scratch area helpers")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cb891fa6
    • Gao Feng's avatar
      ppp: Fix a scheduling-while-atomic bug in del_chan · ddab8282
      Gao Feng authored
      The PPTP set the pptp_sock_destruct as the sock's sk_destruct, it would
      trigger this bug when __sk_free is invoked in atomic context, because of
      the call path pptp_sock_destruct->del_chan->synchronize_rcu.
      
      Now move the synchronize_rcu to pptp_release from del_chan. This is the
      only one case which would free the sock and need the synchronize_rcu.
      
      The following is the panic I met with kernel 3.3.8, but this issue should
      exist in current kernel too according to the codes.
      
      BUG: scheduling while atomic
      __schedule_bug+0x5e/0x64
      __schedule+0x55/0x580
      ? ppp_unregister_channel+0x1cd5/0x1de0 [ppp_generic]
      ? dev_hard_start_xmit+0x423/0x530
      ? sch_direct_xmit+0x73/0x170
      __cond_resched+0x16/0x30
      _cond_resched+0x22/0x30
      wait_for_common+0x18/0x110
      ? call_rcu_bh+0x10/0x10
      wait_for_completion+0x12/0x20
      wait_rcu_gp+0x34/0x40
      ? wait_rcu_gp+0x40/0x40
      synchronize_sched+0x1e/0x20
      0xf8417298
      0xf8417484
      ? sock_queue_rcv_skb+0x109/0x130
      __sk_free+0x16/0x110
      ? udp_queue_rcv_skb+0x1f2/0x290
      sk_free+0x16/0x20
      __udp4_lib_rcv+0x3b8/0x650
      Signed-off-by: default avatarGao Feng <gfree.wind@vip.163.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ddab8282
    • Florian Fainelli's avatar
      Revert "net: bcmgenet: Remove init parameter from bcmgenet_mii_config" · 00d51094
      Florian Fainelli authored
      This reverts commit 28b45910 ("net: bcmgenet: Remove init parameter
      from bcmgenet_mii_config") because in the process of moving from
      dev_info() to dev_info_once() we essentially lost the helpful printed
      messages once the second instance of the driver is loaded.
      dev_info_once() does not actually print the message once per device
      instance, but once period.
      
      Fixes: 28b45910 ("net: bcmgenet: Remove init parameter from bcmgenet_mii_config")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarDoug Berger <opendmb@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      00d51094
    • Michael S. Tsirkin's avatar
      virtio_net: fix truesize for mergeable buffers · 1daa8790
      Michael S. Tsirkin authored
      Seth Forshee noticed a performance degradation with some workloads.
      This turns out to be due to packet drops.  Euan Kemp noticed that this
      is because we drop all packets where length exceeds the truesize, but
      for some packets we add in extra memory without updating the truesize.
      This in turn was kept around unchanged from ab7db917 ("virtio-net:
      auto-tune mergeable rx buffer size for improved performance").  That
      commit had an internal reason not to account for the extra space: not
      enough bits to do it.  No longer true so let's account for the allocated
      length exactly.
      
      Many thanks to Seth Forshee for the report and bisecting and Euan Kemp
      for debugging the issue.
      
      Fixes: 680557cf ("virtio_net: rework mergeable buffer handling")
      Reported-by: default avatarEuan Kemp <euan.kemp@coreos.com>
      Tested-by: default avatarEuan Kemp <euan.kemp@coreos.com>
      Reported-by: default avatarSeth Forshee <seth.forshee@canonical.com>
      Tested-by: default avatarSeth Forshee <seth.forshee@canonical.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1daa8790