1. 28 Jun, 2017 1 commit
    • Kees Cook's avatar
      locking/refcount: Create unchecked atomic_t implementation · fd25d19f
      Kees Cook authored
      Many subsystems will not use refcount_t unless there is a way to build the
      kernel so that there is no regression in speed compared to atomic_t. This
      adds CONFIG_REFCOUNT_FULL to enable the full refcount_t implementation
      which has the validation but is slightly slower. When not enabled,
      refcount_t uses the basic unchecked atomic_t routines, which results in
      no code changes compared to just using atomic_t directly.
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Acked-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: David Windsor <dwindsor@gmail.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Elena Reshetova <elena.reshetova@intel.com>
      Cc: Eric Biggers <ebiggers3@gmail.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Hans Liljestrand <ishkamiel@gmail.com>
      Cc: James Bottomley <James.Bottomley@hansenpartnership.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Serge E. Hallyn <serge@hallyn.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: arozansk@redhat.com
      Cc: axboe@kernel.dk
      Cc: linux-arch <linux-arch@vger.kernel.org>
      Link: http://lkml.kernel.org/r/20170621200026.GA115679@beastSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      fd25d19f
  2. 22 Jun, 2017 2 commits
    • Ingo Molnar's avatar
      f9e16988
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.dk/linux-block · 8d829b9b
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "This contains a set of fixes for xen-blkback by way of Konrad, and a
        performance regression fix for blk-mq for shared tags.
      
        The latter could account for as much as a 50x reduction in
        performance, with the test case from the user with 500 name spaces. A
        more realistic setup on my end with 32 drives showed a 3.5x drop. The
        fix has been thoroughly tested before being committed"
      
      * 'for-linus' of git://git.kernel.dk/linux-block:
        blk-mq: fix performance regression with shared tags
        xen-blkback: don't leak stack data via response ring
        xen/blkback: don't use xen_blkif_get() in xen-blkback kthread
        xen/blkback: don't free be structure too early
        xen/blkback: fix disconnect while I/Os in flight
      8d829b9b
  3. 21 Jun, 2017 8 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 48b6bbef
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix refcounting wrt timers which hold onto inet6 address objects,
          from Xin Long.
      
       2) Fix an ancient bug in wireless wext ioctls, from Johannes Berg.
      
       3) Firmware handling fixes in brcm80211 driver, from Arend Van Spriel.
      
       4) Several mlx5 driver fixes (firmware readiness, timestamp cap
          reporting, devlink command validity checking, tc offloading, etc.)
          From Eli Cohen, Maor Dickman, Chris Mi, and Or Gerlitz.
      
       5) Fix dst leak in IP/IP6 tunnels, from Haishuang Yan.
      
       6) Fix dst refcount bug in decnet, from Wei Wang.
      
       7) Netdev can be double freed in register_vlan_device(). Fix from Gao
          Feng.
      
       8) Don't allow object to be destroyed while it is being dumped in SCTP,
          from Xin Long.
      
       9) Fix dpaa_eth build when modular, from Madalin Bucur.
      
      10) Fix throw route leaks, from Serhey Popovych.
      
      11) IFLA_GROUP missing from if_nlmsg_size() and ifla_policy[] table,
          also from Serhey Popovych.
      
      12) Fix premature TX SKB free in stmmac, from Niklas Cassel.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (36 commits)
        igmp: add a missing spin_lock_init()
        net: stmmac: free an skb first when there are no longer any descriptors using it
        sfc: remove duplicate up_write on VF filter_sem
        rtnetlink: add IFLA_GROUP to ifla_policy
        ipv6: Do not leak throw route references
        dt-bindings: net: sms911x: Add missing optional VDD regulators
        dpaa_eth: reuse the dma_ops provided by the FMan MAC device
        fsl/fman: propagate dma_ops
        net/core: remove explicit do_softirq() from busy_poll_stop()
        fib_rules: Resolve goto rules target on delete
        sctp: ensure ep is not destroyed before doing the dump
        net/hns:bugfix of ethtool -t phy self_test
        net: 8021q: Fix one possible panic caused by BUG_ON in free_netdev
        cxgb4: notify uP to route ctrlq compl to rdma rspq
        ip6_tunnel: Correct tos value in collect_md mode
        decnet: always not take dst->__refcnt when inserting dst into hash table
        ip6_tunnel: fix potential issue in __ip6_tnl_rcv
        ip_tunnel: fix potential issue in ip_tunnel_rcv
        brcmfmac: fix uninitialized warning in brcmf_usb_probe_phase2()
        net/mlx5e: Avoid doing a cleanup call if the profile doesn't have it
        ...
      48b6bbef
    • Linus Torvalds's avatar
      Merge tag 'pinctrl-v4.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · ce879b64
      Linus Torvalds authored
      Pull more pin control fixes from Linus Walleij:
       "Some late arriving fixes. I should have sent earlier, just swamped
        with work as usual. Thomas patch makes AMD systems usable despite
        firmware bugs so it is fairly important.
      
         - Make the AMD driver use a regular interrupt rather than a chained
           one, so the system does not lock up.
      
         - Fix a function call error deep inside the STM32 driver"
      
      * tag 'pinctrl-v4.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
        pinctrl: stm32: Fix bad function call
        pinctrl/amd: Use regular interrupt instead of chained
      ce879b64
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid · db1b5ccd
      Linus Torvalds authored
      Pull HID fixes from Jiri Kosina:
      
       - revert of a commit to magicmouse driver that regressess certain
         devices, from Daniel Stone
      
       - quirk for a specific Dell mouse, from Sebastian Parschauer
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
        Revert "HID: magicmouse: Set multi-touch keybits for Magic Mouse"
        HID: Add quirk for Dell PIXART OEM mouse
      db1b5ccd
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching · dcba7108
      Linus Torvalds authored
      Pull livepatching fix from Jiri Kosina:
       "Fix the way how livepatches are being stacked with respect to RCU,
        from Petr Mladek"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
        livepatch: Fix stacking of patches with respect to RCU
      dcba7108
    • Linus Torvalds's avatar
      Merge branch 'ufs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 021f6019
      Linus Torvalds authored
      Pull more ufs fixes from Al Viro:
       "More UFS fixes, unfortunately including build regression fix for the
        64-bit s_dsize commit. Fixed in this pile:
      
         - trivial bug in signedness of 32bit timestamps on ufs1
      
         - ESTALE instead of ufs_error() when doing open-by-fhandle on
           something deleted
      
         - build regression on 32bit in ufs_new_fragments() - calculating that
           many percents of u64 pulls libgcc stuff on some of those. Mea
           culpa.
      
         - fix hysteresis loop broken by typo in 2.4.14.7 (right next to the
           location of previous bug).
      
         - fix the insane limits of said hysteresis loop on filesystems with
           very low percentage of reserved blocks. If it's 5% or less, just
           use the OPTSPACE policy.
      
         - calculate those limits once and mount time.
      
        This tree does pass xfstests clean (both ufs1 and ufs2) and it _does_
        survive cross-builds.
      
        Again, my apologies for missing that, especially since I have noticed
        a related percentage-of-64bit issue in earlier patches (when dealing
        with amount of reserved blocks). Self-LART applied..."
      
      * 'ufs-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        ufs: fix the logics for tail relocation
        ufs_iget(): fail with -ESTALE on deleted inode
        fix signedness of timestamps on ufs1
      021f6019
    • Helge Deller's avatar
      Allow stack to grow up to address space limit · bd726c90
      Helge Deller authored
      Fix expand_upwards() on architectures with an upward-growing stack (parisc,
      metag and partly IA-64) to allow the stack to reliably grow exactly up to
      the address space limit given by TASK_SIZE.
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Acked-by: default avatarHugh Dickins <hughd@google.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bd726c90
    • Hugh Dickins's avatar
      mm: fix new crash in unmapped_area_topdown() · f4cb767d
      Hugh Dickins authored
      Trinity gets kernel BUG at mm/mmap.c:1963! in about 3 minutes of
      mmap testing.  That's the VM_BUG_ON(gap_end < gap_start) at the
      end of unmapped_area_topdown().  Linus points out how MAP_FIXED
      (which does not have to respect our stack guard gap intentions)
      could result in gap_end below gap_start there.  Fix that, and
      the similar case in its alternative, unmapped_area().
      
      Cc: stable@vger.kernel.org
      Fixes: 1be7107f ("mm: larger stack guard gap, between vmas")
      Reported-by: default avatarDave Jones <davej@codemonkey.org.uk>
      Debugged-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f4cb767d
    • Jens Axboe's avatar
      blk-mq: fix performance regression with shared tags · 8e8320c9
      Jens Axboe authored
      If we have shared tags enabled, then every IO completion will trigger
      a full loop of every queue belonging to a tag set, and every hardware
      queue for each of those queues, even if nothing needs to be done.
      This causes a massive performance regression if you have a lot of
      shared devices.
      
      Instead of doing this huge full scan on every IO, add an atomic
      counter to the main queue that tracks how many hardware queues have
      been marked as needing a restart. With that, we can avoid looking for
      restartable queues, if we don't have to.
      
      Max reports that this restores performance. Before this patch, 4K
      IOPS was limited to 22-23K IOPS. With the patch, we are running at
      950-970K IOPS.
      
      Fixes: 6d8c6c0f ("blk-mq: Restart a single queue if tag sets are shared")
      Reported-by: default avatarMax Gurtovoy <maxg@mellanox.com>
      Tested-by: default avatarMax Gurtovoy <maxg@mellanox.com>
      Reviewed-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      Tested-by: default avatarBart Van Assche <bart.vanassche@wdc.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      8e8320c9
  4. 20 Jun, 2017 19 commits
    • WANG Cong's avatar
      igmp: add a missing spin_lock_init() · b4846fc3
      WANG Cong authored
      Andrey reported a lockdep warning on non-initialized
      spinlock:
      
       INFO: trying to register non-static key.
       the code is fine but needs lockdep annotation.
       turning off the locking correctness validator.
       CPU: 1 PID: 4099 Comm: a.out Not tainted 4.12.0-rc6+ #9
       Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
       Call Trace:
        __dump_stack lib/dump_stack.c:16
        dump_stack+0x292/0x395 lib/dump_stack.c:52
        register_lock_class+0x717/0x1aa0 kernel/locking/lockdep.c:755
        ? 0xffffffffa0000000
        __lock_acquire+0x269/0x3690 kernel/locking/lockdep.c:3255
        lock_acquire+0x22d/0x560 kernel/locking/lockdep.c:3855
        __raw_spin_lock_bh ./include/linux/spinlock_api_smp.h:135
        _raw_spin_lock_bh+0x36/0x50 kernel/locking/spinlock.c:175
        spin_lock_bh ./include/linux/spinlock.h:304
        ip_mc_clear_src+0x27/0x1e0 net/ipv4/igmp.c:2076
        igmpv3_clear_delrec+0xee/0x4f0 net/ipv4/igmp.c:1194
        ip_mc_destroy_dev+0x4e/0x190 net/ipv4/igmp.c:1736
      
      We miss a spin_lock_init() in igmpv3_add_delrec(), probably
      because previously we never use it on this code path. Since
      we already unlink it from the global mc_tomb list, it is
      probably safe not to acquire this spinlock here. It does not
      harm to have it although, to avoid conditional locking.
      
      Fixes: c38b7d32 ("igmp: acquire pmc lock for ip_mc_clear_src()")
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b4846fc3
    • David S. Miller's avatar
      Merge tag 'wireless-drivers-for-davem-2017-06-20' of... · afd64631
      David S. Miller authored
      Merge tag 'wireless-drivers-for-davem-2017-06-20' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
      
      Kalle Valo says:
      
      ====================
      wireless-drivers fixes for 4.12
      
      Two important fixes for brcmfmac. The rest of the brcmfmac patches are
      either code preparation and fixing a new build warning.
      
      brcmfmac
      
      * fix a NULL pointer dereference during resume
      
      * fix a NULL pointer dereference with USB devices, a regression from
        v4.12-rc1
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      afd64631
    • Niklas Cassel's avatar
      net: stmmac: free an skb first when there are no longer any descriptors using it · 05cf0d1b
      Niklas Cassel authored
      When having the skb pointer in the first descriptor, stmmac_tx_clean
      can get called at a moment where the IP has only cleared the own bit
      of the first descriptor, thus freeing the skb, even though there can
      be several descriptors whose buffers point into the same skb.
      
      By simply moving the skb pointer from the first descriptor to the last
      descriptor, a skb will get freed only when the IP has cleared the
      own bit of all the descriptors that are using that skb.
      Signed-off-by: default avatarNiklas Cassel <niklas.cassel@axis.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      05cf0d1b
    • Edward Cree's avatar
      sfc: remove duplicate up_write on VF filter_sem · 57f0c9cf
      Edward Cree authored
      Somehow two copies of the line 'up_write(&vf->efx->filter_sem);' got into
       efx_ef10_sriov_set_vf_vlan().  This would put the mutex in a bad state and
       cause all subsequent down attempts to hang.
      
      Fixes: 671b53ee ("sfc: Ensure down_write(&filter_sem) and up_write() are matched before calling efx_net_open()")
      Signed-off-by: default avatarEdward Cree <ecree@solarflare.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      57f0c9cf
    • Serhey Popovych's avatar
      rtnetlink: add IFLA_GROUP to ifla_policy · db833d40
      Serhey Popovych authored
      Network interface groups support added while ago, however
      there is no IFLA_GROUP attribute description in policy
      and netlink message size calculations until now.
      
      Add IFLA_GROUP attribute to the policy.
      
      Fixes: cbda10fa ("net_device: add support for network device groups")
      Signed-off-by: default avatarSerhey Popovych <serhe.popovych@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      db833d40
    • Serhey Popovych's avatar
      ipv6: Do not leak throw route references · 07f61557
      Serhey Popovych authored
      While commit 73ba57bf ("ipv6: fix backtracking for throw routes")
      does good job on error propagation to the fib_rules_lookup()
      in fib rules core framework that also corrects throw routes
      handling, it does not solve route reference leakage problem
      happened when we return -EAGAIN to the fib_rules_lookup()
      and leave routing table entry referenced in arg->result.
      
      If rule with matched throw route isn't last matched in the
      list we overwrite arg->result losing reference on throw
      route stored previously forever.
      
      We also partially revert commit ab997ad4 ("ipv6: fix the
      incorrect return value of throw route") since we never return
      routing table entry with dst.error == -EAGAIN when
      CONFIG_IPV6_MULTIPLE_TABLES is on. Also there is no point
      to check for RTF_REJECT flag since it is always set throw
      route.
      
      Fixes: 73ba57bf ("ipv6: fix backtracking for throw routes")
      Signed-off-by: default avatarSerhey Popovych <serhe.popovych@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      07f61557
    • Krzysztof Kozlowski's avatar
      dt-bindings: net: sms911x: Add missing optional VDD regulators · 7e113321
      Krzysztof Kozlowski authored
      The lan911x family of devices require supplying from 3.3 V power
      supplies (connected to VDD_IO, VDD_A and VREG_3.3 pins).  The existing
      driver however obtains only VDD_IO and VDD_A regulators in an optional
      way so document this in bindings.
      Signed-off-by: default avatarKrzysztof Kozlowski <krzk@kernel.org>
      Reviewed-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7e113321
    • David S. Miller's avatar
      Merge branch 'net-fix-loadable-module-for-DPAA-Ethernet' · 73b098d6
      David S. Miller authored
      Madalin Bucur says:
      
      ====================
      net: fix loadable module for DPAA Ethernet
      
      The DPAA Ethernet makes use of a symbol that is not exported.
      Address the issue by propagating the dma_ops rather than calling
      arch_setup_dma_ops().
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      73b098d6
    • Madalin Bucur's avatar
      dpaa_eth: reuse the dma_ops provided by the FMan MAC device · fb52728a
      Madalin Bucur authored
      Remove the use of arch_setup_dma_ops() that was not exported
      and was breaking loadable module compilation.
      Signed-off-by: default avatarMadalin Bucur <madalin.bucur@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fb52728a
    • Madalin Bucur's avatar
      fsl/fman: propagate dma_ops · 5567e989
      Madalin Bucur authored
      Make sure dma_ops are set, to be later used by the Ethernet driver.
      Signed-off-by: default avatarMadalin Bucur <madalin.bucur@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5567e989
    • Sebastian Siewior's avatar
      net/core: remove explicit do_softirq() from busy_poll_stop() · fe420d87
      Sebastian Siewior authored
      Since commit 217f6974 ("net: busy-poll: allow preemption in
      sk_busy_loop()") there is an explicit do_softirq() invocation after
      local_bh_enable() has been invoked.
      I don't understand why we need this because local_bh_enable() will
      invoke do_softirq() once the softirq counter reached zero and we have
      softirq-related work pending.
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fe420d87
    • Serhey Popovych's avatar
      fib_rules: Resolve goto rules target on delete · bdaf32c3
      Serhey Popovych authored
      We should avoid marking goto rules unresolved when their
      target is actually reachable after rule deletion.
      
      Consolder following sample scenario:
      
        # ip -4 ru sh
        0:      from all lookup local
        32000:  from all goto 32100
        32100:  from all lookup main
        32100:  from all lookup default
        32766:  from all lookup main
        32767:  from all lookup default
      
        # ip -4 ru del pref 32100 table main
        # ip -4 ru sh
        0:      from all lookup local
        32000:  from all goto 32100 [unresolved]
        32100:  from all lookup default
        32766:  from all lookup main
        32767:  from all lookup default
      
      After removal of first rule with preference 32100 we
      mark all goto rules as unreachable, even when rule with
      same preference as removed one still present.
      
      Check if next rule with same preference is available
      and make all rules with goto action pointing to it.
      Signed-off-by: default avatarSerhey Popovych <serhe.popovych@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bdaf32c3
    • Jens Axboe's avatar
      Merge branch 'stable/for-jens-4.12' of... · ec2f0fad
      Jens Axboe authored
      Merge branch 'stable/for-jens-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-linus
      
      Pull xen-blkback fixes from Konrad:
      
      "Security and memory leak fixes in xen block driver."
      ec2f0fad
    • Levin, Alexander (Sasha Levin)'s avatar
      locking/rtmutex: Don't initialize lockdep when not required · cde50a67
      Levin, Alexander (Sasha Levin) authored
      pi_mutex isn't supposed to be tracked by lockdep, but just
      passing NULLs for name and key will cause lockdep to spew a
      warning and die, which is not what we want it to do.
      
      Skip lockdep initialization if the caller passed NULLs for
      name and key, suggesting such initialization isn't desired.
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: f5694788 ("rt_mutex: Add lockdep annotations")
      Link: http://lkml.kernel.org/r/20170618140548.4763-1-alexander.levin@verizon.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      cde50a67
    • Jiri Kosina's avatar
      900a88ef
    • Petr Mladek's avatar
      livepatch: Fix stacking of patches with respect to RCU · 842c0884
      Petr Mladek authored
      rcu_read_(un)lock(), list_*_rcu(), and synchronize_rcu() are used for a secure
      access and manipulation of the list of patches that modify the same function.
      In particular, it is the variable func_stack that is accessible from the ftrace
      handler via struct ftrace_ops and klp_ops.
      
      Of course, it synchronizes also some states of the patch on the top of the
      stack, e.g. func->transition in klp_ftrace_handler.
      
      At the same time, this mechanism guards also the manipulation of
      task->patch_state. It is modified according to the state of the transition and
      the state of the process.
      
      Now, all this works well as long as RCU works well. Sadly livepatching might
      get into some corner cases when this is not true. For example, RCU is not
      watching when rcu_read_lock() is taken in idle threads.  It is because they
      might sleep and prevent reaching the grace period for too long.
      
      There are ways how to make RCU watching even in idle threads, see
      rcu_irq_enter(). But there is a small location inside RCU infrastructure when
      even this does not work.
      
      This small problematic location can be detected either before calling
      rcu_irq_enter() by rcu_irq_enter_disabled() or later by rcu_is_watching().
      Sadly, there is no safe way how to handle it.  Once we detect that RCU was not
      watching, we might see inconsistent state of the function stack and the related
      variables in klp_ftrace_handler(). Then we could do a wrong decision, use an
      incompatible implementation of the function and break the consistency of the
      system. We could warn but we could not avoid the damage.
      
      Fortunately, ftrace has similar problems and they seem to be solved well there.
      It uses a heavy weight implementation of some RCU operations. In particular, it
      replaces:
      
        + rcu_read_lock() with preempt_disable_notrace()
        + rcu_read_unlock() with preempt_enable_notrace()
        + synchronize_rcu() with schedule_on_each_cpu(sync_work)
      
      My understanding is that this is RCU implementation from a stone age. It meets
      the core RCU requirements but it is rather ineffective. Especially, it does not
      allow to batch or speed up the synchronize calls.
      
      On the other hand, it is very trivial. It allows to safely trace and/or
      livepatch even the RCU core infrastructure.  And the effectiveness is a not a
      big issue because using ftrace or livepatches on productive systems is a rare
      operation.  The safety is much more important than a negligible extra load.
      
      Note that the alternative implementation follows the RCU principles. Therefore,
           we could and actually must use list_*_rcu() variants when manipulating the
           func_stack.  These functions allow to access the pointers in the right
           order and with the right barriers. But they do not use any other
           information that would be set only by rcu_read_lock().
      
      Also note that there are actually two problems solved in ftrace:
      
      First, it cares about the consistency of RCU read sections.  It is being solved
      the way as described and used in this patch.
      
      Second, ftrace needs to make sure that nobody is inside the dynamic trampoline
      when it is being freed. For this, it also calls synchronize_rcu_tasks() in
      preemptive kernel in ftrace_shutdown().
      
      Livepatch has similar problem but it is solved by ftrace for free.
      klp_ftrace_handler() is a good guy and never sleeps. In addition, it is
      registered with FTRACE_OPS_FL_DYNAMIC. It causes that
      unregister_ftrace_function() calls:
      
      	* schedule_on_each_cpu(ftrace_sync) - always
      	* synchronize_rcu_tasks() - in preemptive kernel
      
      The effect is that nobody is neither inside the dynamic trampoline nor inside
      the ftrace handler after unregister_ftrace_function() returns.
      
      [jkosina@suse.cz: reformat changelog, fix comment]
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      Acked-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: default avatarMiroslav Benes <mbenes@suse.cz>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      842c0884
    • Daniel Stone's avatar
      Revert "HID: magicmouse: Set multi-touch keybits for Magic Mouse" · 53145c2e
      Daniel Stone authored
      Setting these bits causes libinput to fail to initialize the device;
      setting BTN_TOUCH and BTN_TOOL_FINGER causes it to treat the mouse as a
      touchpad, and it then refuses to continue when it discovers ABS_X is not
      set.
      
      This breaks all known Wayland compositors, as well as Xorg when the
      libinput driver is being used.
      
      This reverts commit f4b65b95.
      Signed-off-by: default avatarDaniel Stone <daniels@collabora.com>
      Cc: Che-Liang Chiou <clchiou@chromium.org>
      Cc: Thierry Escande <thierry.escande@collabora.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com>
      Acked-by: default avatarBenjamin Tissoires <benjamin.tissoires@redhat.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      53145c2e
    • Linus Torvalds's avatar
      Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 9705596d
      Linus Torvalds authored
      Pull clk fixes from Stephen Boyd:
       "One build fix for an Amlogic clk driver and a handful of Allwinner clk
        driver fixes for some DT bindings and a randconfig build error that
        all came in this merge window"
      
      * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: sunxi-ng: a64: Export PLL_PERIPH0 clock for the PRCM
        clk: sunxi-ng: h3: Export PLL_PERIPH0 clock for the PRCM
        dt-bindings: clock: sunxi-ccu: Add pll-periph to PRCM's needed clocks
        clk: sunxi-ng: sun5i: Fix ahb_bist_clk definition
        clk: sunxi-ng: enable SUNXI_CCU_MP for PRCM
        clk: meson: gxbb: fix build error without RESET_CONTROLLER
        clk: sunxi-ng: v3s: Fix usb otg device reset bit
        clk: sunxi-ng: a31: Correct lcd1-ch1 clock register offset
      9705596d
    • Linus Torvalds's avatar
      Merge tag 'ntb-4.12-bugfixes' of git://github.com/jonmason/ntb · 865be780
      Linus Torvalds authored
      Pull NTB fixes from Jon Mason:
       "NTB bug fixes to address the modinfo in ntb_perf, a couple of bugs in
        the NTB transport QP calculations, skx doorbells, and sleeping in
        ntb_async_tx_submit"
      
      * tag 'ntb-4.12-bugfixes' of git://github.com/jonmason/ntb:
        ntb: no sleep in ntb_async_tx_submit
        ntb: ntb_hw_intel: Skylake doorbells should be 32bits, not 64bits
        ntb_transport: fix bug calculating num_qps_mw
        ntb_transport: fix qp count bug
        NTB: ntb_test: fix bug printing ntb_perf results
        ntb: Correct modinfo usage statement for ntb_perf
      865be780
  5. 19 Jun, 2017 10 commits
    • Xin Long's avatar
      sctp: ensure ep is not destroyed before doing the dump · 86fdb344
      Xin Long authored
      Now before dumping a sock in sctp_diag, it only holds the sock while
      the ep may be already destroyed. It can cause a use-after-free panic
      when accessing ep->asocs.
      
      This patch is to set sctp_sk(sk)->ep NULL in sctp_endpoint_destroy,
      and check if this ep is already destroyed before dumping this ep.
      Suggested-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdrver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      86fdb344
    • Allen Hubbe's avatar
      ntb: no sleep in ntb_async_tx_submit · 88931ec3
      Allen Hubbe authored
      Do not sleep in ntb_async_tx_submit, which could deadlock.
      This reverts commit "8c874cc1"
      
      Fixes: 8c874cc1 ("NTB: Address out of DMA descriptor issue with NTB")
      Reported-by: default avatarJia-Ju Bai <baijiaju1990@163.com>
      Signed-off-by: default avatarAllen Hubbe <Allen.Hubbe@dell.com>
      Acked-by: default avatarDave Jiang <dave.jiang@intel.com>
      Signed-off-by: default avatarJon Mason <jdmason@kudzu.us>
      88931ec3
    • Dave Jiang's avatar
      ntb: ntb_hw_intel: Skylake doorbells should be 32bits, not 64bits · 5eb449e1
      Dave Jiang authored
      Fixing doorbell register length to 32bits per spec. On Skylake NTB, the
      doorbell registers are 32bit write only registers. The source for the
      doorbell is a 64bit register that shows the interrupt bits.
      Signed-off-by: default avatarDave Jiang <dave.jiang@intel.com>
      Fixes: 783dfa6c ("ntb: Adding Skylake Xeon NTB support")
      Acked-by: default avatarAllen Hubbe <Allen.Hubbe@dell.com>
      Signed-off-by: default avatarJon Mason <jdmason@kudzu.us>
      5eb449e1
    • Logan Gunthorpe's avatar
      ntb_transport: fix bug calculating num_qps_mw · 8e8496e0
      Logan Gunthorpe authored
      A divide by zero error occurs if qp_count is less than mw_count because
      num_qps_mw is calculated to be zero. The calculation appears to be
      incorrect.
      
      The requirement is for num_qps_mw to be set to qp_count / mw_count
      with any remainder divided among the earlier mws.
      
      For example, if mw_count is 5 and qp_count is 12 then mws 0 and 1
      will have 3 qps per window and mws 2 through 4 will have 2 qps per window.
      Thus, when mw_num < qp_count % mw_count, num_qps_mw is 1 higher
      than when mw_num >= qp_count.
      Signed-off-by: default avatarLogan Gunthorpe <logang@deltatee.com>
      Fixes: e26a5843 ("NTB: Split ntb_hw_intel and ntb_transport drivers")
      Acked-by: default avatarAllen Hubbe <Allen.Hubbe@dell.com>
      Signed-off-by: default avatarJon Mason <jdmason@kudzu.us>
      8e8496e0
    • Logan Gunthorpe's avatar
      ntb_transport: fix qp count bug · cb827ee6
      Logan Gunthorpe authored
      In cases where there are more mw's than spads/2-2, the mw count gets
      reduced to match the limitation. ntb_transport also tries to ensure that
      there are fewer qps than mws but uses the full mw count instead of
      the reduced one. When this happens, the math in
      'ntb_transport_setup_qp_mw' will get confused and result in a kernel
      paging request bug.
      
      This patch fixes the bug by reducing qp_count to the reduced mw count
      instead of the full mw count.
      Signed-off-by: default avatarLogan Gunthorpe <logang@deltatee.com>
      Fixes: e26a5843 ("NTB: Split ntb_hw_intel and ntb_transport drivers")
      Acked-by: default avatarAllen Hubbe <Allen.Hubbe@dell.com>
      Signed-off-by: default avatarJon Mason <jdmason@kudzu.us>
      cb827ee6
    • Logan Gunthorpe's avatar
      NTB: ntb_test: fix bug printing ntb_perf results · 07b0b22b
      Logan Gunthorpe authored
      The code mistakenly prints the local perf results for the remote test
      so the script reports identical results for both directions. Fix this
      by ensuring we print the remote result.
      Signed-off-by: default avatarLogan Gunthorpe <logang@deltatee.com>
      Fixes: a9c59ef7 ("ntb_test: Add a selftest script for the NTB subsystem")
      Acked-by: default avatarAllen Hubbe <Allen.Hubbe@dell.com>
      Signed-off-by: default avatarJon Mason <jdmason@kudzu.us>
      07b0b22b
    • Gary R Hook's avatar
      ntb: Correct modinfo usage statement for ntb_perf · 94fc7954
      Gary R Hook authored
      The order parameters are powers of 2; adjust the usage information
      to use correct mathematical representations.
      Signed-off-by: default avatarGary R Hook <gary.hook@amd.com>
      Fixes: 8a7b6a77 ("ntb: ntb perf tool")
      Acked-by: default avatarDave Jiang <dave.jiang@intel.com>
      Signed-off-by: default avatarJon Mason <jdmason@kudzu.us>
      94fc7954
    • Lin Yun Sheng's avatar
      net/hns:bugfix of ethtool -t phy self_test · 7fe5b914
      Lin Yun Sheng authored
      This patch fixes the phy loopback self_test failed issue. when
      Marvell Phy Module is loaded, it will powerdown fiber when doing
      phy loopback self test, which cause phy loopback self_test fail.
      Signed-off-by: default avatarLin Yun Sheng <linyunsheng@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7fe5b914
    • Gao Feng's avatar
      net: 8021q: Fix one possible panic caused by BUG_ON in free_netdev · 9745e362
      Gao Feng authored
      The register_vlan_device would invoke free_netdev directly, when
      register_vlan_dev failed. It would trigger the BUG_ON in free_netdev
      if the dev was already registered. In this case, the netdev would be
      freed in netdev_run_todo later.
      
      So add one condition check now. Only when dev is not registered, then
      free it directly.
      
      The following is the part coredump when netdev_upper_dev_link failed
      in register_vlan_dev. I removed the lines which are too long.
      
      [  411.237457] ------------[ cut here ]------------
      [  411.237458] kernel BUG at net/core/dev.c:7998!
      [  411.237484] invalid opcode: 0000 [#1] SMP
      [  411.237705]  [last unloaded: 8021q]
      [  411.237718] CPU: 1 PID: 12845 Comm: vconfig Tainted: G            E   4.12.0-rc5+ #6
      [  411.237737] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
      [  411.237764] task: ffff9cbeb6685580 task.stack: ffffa7d2807d8000
      [  411.237782] RIP: 0010:free_netdev+0x116/0x120
      [  411.237794] RSP: 0018:ffffa7d2807dbdb0 EFLAGS: 00010297
      [  411.237808] RAX: 0000000000000002 RBX: ffff9cbeb6ba8fd8 RCX: 0000000000001878
      [  411.237826] RDX: 0000000000000001 RSI: 0000000000000282 RDI: 0000000000000000
      [  411.237844] RBP: ffffa7d2807dbdc8 R08: 0002986100029841 R09: 0002982100029801
      [  411.237861] R10: 0004000100029980 R11: 0004000100029980 R12: ffff9cbeb6ba9000
      [  411.238761] R13: ffff9cbeb6ba9060 R14: ffff9cbe60f1a000 R15: ffff9cbeb6ba9000
      [  411.239518] FS:  00007fb690d81700(0000) GS:ffff9cbebb640000(0000) knlGS:0000000000000000
      [  411.239949] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  411.240454] CR2: 00007f7115624000 CR3: 0000000077cdf000 CR4: 00000000003406e0
      [  411.240936] Call Trace:
      [  411.241462]  vlan_ioctl_handler+0x3f1/0x400 [8021q]
      [  411.241910]  sock_ioctl+0x18b/0x2c0
      [  411.242394]  do_vfs_ioctl+0xa1/0x5d0
      [  411.242853]  ? sock_alloc_file+0xa6/0x130
      [  411.243465]  SyS_ioctl+0x79/0x90
      [  411.243900]  entry_SYSCALL_64_fastpath+0x1e/0xa9
      [  411.244425] RIP: 0033:0x7fb69089a357
      [  411.244863] RSP: 002b:00007ffcd04e0fc8 EFLAGS: 00000202 ORIG_RAX: 0000000000000010
      [  411.245445] RAX: ffffffffffffffda RBX: 00007ffcd04e2884 RCX: 00007fb69089a357
      [  411.245903] RDX: 00007ffcd04e0fd0 RSI: 0000000000008983 RDI: 0000000000000003
      [  411.246527] RBP: 00007ffcd04e0fd0 R08: 0000000000000000 R09: 1999999999999999
      [  411.246976] R10: 000000000000053f R11: 0000000000000202 R12: 0000000000000004
      [  411.247414] R13: 00007ffcd04e1128 R14: 00007ffcd04e2888 R15: 0000000000000001
      [  411.249129] RIP: free_netdev+0x116/0x120 RSP: ffffa7d2807dbdb0
      Signed-off-by: default avatarGao Feng <gfree.wind@vip.163.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9745e362
    • Raju Rangoju's avatar
      cxgb4: notify uP to route ctrlq compl to rdma rspq · dec6b331
      Raju Rangoju authored
      During the module initialisation there is a possible race
      (basically race between uld and lld) where neither the uld
      nor lld notifies the uP about where to route the ctrl queue
      completions. LLD skips notifying uP as the rdma queues were
      not created by then (will leave it to ULD to notify the uP).
      As the ULD comes up, it also skips notifying the uP as the
      flag FULL_INIT_DONE is not set yet (ULD assumes that the
      interface is not up yet).
      
      Consequently, this race between uld and lld leaves uP
      unnotified about where to send the ctrl queue completions
      to, leading to iwarp RI_RES WR failure.
      
      Here is the race:
      
      CPU 0                                   CPU1
      
      - allocates nic rx queus
      - t4_sge_alloc_ctrl_txq()
      (if rdma rsp queues exists,
      tell uP to route ctrl queue
      compl to rdma rspq)
                                      - acquires the mutex_lock
                                      - allocates rdma response queues
                                      - if FULL_INIT_DONE set,
                                        tell uP to route ctrl queue compl
                                        to rdma rspq
                                      - relinquishes mutex_lock
      - acquires the mutex_lock
      - enable_rx()
      - set FULL_INIT_DONE
      - relinquishes mutex_lock
      
      This patch fixes the above issue.
      
      Fixes: e7519f99('cxgb4: avoid enabling napi twice to the same queue')
      Signed-off-by: default avatarRaju Rangoju <rajur@chelsio.com>
      Acked-by: default avatarSteve Wise <swise@opengridcomputing.com>
      CC: Stable <stable@vger.kernel.org> # 4.9+
      Signed-off-by: default avatarGanesh Goudar <ganeshgr@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dec6b331