1. 24 Jan, 2017 23 commits
  2. 10 Jan, 2017 3 commits
    • Steve Wise's avatar
      iw_cxgb4: do not send RX_DATA_ACK CPLs after close/abort · 3bcf96e0
      Steve Wise authored
      Function rx_data(), which handles ingress CPL_RX_DATA messages, was
      always sending an RX_DATA_ACK with the goal of updating the credits.
      However, if the RDMA connection is moved out of FPDU mode abruptly,
      then it is possible for iw_cxgb4 to process queued RX_DATA CPLs after HW
      has aborted the connection.  These CPLs should not trigger RX_DATA_ACKS.
      If they do, HW can see a READ after DELETE of the DB_LE hash entry for
      the tid and post a LE_DB HashTblMemCrcError.
      Signed-off-by: default avatarSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      3bcf96e0
    • Steve Wise's avatar
      iw_cxgb4: free EQ queue memory on last deref · c12a67fe
      Steve Wise authored
      Commit ad61a4c7 ("iw_cxgb4: don't block in destroy_qp awaiting
      the last deref") introduced a bug where the RDMA QP EQ queue memory
      (and QIDs) are possibly freed before the underlying connection has been
      fully shutdown.  The result being a possible DMA read issued by HW after
      the queue memory has been unmapped and freed.  This results in possible
      WR corruption in the worst case, system bus errors if an IOMMU is in use,
      and SGE "bad WR" errors reported in the very least.  The fix is to defer
      unmap/free of queue memory and QID resources until the QP struct has
      been fully dereferenced.  To do this, the c4iw_ucontext must also be kept
      around until the last QP that references it is fully freed.  In addition,
      since the last QP deref can happen in an IRQ disabled context, we need
      a new workqueue thread to do the final unmap/free of the EQ queue memory.
      
      Fixes: ad61a4c7 ("iw_cxgb4: don't block in destroy_qp awaiting the last deref")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      c12a67fe
    • Steve Wise's avatar
      iw_cxgb4: refactor sq/rq drain logic · 4fe7c296
      Steve Wise authored
      With the addition of the IB/Core drain API, iw_cxgb4 supported drain
      by watching the CQs when the QP was out of RTS and signalling "drain
      complete" when the last CQE is polled.  This, however, doesn't fully
      support the drain semantics. Namely, the drain logic is supposed to signal
      "drain complete" only when the application has _processed_ the last CQE,
      not just removed them from the CQ.  Thus a small timing hole exists that
      can cause touch after free type bugs in applications using the drain API
      (nvmf, iSER, for example).  So iw_cxgb4 needs a better solution.
      
      The iWARP Verbs spec mandates that "_at some point_ after the QP is
      moved to ERROR", the iWARP driver MUST synchronously fail post_send and
      post_recv calls.  iw_cxgb4 was currently not allowing any posts once the
      QP is in ERROR.  This was in part due to the fact that the HW queues for
      the QP in ERROR state are disabled at this point, so there wasn't much
      else to do but fail the post operation synchronously.  This restriction
      is what drove the first drain implementation in iw_cxgb4 that has the
      above mentioned flaw.
      
      This patch changes iw_cxgb4 to allow post_send and post_recv WRs after
      the QP is moved to ERROR state for kernel mode users, thus still adhering
      to the Verbs spec for user mode users, but allowing flush WRs for kernel
      users.  Since the HW queues are disabled, we just synthesize a CQE for
      this post, queue it to the SW CQ, and then call the CQ event handler.
      This enables proper drain operations for the various storage applications.
      Signed-off-by: default avatarSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      4fe7c296
  3. 08 Jan, 2017 6 commits
    • Linus Torvalds's avatar
      Linux 4.10-rc3 · a121103c
      Linus Torvalds authored
      a121103c
    • Linus Torvalds's avatar
      Merge tag 'usb-4.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 83280e90
      Linus Torvalds authored
      Pull USB fixes from Greg KH:
       "Here are a bunch of USB fixes for 4.10-rc3. Yeah, it's a lot, an
        artifact of the holiday break I think.
      
        Lots of gadget and the usual XHCI fixups for reported issues (one day
        that driver will calm down...) Also included are a bunch of usb-serial
        driver fixes, and for good measure, a number of much-reported MUSB
        driver issues have finally been resolved.
      
        All of these have been in linux-next with no reported issues"
      
      * tag 'usb-4.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (72 commits)
        USB: fix problems with duplicate endpoint addresses
        usb: ohci-at91: use descriptor-based gpio APIs correctly
        usb: storage: unusual_uas: Add JMicron JMS56x to unusual device
        usb: hub: Move hub_port_disable() to fix warning if PM is disabled
        usb: musb: blackfin: add bfin_fifo_offset in bfin_ops
        usb: musb: fix compilation warning on unused function
        usb: musb: Fix trying to free already-free IRQ 4
        usb: musb: dsps: implement clear_ep_rxintr() callback
        usb: musb: core: add clear_ep_rxintr() to musb_platform_ops
        USB: serial: ti_usb_3410_5052: fix NULL-deref at open
        USB: serial: spcp8x5: fix NULL-deref at open
        USB: serial: quatech2: fix sleep-while-atomic in close
        USB: serial: pl2303: fix NULL-deref at open
        USB: serial: oti6858: fix NULL-deref at open
        USB: serial: omninet: fix NULL-derefs at open and disconnect
        USB: serial: mos7840: fix misleading interrupt-URB comment
        USB: serial: mos7840: remove unused write URB
        USB: serial: mos7840: fix NULL-deref at open
        USB: serial: mos7720: remove obsolete port initialisation
        USB: serial: mos7720: fix parallel probe
        ...
      83280e90
    • Linus Torvalds's avatar
      Merge tag 'char-misc-4.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · cc250e26
      Linus Torvalds authored
      Pull char/misc fixes from Greg KH:
       "Here are a few small char/misc driver fixes for 4.10-rc3.
      
        Two MEI driver fixes, and three NVMEM patches for reported issues, and
        a new Hyper-V driver MAINTAINER update. Nothing major at all, all have
        been in linux-next with no reported issues"
      
      * tag 'char-misc-4.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
        hyper-v: Add myself as additional MAINTAINER
        nvmem: fix nvmem_cell_read() return type doc
        nvmem: imx-ocotp: Fix wrong register size
        nvmem: qfprom: Allow single byte accesses for read/write
        mei: move write cb to completion on credentials failures
        mei: bus: fix mei_cldev_enable KDoc
      cc250e26
    • Linus Torvalds's avatar
      Merge tag 'staging-4.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · 6ea17ed1
      Linus Torvalds authored
      Pull staging/IIO fixes from Greg KH:
       "Here are some staging and IIO driver fixes for 4.10-rc3.
      
        Most of these are minor IIO fixes of reported issues, along with one
        network driver fix to resolve an issue. And a MAINTAINERS update with
        a new mailing list. All of these, except the MAINTAINERS file update,
        have been in linux-next with no reported issues (the MAINTAINERS patch
        happened on Friday...)"
      
      * tag 'staging-4.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
        MAINTAINERS: add greybus subsystem mailing list
        staging: octeon: Call SET_NETDEV_DEV()
        iio: accel: st_accel: fix LIS3LV02 reading and scaling
        iio: common: st_sensors: fix channel data parsing
        iio: max44000: correct value in illuminance_integration_time_available
        iio: adc: TI_AM335X_ADC should depend on HAS_DMA
        iio: bmi160: Fix time needed to sleep after command execution
        iio: 104-quad-8: Fix active level mismatch for the preset enable option
        iio: 104-quad-8: Fix off-by-one errors when addressing IOR
        iio: 104-quad-8: Fix index control configuration
      6ea17ed1
    • Johannes Weiner's avatar
      mm: workingset: fix use-after-free in shadow node shrinker · ea07b862
      Johannes Weiner authored
      Several people report seeing warnings about inconsistent radix tree
      nodes followed by crashes in the workingset code, which all looked like
      use-after-free access from the shadow node shrinker.
      
      Dave Jones managed to reproduce the issue with a debug patch applied,
      which confirmed that the radix tree shrinking indeed frees shadow nodes
      while they are still linked to the shadow LRU:
      
        WARNING: CPU: 2 PID: 53 at lib/radix-tree.c:643 delete_node+0x1e4/0x200
        CPU: 2 PID: 53 Comm: kswapd0 Not tainted 4.10.0-rc2-think+ #3
        Call Trace:
           delete_node+0x1e4/0x200
           __radix_tree_delete_node+0xd/0x10
           shadow_lru_isolate+0xe6/0x220
           __list_lru_walk_one.isra.4+0x9b/0x190
           list_lru_walk_one+0x23/0x30
           scan_shadow_nodes+0x2e/0x40
           shrink_slab.part.44+0x23d/0x5d0
           shrink_node+0x22c/0x330
           kswapd+0x392/0x8f0
      
      This is the WARN_ON_ONCE(!list_empty(&node->private_list)) placed in the
      inlined radix_tree_shrink().
      
      The problem is with 14b46879 ("mm: workingset: move shadow entry
      tracking to radix tree exceptional tracking"), which passes an update
      callback into the radix tree to link and unlink shadow leaf nodes when
      tree entries change, but forgot to pass the callback when reclaiming a
      shadow node.
      
      While the reclaimed shadow node itself is unlinked by the shrinker, its
      deletion from the tree can cause the left-most leaf node in the tree to
      be shrunk.  If that happens to be a shadow node as well, we don't unlink
      it from the LRU as we should.
      
      Consider this tree, where the s are shadow entries:
      
             root->rnode
                  |
             [0       n]
              |       |
           [s    ] [sssss]
      
      Now the shadow node shrinker reclaims the rightmost leaf node through
      the shadow node LRU:
      
             root->rnode
                  |
             [0        ]
              |
          [s     ]
      
      Because the parent of the deleted node is the first level below the
      root and has only one child in the left-most slot, the intermediate
      level is shrunk and the node containing the single shadow is put in
      its place:
      
             root->rnode
                  |
             [s        ]
      
      The shrinker again sees a single left-most slot in a first level node
      and thus decides to store the shadow in root->rnode directly and free
      the node - which is a leaf node on the shadow node LRU.
      
        root->rnode
             |
             s
      
      Without the update callback, the freed node remains on the shadow LRU,
      where it causes later shrinker runs to crash.
      
      Pass the node updater callback into __radix_tree_delete_node() in case
      the deletion causes the left-most branch in the tree to collapse too.
      
      Also add warnings when linked nodes are freed right away, rather than
      wait for the use-after-free when the list is scanned much later.
      
      Fixes: 14b46879 ("mm: workingset: move shadow entry tracking to radix tree exceptional tracking")
      Reported-by: default avatarDave Chinner <david@fromorbit.com>
      Reported-by: default avatarHugh Dickins <hughd@google.com>
      Reported-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Reported-and-tested-by: default avatarDave Jones <davej@codemonkey.org.uk>
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Chris Leech <cleech@redhat.com>
      Cc: Lee Duncan <lduncan@suse.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox <mawilcox@linuxonhyperv.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ea07b862
    • Hugh Dickins's avatar
      mm: stop leaking PageTables · b0b9b3df
      Hugh Dickins authored
      4.10-rc loadtest (even on x86, and even without THPCache) fails with
      "fork: Cannot allocate memory" or some such; and /proc/meminfo shows
      PageTables growing.
      
      Commit 953c66c2 ("mm: THP page cache support for ppc64") that got
      merged in rc1 removed the freeing of an unused preallocated pagetable
      after do_fault_around() has called map_pages().
      
      This is usually a good optimization, so that the followup doesn't have
      to reallocate one; but it's not sufficient to shift the freeing into
      alloc_set_pte(), since there are failure cases (most commonly
      VM_FAULT_RETRY) which never reach finish_fault().
      
      Check and free it at the outer level in do_fault(), then we don't need
      to worry in alloc_set_pte(), and can restore that to how it was (I
      cannot find any reason to pte_free() under lock as it was doing).
      
      And fix a separate pagetable leak, or crash, introduced by the same
      change, that could only show up on some ppc64: why does do_set_pmd()'s
      failure case attempt to withdraw a pagetable when it never deposited
      one, at the same time overwriting (so leaking) the vmf->prealloc_pte?
      Residue of an earlier implementation, perhaps? Delete it.
      
      Fixes: 953c66c2 ("mm: THP page cache support for ppc64")
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Michael Neuling <mikey@neuling.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b0b9b3df
  4. 07 Jan, 2017 2 commits
  5. 06 Jan, 2017 6 commits