1. 05 Jan, 2018 14 commits
  2. 22 Dec, 2017 1 commit
    • Jens Axboe's avatar
      blk-mq: improve heavily contended tag case · 4e5dff41
      Jens Axboe authored
      Even with a number of waitqueues, we can get into a situation where we
      are heavily contended on the waitqueue lock. I got a report on spc1
      where we're spending seconds doing this. Arguably the use case is nasty,
      I reproduce it with one device and 1000 threads banging on the device.
      But that doesn't mean we shouldn't be handling it better.
      
      What ends up happening is that a thread will fail to get a tag, add
      itself to the waitqueue, and subsequently get woken up when a tag is
      freed - only to find itself going back to sleep on the waitqueue.
      
      Instead of waking all threads, use an exclusive wait and wake up our
      sbitmap batch count instead. This seems to work well for me (massive
      improvement for this use case), and it survives basic testing. But I
      haven't fully verified it yet.
      
      An additional improvement is running the queue and checking for a new
      tag BEFORE needing to add ourselves to the waitqueue.
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      4e5dff41
  3. 18 Dec, 2017 1 commit
  4. 17 Dec, 2017 20 commits
  5. 16 Dec, 2017 4 commits
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · f3b5ad89
      Linus Torvalds authored
      Pull rdma fixes from Jason Gunthorpe:
       "More fixes from testing done on the rc kernel, including more SELinux
        testing. Looking forward, lockdep found regression today in ipoib
        which is still being fixed.
      
        Summary:
      
         - Fix for SELinux on the umad SMI path. Some old hardware does not
           fill the PKey properly exposing another bug in the newer SELinux
           code.
      
         - Check the input port as we can exceed array bounds from this user
           supplied value
      
         - Users are unable to use the hash field support as they want due to
           incorrect checks on the field restrictions, correct that so the
           feature works as intended
      
         - User triggerable oops in the NETLINK_RDMA handler
      
         - cxgb4 driver fix for a bad interaction with CQ flushing in iser
           caused by patches in this merge window, and bad CQ flushing during
           normal close.
      
         - Unbalanced memalloc_noio in ipoib in an error path"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
        IB/ipoib: Restore MM behavior in case of tx_ring allocation failure
        iw_cxgb4: only insert drain cqes if wq is flushed
        iw_cxgb4: only clear the ARMED bit if a notification is needed
        RDMA/netlink: Fix general protection fault
        IB/mlx4: Fix RSS hash fields restrictions
        IB/core: Don't enforce PKey security on SMI MADs
        IB/core: Bound check alternate path port number
      f3b5ad89
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · f25e2295
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "Two bugfixes for the AT24 I2C eeprom driver and some minor corrections
        for I2C bus drivers"
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: piix4: Fix port number check on release
        i2c: stm32: Fix copyrights
        i2c-cht-wc: constify platform_device_id
        eeprom: at24: change nvmem stride to 1
        eeprom: at24: fix I2C device selection for runtime PM
      f25e2295
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-4.15-3' of git://git.linux-nfs.org/projects/anna/linux-nfs · d025fbf1
      Linus Torvalds authored
      Pull NFS client fixes from Anna Schumaker:
       "This has two stable bugfixes, one to fix a BUG_ON() when
        nfs_commit_inode() is called with no outstanding commit requests and
        another to fix a race in the SUNRPC receive codepath.
      
        Additionally, there are also fixes for an NFS client deadlock and an
        xprtrdma performance regression.
      
        Summary:
      
        Stable bugfixes:
         - NFS: Avoid a BUG_ON() in nfs_commit_inode() by not waiting for a
           commit in the case that there were no commit requests.
         - SUNRPC: Fix a race in the receive code path
      
        Other fixes:
         - NFS: Fix a deadlock in nfs client initialization
         - xprtrdma: Fix a performance regression for small IOs"
      
      * tag 'nfs-for-4.15-3' of git://git.linux-nfs.org/projects/anna/linux-nfs:
        SUNRPC: Fix a race in the receive code path
        nfs: don't wait on commit in nfs_commit_inode() if there were no commit requests
        xprtrdma: Spread reply processing over more CPUs
        nfs: fix a deadlock in nfs client initialization
      d025fbf1
    • Linus Torvalds's avatar
      Revert "mm: replace p??_write with pte_access_permitted in fault + gup paths" · f6f37321
      Linus Torvalds authored
      This reverts commits 5c9d2d5c, c7da82b8, and e7fe7b5c.
      
      We'll probably need to revisit this, but basically we should not
      complicate the get_user_pages_fast() case, and checking the actual page
      table protection key bits will require more care anyway, since the
      protection keys depend on the exact state of the VM in question.
      
      Particularly when doing a "remote" page lookup (ie in somebody elses VM,
      not your own), you need to be much more careful than this was.  Dave
      Hansen says:
      
       "So, the underlying bug here is that we now a get_user_pages_remote()
        and then go ahead and do the p*_access_permitted() checks against the
        current PKRU. This was introduced recently with the addition of the
        new p??_access_permitted() calls.
      
        We have checks in the VMA path for the "remote" gups and we avoid
        consulting PKRU for them. This got missed in the pkeys selftests
        because I did a ptrace read, but not a *write*. I also didn't
        explicitly test it against something where a COW needed to be done"
      
      It's also not entirely clear that it makes sense to check the protection
      key bits at this level at all.  But one possible eventual solution is to
      make the get_user_pages_fast() case just abort if it sees protection key
      bits set, which makes us fall back to the regular get_user_pages() case,
      which then has a vma and can do the check there if we want to.
      
      We'll see.
      
      Somewhat related to this all: what we _do_ want to do some day is to
      check the PAGE_USER bit - it should obviously always be set for user
      pages, but it would be a good check to have back.  Because we have no
      generic way to test for it, we lost it as part of moving over from the
      architecture-specific x86 GUP implementation to the generic one in
      commit e585513b ("x86/mm/gup: Switch GUP to the generic
      get_user_page_fast() implementation").
      
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: "Jérôme Glisse" <jglisse@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f6f37321