1. 26 Jun, 2020 1 commit
    • Jens Axboe's avatar
      io_uring: use task_work for links if possible · c40f6379
      Jens Axboe authored
      Currently links are always done in an async fashion, unless we catch them
      inline after we successfully complete a request without having to resort
      to blocking. This isn't necessarily the most efficient approach, it'd be
      more ideal if we could just use the task_work handling for this.
      
      Outside of saving an async jump, we can also do less prep work for these
      kinds of requests.
      
      Running dependent links from the task_work handler yields some nice
      performance benefits. As an example, examples/link-cp from the liburing
      repository uses read+write links to implement a copy operation. Without
      this patch, the a cache fold 4G file read from a VM runs in about 3
      seconds:
      
      $ time examples/link-cp /data/file /dev/null
      
      real	0m2.986s
      user	0m0.051s
      sys	0m2.843s
      
      and a subsequent cache hot run looks like this:
      
      $ time examples/link-cp /data/file /dev/null
      
      real	0m0.898s
      user	0m0.069s
      sys	0m0.797s
      
      With this patch in place, the cold case takes about 2.4 seconds:
      
      $ time examples/link-cp /data/file /dev/null
      
      real	0m2.400s
      user	0m0.020s
      sys	0m2.366s
      
      and the cache hot case looks like this:
      
      $ time examples/link-cp /data/file /dev/null
      
      real	0m0.676s
      user	0m0.010s
      sys	0m0.665s
      
      As expected, the (mostly) cache hot case yields the biggest improvement,
      running about 25% faster with this change, while the cache cold case
      yields about a 20% increase in performance. Outside of the performance
      increase, we're using less CPU as well, as we're not using the async
      offload threads at all for this anymore.
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      c40f6379
  2. 25 Jun, 2020 6 commits
    • Jens Axboe's avatar
      io_uring: enable READ/WRITE to use deferred completions · a1d7c393
      Jens Axboe authored
      A bit more surgery required here, as completions are generally done
      through the kiocb->ki_complete() callback, even if they complete inline.
      This enables the regular read/write path to use the io_comp_state
      logic to batch inline completions.
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      a1d7c393
    • Jens Axboe's avatar
      io_uring: pass in completion state to appropriate issue side handlers · 229a7b63
      Jens Axboe authored
      Provide the completion state to the handlers that we know can complete
      inline, so they can utilize this for batching completions.
      
      Cap the max batch count at 32. This should be enough to provide a good
      amortization of the cost of the lock+commit dance for completions, while
      still being low enough not to cause any real latency issues for SQPOLL
      applications.
      
      Xuan Zhuo <xuanzhuo@linux.alibaba.com> reports that this changes his
      profile from:
      
      17.97% [kernel] [k] copy_user_generic_unrolled
      13.92% [kernel] [k] io_commit_cqring
      11.04% [kernel] [k] __io_cqring_fill_event
      10.33% [kernel] [k] udp_recvmsg
       5.94% [kernel] [k] skb_release_data
       4.31% [kernel] [k] udp_rmem_release
       2.68% [kernel] [k] __check_object_size
       2.24% [kernel] [k] __slab_free
       2.22% [kernel] [k] _raw_spin_lock_bh
       2.21% [kernel] [k] kmem_cache_free
       2.13% [kernel] [k] free_pcppages_bulk
       1.83% [kernel] [k] io_submit_sqes
       1.38% [kernel] [k] page_frag_free
       1.31% [kernel] [k] inet_recvmsg
      
      to
      
      19.99% [kernel] [k] copy_user_generic_unrolled
      11.63% [kernel] [k] skb_release_data
       9.36% [kernel] [k] udp_rmem_release
       8.64% [kernel] [k] udp_recvmsg
       6.21% [kernel] [k] __slab_free
       4.39% [kernel] [k] __check_object_size
       3.64% [kernel] [k] free_pcppages_bulk
       2.41% [kernel] [k] kmem_cache_free
       2.00% [kernel] [k] io_submit_sqes
       1.95% [kernel] [k] page_frag_free
       1.54% [kernel] [k] io_put_req
      [...]
       0.07% [kernel] [k] io_commit_cqring
       0.44% [kernel] [k] __io_cqring_fill_event
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      229a7b63
    • Jens Axboe's avatar
      io_uring: pass down completion state on the issue side · f13fad7b
      Jens Axboe authored
      No functional changes in this patch, just in preparation for having the
      completion state be available on the issue side. Later on, this will
      allow requests that complete inline to be completed in batches.
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      f13fad7b
    • Jens Axboe's avatar
      io_uring: add 'io_comp_state' to struct io_submit_state · 013538bd
      Jens Axboe authored
      No functional changes in this patch, just in preparation for passing back
      pending completions to the caller and completing them in a batched
      fashion.
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      013538bd
    • Jens Axboe's avatar
      io_uring: provide generic io_req_complete() helper · e1e16097
      Jens Axboe authored
      We have lots of callers of:
      
      io_cqring_add_event(req, result);
      io_put_req(req);
      
      Provide a helper that does this for us. It helps clean up the code, and
      also provides a more convenient location for us to change the completion
      handling.
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      e1e16097
    • Pavel Begunkov's avatar
      io_uring: fix NULL-mm for linked reqs · d3cac64c
      Pavel Begunkov authored
      __io_queue_sqe() tries to handle all request of a link,
      so it's not enough to grab mm in io_sq_thread_acquire_mm()
      based just on the head.
      
      Don't check req->needs_mm and do it always.
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      d3cac64c
  3. 22 Jun, 2020 25 commits
  4. 21 Jun, 2020 8 commits
    • Linus Torvalds's avatar
      Linux 5.8-rc2 · 48778464
      Linus Torvalds authored
      48778464
    • Linus Torvalds's avatar
      Merge tag 'selinux-pr-20200621' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux · 817d914d
      Linus Torvalds authored
      Pull SELinux fixes from Paul Moore:
       "Three small patches to fix problems in the SELinux code, all found via
        clang.
      
        Two patches fix potential double-free conditions and one fixes an
        undefined return value"
      
      * tag 'selinux-pr-20200621' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
        selinux: fix undefined return of cond_evaluate_expr
        selinux: fix a double free in cond_read_node()/cond_read_list()
        selinux: fix double free
      817d914d
    • Linus Torvalds's avatar
      Merge tag 'pinctrl-v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · 16f4aa9b
      Linus Torvalds authored
      Pull pin control fixes from Linus Walleij:
       "Some early fixes collected during the first week after the merge
        window, all pretty self-evident, with the details below. The revert is
        the crucial thing.
      
         - Fix a warning on the Qualcomm SPMI GPIO chip being instatiated
           twice without a unique irqchip struct
      
         - Use the noirq variants of the suspend and resume callbacks in the
           Tegra driver
      
         - Clean up the errorpath on the MCP23s08 driver
      
         - Revert the use of devm_of_iomap() in the Freescale driver as it was
           regressing the platform
      
         - Add some missing pins in the Qualcomm IPQ6018 driver
      
         - Fix a simple documentation bug in the pinctrl-single driver"
      
      * tag 'pinctrl-v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
        pinctrl: single: fix function name in documentation
        pinctrl: qcom: ipq6018 Add missing pins in qpic pin group
        Revert "pinctrl: freescale: imx: Use 'devm_of_iomap()' to avoid a resource leak in case of error in 'imx_pinctrl_probe()'"
        pinctrl: mcp23s08: Split to three parts: fix ptr_ret.cocci warnings
        pinctrl: tegra: Use noirq suspend/resume callbacks
        pinctrl: qcom: spmi-gpio: fix warning about irq chip reusage
      16f4aa9b
    • Linus Torvalds's avatar
      Merge tag 'kbuild-fixes-v5.8' of... · be9160a9
      Linus Torvalds authored
      Merge tag 'kbuild-fixes-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
      
      Pull Kbuild fixes from Masahiro Yamada:
      
       - fix -gz=zlib compiler option test for CONFIG_DEBUG_INFO_COMPRESSED
      
       - improve cc-option in scripts/Kbuild.include to clean up temp files
      
       - improve cc-option in scripts/Kconfig.include for more reliable
         compile option test
      
       - do not copy modules.builtin by 'make install' because it would break
         existing systems
      
       - use 'userprogs' syntax for watch_queue sample
      
      * tag 'kbuild-fixes-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        samples: watch_queue: build sample program for target architecture
        Revert "Makefile: install modules.builtin even if CONFIG_MODULES=n"
        scripts: Fix typo in headers_install.sh
        kconfig: unify cc-option and as-option
        kbuild: improve cc-option to clean up all temporary files
        Makefile: Improve compressed debug info support detection
      be9160a9
    • Linus Torvalds's avatar
      Merge tag 'powerpc-5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 75613939
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
      
       - One fix for the interrupt rework we did last release which broke
         KVM-PR
      
       - Three commits fixing some fallout from the READ_ONCE() changes
         interacting badly with our 8xx 16K pages support, which uses a pte_t
         that is a structure of 4 actual PTEs
      
       - A cleanup of the 8xx pte_update() to use the newly added pmd_off()
      
       - A fix for a crash when handling an oops if CONFIG_DEBUG_VIRTUAL is
         enabled
      
       - A minor fix for the SPU syscall generation
      
      Thanks to Aneesh Kumar K.V, Christian Zigotzky, Christophe Leroy, Mike
      Rapoport, Nicholas Piggin.
      
      * tag 'powerpc-5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/8xx: Provide ptep_get() with 16k pages
        mm: Allow arches to provide ptep_get()
        mm/gup: Use huge_ptep_get() in gup_hugepte()
        powerpc/syscalls: Use the number when building SPU syscall table
        powerpc/8xx: use pmd_off() to access a PMD entry in pte_update()
        powerpc/64s: Fix KVM interrupt using wrong save area
        powerpc: Fix kernel crash in show_instructions() w/DEBUG_VIRTUAL
      75613939
    • Linus Torvalds's avatar
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 93bbca27
      Linus Torvalds authored
      Pull crypto fixes from Herbert Xu:
      
       - NULL dereference in octeontx
      
       - PM reference imbalance in ks-sa
      
       - deadlock in crypto manager
      
       - memory leak in drbg
      
       - missing socket limit check on receive SG list size in algif_skcipher
      
       - typos in caam
      
       - warnings in ccp and hisilicon
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: drbg - always try to free Jitter RNG instance
        crypto: marvell/octeontx - Fix a potential NULL dereference
        crypto: algboss - don't wait during notifier callback
        crypto: caam - fix typos
        crypto: ccp - Fix sparse warnings in sev-dev
        crypto: hisilicon - Cap block size at 2^31
        crypto: algif_skcipher - Cap recv SG list at ctx->used
        hwrng: ks-sa - Fix runtime PM imbalance on error
      93bbca27
    • Masahiro Yamada's avatar
      samples: watch_queue: build sample program for target architecture · 214377e9
      Masahiro Yamada authored
      This userspace program includes UAPI headers exported to usr/include/.
      'make headers' always works for the target architecture (i.e. the same
      architecture as the kernel), so the sample program should be built for
      the target as well. Kbuild now supports 'userprogs' for that.
      
      I also guarded the CONFIG option by 'depends on CC_CAN_LINK' because
      $(CC) may not provide libc.
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      214377e9
    • Masahiro Yamada's avatar
      Revert "Makefile: install modules.builtin even if CONFIG_MODULES=n" · 2c6d9636
      Masahiro Yamada authored
      This reverts commit e0b250b5,
      which broke build systems that need to install files to a certain
      path, but do not set INSTALL_MOD_PATH when invoking 'make install'.
      
        $ make INSTALL_PATH=/tmp/destdir install
        mkdir: cannot create directory ‘/lib/modules/5.8.0-rc1+/’: Permission denied
        Makefile:1342: recipe for target '_builtin_inst_' failed
        make: *** [_builtin_inst_] Error 1
      
      While modules.builtin is useful also for CONFIG_MODULES=n, this change
      in the behavior is quite unexpected. Maybe "make modules_install"
      can install modules.builtin irrespective of CONFIG_MODULES as Jonas
      originally suggested.
      
      Anyway, that commit should be reverted ASAP.
      Reported-by: default avatarDouglas Anderson <dianders@chromium.org>
      Reported-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Cc: Jonas Karlman <jonas@kwiboo.se>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Reviewed-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      2c6d9636