1. 23 Jul, 2020 1 commit
  2. 18 Jul, 2020 2 commits
    • Daniele Albano's avatar
      io_uring: always allow drain/link/hardlink/async sqe flags · 61710e43
      Daniele Albano authored
      We currently filter these for timeout_remove/async_cancel/files_update,
      but we only should be filtering for fixed file and buffer select. This
      also causes a second read of sqe->flags, which isn't needed.
      
      Just check req->flags for the relevant bits. This then allows these
      commands to be used in links, for example, like everything else.
      Signed-off-by: default avatarDaniele Albano <d.albano@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      61710e43
    • Jens Axboe's avatar
      io_uring: ensure double poll additions work with both request types · 807abcb0
      Jens Axboe authored
      The double poll additions were centered around doing POLL_ADD on file
      descriptors that use more than one waitqueue (typically one for read,
      one for write) when being polled. However, it can also end up being
      triggered for when we use poll triggered retry. For that case, we cannot
      safely use req->io, as that could be used by the request type itself.
      
      Add a second io_poll_iocb pointer in the structure we allocate for poll
      based retry, and ensure we use the right one from the two paths.
      
      Fixes: 18bceab1 ("io_uring: allow POLL_ADD with double poll_wait() users")
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      807abcb0
  3. 15 Jul, 2020 1 commit
  4. 12 Jul, 2020 2 commits
  5. 10 Jul, 2020 2 commits
    • Jens Axboe's avatar
      io_uring: account user memory freed when exit has been queued · 309fc03a
      Jens Axboe authored
      We currently account the memory after the exit work has been run, but
      that leaves a gap where a process has closed its ring and until the
      memory has been accounted as freed. If the memlocked ulimit is
      borderline, then that can introduce spurious setup errors returning
      -ENOMEM because the free work hasn't been run yet.
      
      Account this as freed when we close the ring, as not to expose a tiny
      gap where setting up a new ring can fail.
      
      Fixes: 85faa7b8 ("io_uring: punt final io_ring_ctx wait-and-free to workqueue")
      Cc: stable@vger.kernel.org # v5.7
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      309fc03a
    • Yang Yingliang's avatar
      io_uring: fix memleak in io_sqe_files_register() · 667e57da
      Yang Yingliang authored
      I got a memleak report when doing some fuzz test:
      
      BUG: memory leak
      unreferenced object 0x607eeac06e78 (size 8):
        comm "test", pid 295, jiffies 4294735835 (age 31.745s)
        hex dump (first 8 bytes):
          00 00 00 00 00 00 00 00                          ........
        backtrace:
          [<00000000932632e6>] percpu_ref_init+0x2a/0x1b0
          [<0000000092ddb796>] __io_uring_register+0x111d/0x22a0
          [<00000000eadd6c77>] __x64_sys_io_uring_register+0x17b/0x480
          [<00000000591b89a6>] do_syscall_64+0x56/0xa0
          [<00000000864a281d>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Call percpu_ref_exit() on error path to avoid
      refcount memleak.
      
      Fixes: 05f3fb3c ("io_uring: avoid ring quiesce for fixed file set unregister and update")
      Cc: stable@vger.kernel.org
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      667e57da
  6. 09 Jul, 2020 2 commits
    • Yang Yingliang's avatar
      io_uring: fix memleak in __io_sqe_files_update() · f3bd9dae
      Yang Yingliang authored
      I got a memleak report when doing some fuzz test:
      
      BUG: memory leak
      unreferenced object 0xffff888113e02300 (size 488):
      comm "syz-executor401", pid 356, jiffies 4294809529 (age 11.954s)
      hex dump (first 32 bytes):
      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
      a0 a4 ce 19 81 88 ff ff 60 ce 09 0d 81 88 ff ff ........`.......
      backtrace:
      [<00000000129a84ec>] kmem_cache_zalloc include/linux/slab.h:659 [inline]
      [<00000000129a84ec>] __alloc_file+0x25/0x310 fs/file_table.c:101
      [<000000003050ad84>] alloc_empty_file+0x4f/0x120 fs/file_table.c:151
      [<000000004d0a41a3>] alloc_file+0x5e/0x550 fs/file_table.c:193
      [<000000002cb242f0>] alloc_file_pseudo+0x16a/0x240 fs/file_table.c:233
      [<00000000046a4baa>] anon_inode_getfile fs/anon_inodes.c:91 [inline]
      [<00000000046a4baa>] anon_inode_getfile+0xac/0x1c0 fs/anon_inodes.c:74
      [<0000000035beb745>] __do_sys_perf_event_open+0xd4a/0x2680 kernel/events/core.c:11720
      [<0000000049009dc7>] do_syscall_64+0x56/0xa0 arch/x86/entry/common.c:359
      [<00000000353731ca>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      BUG: memory leak
      unreferenced object 0xffff8881152dd5e0 (size 16):
      comm "syz-executor401", pid 356, jiffies 4294809529 (age 11.954s)
      hex dump (first 16 bytes):
      01 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................
      backtrace:
      [<0000000074caa794>] kmem_cache_zalloc include/linux/slab.h:659 [inline]
      [<0000000074caa794>] lsm_file_alloc security/security.c:567 [inline]
      [<0000000074caa794>] security_file_alloc+0x32/0x160 security/security.c:1440
      [<00000000c6745ea3>] __alloc_file+0xba/0x310 fs/file_table.c:106
      [<000000003050ad84>] alloc_empty_file+0x4f/0x120 fs/file_table.c:151
      [<000000004d0a41a3>] alloc_file+0x5e/0x550 fs/file_table.c:193
      [<000000002cb242f0>] alloc_file_pseudo+0x16a/0x240 fs/file_table.c:233
      [<00000000046a4baa>] anon_inode_getfile fs/anon_inodes.c:91 [inline]
      [<00000000046a4baa>] anon_inode_getfile+0xac/0x1c0 fs/anon_inodes.c:74
      [<0000000035beb745>] __do_sys_perf_event_open+0xd4a/0x2680 kernel/events/core.c:11720
      [<0000000049009dc7>] do_syscall_64+0x56/0xa0 arch/x86/entry/common.c:359
      [<00000000353731ca>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      If io_sqe_file_register() failed, we need put the file that get by fget()
      to avoid the memleak.
      
      Fixes: c3a31e60 ("io_uring: add support for IORING_REGISTER_FILES_UPDATE")
      Cc: stable@vger.kernel.org
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      f3bd9dae
    • Xiaoguang Wang's avatar
      io_uring: export cq overflow status to userspace · 6d5f9049
      Xiaoguang Wang authored
      For those applications which are not willing to use io_uring_enter()
      to reap and handle cqes, they may completely rely on liburing's
      io_uring_peek_cqe(), but if cq ring has overflowed, currently because
      io_uring_peek_cqe() is not aware of this overflow, it won't enter
      kernel to flush cqes, below test program can reveal this bug:
      
      static void test_cq_overflow(struct io_uring *ring)
      {
              struct io_uring_cqe *cqe;
              struct io_uring_sqe *sqe;
              int issued = 0;
              int ret = 0;
      
              do {
                      sqe = io_uring_get_sqe(ring);
                      if (!sqe) {
                              fprintf(stderr, "get sqe failed\n");
                              break;;
                      }
                      ret = io_uring_submit(ring);
                      if (ret <= 0) {
                              if (ret != -EBUSY)
                                      fprintf(stderr, "sqe submit failed: %d\n", ret);
                              break;
                      }
                      issued++;
              } while (ret > 0);
              assert(ret == -EBUSY);
      
              printf("issued requests: %d\n", issued);
      
              while (issued) {
                      ret = io_uring_peek_cqe(ring, &cqe);
                      if (ret) {
                              if (ret != -EAGAIN) {
                                      fprintf(stderr, "peek completion failed: %s\n",
                                              strerror(ret));
                                      break;
                              }
                              printf("left requets: %d\n", issued);
                              continue;
                      }
                      io_uring_cqe_seen(ring, cqe);
                      issued--;
                      printf("left requets: %d\n", issued);
              }
      }
      
      int main(int argc, char *argv[])
      {
              int ret;
              struct io_uring ring;
      
              ret = io_uring_queue_init(16, &ring, 0);
              if (ret) {
                      fprintf(stderr, "ring setup failed: %d\n", ret);
                      return 1;
              }
      
              test_cq_overflow(&ring);
              return 0;
      }
      
      To fix this issue, export cq overflow status to userspace by adding new
      IORING_SQ_CQ_OVERFLOW flag, then helper functions() in liburing, such as
      io_uring_peek_cqe, can be aware of this cq overflow and do flush accordingly.
      Signed-off-by: default avatarXiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      6d5f9049
  7. 04 Jul, 2020 1 commit
    • Jens Axboe's avatar
      io_uring: fix regression with always ignoring signals in io_cqring_wait() · b7db41c9
      Jens Axboe authored
      When switching to TWA_SIGNAL for task_work notifications, we also made
      any signal based condition in io_cqring_wait() return -ERESTARTSYS.
      This breaks applications that rely on using signals to abort someone
      waiting for events.
      
      Check if we have a signal pending because of queued task_work, and
      repeat the signal check once we've run the task_work. This provides a
      reliable way of telling the two apart.
      
      Additionally, only use TWA_SIGNAL if we are using an eventfd. If not,
      we don't have the dependency situation described in the original commit,
      and we can get by with just using TWA_RESUME like we previously did.
      
      Fixes: ce593a6c ("io_uring: use signal based task_work running")
      Cc: stable@vger.kernel.org # v5.7
      Reported-by: default avatarAndres Freund <andres@anarazel.de>
      Tested-by: default avatarAndres Freund <andres@anarazel.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      b7db41c9
  8. 30 Jun, 2020 2 commits
    • Jens Axboe's avatar
      io_uring: use signal based task_work running · ce593a6c
      Jens Axboe authored
      Since 5.7, we've been using task_work to trigger async running of
      requests in the context of the original task. This generally works
      great, but there's a case where if the task is currently blocked
      in the kernel waiting on a condition to become true, it won't process
      task_work. Even though the task is woken, it just checks whatever
      condition it's waiting on, and goes back to sleep if it's still false.
      
      This is a problem if that very condition only becomes true when that
      task_work is run. An example of that is the task registering an eventfd
      with io_uring, and it's now blocked waiting on an eventfd read. That
      read could depend on a completion event, and that completion event
      won't get trigged until task_work has been run.
      
      Use the TWA_SIGNAL notification for task_work, so that we ensure that
      the task always runs the work when queued.
      
      Cc: stable@vger.kernel.org # v5.7
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      ce593a6c
    • Oleg Nesterov's avatar
      task_work: teach task_work_add() to do signal_wake_up() · e91b4816
      Oleg Nesterov authored
      So that the target task will exit the wait_event_interruptible-like
      loop and call task_work_run() asap.
      
      The patch turns "bool notify" into 0,TWA_RESUME,TWA_SIGNAL enum, the
      new TWA_SIGNAL flag implies signal_wake_up().  However, it needs to
      avoid the race with recalc_sigpending(), so the patch also adds the
      new JOBCTL_TASK_WORK bit included in JOBCTL_PENDING_MASK.
      
      TODO: once this patch is merged we need to change all current users
      of task_work_add(notify = true) to use TWA_RESUME.
      
      Cc: stable@vger.kernel.org # v5.7
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      e91b4816
  9. 25 Jun, 2020 2 commits
    • Pavel Begunkov's avatar
      io_uring: fix current->mm NULL dereference on exit · d60b5fbc
      Pavel Begunkov authored
      Don't reissue requests from io_iopoll_reap_events(), the task may not
      have mm, which ends up with NULL. It's better to kill everything off on
      exit anyway.
      
      [  677.734670] RIP: 0010:io_iopoll_complete+0x27e/0x630
      ...
      [  677.734679] Call Trace:
      [  677.734695]  ? __send_signal+0x1f2/0x420
      [  677.734698]  ? _raw_spin_unlock_irqrestore+0x24/0x40
      [  677.734699]  ? send_signal+0xf5/0x140
      [  677.734700]  io_iopoll_getevents+0x12f/0x1a0
      [  677.734702]  io_iopoll_reap_events.part.0+0x5e/0xa0
      [  677.734703]  io_ring_ctx_wait_and_kill+0x132/0x1c0
      [  677.734704]  io_uring_release+0x20/0x30
      [  677.734706]  __fput+0xcd/0x230
      [  677.734707]  ____fput+0xe/0x10
      [  677.734709]  task_work_run+0x67/0xa0
      [  677.734710]  do_exit+0x35d/0xb70
      [  677.734712]  do_group_exit+0x43/0xa0
      [  677.734713]  get_signal+0x140/0x900
      [  677.734715]  do_signal+0x37/0x780
      [  677.734717]  ? enqueue_hrtimer+0x41/0xb0
      [  677.734718]  ? recalibrate_cpu_khz+0x10/0x10
      [  677.734720]  ? ktime_get+0x3e/0xa0
      [  677.734721]  ? lapic_next_deadline+0x26/0x30
      [  677.734723]  ? tick_program_event+0x4d/0x90
      [  677.734724]  ? __hrtimer_get_next_event+0x4d/0x80
      [  677.734726]  __prepare_exit_to_usermode+0x126/0x1c0
      [  677.734741]  prepare_exit_to_usermode+0x9/0x40
      [  677.734742]  idtentry_exit_cond_rcu+0x4c/0x60
      [  677.734743]  sysvec_reschedule_ipi+0x92/0x160
      [  677.734744]  ? asm_sysvec_reschedule_ipi+0xa/0x20
      [  677.734745]  asm_sysvec_reschedule_ipi+0x12/0x20
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      d60b5fbc
    • Pavel Begunkov's avatar
      io_uring: fix hanging iopoll in case of -EAGAIN · cd664b0e
      Pavel Begunkov authored
      io_do_iopoll() won't do anything with a request unless
      req->iopoll_completed is set. So io_complete_rw_iopoll() has to set
      it, otherwise io_do_iopoll() will poll a file again and again even
      though the request of interest was completed long time ago.
      
      Also, remove -EAGAIN check from io_issue_sqe() as it races with
      the changed lines. The request will take the long way and be
      resubmitted from io_iopoll*().
      
      io_kiocb's result and iopoll_completed")
      
      Fixes: bbde017a ("io_uring: add memory barrier to synchronize
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      cd664b0e
  10. 23 Jun, 2020 1 commit
    • Xuan Zhuo's avatar
      io_uring: fix io_sq_thread no schedule when busy · b772f07a
      Xuan Zhuo authored
      When the user consumes and generates sqe at a fast rate,
      io_sqring_entries can always get sqe, and ret will not be equal to -EBUSY,
      so that io_sq_thread will never call cond_resched or schedule, and then
      we will get the following system error prompt:
      
      rcu: INFO: rcu_sched self-detected stall on CPU
      or
      watchdog: BUG: soft lockup-CPU#23 stuck for 112s! [io_uring-sq:1863]
      
      This patch checks whether need to call cond_resched() by checking
      the need_resched() function every cycle.
      Suggested-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarXuan Zhuo <xuanzhuo@linux.alibaba.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      b772f07a
  11. 21 Jun, 2020 10 commits
    • Linus Torvalds's avatar
      Linux 5.8-rc2 · 48778464
      Linus Torvalds authored
      48778464
    • Linus Torvalds's avatar
      Merge tag 'selinux-pr-20200621' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux · 817d914d
      Linus Torvalds authored
      Pull SELinux fixes from Paul Moore:
       "Three small patches to fix problems in the SELinux code, all found via
        clang.
      
        Two patches fix potential double-free conditions and one fixes an
        undefined return value"
      
      * tag 'selinux-pr-20200621' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
        selinux: fix undefined return of cond_evaluate_expr
        selinux: fix a double free in cond_read_node()/cond_read_list()
        selinux: fix double free
      817d914d
    • Linus Torvalds's avatar
      Merge tag 'pinctrl-v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · 16f4aa9b
      Linus Torvalds authored
      Pull pin control fixes from Linus Walleij:
       "Some early fixes collected during the first week after the merge
        window, all pretty self-evident, with the details below. The revert is
        the crucial thing.
      
         - Fix a warning on the Qualcomm SPMI GPIO chip being instatiated
           twice without a unique irqchip struct
      
         - Use the noirq variants of the suspend and resume callbacks in the
           Tegra driver
      
         - Clean up the errorpath on the MCP23s08 driver
      
         - Revert the use of devm_of_iomap() in the Freescale driver as it was
           regressing the platform
      
         - Add some missing pins in the Qualcomm IPQ6018 driver
      
         - Fix a simple documentation bug in the pinctrl-single driver"
      
      * tag 'pinctrl-v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
        pinctrl: single: fix function name in documentation
        pinctrl: qcom: ipq6018 Add missing pins in qpic pin group
        Revert "pinctrl: freescale: imx: Use 'devm_of_iomap()' to avoid a resource leak in case of error in 'imx_pinctrl_probe()'"
        pinctrl: mcp23s08: Split to three parts: fix ptr_ret.cocci warnings
        pinctrl: tegra: Use noirq suspend/resume callbacks
        pinctrl: qcom: spmi-gpio: fix warning about irq chip reusage
      16f4aa9b
    • Linus Torvalds's avatar
      Merge tag 'kbuild-fixes-v5.8' of... · be9160a9
      Linus Torvalds authored
      Merge tag 'kbuild-fixes-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
      
      Pull Kbuild fixes from Masahiro Yamada:
      
       - fix -gz=zlib compiler option test for CONFIG_DEBUG_INFO_COMPRESSED
      
       - improve cc-option in scripts/Kbuild.include to clean up temp files
      
       - improve cc-option in scripts/Kconfig.include for more reliable
         compile option test
      
       - do not copy modules.builtin by 'make install' because it would break
         existing systems
      
       - use 'userprogs' syntax for watch_queue sample
      
      * tag 'kbuild-fixes-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        samples: watch_queue: build sample program for target architecture
        Revert "Makefile: install modules.builtin even if CONFIG_MODULES=n"
        scripts: Fix typo in headers_install.sh
        kconfig: unify cc-option and as-option
        kbuild: improve cc-option to clean up all temporary files
        Makefile: Improve compressed debug info support detection
      be9160a9
    • Linus Torvalds's avatar
      Merge tag 'powerpc-5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 75613939
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
      
       - One fix for the interrupt rework we did last release which broke
         KVM-PR
      
       - Three commits fixing some fallout from the READ_ONCE() changes
         interacting badly with our 8xx 16K pages support, which uses a pte_t
         that is a structure of 4 actual PTEs
      
       - A cleanup of the 8xx pte_update() to use the newly added pmd_off()
      
       - A fix for a crash when handling an oops if CONFIG_DEBUG_VIRTUAL is
         enabled
      
       - A minor fix for the SPU syscall generation
      
      Thanks to Aneesh Kumar K.V, Christian Zigotzky, Christophe Leroy, Mike
      Rapoport, Nicholas Piggin.
      
      * tag 'powerpc-5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/8xx: Provide ptep_get() with 16k pages
        mm: Allow arches to provide ptep_get()
        mm/gup: Use huge_ptep_get() in gup_hugepte()
        powerpc/syscalls: Use the number when building SPU syscall table
        powerpc/8xx: use pmd_off() to access a PMD entry in pte_update()
        powerpc/64s: Fix KVM interrupt using wrong save area
        powerpc: Fix kernel crash in show_instructions() w/DEBUG_VIRTUAL
      75613939
    • Linus Torvalds's avatar
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 93bbca27
      Linus Torvalds authored
      Pull crypto fixes from Herbert Xu:
      
       - NULL dereference in octeontx
      
       - PM reference imbalance in ks-sa
      
       - deadlock in crypto manager
      
       - memory leak in drbg
      
       - missing socket limit check on receive SG list size in algif_skcipher
      
       - typos in caam
      
       - warnings in ccp and hisilicon
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: drbg - always try to free Jitter RNG instance
        crypto: marvell/octeontx - Fix a potential NULL dereference
        crypto: algboss - don't wait during notifier callback
        crypto: caam - fix typos
        crypto: ccp - Fix sparse warnings in sev-dev
        crypto: hisilicon - Cap block size at 2^31
        crypto: algif_skcipher - Cap recv SG list at ctx->used
        hwrng: ks-sa - Fix runtime PM imbalance on error
      93bbca27
    • Masahiro Yamada's avatar
      samples: watch_queue: build sample program for target architecture · 214377e9
      Masahiro Yamada authored
      This userspace program includes UAPI headers exported to usr/include/.
      'make headers' always works for the target architecture (i.e. the same
      architecture as the kernel), so the sample program should be built for
      the target as well. Kbuild now supports 'userprogs' for that.
      
      I also guarded the CONFIG option by 'depends on CC_CAN_LINK' because
      $(CC) may not provide libc.
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      214377e9
    • Masahiro Yamada's avatar
      Revert "Makefile: install modules.builtin even if CONFIG_MODULES=n" · 2c6d9636
      Masahiro Yamada authored
      This reverts commit e0b250b5,
      which broke build systems that need to install files to a certain
      path, but do not set INSTALL_MOD_PATH when invoking 'make install'.
      
        $ make INSTALL_PATH=/tmp/destdir install
        mkdir: cannot create directory ‘/lib/modules/5.8.0-rc1+/’: Permission denied
        Makefile:1342: recipe for target '_builtin_inst_' failed
        make: *** [_builtin_inst_] Error 1
      
      While modules.builtin is useful also for CONFIG_MODULES=n, this change
      in the behavior is quite unexpected. Maybe "make modules_install"
      can install modules.builtin irrespective of CONFIG_MODULES as Jonas
      originally suggested.
      
      Anyway, that commit should be reverted ASAP.
      Reported-by: default avatarDouglas Anderson <dianders@chromium.org>
      Reported-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Cc: Jonas Karlman <jonas@kwiboo.se>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Reviewed-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      2c6d9636
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 64677779
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "One minor fix and two patches reworking the ata dma drain for the
        !CONFIG_LIBATA case. The latter is a 5.7 regression fix"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: Wire up ata_scsi_dma_need_drain for SAS HBA drivers
        scsi: libata: Provide an ata_scsi_dma_need_drain stub for !CONFIG_ATA
        scsi: ufs-bsg: Fix runtime PM imbalance on error
      64677779
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · a5c6a1f0
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
      
       - a small collection of remaining API conversion patches (all acked)
         which allow to finally remove the deprecated API
      
       - some documentation fixes and a MAINTAINERS addition
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        MAINTAINERS: Add robert and myself as qcom i2c cci maintainers
        i2c: smbus: Fix spelling mistake in the comments
        Documentation/i2c: SMBus start signal is S not A
        i2c: remove deprecated i2c_new_device API
        Documentation: media: convert to use i2c_new_client_device()
        video: backlight: tosa_lcd: convert to use i2c_new_client_device()
        x86/platform/intel-mid: convert to use i2c_new_client_device()
        drm: encoder_slave: use new I2C API
        drm: encoder_slave: fix refcouting error for modules
      a5c6a1f0
  12. 20 Jun, 2020 13 commits
    • Drew Fustini's avatar
      pinctrl: single: fix function name in documentation · 25fae752
      Drew Fustini authored
      Use the correct the function name in the documentation for
      "pcs_parse_one_pinctrl_entry()".
      
      "smux_parse_one_pinctrl_entry()" appears to be an artifact from the
      development of a prior patch series ("simple pinmux driver") which
      transformed into pinctrl-single.
      Signed-off-by: default avatarDrew Fustini <drew@beagleboard.org>
      Link: https://lore.kernel.org/r/20200612112758.GA3407886@x1Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      25fae752
    • Linus Torvalds's avatar
      Merge tag 'trace-v5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · 8b6ddd10
      Linus Torvalds authored
      Pull tracing fixes from Steven Rostedt:
      
       - Have recordmcount work with > 64K sections (to support LTO)
      
       - kprobe RCU fixes
      
       - Correct a kprobe critical section with missing mutex
      
       - Remove redundant arch_disarm_kprobe() call
      
       - Fix lockup when kretprobe triggers within kprobe_flush_task()
      
       - Fix memory leak in fetch_op_data operations
      
       - Fix sleep in atomic in ftrace trace array sample code
      
       - Free up memory on failure in sample trace array code
      
       - Fix incorrect reporting of function_graph fields in format file
      
       - Fix quote within quote parsing in bootconfig
      
       - Fix return value of bootconfig tool
      
       - Add testcases for bootconfig tool
      
       - Fix maybe uninitialized warning in ftrace pid file code
      
       - Remove unused variable in tracing_iter_reset()
      
       - Fix some typos
      
      * tag 'trace-v5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        ftrace: Fix maybe-uninitialized compiler warning
        tools/bootconfig: Add testcase for show-command and quotes test
        tools/bootconfig: Fix to return 0 if succeeded to show the bootconfig
        tools/bootconfig: Fix to use correct quotes for value
        proc/bootconfig: Fix to use correct quotes for value
        tracing: Remove unused event variable in tracing_iter_reset
        tracing/probe: Fix memleak in fetch_op_data operations
        trace: Fix typo in allocate_ftrace_ops()'s comment
        tracing: Make ftrace packed events have align of 1
        sample-trace-array: Remove trace_array 'sample-instance'
        sample-trace-array: Fix sleeping function called from invalid context
        kretprobe: Prevent triggering kretprobe from within kprobe_flush_task
        kprobes: Remove redundant arch_disarm_kprobe() call
        kprobes: Fix to protect kick_kprobe_optimizer() by kprobe_mutex
        kprobes: Use non RCU traversal APIs on kprobe_tables if possible
        kprobes: Suppress the suspicious RCU warning on kprobes
        recordmcount: support >64k sections
      8b6ddd10
    • Linus Torvalds's avatar
      Merge tag 'libnvdimm-for-5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm · eede2b9b
      Linus Torvalds authored
      Pull libnvdimm updates from Dan Williams:
       "A feature (papr_scm health retrieval) and a fix (sysfs attribute
        visibility) for v5.8.
      
        Vaibhav explains in the merge commit below why missing v5.8 would be
        painful and I agreed to try a -rc2 pull because only cosmetics kept
        this out of -rc1 and his initial versions were posted in more than
        enough time for v5.8 consideration:
      
         'These patches are tied to specific features that were committed to
          customers in upcoming distros releases (RHEL and SLES) whose
          time-lines are tied to 5.8 kernel release.
      
          Being able to track the health of an nvdimm is critical for our
          customers that are running workloads leveraging papr-scm nvdimms.
          Missing the 5.8 kernel would mean missing the distro timelines and
          shifting forward the availability of this feature in distro kernels
          by at least 6 months'
      
        Summary:
      
         - Fix the visibility of the region 'align' attribute.
      
           The new unit tests for region alignment handling caught a corner
           case where the alignment cannot be specified if the region is
           converted from static to dynamic provisioning at runtime.
      
         - Add support for device health retrieval for the persistent memory
           supported by the papr_scm driver.
      
           This includes both the standard sysfs "health flags" that the nfit
           persistent memory driver publishes and a mechanism for the ndctl
           tool to retrieve a health-command payload"
      
      * tag 'libnvdimm-for-5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
        nvdimm/region: always show the 'align' attribute
        powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH
        ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods
        powerpc/papr_scm: Improve error logging and handling papr_scm_ndctl()
        powerpc/papr_scm: Fetch nvdimm health information from PHYP
        seq_buf: Export seq_buf_printf
        powerpc: Document details on H_SCM_HEALTH hcall
      eede2b9b
    • Sivaprakash Murugesan's avatar
      pinctrl: qcom: ipq6018 Add missing pins in qpic pin group · 7f5f4de8
      Sivaprakash Murugesan authored
      The patch adds missing qpic data pins to qpic pingroup. These pins are
      necessary for the qpic nand to work.
      
      Fixes: ef1ea54e ("pinctrl: qcom: Add ipq6018 pinctrl driver")
      Signed-off-by: default avatarSivaprakash Murugesan <sivaprak@codeaurora.org>
      Link: https://lore.kernel.org/r/1592541089-17700-1-git-send-email-sivaprak@codeaurora.orgSigned-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      7f5f4de8
    • Haibo Chen's avatar
      Revert "pinctrl: freescale: imx: Use 'devm_of_iomap()' to avoid a resource... · 13f2d25b
      Haibo Chen authored
      Revert "pinctrl: freescale: imx: Use 'devm_of_iomap()' to avoid a resource leak in case of error in 'imx_pinctrl_probe()'"
      
      This reverts commit ba403242.
      
      After commit 26d8cde5 ("pinctrl: freescale: imx: add shared
      input select reg support"). i.MX7D has two iomux controllers
      iomuxc and iomuxc-lpsr which share select_input register for
      daisy chain settings.
      If use 'devm_of_iomap()', when probe the iomuxc-lpsr, will call
      devm_request_mem_region() for the region <0x30330000-0x3033ffff>
      for the first time. Then, next time when probe the iomuxc, API
      devm_platform_ioremap_resource() will also use the API
      devm_request_mem_region() for the share region <0x30330000-0x3033ffff>
      again, then cause issue, log like below:
      
      [    0.179561] imx7d-pinctrl 302c0000.iomuxc-lpsr: initialized IMX pinctrl driver
      [    0.191742] imx7d-pinctrl 30330000.pinctrl: can't request region for resource [mem 0x30330000-0x3033ffff]
      [    0.191842] imx7d-pinctrl: probe of 30330000.pinctrl failed with error -16
      
      Fixes: ba403242 ("pinctrl: freescale: imx: Use 'devm_of_iomap()' to avoid a resource leak in case of error in 'imx_pinctrl_probe()'")
      Signed-off-by: default avatarHaibo Chen <haibo.chen@nxp.com>
      Reviewed-by: default avatarDong Aisheng <aisheng.dong@nxp.com>
      Link: https://lore.kernel.org/r/1591673223-1680-1-git-send-email-haibo.chen@nxp.comSigned-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      13f2d25b
    • Linus Torvalds's avatar
      Merge tag 's390-5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 1566feea
      Linus Torvalds authored
      Pull s390 fixes from Vasily Gorbik:
      
       - a few ptrace fixes mostly for strace and seccomp_bpf kernel tests
         findings
      
       - cleanup unused pm callbacks in virtio ccw
      
       - replace kmalloc + memset with kzalloc in crypto
      
       - use $(LD) for vDSO linkage to make clang happy
      
       - fix vDSO clock_getres() to preserve the same behaviour as
         posix_get_hrtimer_res()
      
       - fix workqueue cpumask warning when NUMA=n and nr_node_ids=2
      
       - reduce SLSB writes during input processing, improve warnings and
         cleanup qdio_data usage in qdio
      
       - a few fixes to use scnprintf() instead of snprintf()
      
      * tag 's390-5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390: fix syscall_get_error for compat processes
        s390/qdio: warn about unexpected SLSB states
        s390/qdio: clean up usage of qdio_data
        s390/numa: let NODES_SHIFT depend on NEED_MULTIPLE_NODES
        s390/vdso: fix vDSO clock_getres()
        s390/vdso: Use $(LD) instead of $(CC) to link vDSO
        s390/protvirt: use scnprintf() instead of snprintf()
        s390: use scnprintf() in sys_##_prefix##_##_name##_show
        s390/crypto: use scnprintf() instead of snprintf()
        s390/zcrypt: use kzalloc
        s390/virtio: remove unused pm callbacks
        s390/qdio: reduce SLSB writes during Input Queue processing
        selftests/seccomp: s390 shares the syscall and return value register
        s390/ptrace: fix setting syscall number
        s390/ptrace: pass invalid syscall numbers to tracing
        s390/ptrace: return -ENOSYS when invalid syscall is supplied
        s390/seccomp: pass syscall arguments via seccomp_data
        s390/qdio: fine-tune SLSB update
      1566feea
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · 7fdfbe08
      Linus Torvalds authored
      Pull RISC-V fixes from Palmer Dabbelt:
      
       - a workaround for a compiler surprise related to the "r" inline
         assembly that allows LLVM to boot.
      
       - a fix to avoid WX-only mappings, which the ISA does not allow. While
         this probably manifests in many ways, the bug was found in stress-ng.
      
       - a missing lock in set_direct_map_*(), which due to a recent lockdep
         change started asserting.
      
      * tag 'riscv-for-linus-5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        RISC-V: Acquire mmap lock before invoking walk_page_range
        RISC-V: Don't allow write+exec only page mapping request in mmap
        riscv/atomic: Fix sign extension for RV64I
      7fdfbe08
    • Linus Torvalds's avatar
      Merge tag 'linux-kselftest-5.8-rc2' of... · 27c27605
      Linus Torvalds authored
      Merge tag 'linux-kselftest-5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
      
      Pull kselftest cleanups from Shuah Khan:
      
       - ftrace "requires:" list for simplifying and unifying requirement
         checks for each test case, adding "requires:" line instead of
         checking required ftrace interfaces in each test case.
      
       - a minor spelling correction patch
      
      * tag 'linux-kselftest-5.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
        selftests/ftrace: Support ":README" suffix for requires
        selftests/ftrace: Support ":tracer" suffix for requires
        selftests/ftrace: Convert check_filter_file() with requires list
        selftests/ftrace: Convert required interface checks into requires list
        selftests/ftrace: Add "requires:" list support
        selftests/ftrace: Return unsupported for the unconfigured features
        selftests/ftrace: Allow ":" in description
        tools: testing: ftrace: trigger: fix spelling mistake
      27c27605
    • David Howells's avatar
      afs: Fix hang on rmmod due to outstanding timer · 5481fc6e
      David Howells authored
      The fileserver probe timer, net->fs_probe_timer, isn't cancelled when
      the kafs module is being removed and so the count it holds on
      net->servers_outstanding doesn't get dropped..
      
      This causes rmmod to wait forever.  The hung process shows a stack like:
      
      	afs_purge_servers+0x1b5/0x23c [kafs]
      	afs_net_exit+0x44/0x6e [kafs]
      	ops_exit_list+0x72/0x93
      	unregister_pernet_operations+0x14c/0x1ba
      	unregister_pernet_subsys+0x1d/0x2a
      	afs_exit+0x29/0x6f [kafs]
      	__do_sys_delete_module.isra.0+0x1a2/0x24b
      	do_syscall_64+0x51/0x95
      	entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fix this by:
      
       (1) Attempting to cancel the probe timer and, if successful, drop the
           count that the timer was holding.
      
       (2) Make the timer function just drop the count and not schedule the
           prober if the afs portion of net namespace is being destroyed.
      
      Also, whilst we're at it, make the following changes:
      
       (3) Initialise net->servers_outstanding to 1 and decrement it before
           waiting on it so that it doesn't generate wake up events by being
           decremented to 0 until we're cleaning up.
      
       (4) Switch the atomic_dec() on ->servers_outstanding for ->fs_timer in
           afs_purge_servers() to use the helper function for that.
      
      Fixes: f6cbb368 ("afs: Actively poll fileservers to maintain NAT or firewall openings")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5481fc6e
    • David Howells's avatar
      afs: Fix afs_do_lookup() to call correct fetch-status op variant · f8ea5c7b
      David Howells authored
      Fix afs_do_lookup()'s fallback case for when FS.InlineBulkStatus isn't
      supported by the server.
      
      In the fallback, it calls FS.FetchStatus for the specific vnode it's
      meant to be looking up.  Commit b6489a49 broke this by renaming one
      of the two identically-named afs_fetch_status_operation descriptors to
      something else so that one of them could be made non-static.  The site
      that used the renamed one, however, wasn't renamed and didn't produce
      any warning because the other was declared in a header.
      
      Fix this by making afs_do_lookup() use the renamed variant.
      
      Note that there are two variants of the success method because one is
      called from ->lookup() where we may or may not have an inode, but can't
      call iget until after we've talked to the server - whereas the other is
      called from within iget where we have an inode, but it may or may not be
      initialised.
      
      The latter variant expects there to be an inode, but because it's being
      called from there former case, there might not be - resulting in an oops
      like the following:
      
        BUG: kernel NULL pointer dereference, address: 00000000000000b0
        ...
        RIP: 0010:afs_fetch_status_success+0x27/0x7e
        ...
        Call Trace:
          afs_wait_for_operation+0xda/0x234
          afs_do_lookup+0x2fe/0x3c1
          afs_lookup+0x3c5/0x4bd
          __lookup_slow+0xcd/0x10f
          walk_component+0xa2/0x10c
          path_lookupat.isra.0+0x80/0x110
          filename_lookup+0x81/0x104
          vfs_statx+0x76/0x109
          __do_sys_newlstat+0x39/0x6b
          do_syscall_64+0x4c/0x78
          entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: b6489a49 ("afs: Fix silly rename")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f8ea5c7b
    • Christophe Leroy's avatar
      powerpc/8xx: Provide ptep_get() with 16k pages · c0e1c8c2
      Christophe Leroy authored
      READ_ONCE() now enforces atomic read, which leads to:
      
        CC      mm/gup.o
      In file included from ./include/linux/kernel.h:11:0,
                       from mm/gup.c:2:
      In function 'gup_hugepte.constprop',
          inlined from 'gup_huge_pd.isra.79' at mm/gup.c:2465:8:
      ./include/linux/compiler.h:392:38: error: call to '__compiletime_assert_222' declared with attribute error: Unsupported access size for {READ,WRITE}_ONCE().
        _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
                                            ^
      ./include/linux/compiler.h:373:4: note: in definition of macro '__compiletime_assert'
          prefix ## suffix();    \
          ^
      ./include/linux/compiler.h:392:2: note: in expansion of macro '_compiletime_assert'
        _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
        ^
      ./include/linux/compiler.h:405:2: note: in expansion of macro 'compiletime_assert'
        compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
        ^
      ./include/linux/compiler.h:291:2: note: in expansion of macro 'compiletime_assert_rwonce_type'
        compiletime_assert_rwonce_type(x);    \
        ^
      mm/gup.c:2428:8: note: in expansion of macro 'READ_ONCE'
        pte = READ_ONCE(*ptep);
              ^
      In function 'gup_get_pte',
          inlined from 'gup_pte_range' at mm/gup.c:2228:9,
          inlined from 'gup_pmd_range' at mm/gup.c:2613:15,
          inlined from 'gup_pud_range' at mm/gup.c:2641:15,
          inlined from 'gup_p4d_range' at mm/gup.c:2666:15,
          inlined from 'gup_pgd_range' at mm/gup.c:2694:15,
          inlined from 'internal_get_user_pages_fast' at mm/gup.c:2795:3:
      ./include/linux/compiler.h:392:38: error: call to '__compiletime_assert_219' declared with attribute error: Unsupported access size for {READ,WRITE}_ONCE().
        _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
                                            ^
      ./include/linux/compiler.h:373:4: note: in definition of macro '__compiletime_assert'
          prefix ## suffix();    \
          ^
      ./include/linux/compiler.h:392:2: note: in expansion of macro '_compiletime_assert'
        _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
        ^
      ./include/linux/compiler.h:405:2: note: in expansion of macro 'compiletime_assert'
        compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
        ^
      ./include/linux/compiler.h:291:2: note: in expansion of macro 'compiletime_assert_rwonce_type'
        compiletime_assert_rwonce_type(x);    \
        ^
      mm/gup.c:2199:9: note: in expansion of macro 'READ_ONCE'
        return READ_ONCE(*ptep);
               ^
      make[2]: *** [mm/gup.o] Error 1
      
      Define ptep_get() on 8xx when using 16k pages.
      
      Fixes: 9e343b46 ("READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses")
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@csgroup.eu>
      Acked-by: default avatarWill Deacon <will@kernel.org>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/341688399c1b102756046d19ea6ce39db1ae4742.1592225558.git.christophe.leroy@csgroup.eu
      c0e1c8c2
    • Christophe Leroy's avatar
      mm: Allow arches to provide ptep_get() · 481e980a
      Christophe Leroy authored
      Since commit 9e343b46 ("READ_ONCE: Enforce atomicity for
      {READ,WRITE}_ONCE() memory accesses") it is not possible anymore to
      use READ_ONCE() to access complex page table entries like the one
      defined for powerpc 8xx with 16k size pages.
      
      Define a ptep_get() helper that architectures can override instead
      of performing a READ_ONCE() on the page table entry pointer.
      
      Fixes: 9e343b46 ("READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses")
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@csgroup.eu>
      Acked-by: default avatarWill Deacon <will@kernel.org>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/087fa12b6e920e32315136b998aa834f99242695.1592225558.git.christophe.leroy@csgroup.eu
      481e980a
    • Christophe Leroy's avatar
      mm/gup: Use huge_ptep_get() in gup_hugepte() · 55ca2263
      Christophe Leroy authored
      gup_hugepte() reads hugepage table entries, it can't read
      them directly, huge_ptep_get() must be used.
      
      Fixes: 9e343b46 ("READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses")
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@csgroup.eu>
      Acked-by: default avatarWill Deacon <will@kernel.org>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/ffc3714334c3bfaca6f13788ad039e8759ae413f.1592225558.git.christophe.leroy@csgroup.eu
      55ca2263
  13. 19 Jun, 2020 1 commit
    • Dan Williams's avatar
      Merge branch 'for-5.8/papr_scm' into libnvdimm-for-next · 9df24eae
      Dan Williams authored
      Include the papr_scm health retrieval feature for v5.8-rc2. The
      functionality was initially posted well in advance of the merge window,
      but review comments and a late build-bot warning kept them out of the
      v5.8-rc1 libnvdimm pull request.
      
      Vaibhav notes:
      These patches are tied to specific features that were committed to
      customers in upcoming distros releases (RHEL and SLES) whose time-lines
      are tied to 5.8 kernel release.
      
      Being able to track the health of an nvdimm is critical for our
      customers that are running workloads leveraging papr-scm nvdimms.
      Missing the 5.8 kernel would mean missing the distro timelines and
      shifting forward the availability of this feature in distro kernels by
      at least 6 months.
      9df24eae