Commits · 95a1b3ff9a3e4ea2f26c4e802067d58831f415db · nexedi / linux

29 Oct, 2019 13 commits

io_uring: Fix mm_fault with READ/WRITE_FIXED · 95a1b3ff

Pavel Begunkov authored Oct 27, 2019

Commit fb5ccc98 ("io_uring: Fix broken links with offloading")
introduced a potential performance regression with unconditionally
taking mm even for READ/WRITE_FIXED operations.

Return the logic handling it back. mm-faulted requests will go through
the generic submission path, so honoring links and drains, but will
fail further on req->has_user check.

Fixes: fb5ccc98 ("io_uring: Fix broken links with offloading")
Cc: stable@vger.kernel.org # v5.4
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

95a1b3ff

io_uring: remove index from sqe_submit · fa456228

Pavel Begunkov authored Oct 27, 2019

submit->index is used only for inbound check in submission path (i.e.
head < ctx->sq_entries). However, it always will be true, as
1. it's already validated by io_get_sqring()
2. ctx->sq_entries can't be changedd in between, because of held
ctx->uring_lock and ctx->refs.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

fa456228

io_uring: add set of tracing events · c826bd7a

Dmitrii Dolgov authored Oct 15, 2019

To trace io_uring activity one can get an information from workqueue and
io trace events, but looks like some parts could be hard to identify via
this approach. Making what happens inside io_uring more transparent is
important to be able to reason about many aspects of it, hence introduce
the set of tracing events.

All such events could be roughly divided into two categories:

* those, that are helping to understand correctness (from both kernel
  and an application point of view). E.g. a ring creation, file
  registration, or waiting for available CQE. Proposed approach is to
  get a pointer to an original structure of interest (ring context, or
  request), and then find relevant events. io_uring_queue_async_work
  also exposes a pointer to work_struct, to be able to track down
  corresponding workqueue events.

* those, that provide performance related information. Mostly it's about
  events that change the flow of requests, e.g. whether an async work
  was queued, or delayed due to some dependencies. Another important
  case is how io_uring optimizations (e.g. registered files) are
  utilized.
Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

c826bd7a

io_uring: add support for canceling timeout requests · 11365043

Jens Axboe authored Oct 16, 2019

We might have cases where the need for a specific timeout is gone, add
support for canceling an existing timeout operation. This works like the
POLL_REMOVE command, where the application passes in the user_data of
the timeout it wishes to cancel in the sqe->addr field.
Signed-off-by: Jens Axboe <axboe@kernel.dk>

11365043

io_uring: add support for absolute timeouts · a41525ab

Jens Axboe authored Oct 15, 2019

This is a pretty trivial addition on top of the relative timeouts
we have now, but it's handy for ensuring tighter timing for those
that are building scheduling primitives on top of io_uring.
Signed-off-by: Jens Axboe <axboe@kernel.dk>

a41525ab

io_uring: replace s->needs_lock with s->in_async · ba5290cc

Jackie Liu authored Oct 09, 2019

There is no function change, just to clean up the code, use s->in_async
to make the code know where it is.
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

ba5290cc

io_uring: allow application controlled CQ ring size · 33a107f0

Jens Axboe authored Oct 04, 2019

We currently size the CQ ring as twice the SQ ring, to allow some
flexibility in not overflowing the CQ ring. This is done because the
SQE life time is different than that of the IO request itself, the SQE
is consumed as soon as the kernel has seen the entry.

Certain application don't need a huge SQ ring size, since they just
submit IO in batches. But they may have a lot of requests pending, and
hence need a big CQ ring to hold them all. By allowing the application
to control the CQ ring size multiplier, we can cater to those
applications more efficiently.

If an application wants to define its own CQ ring size, it must set
IORING_SETUP_CQSIZE in the setup flags, and fill out
io_uring_params->cq_entries. The value must be a power of two.
Signed-off-by: Jens Axboe <axboe@kernel.dk>

33a107f0

io_uring: add support for IORING_REGISTER_FILES_UPDATE · c3a31e60

Jens Axboe authored Oct 03, 2019

Allows the application to remove/replace/add files to/from a file set.
Passes in a struct:

struct io_uring_files_update {
	__u32 offset;
	__s32 *fds;
};

that holds an array of fds, size of array passed in through the usual
nr_args part of the io_uring_register() system call. The logic is as
follows:

1) If ->fds[i] is -1, the existing file at i + ->offset is removed from
   the set.
2) If ->fds[i] is a valid fd, the existing file at i + ->offset is
   replaced with ->fds[i].

For case #2, is the existing file is currently empty (fd == -1), the
new fd is simply added to the array.
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

c3a31e60

io_uring: allow sparse fixed file sets · 08a45173

Jens Axboe authored Oct 03, 2019

This is in preparation for allowing updates to fixed file sets without
requiring a full unregister+register.
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

08a45173

io_uring: run dependent links inline if possible · ba816ad6

Jens Axboe authored Sep 28, 2019

Currently any dependent link is executed from a new workqueue context,
which means that we'll be doing a context switch per link in the chain.
If we are running the completion of the current request from our async
workqueue and find that the next request is a link, then run it directly
from the workqueue context instead of forcing another switch.

This improves the performance of linked SQEs, and reduces the CPU
overhead.
Reviewed-by: Jackie Liu <liuyun01@kylinos.cn>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

ba816ad6

um-ubd: Entrust re-queue to the upper layers · d848074b

Anton Ivanov authored Oct 29, 2019

Fixes crashes due to ubd requeue logic conflicting with the block-mq
logic. Crash is reproducible in 5.0 - 5.3.

Fixes: 53766def ("um: Clean-up command processing in UML UBD driver")
Cc: stable@vger.kernel.org # v5.0+
Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

d848074b

nvme-multipath: remove unused groups_only mode in ana log · 86cccfbf

Anton Eidelman authored Oct 18, 2019

groups_only mode in nvme_read_ana_log() is no longer used: remove it.
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

86cccfbf

nvme-multipath: fix possible io hang after ctrl reconnect · af8fd042

Anton Eidelman authored Oct 18, 2019

The following scenario results in an IO hang:
1) ctrl completes a request with NVME_SC_ANA_TRANSITION.
   NVME_NS_ANA_PENDING bit in ns->flags is set and ana_work is triggered.
2) ana_work: nvme_read_ana_log() tries to get the ANA log page from the ctrl.
   This fails because ctrl disconnects.
   Therefore nvme_update_ns_ana_state() is not called
   and NVME_NS_ANA_PENDING bit in ns->flags is not cleared.
3) ctrl reconnects: nvme_mpath_init(ctrl,...) calls
   nvme_read_ana_log(ctrl, groups_only=true).
   However, nvme_update_ana_state() does not update namespaces
   because nr_nsids = 0 (due to groups_only mode).
4) scan_work calls nvme_validate_ns() finds the ns and re-validates OK.

Result:
The ctrl is now live but NVME_NS_ANA_PENDING bit in ns->flags is still set.
Consequently ctrl will never be considered a viable path by __nvme_find_path().
IO will hang if ctrl is the only or the last path to the namespace.

More generally, while ctrl is reconnecting, its ANA state may change.
And because nvme_mpath_init() requests ANA log in groups_only mode,
these changes are not propagated to the existing ctrl namespaces.
This may result in a mal-function or an IO hang.

Solution:
nvme_mpath_init() will nvme_read_ana_log() with groups_only set to false.
This will not harm the new ctrl case (no namespaces present),
and will make sure the ANA state of namespaces gets updated after reconnect.

Note: Another option would be for nvme_mpath_init() to invoke
nvme_parse_ana_log(..., nvme_set_ns_ana_state) for each existing namespace.
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

af8fd042

28 Oct, 2019 2 commits

io_uring: don't touch ctx in setup after ring fd install · 044c1ab3

Jens Axboe authored Oct 28, 2019

syzkaller reported an issue where it looks like a malicious app can
trigger a use-after-free of reading the ctx ->sq_array and ->rings
value right after having installed the ring fd in the process file
table.

Defer ring fd installation until after we're done reading those
values.

Fixes: 75b28aff ("io_uring: allocate the two rings together")
Reported-by: syzbot+6f03d895a6cd0d06187f@syzkaller.appspotmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

044c1ab3

io_uring: Fix leaked shadow_req · 7b20238d

Pavel Begunkov authored Oct 27, 2019

io_queue_link_head() owns shadow_req after taking it as an argument.
By not freeing it in case of an error, it can leak the request along
with taken ctx->refs.
Reviewed-by: Jackie Liu <liuyun01@kylinos.cn>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

7b20238d

27 Oct, 2019 7 commits

Linux 5.4-rc5 · d6d5df1d
Linus Torvalds authored Oct 27, 2019

d6d5df1d

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 153a971f

Linus Torvalds authored Oct 27, 2019

Pull x86 fixes from Thomas Gleixner:
 "Two fixes for the VMWare guest support:

   - Unbreak VMWare platform detection which got wreckaged by converting
     an integer constant to a string constant.

   - Fix the clang build of the VMWAre hypercall by explicitely
     specifying the ouput register for INL instead of using the short
     form"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpu/vmware: Fix platform detection VMWARE_PORT macro
  x86/cpu/vmware: Use the full form of INL in VMWARE_HYPERCALL, for clang/llvm

153a971f

Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 2b776b54

Linus Torvalds authored Oct 27, 2019

Pull timer fixes from Thomas Gleixner:
 "A small set of fixes for time(keeping):

   - Add a missing include to prevent compiler warnings.

   - Make the VDSO implementation of clock_getres() POSIX compliant
     again. A recent change dropped the NULL pointer guard which is
     required as NULL is a valid pointer value for this function.

   - Fix two function documentation typos"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  posix-cpu-timers: Fix two trivial comments
  timers/sched_clock: Include local timekeeping.h for missing declarations
  lib/vdso: Make clock_getres() POSIX compliant again

2b776b54

Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a8a31fdc

Linus Torvalds authored Oct 27, 2019

Pull perf fixes from Thomas Gleixner:
 "A set of perf fixes:

  kernel:

   - Unbreak the tracking of auxiliary buffer allocations which got
     imbalanced causing recource limit failures.

   - Fix the fallout of splitting of ToPA entries which missed to shift
     the base entry PA correctly.

   - Use the correct context to lookup the AUX event when unmapping the
     associated AUX buffer so the event can be stopped and the buffer
     reference dropped.

  tools:

   - Fix buildiid-cache mode setting in copyfile_mode_ns() when copying
     /proc/kcore

   - Fix freeing id arrays in the event list so the correct event is
     closed.

   - Sync sched.h anc kvm.h headers with the kernel sources.

   - Link jvmti against tools/lib/ctype.o to have weak strlcpy().

   - Fix multiple memory and file descriptor leaks, found by coverity in
     perf annotate.

   - Fix leaks in error handling paths in 'perf c2c', 'perf kmem', found
     by a static analysis tool"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/aux: Fix AUX output stopping
  perf/aux: Fix tracking of auxiliary trace buffer allocation
  perf/x86/intel/pt: Fix base for single entry topa
  perf kmem: Fix memory leak in compact_gfp_flags()
  tools headers UAPI: Sync sched.h with the kernel
  tools headers kvm: Sync kvm.h headers with the kernel sources
  tools headers kvm: Sync kvm headers with the kernel sources
  tools headers kvm: Sync kvm headers with the kernel sources
  perf c2c: Fix memory leak in build_cl_output()
  perf tools: Fix mode setting in copyfile_mode_ns()
  perf annotate: Fix multiple memory and file descriptor leaks
  perf tools: Fix resource leak of closedir() on the error paths
  perf evlist: Fix fix for freed id arrays
  perf jvmti: Link against tools/lib/ctype.h to have weak strlcpy()

a8a31fdc

Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 1e1ac1cb

Linus Torvalds authored Oct 27, 2019

Pull irq fixes from Thomas Gleixner:
 "Two fixes for interrupt controller drivers:

   - Skip IRQ_M_EXT entries in the device tree when initializing the
     RISCV PLIC controller to avoid a double init attempt.

   - Use the correct ITS list when issuing the VMOVP synchronization
     command so the operation works only on the ITS instances which are
     associated to a VM"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/sifive-plic: Skip contexts except supervisor in plic_init()
  irqchip/gic-v3-its: Use the exact ITSList for VMOVP

1e1ac1cb

Merge tag '5.4-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6 · c9a2e4a8

Linus Torvalds authored Oct 27, 2019

Pull cifs fixes from Steve French:
 "Seven cifs/smb3 fixes, including three for stable"

* tag '5.4-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: Fix cifsInodeInfo lock_sem deadlock when reconnect occurs
  CIFS: Fix use after free of file info structures
  CIFS: Fix retry mid list corruption on reconnects
  cifs: Fix missed free operations
  CIFS: avoid using MID 0xFFFF
  cifs: clarify comment about timestamp granularity for old servers
  cifs: Handle -EINPROGRESS only when noblockcnt is set

c9a2e4a8

Merge tag 'riscv/for-v5.4-rc5-b' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · 6995a6a5

Linus Torvalds authored Oct 27, 2019

Pull RISC-V fixes from Paul Walmsley:
 "Several minor fixes and cleanups for v5.4-rc5:

   - Three build fixes for various SPARSEMEM-related kernel
     configurations

   - Two cleanup patches for the kernel bug and breakpoint trap handler
     code"

* tag 'riscv/for-v5.4-rc5-b' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: cleanup do_trap_break
  riscv: cleanup <asm/bug.h>
  riscv: Fix undefined reference to vmemmap_populate_basepages
  riscv: Fix implicit declaration of 'page_to_section'
  riscv: fix fs/proc/kcore.c compilation with sparsemem enabled

6995a6a5

26 Oct, 2019 13 commits

Merge tag 'mips_fixes_5.4_3' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux · 5a1e843c

Linus Torvalds authored Oct 26, 2019

Pull MIPS fixes from Paul Burton:
 "A few MIPS fixes:

   - Fix VDSO time-related function behavior for systems where we need
     to fall back to syscalls, but were instead returning bogus results.

   - A fix to TLB exception handlers for Cavium Octeon systems where
     they would inadvertently clobber the $1/$at register.

   - A build fix for bcm63xx configurations.

   - Switch to using my @kernel.org email address"

* tag 'mips_fixes_5.4_3' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MIPS: tlbex: Fix build_restore_pagemask KScratch restore
  MIPS: bmips: mark exception vectors as char arrays
  mips: vdso: Fix __arch_get_hw_counter()
  MAINTAINERS: Use @kernel.org address for Paul Burton

5a1e843c

Merge tag 'tty-5.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · 29768954

Linus Torvalds authored Oct 26, 2019

Pull tty/serial driver fix from Greg KH:
 "Here is a single tty/serial driver fix for 5.4-rc5 that resolves a
  reported issue.

  It has been in linux-next for a while with no problems"

* tag 'tty-5.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  8250-men-mcb: fix error checking when get_num_ports returns -ENODEV

29768954

Merge tag 'staging-5.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · 228bd624

Linus Torvalds authored Oct 26, 2019

Pull staging driver fix from Greg KH:
 "Here is a single staging driver fix, for the wlan-ng driver, that
  resolves a reported issue.

  It is been in linux-next for a while with no reported issues"

* tag 'staging-5.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: wlan-ng: fix exit return when sme->key_idx >= NUM_WEPKEYS

228bd624

Merge tag 'driver-core-5.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core · 13fa692e

Linus Torvalds authored Oct 26, 2019

Pull driver core fix from Greg KH:
 "Here is a single sysfs fix for 5.4-rc5.

  It resolves an error if you actually try to use the __BIN_ATTR_WO()
  macro, seems I never tested it properly before :(

  This has been in linux-next for a while with no reported issues"

* tag 'driver-core-5.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  sysfs: Fixes __BIN_ATTR_WO() macro

13fa692e

Merge tag 'char-misc-5.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · a03885d5

Linus Torvalds authored Oct 26, 2019

Pull binder fix from Greg KH:
 "This is a single binder fix to resolve a reported issue by Jann. It's
  been in linux-next for a while with no reported issues"

* tag 'char-misc-5.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  binder: Don't modify VMA bounds in ->mmap handler

a03885d5

Merge tag 'usb-5.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 0ecdd78c

Linus Torvalds authored Oct 26, 2019

Pull USB fixes from Greg KH:
 "Here are a number of small USB driver fixes for 5.4-rc5.

  More "fun" with some of the misc USB drivers as found by syzbot, and
  there are a number of other small bugfixes in here for reported
  issues.

  All have been in linux-next for a while with no reported issues"

* tag 'usb-5.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: cdns3: Error out if USB_DR_MODE_UNKNOWN in cdns3_core_init_role()
  USB: ldusb: fix read info leaks
  USB: serial: ti_usb_3410_5052: clean up serial data access
  USB: serial: ti_usb_3410_5052: fix port-close races
  USB: usblp: fix use-after-free on disconnect
  usb: udc: lpc32xx: fix bad bit shift operation
  usb: cdns3: Fix dequeue implementation.
  USB: legousbtower: fix a signedness bug in tower_probe()
  USB: legousbtower: fix memleak on disconnect
  USB: ldusb: fix memleak on disconnect

0ecdd78c

Merge branch 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 992cb107

Linus Torvalds authored Oct 26, 2019

Pull i2c fixes from Wolfram Sang:
 "A few driver fixes for the I2C subsystem"

* 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: stm32f7: remove warning when compiling with W=1
  i2c: stm32f7: fix a race in slave mode with arbitration loss irq
  i2c: stm32f7: fix first byte to send in slave mode
  i2c: mt65xx: fix NULL ptr dereference
  i2c: aspeed: fix master pending state handling

992cb107

Merge tag 'for-linus-2019-10-26' of git://git.kernel.dk/linux-block · acf913b7

Linus Torvalds authored Oct 26, 2019

Pull block and io_uring fixes from Jens Axboe:
 "A bit bigger than usual at this point in time, mostly due to some good
  bug hunting work by Pavel that resulted in three io_uring fixes from
  him and two from me. Anyway, this pull request contains:

   - Revert of the submit-and-wait optimization for io_uring, it can't
     always be done safely. It depends on commands always making
     progress on their own, which isn't necessarily the case outside of
     strict file IO. (me)

   - Series of two patches from me and three from Pavel, fixing issues
     with shared data and sequencing for io_uring.

   - Lastly, two timeout sequence fixes for io_uring (zhangyi)

   - Two nbd patches fixing races (Josef)

   - libahci regulator_get_optional() fix (Mark)"

* tag 'for-linus-2019-10-26' of git://git.kernel.dk/linux-block:
  nbd: verify socket is supported during setup
  ata: libahci_platform: Fix regulator_get_optional() misuse
  nbd: handle racing with error'ed out commands
  nbd: protect cmd->status with cmd->lock
  io_uring: fix bad inflight accounting for SETUP_IOPOLL|SETUP_SQTHREAD
  io_uring: used cached copies of sq->dropped and cq->overflow
  io_uring: Fix race for sqes with userspace
  io_uring: Fix broken links with offloading
  io_uring: Fix corrupted user_data
  io_uring: correct timeout req sequence when inserting a new entry
  io_uring : correct timeout req sequence when waiting timeout
  io_uring: revert "io_uring: optimize submit_and_wait API"

acf913b7

Merge tag 's390-5.4-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · f877bee5

Linus Torvalds authored Oct 26, 2019

Pull s390 fixes from Vasily Gorbik:

 - Add R_390_GLOB_DAT relocation type support. This fixes boot problem
   on linux-next.

 - Fix memory leak in zcrypt

* tag 's390-5.4-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/kaslr: add support for R_390_GLOB_DAT relocation type
  s390/zcrypt: fix memleak at release

f877bee5

Merge tag 'for-linus-5.4-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 4fac2407

Linus Torvalds authored Oct 26, 2019

Pull xen fixlet from Juergen Gross:
 "Just one patch for issuing a deprecation warning for 32-bit Xen pv
  guests"

* tag 'for-linus-5.4-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen: issue deprecation warning for 32-bit pv guest

4fac2407

Merge tag 'dma-mapping-5.4-2' of git://git.infradead.org/users/hch/dma-mapping · 964f9cfa

Linus Torvalds authored Oct 26, 2019

Pull dma-mapping fix from Christoph Hellwig:
 "Fix a regression in the intel-iommu get_required_mask conversion
  (Arvind Sankar)"

* tag 'dma-mapping-5.4-2' of git://git.infradead.org/users/hch/dma-mapping:
  iommu/vt-d: Return the correct dma mask when we are bypassing the IOMMU

964f9cfa

Merge tag 'dax-fix-5.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm · 485fc4b6

Linus Torvalds authored Oct 26, 2019

Pull dax fix from Dan Williams:
 "Fix a performance regression that followed from a fix to the
  conversion of the fsdax implementation to the xarray. v5.3 users
  report that they stop seeing huge page mappings on an application +
  filesystem layout that was seeing huge pages previously on v5.2"

* tag 'dax-fix-5.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  fs/dax: Fix pmd vs pte conflict detection

485fc4b6

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 1c4e395c

Linus Torvalds authored Oct 25, 2019

Pull SCSI fixes from James Bottomley:
 "Nine changes, eight to drivers (qla2xxx, hpsa, lpfc, alua, ch,
  53c710[x2], target) and one core change that tries to close a race
  between sysfs delete and module removal"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: lpfc: remove left-over BUILD_NVME defines
  scsi: core: try to get module before removing device
  scsi: hpsa: add missing hunks in reset-patch
  scsi: target: core: Do not overwrite CDB byte 1
  scsi: ch: Make it possible to open a ch device multiple times again
  scsi: fix kconfig dependency warning related to 53C700_LE_ON_BE
  scsi: sni_53c710: fix compilation error
  scsi: scsi_dh_alua: handle RTPG sense code correctly during state transitions
  scsi: qla2xxx: fix a potential NULL pointer dereference

1c4e395c

25 Oct, 2019 5 commits

riscv: cleanup do_trap_break · e8f44c50

Christoph Hellwig authored Oct 17, 2019

If we always compile the get_break_insn_length inline function we can
remove the ifdefs and let dead code elimination take care of the warn
branch that is now unreadable because the report_bug stub always
returns BUG_TRAP_TYPE_BUG.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>

e8f44c50

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · b4b61b22

Linus Torvalds authored Oct 25, 2019

Pull input fix from Dmitry Torokhov:
 "A fix for st1232 driver to properly report coordinates for 2nd and
  subsequent fingers when more than one is on the surface"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: st1232 - fix reporting multitouch coordinates

b4b61b22

nbd: verify socket is supported during setup · cf1b2326

Mike Christie authored Oct 17, 2019

nbd requires socket families to support the shutdown method so the nbd
recv workqueue can be woken up from its sock_recvmsg call. If the socket
does not support the callout we will leave recv works running or get hangs
later when the device or module is removed.

This adds a check during socket connection/reconnection to make sure the
socket being passed in supports the needed callout.

Reported-by: syzbot+24c12fa8d218ed26011a@syzkaller.appspotmail.com
Fixes: e9e006f5 ("nbd: fix max number of supported devs")
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

cf1b2326

ata: libahci_platform: Fix regulator_get_optional() misuse · 962399bb

Mark Brown authored Oct 16, 2019

This driver is using regulator_get_optional() to handle all the supplies
that it handles, and only ever enables and disables all supplies en masse
without ever doing any other configuration of the device to handle missing
power. These are clear signs that the API is being misused - it should only
be used for supplies that may be physically absent from the system and in
these cases the hardware usually needs different configuration if the
supply is missing. Instead use normal regualtor_get(), if the supply is
not described in DT then the framework will substitute a dummy regulator in
so no special handling is needed by the consumer driver.

In the case of the PHY regulator the handling in the driver is a hack to
deal with integrated PHYs; the supplies are only optional in the sense
that that there's some confusion in the code about where they're bound to.
From a code point of view they function exactly as normal supplies so can
be treated as such. It'd probably be better to model this by instantiating
a PHY object for integrated PHYs.
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

962399bb

nbd: handle racing with error'ed out commands · 7ce23e8e

Josef Bacik authored Oct 21, 2019

We hit the following warning in production

print_req_error: I/O error, dev nbd0, sector 7213934408 flags 80700
------------[ cut here ]------------
refcount_t: underflow; use-after-free.
WARNING: CPU: 25 PID: 32407 at lib/refcount.c:190 refcount_sub_and_test_checked+0x53/0x60
Workqueue: knbd-recv recv_work [nbd]
RIP: 0010:refcount_sub_and_test_checked+0x53/0x60
Call Trace:
 blk_mq_free_request+0xb7/0xf0
 blk_mq_complete_request+0x62/0xf0
 recv_work+0x29/0xa1 [nbd]
 process_one_work+0x1f5/0x3f0
 worker_thread+0x2d/0x3d0
 ? rescuer_thread+0x340/0x340
 kthread+0x111/0x130
 ? kthread_create_on_node+0x60/0x60
 ret_from_fork+0x1f/0x30
---[ end trace b079c3c67f98bb7c ]---

This was preceded by us timing out everything and shutting down the
sockets for the device.  The problem is we had a request in the queue at
the same time, so we completed the request twice.  This can actually
happen in a lot of cases, we fail to get a ref on our config, we only
have one connection and just error out the command, etc.

Fix this by checking cmd->status in nbd_read_stat.  We only change this
under the cmd->lock, so we are safe to check this here and see if we've
already error'ed this command out, which would indicate that we've
completed it as well.
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

7ce23e8e