1. 03 Apr, 2021 7 commits
  2. 02 Apr, 2021 12 commits
    • Linus Torvalds's avatar
      Merge tag 'block-5.12-2021-04-02' of git://git.kernel.dk/linux-block · d93a0d43
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
      
       - Remove comment that never came to fruition in 22 years of development
         (Christoph)
      
       - Remove unused request flag (Christoph)
      
       - Fix for null_blk fake timeout handling (Damien)
      
       - Fix for IOCB_NOWAIT being ignored for O_DIRECT on raw bdevs (Pavel)
      
       - Error propagation fix for multiple split bios (Yufen)
      
      * tag 'block-5.12-2021-04-02' of git://git.kernel.dk/linux-block:
        block: remove the unused RQF_ALLOCED flag
        block: update a few comments in uapi/linux/blkpg.h
        block: don't ignore REQ_NOWAIT for direct IO
        null_blk: fix command timeout completion handling
        block: only update parent bi_status when bio fail
      d93a0d43
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.12-2021-04-02' of git://git.kernel.dk/linux-block · 1faccb63
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
       "Nothing really major in here, and finally nothing really related to
        signals. A few minor fixups related to the threading changes, and some
        general fixes, that's it.
      
        There's the pending gdb-get-confused-about-arch, but that's more of a
        cosmetic issue, nothing that hinder use of it. And given that other
        archs will likely be affected by that oddity too, better to postpone
        any changes there until 5.13 imho"
      
      * tag 'io_uring-5.12-2021-04-02' of git://git.kernel.dk/linux-block:
        io_uring: move reissue into regular IO path
        io_uring: fix EIOCBQUEUED iter revert
        io_uring/io-wq: protect against sprintf overflow
        io_uring: don't mark S_ISBLK async work as unbounded
        io_uring: drop sqd lock before handling signals for SQPOLL
        io_uring: handle setup-failed ctx in kill_timeouts
        io_uring: always go for cancellation spin on exec
      1faccb63
    • Linus Torvalds's avatar
      Merge tag 'acpi-5.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 0a84c2e4
      Linus Torvalds authored
      Pull ACPI fixes from Rafael Wysocki:
       "These fix an ACPI tables management issue, an issue related to the
        ACPI enumeration of devices and CPU wakeup in the ACPI processor
        driver.
      
        Specifics:
      
         - Ensure that the memory occupied by ACPI tables on x86 will always
           be reserved to prevent it from being allocated for other purposes
           which was possible in some cases (Rafael Wysocki).
      
         - Fix the ACPI device enumeration code to prevent it from attempting
           to evaluate the _STA control method for devices with unmet
           dependencies which is likely to fail (Hans de Goede).
      
         - Fix the handling of CPU0 wakeup in the ACPI processor driver to
           prevent CPU0 online failures from occurring (Vitaly Kuznetsov)"
      
      * tag 'acpi-5.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI: processor: Fix CPU0 wakeup in acpi_idle_play_dead()
        ACPI: scan: Fix _STA getting called on devices with unmet dependencies
        ACPI: tables: x86: Reserve memory occupied by ACPI tables
      0a84c2e4
    • Linus Torvalds's avatar
      Merge tag 'pm-5.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 9314a0e9
      Linus Torvalds authored
      Pull power management fixes from Rafael Wysocki:
       "These fix a race condition and an ordering issue related to using
        device links in the runtime PM framework and two kerneldoc comments in
        cpufreq.
      
        Specifics:
      
         - Fix race condition related to the handling of supplier devices
           during consumer device probe and fix the order of decrementation of
           two related reference counters in the runtime PM core code handling
           supplier devices (Adrian Hunter).
      
         - Fix kerneldoc comments in cpufreq that have not been updated along
           with the functions documented by them (Geert Uytterhoeven)"
      
      * tag 'pm-5.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        PM: runtime: Fix race getting/putting suppliers at probe
        PM: runtime: Fix ordering in pm_runtime_get_suppliers()
        cpufreq: Fix scaling_{available,boost}_frequencies_show() comments
      9314a0e9
    • Christoph Hellwig's avatar
      block: remove the unused RQF_ALLOCED flag · f06c6096
      Christoph Hellwig authored
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      f06c6096
    • Christoph Hellwig's avatar
      block: update a few comments in uapi/linux/blkpg.h · b9c6cdc3
      Christoph Hellwig authored
      The big top of the file comment talk about grand plans that never
      happened, so remove them to not confuse the readers.  Also mark the
      devname and volname fields as ignored as they were never used by the
      kernel.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      b9c6cdc3
    • Linus Torvalds's avatar
      Merge tag 'trace-v5.12-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · 05de4538
      Linus Torvalds authored
      Pull tracing fix from Steven Rostedt:
       "Fix stack trace entry size to stop showing garbage
      
        The macro that creates both the structure and the format displayed to
        user space for the stack trace event was changed a while ago to fix
        the parsing by user space tooling. But this change also modified the
        structure used to store the stack trace event. It changed the caller
        array field from [0] to [8].
      
        Even though the size in the ring buffer is dynamic and can be
        something other than 8 (user space knows how to handle this), the 8
        extra words was not accounted for when reserving the event on the ring
        buffer, and added 8 more entries, due to the calculation of
        "sizeof(*entry) + nr_entries * sizeof(long)", as the sizeof(*entry)
        now contains 8 entries.
      
        The size of the caller field needs to be subtracted from the size of
        the entry to create the correct allocation size"
      
      * tag 'trace-v5.12-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        tracing: Fix stack trace event size
      05de4538
    • Jens Axboe's avatar
      io_uring: move reissue into regular IO path · 230d50d4
      Jens Axboe authored
      It's non-obvious how retry is done for block backed files, when it happens
      off the kiocb done path. It also makes it tricky to deal with the iov_iter
      handling.
      
      Just mark the req as needing a reissue, and handling it from the
      submission path instead. This makes it directly obvious that we're not
      re-importing the iovec from userspace past the submit point, and it means
      that we can just reuse our usual -EAGAIN retry path from the read/write
      handling.
      
      At some point in the future, we'll gain the ability to always reliably
      return -EAGAIN through the stack. A previous attempt on the block side
      didn't pan out and got reverted, hence the need to check for this
      information out-of-band right now.
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      230d50d4
    • Rafael J. Wysocki's avatar
      Merge branches 'acpi-tables' and 'acpi-scan' · 91463ebf
      Rafael J. Wysocki authored
      * acpi-tables:
        ACPI: tables: x86: Reserve memory occupied by ACPI tables
      
      * acpi-scan:
        ACPI: scan: Fix _STA getting called on devices with unmet dependencies
      91463ebf
    • Rafael J. Wysocki's avatar
      Merge branch 'pm-cpufreq' · ac1790ad
      Rafael J. Wysocki authored
      * pm-cpufreq:
        cpufreq: Fix scaling_{available,boost}_frequencies_show() comments
      ac1790ad
    • Pavel Begunkov's avatar
      block: don't ignore REQ_NOWAIT for direct IO · f8b78caf
      Pavel Begunkov authored
      If IOCB_NOWAIT is set on submission, then that needs to get propagated to
      REQ_NOWAIT on the block side. Otherwise we completely lose this
      information, and any issuer of IOCB_NOWAIT IO will potentially end up
      blocking on eg request allocation on the storage side.
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      f8b78caf
    • Linus Torvalds's avatar
      Merge tag 'lto-v5.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 1678e493
      Linus Torvalds authored
      Pull LTO fix from Kees Cook:
       "It seems that there is a bug in ld.bfd when doing module section
        merging.
      
        As explicit merging is only needed for LTO, the work-around is to only
        do it under LTO, leaving the original section layout choices alone
        under normal builds:
      
         - Only perform explicit module section merges under LTO (Sean
           Christopherson)"
      
      * tag 'lto-v5.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        kbuild: lto: Merge module sections if and only if CONFIG_LTO_CLANG is enabled
      1678e493
  3. 01 Apr, 2021 21 commits
    • Sean Christopherson's avatar
      kbuild: lto: Merge module sections if and only if CONFIG_LTO_CLANG is enabled · 6a3193cd
      Sean Christopherson authored
      Merge module sections only when using Clang LTO. With ld.bfd, merging
      sections does not appear to update the symbol tables for the module,
      e.g. 'readelf -s' shows the value that a symbol would have had, if
      sections were not merged. ld.lld does not show this problem.
      
      The stale symbol table breaks gdb's function disassembler, and presumably
      other things, e.g.
      
        gdb -batch -ex "file arch/x86/kvm/kvm.ko" -ex "disassemble kvm_init"
      
      reads the wrong bytes and dumps garbage.
      
      Fixes: dd277622 ("kbuild: lto: merge module sections")
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Reviewed-by: default avatarSami Tolvanen <samitolvanen@google.com>
      Tested-by: default avatarSami Tolvanen <samitolvanen@google.com>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Link: https://lore.kernel.org/r/20210322234438.502582-1-seanjc@google.com
      6a3193cd
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 6905b1dc
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "It's a bit larger than I (and probably you) would like by the time we
        get to -rc6, but perhaps not entirely unexpected since the changes in
        the last merge window were larger than usual.
      
        x86:
         - Fixes for missing TLB flushes with TDP MMU
      
         - Fixes for race conditions in nested SVM
      
         - Fixes for lockdep splat with Xen emulation
      
         - Fix for kvmclock underflow
      
         - Fix srcdir != builddir builds
      
         - Other small cleanups
      
        ARM:
         - Fix GICv3 MMIO compatibility probing
      
         - Prevent guests from using the ARMv8.4 self-hosted tracing
           extension"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        selftests: kvm: Check that TSC page value is small after KVM_SET_CLOCK(0)
        KVM: x86: Prevent 'hv_clock->system_time' from going negative in kvm_guest_time_update()
        KVM: x86: disable interrupts while pvclock_gtod_sync_lock is taken
        KVM: x86: reduce pvclock_gtod_sync_lock critical sections
        KVM: SVM: ensure that EFER.SVME is set when running nested guest or on nested vmexit
        KVM: SVM: load control fields from VMCB12 before checking them
        KVM: x86/mmu: Don't allow TDP MMU to yield when recovering NX pages
        KVM: x86/mmu: Ensure TLBs are flushed for TDP MMU during NX zapping
        KVM: x86/mmu: Ensure TLBs are flushed when yielding during GFN range zap
        KVM: make: Fix out-of-source module builds
        selftests: kvm: make hardware_disable_test less verbose
        KVM: x86/vPMU: Forbid writing to MSR_F15H_PERF MSRs when guest doesn't have X86_FEATURE_PERFCTR_CORE
        KVM: x86: remove unused declaration of kvm_write_tsc()
        KVM: clean up the unused argument
        tools/kvm_stat: Add restart delay
        KVM: arm64: Fix CPU interface MMIO compatibility detection
        KVM: arm64: Disable guest access to trace filter controls
        KVM: arm64: Hide system instruction access to Trace registers
      6905b1dc
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2021-04-02' of git://anongit.freedesktop.org/drm/drm · a80314c3
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Things have settled down in time for Easter, a random smattering of
        small fixes across a few drivers.
      
        I'm guessing though there might be some i915 and misc fixes out there
        I haven't gotten yet, but since today is a public holiday here, I'm
        sending this early so I can have the day off, I'll see if more
        requests come in and decide what to do with them later.
      
        amdgpu:
         - Polaris idle power fix
         - VM fix
         - Vangogh S3 fix
         - Fixes for non-4K page sizes
      
        amdkfd:
         - dqm fence memory corruption fix
      
        tegra:
         - lockdep warning fix
         - runtine PM reference fix
         - display controller fix
         - PLL Fix
      
        imx:
         - memory leak in error path fix
         - LDB driver channel registration fix
         - oob array warning in LDB driver
      
        exynos
         - unused header file removal"
      
      * tag 'drm-fixes-2021-04-02' of git://anongit.freedesktop.org/drm/drm:
        drm/amdgpu: check alignment on CPU page for bo map
        drm/amdgpu: Set a suitable dev_info.gart_page_size
        drm/amdgpu/vangogh: don't check for dpm in is_dpm_running when in suspend
        drm/amdkfd: dqm fence memory corruption
        drm/tegra: sor: Grab runtime PM reference across reset
        drm/tegra: dc: Restore coupling of display controllers
        gpu: host1x: Use different lock classes for each client
        drm/tegra: dc: Don't set PLL clock to 0Hz
        drm/amdgpu: fix offset calculation in amdgpu_vm_bo_clear_mappings()
        drm/amd/pm: no need to force MCLK to highest when no display connected
        drm/exynos/decon5433: Remove the unused include statements
        drm/imx: imx-ldb: fix out of bounds array access warning
        drm/imx: imx-ldb: Register LDB channel1 when it is the only channel to be used
        drm/imx: fix memory leak when fails to init
      a80314c3
    • Dave Airlie's avatar
      Merge tag 'imx-drm-fixes-2021-04-01' of git://git.pengutronix.de/git/pza/linux into drm-fixes · 6fdb8e5a
      Dave Airlie authored
      drm/imx: imx-drm-core and imx-ldb fixes
      
      Fix a memory leak in an error path during DRM device initialization,
      fix the LDB driver to register channel 1 even if channel 0 is unused,
      and fix an out of bounds array access warning in the LDB driver.
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      
      From: Philipp Zabel <p.zabel@pengutronix.de>
      Link: https://patchwork.freedesktop.org/patch/msgid/20210401092235.GA13586@pengutronix.de
      6fdb8e5a
    • Dave Airlie's avatar
      Merge tag 'drm/tegra/for-5.12-rc6' of ssh://git.freedesktop.org/git/tegra/linux into drm-fixes · a0497251
      Dave Airlie authored
      drm/tegra: Fixes for v5.12-rc6
      
      This contains a couple of fixes for various issues such as lockdep
      warnings, runtime PM references, coupled display controllers and
      misconfigured PLLs.
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      
      From: Thierry Reding <thierry.reding@gmail.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20210401163352.3348296-1-thierry.reding@gmail.com
      a0497251
    • Steven Rostedt (VMware)'s avatar
      tracing: Fix stack trace event size · 9deb193a
      Steven Rostedt (VMware) authored
      Commit cbc3b92c fixed an issue to modify the macros of the stack trace
      event so that user space could parse it properly. Originally the stack
      trace format to user space showed that the called stack was a dynamic
      array. But it is not actually a dynamic array, in the way that other
      dynamic event arrays worked, and this broke user space parsing for it. The
      update was to make the array look to have 8 entries in it. Helper
      functions were added to make it parse it correctly, as the stack was
      dynamic, but was determined by the size of the event stored.
      
      Although this fixed user space on how it read the event, it changed the
      internal structure used for the stack trace event. It changed the array
      size from [0] to [8] (added 8 entries). This increased the size of the
      stack trace event by 8 words. The size reserved on the ring buffer was the
      size of the stack trace event plus the number of stack entries found in
      the stack trace. That commit caused the amount to be 8 more than what was
      needed because it did not expect the caller field to have any size. This
      produced 8 entries of garbage (and reading random data) from the stack
      trace event:
      
                <idle>-0       [002] d... 1976396.837549: <stack trace>
       => trace_event_raw_event_sched_switch
       => __traceiter_sched_switch
       => __schedule
       => schedule_idle
       => do_idle
       => cpu_startup_entry
       => secondary_startup_64_no_verify
       => 0xc8c5e150ffff93de
       => 0xffff93de
       => 0
       => 0
       => 0xc8c5e17800000000
       => 0x1f30affff93de
       => 0x00000004
       => 0x200000000
      
      Instead, subtract the size of the caller field from the size of the event
      to make sure that only the amount needed to store the stack trace is
      reserved.
      
      Link: https://lore.kernel.org/lkml/your-ad-here.call-01617191565-ext-9692@work.hours/
      
      Cc: stable@vger.kernel.org
      Fixes: cbc3b92c ("tracing: Set kernel_stack's caller size properly")
      Reported-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Tested-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Acked-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      9deb193a
    • Linus Torvalds's avatar
      Merge tag 'sound-5.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · ffd9fb54
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "Things seem calming down, only usual device-specific fixes for
        HD-audio and USB-audio at this time"
      
      * tag 'sound-5.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: hda/realtek: fix mute/micmute LEDs for HP 640 G8
        ALSA: hda: Add missing sanity checks in PM prepare/complete callbacks
        ALSA: hda: Re-add dropped snd_poewr_change_state() calls
        ALSA: usb-audio: Apply sample rate quirk to Logitech Connect
        ALSA: hda/realtek: call alc_update_headset_mode() in hp_automute_hook
        ALSA: hda/realtek: fix a determine_headset_type issue for a Dell AIO
      ffd9fb54
    • Linus Torvalds's avatar
      Merge tag 'tomoyo-pr-20210401' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1 · 5d17c1ba
      Linus Torvalds authored
      Pull tomory fix from Tetsuo Handa:
       "An update on 'tomoyo: recognize kernel threads correctly' from Jens
        Axboe to not special case PF_IO_WORKER for PF_KTHREAD"
      
      * tag 'tomoyo-pr-20210401' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1:
        tomoyo: don't special case PF_IO_WORKER for PF_KTHREAD
      5d17c1ba
    • Linus Torvalds's avatar
      Merge tag 'xarray-5.12' of git://git.infradead.org/users/willy/xarray · e8d18958
      Linus Torvalds authored
      Pull XArray fixes from Matthew Wilcox:
       "My apologies for the lateness of this. I had a bug reported in the
        test suite, and when I started working on it, I realised I had two
        fixes sitting in the xarray tree since last November. Anyway,
        everything here is fixes, apart from adding xa_limit_16b. The test
        suite passes.
      
        Summary:
      
         - Fix a bug when splitting to a non-zero order
      
         - Documentation fix
      
         - Add a predefined 16-bit allocation limit
      
         - Various test suite fixes"
      
      * tag 'xarray-5.12' of git://git.infradead.org/users/willy/xarray:
        idr test suite: Improve reporting from idr_find_test_1
        idr test suite: Create anchor before launching throbber
        idr test suite: Take RCU read lock in idr_find_test_1
        radix tree test suite: Register the main thread with the RCU library
        radix tree test suite: Fix compilation
        XArray: Add xa_limit_16b
        XArray: Fix splitting to non-zero orders
        XArray: Fix split documentation
      e8d18958
    • Pavel Begunkov's avatar
      io_uring: fix EIOCBQUEUED iter revert · 07204f21
      Pavel Begunkov authored
      iov_iter_revert() is done in completion handlers that happensf before
      read/write returns -EIOCBQUEUED, no need to repeat reverting afterwards.
      Moreover, even though it may appear being just a no-op, it's actually
      races with 1) user forging a new iovec of a different size 2) reissue,
      that is done via io-wq continues completely asynchronously.
      
      Fixes: 3e6a0d3c ("io_uring: fix -EAGAIN retry with IOPOLL")
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      07204f21
    • Pavel Begunkov's avatar
      io_uring/io-wq: protect against sprintf overflow · 696ee88a
      Pavel Begunkov authored
      task_pid may be large enough to not fit into the left space of
      TASK_COMM_LEN-sized buffers and overflow in sprintf. We not so care
      about uniqueness, so replace it with safer snprintf().
      Reported-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Link: https://lore.kernel.org/r/1702c6145d7e1c46fbc382f28334c02e1a3d3994.1617267273.git.asml.silence@gmail.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      696ee88a
    • Jens Axboe's avatar
      io_uring: don't mark S_ISBLK async work as unbounded · 4b982bd0
      Jens Axboe authored
      S_ISBLK is marked as unbounded work for async preparation, because it
      doesn't match S_ISREG. That is incorrect, as any read/write to a block
      device is also a bounded operation. Fix it up and ensure that S_ISBLK
      isn't marked unbounded.
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      4b982bd0
    • Damien Le Moal's avatar
      null_blk: fix command timeout completion handling · de3510e5
      Damien Le Moal authored
      Memory backed or zoned null block devices may generate actual request
      timeout errors due to the submission path being blocked on memory
      allocation or zone locking. Unlike fake timeouts or injected timeouts,
      the request submission path will call blk_mq_complete_request() or
      blk_mq_end_request() for these real timeout errors, causing a double
      completion and use after free situation as the block layer timeout
      handler executes blk_mq_rq_timed_out() and __blk_mq_free_request() in
      blk_mq_check_expired(). This problem often triggers a NULL pointer
      dereference such as:
      
      BUG: kernel NULL pointer dereference, address: 0000000000000050
      RIP: 0010:blk_mq_sched_mark_restart_hctx+0x5/0x20
      ...
      Call Trace:
        dd_finish_request+0x56/0x80
        blk_mq_free_request+0x37/0x130
        null_handle_cmd+0xbf/0x250 [null_blk]
        ? null_queue_rq+0x67/0xd0 [null_blk]
        blk_mq_dispatch_rq_list+0x122/0x850
        __blk_mq_do_dispatch_sched+0xbb/0x2c0
        __blk_mq_sched_dispatch_requests+0x13d/0x190
        blk_mq_sched_dispatch_requests+0x30/0x60
        __blk_mq_run_hw_queue+0x49/0x90
        process_one_work+0x26c/0x580
        worker_thread+0x55/0x3c0
        ? process_one_work+0x580/0x580
        kthread+0x134/0x150
        ? kthread_create_worker_on_cpu+0x70/0x70
        ret_from_fork+0x1f/0x30
      
      This problem very often triggers when running the full btrfs xfstests
      on a memory-backed zoned null block device in a VM with limited amount
      of memory.
      
      Avoid this by executing blk_mq_complete_request() in null_timeout_rq()
      only for commands that are marked for a fake timeout completion using
      the fake_timeout boolean in struct null_cmd. For timeout errors injected
      through debugfs, the timeout handler will execute
      blk_mq_complete_request()i as before. This is safe as the submission
      path does not execute complete requests in this case.
      
      In null_timeout_rq(), also make sure to set the command error field to
      BLK_STS_TIMEOUT and to propagate this error through to the request
      completion.
      Reported-by: default avatarJohannes Thumshirn <Johannes.Thumshirn@wdc.com>
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Tested-by: default avatarJohannes Thumshirn <Johannes.Thumshirn@wdc.com>
      Reviewed-by: default avatarJohannes Thumshirn <Johannes.Thumshirn@wdc.com>
      Link: https://lore.kernel.org/r/20210331225244.126426-1-damien.lemoal@wdc.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      de3510e5
    • Matthew Wilcox (Oracle)'s avatar
      idr test suite: Improve reporting from idr_find_test_1 · 2c7e57a0
      Matthew Wilcox (Oracle) authored
      Instead of just reporting an assertion failure, report enough information
      that we can start diagnosing exactly went wrong.
      Signed-off-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      2c7e57a0
    • Matthew Wilcox (Oracle)'s avatar
      idr test suite: Create anchor before launching throbber · 094ffbd1
      Matthew Wilcox (Oracle) authored
      The throbber could race with creation of the anchor entry and cause the
      IDR to have zero entries in it, which would cause the test to fail.
      Signed-off-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      094ffbd1
    • Matthew Wilcox (Oracle)'s avatar
      idr test suite: Take RCU read lock in idr_find_test_1 · 70358641
      Matthew Wilcox (Oracle) authored
      When run on a single CPU, this test would frequently access already-freed
      memory.  Due to timing, this bug never showed up on multi-CPU tests.
      Reported-by: default avatarChris von Recklinghausen <crecklin@redhat.com>
      Signed-off-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      70358641
    • Matthew Wilcox (Oracle)'s avatar
      radix tree test suite: Register the main thread with the RCU library · 1bb4bd26
      Matthew Wilcox (Oracle) authored
      Several test runners register individual worker threads with the
      RCU library, but neglect to register the main thread, which can lead
      to objects being freed while the main thread is in what appears to be
      an RCU critical section.
      Reported-by: default avatarChris von Recklinghausen <crecklin@redhat.com>
      Signed-off-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      1bb4bd26
    • Vitaly Kuznetsov's avatar
      ACPI: processor: Fix CPU0 wakeup in acpi_idle_play_dead() · 8cdddd18
      Vitaly Kuznetsov authored
      Commit 496121c0 ("ACPI: processor: idle: Allow probing on platforms
      with one ACPI C-state") broke CPU0 hotplug on certain systems, e.g.
      I'm observing the following on AWS Nitro (e.g r5b.xlarge but other
      instance types are affected as well):
      
       # echo 0 > /sys/devices/system/cpu/cpu0/online
       # echo 1 > /sys/devices/system/cpu/cpu0/online
       <10 seconds delay>
       -bash: echo: write error: Input/output error
      
      In fact, the above mentioned commit only revealed the problem and did
      not introduce it. On x86, to wakeup CPU an NMI is being used and
      hlt_play_dead()/mwait_play_dead() loops are prepared to handle it:
      
      	/*
      	 * If NMI wants to wake up CPU0, start CPU0.
      	 */
      	if (wakeup_cpu0())
      		start_cpu0();
      
      cpuidle_play_dead() -> acpi_idle_play_dead() (which is now being called on
      systems where it wasn't called before the above mentioned commit) serves
      the same purpose but it doesn't have a path for CPU0. What happens now on
      wakeup is:
       - NMI is sent to CPU0
       - wakeup_cpu0_nmi() works as expected
       - we get back to while (1) loop in acpi_idle_play_dead()
       - safe_halt() puts CPU0 to sleep again.
      
      The straightforward/minimal fix is add the special handling for CPU0 on x86
      and that's what the patch is doing.
      
      Fixes: 496121c0 ("ACPI: processor: idle: Allow probing on platforms with one ACPI C-state")
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Cc: 5.10+ <stable@vger.kernel.org> # 5.10+
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      8cdddd18
    • Vitaly Kuznetsov's avatar
      selftests: kvm: Check that TSC page value is small after KVM_SET_CLOCK(0) · 55626ca9
      Vitaly Kuznetsov authored
      Add a test for the issue when KVM_SET_CLOCK(0) call could cause
      TSC page value to go very big because of a signedness issue around
      hv_clock->system_time.
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20210326155551.17446-3-vkuznets@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      55626ca9
    • Vitaly Kuznetsov's avatar
      KVM: x86: Prevent 'hv_clock->system_time' from going negative in kvm_guest_time_update() · 77fcbe82
      Vitaly Kuznetsov authored
      When guest time is reset with KVM_SET_CLOCK(0), it is possible for
      'hv_clock->system_time' to become a small negative number. This happens
      because in KVM_SET_CLOCK handling we set 'kvm->arch.kvmclock_offset' based
      on get_kvmclock_ns(kvm) but when KVM_REQ_CLOCK_UPDATE is handled,
      kvm_guest_time_update() does (masterclock in use case):
      
      hv_clock.system_time = ka->master_kernel_ns + v->kvm->arch.kvmclock_offset;
      
      And 'master_kernel_ns' represents the last time when masterclock
      got updated, it can precede KVM_SET_CLOCK() call. Normally, this is not a
      problem, the difference is very small, e.g. I'm observing
      hv_clock.system_time = -70 ns. The issue comes from the fact that
      'hv_clock.system_time' is stored as unsigned and 'system_time / 100' in
      compute_tsc_page_parameters() becomes a very big number.
      
      Use 'master_kernel_ns' instead of get_kvmclock_ns() when masterclock is in
      use and get_kvmclock_base_ns() when it's not to prevent 'system_time' from
      going negative.
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20210331124130.337992-2-vkuznets@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      77fcbe82
    • Paolo Bonzini's avatar
      KVM: x86: disable interrupts while pvclock_gtod_sync_lock is taken · a83829f5
      Paolo Bonzini authored
      pvclock_gtod_sync_lock can be taken with interrupts disabled if the
      preempt notifier calls get_kvmclock_ns to update the Xen
      runstate information:
      
         spin_lock include/linux/spinlock.h:354 [inline]
         get_kvmclock_ns+0x25/0x390 arch/x86/kvm/x86.c:2587
         kvm_xen_update_runstate+0x3d/0x2c0 arch/x86/kvm/xen.c:69
         kvm_xen_update_runstate_guest+0x74/0x320 arch/x86/kvm/xen.c:100
         kvm_xen_runstate_set_preempted arch/x86/kvm/xen.h:96 [inline]
         kvm_arch_vcpu_put+0x2d8/0x5a0 arch/x86/kvm/x86.c:4062
      
      So change the users of the spinlock to spin_lock_irqsave and
      spin_unlock_irqrestore.
      
      Reported-by: syzbot+b282b65c2c68492df769@syzkaller.appspotmail.com
      Fixes: 30b5c851 ("KVM: x86/xen: Add support for vCPU runstate information")
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      a83829f5