1. 23 May, 2022 9 commits
    • Linus Torvalds's avatar
      Merge tag 'for-5.19/io_uring-passthrough-2022-05-22' of git://git.kernel.dk/linux-block · 9836e93c
      Linus Torvalds authored
      Pull io_uring NVMe command passthrough from Jens Axboe:
       "On top of everything else, this adds support for passthrough for
        io_uring.
      
        The initial feature for this is NVMe passthrough support, which allows
        non-filesystem based IO commands and admin commands.
      
        To support this, io_uring grows support for SQE and CQE members that
        are twice as big, allowing to pass in a full NVMe command without
        having to copy data around. And to complete with more than just a
        single 32-bit value as the output"
      
      * tag 'for-5.19/io_uring-passthrough-2022-05-22' of git://git.kernel.dk/linux-block: (22 commits)
        io_uring: cleanup handling of the two task_work lists
        nvme: enable uring-passthrough for admin commands
        nvme: helper for uring-passthrough checks
        blk-mq: fix passthrough plugging
        nvme: add vectored-io support for uring-cmd
        nvme: wire-up uring-cmd support for io-passthru on char-device.
        nvme: refactor nvme_submit_user_cmd()
        block: wire-up support for passthrough plugging
        fs,io_uring: add infrastructure for uring-cmd
        io_uring: support CQE32 for nop operation
        io_uring: enable CQE32
        io_uring: support CQE32 in /proc info
        io_uring: add tracing for additional CQE32 fields
        io_uring: overflow processing for CQE32
        io_uring: flush completions for CQE32
        io_uring: modify io_get_cqe for CQE32
        io_uring: add CQE32 completion processing
        io_uring: add CQE32 setup processing
        io_uring: change ring size calculation for CQE32
        io_uring: store add. return values for CQE32
        ...
      9836e93c
    • Linus Torvalds's avatar
      Merge tag 'for-5.19/io_uring-net-2022-05-22' of git://git.kernel.dk/linux-block · e1a8fde7
      Linus Torvalds authored
      Pull io_uring 'more data in socket' support from Jens Axboe:
       "To be able to fully utilize the 'poll first' support in the core
        io_uring branch, it's advantageous knowing if the socket was empty
        after a receive. This adds support for that"
      
      * tag 'for-5.19/io_uring-net-2022-05-22' of git://git.kernel.dk/linux-block:
        io_uring: return hint on whether more data is available after receive
        tcp: pass back data left in socket after receive
      e1a8fde7
    • Linus Torvalds's avatar
      Merge tag 'for-5.19/io_uring-socket-2022-05-22' of git://git.kernel.dk/linux-block · 368da430
      Linus Torvalds authored
      Pull io_uring socket() support from Jens Axboe:
       "This adds support for socket(2) for io_uring. This is handy when using
        direct / registered file descriptors with io_uring.
      
        Outside of those two patches, a small series from Dylan on top that
        improves the tracing by providing a text representation of the opcode
        rather than needing to decode this by reading the header file every
        time.
      
        That sits in this branch as it was the last opcode added (until it
        wasn't...)"
      
      * tag 'for-5.19/io_uring-socket-2022-05-22' of git://git.kernel.dk/linux-block:
        io_uring: use the text representation of ops in trace
        io_uring: rename op -> opcode
        io_uring: add io_uring_get_opcode
        io_uring: add type to op enum
        io_uring: add socket(2) support
        net: add __sys_socket_file()
      368da430
    • Linus Torvalds's avatar
      Merge tag 'for-5.19/io_uring-xattr-2022-05-22' of git://git.kernel.dk/linux-block · 09beaff7
      Linus Torvalds authored
      Pull io_uring xattr support from Jens Axboe:
       "Support for the xattr variants"
      
      * tag 'for-5.19/io_uring-xattr-2022-05-22' of git://git.kernel.dk/linux-block:
        io_uring: cleanup error-handling around io_req_complete
        io_uring: fix trace for reduced sqe padding
        io_uring: add fgetxattr and getxattr support
        io_uring: add fsetxattr and setxattr support
        fs: split off do_getxattr from getxattr
        fs: split off setxattr_copy and do_setxattr function from setxattr
      09beaff7
    • Linus Torvalds's avatar
      Merge tag 'for-5.19/io_uring-2022-05-22' of git://git.kernel.dk/linux-block · 3a166bdb
      Linus Torvalds authored
      Pull io_uring updates from Jens Axboe:
       "Here are the main io_uring changes for 5.19. This contains:
      
         - Fixes for sparse type warnings (Christoph, Vasily)
      
         - Support for multi-shot accept (Hao)
      
         - Support for io_uring managed fixed files, rather than always
           needing the applicationt o manage the indices (me)
      
         - Fix for a spurious poll wakeup (Dylan)
      
         - CQE overflow fixes (Dylan)
      
         - Support more types of cancelations (me)
      
         - Support for co-operative task_work signaling, rather than always
           forcing an IPI (me)
      
         - Support for doing poll first when appropriate, rather than always
           attempting a transfer first (me)
      
         - Provided buffer cleanups and support for mapped buffers (me)
      
         - Improve how io_uring handles inflight SCM files (Pavel)
      
         - Speedups for registered files (Pavel, me)
      
         - Organize the completion data in a struct in io_kiocb rather than
           keep it in separate spots (Pavel)
      
         - task_work improvements (Pavel)
      
         - Cleanup and optimize the submission path, in general and for
           handling links (Pavel)
      
         - Speedups for registered resource handling (Pavel)
      
         - Support sparse buffers and file maps (Pavel, me)
      
         - Various fixes and cleanups (Almog, Pavel, me)"
      
      * tag 'for-5.19/io_uring-2022-05-22' of git://git.kernel.dk/linux-block: (111 commits)
        io_uring: fix incorrect __kernel_rwf_t cast
        io_uring: disallow mixed provided buffer group registrations
        io_uring: initialize io_buffer_list head when shared ring is unregistered
        io_uring: add fully sparse buffer registration
        io_uring: use rcu_dereference in io_close
        io_uring: consistently use the EPOLL* defines
        io_uring: make apoll_events a __poll_t
        io_uring: drop a spurious inline on a forward declaration
        io_uring: don't use ERR_PTR for user pointers
        io_uring: use a rwf_t for io_rw.flags
        io_uring: add support for ring mapped supplied buffers
        io_uring: add io_pin_pages() helper
        io_uring: add buffer selection support to IORING_OP_NOP
        io_uring: fix locking state for empty buffer group
        io_uring: implement multishot mode for accept
        io_uring: let fast poll support multishot
        io_uring: add REQ_F_APOLL_MULTISHOT for requests
        io_uring: add IORING_ACCEPT_MULTISHOT for accept
        io_uring: only wake when the correct events are set
        io_uring: avoid io-wq -EAGAIN looping for !IOPOLL
        ...
      3a166bdb
    • Linus Torvalds's avatar
      Merge tag 'rcu.2022.05.19a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu · 1e57930e
      Linus Torvalds authored
      Pull RCU update from Paul McKenney:
      
       - Documentation updates
      
       - Miscellaneous fixes
      
       - Callback-offloading updates, mainly simplifications
      
       - RCU-tasks updates, including some -rt fixups, handling of systems
         with sparse CPU numbering, and a fix for a boot-time race-condition
         failure
      
       - Put SRCU on a memory diet in order to reduce the size of the
         srcu_struct structure
      
       - Torture-test updates fixing some bugs in tests and closing some
         testing holes
      
       - Torture-test updates for the RCU tasks flavors, most notably ensuring
         that building rcutorture and friends does not change the
         RCU-tasks-related Kconfig options
      
       - Torture-test scripting updates
      
       - Expedited grace-period updates, most notably providing
         milliseconds-scale (not all that) soft real-time response from
         synchronize_rcu_expedited().
      
         This is also the first time in almost 30 years of RCU that someone
         other than me has pushed for a reduction in the RCU CPU stall-warning
         timeout, in this case by more than three orders of magnitude from 21
         seconds to 20 milliseconds. This tighter timeout applies only to
         expedited grace periods
      
      * tag 'rcu.2022.05.19a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (80 commits)
        rcu: Move expedited grace period (GP) work to RT kthread_worker
        rcu: Introduce CONFIG_RCU_EXP_CPU_STALL_TIMEOUT
        srcu: Drop needless initialization of sdp in srcu_gp_start()
        srcu: Prevent expedited GPs and blocking readers from consuming CPU
        srcu: Add contention check to call_srcu() srcu_data ->lock acquisition
        srcu: Automatically determine size-transition strategy at boot
        rcutorture: Make torture.sh allow for --kasan
        rcutorture: Make torture.sh refscale and rcuscale specify Tasks Trace RCU
        rcutorture: Make kvm.sh allow more memory for --kasan runs
        torture: Save "make allmodconfig" .config file
        scftorture: Remove extraneous "scf" from per_version_boot_params
        rcutorture: Adjust scenarios' Kconfig options for CONFIG_PREEMPT_DYNAMIC
        torture: Enable CSD-lock stall reports for scftorture
        torture: Skip vmlinux check for kvm-again.sh runs
        scftorture: Adjust for TASKS_RCU Kconfig option being selected
        rcuscale: Allow rcuscale without RCU Tasks Rude/Trace
        rcuscale: Allow rcuscale without RCU Tasks
        refscale: Allow refscale without RCU Tasks Rude/Trace
        refscale: Allow refscale without RCU Tasks
        rcutorture: Allow specifying per-scenario stat_interval
        ...
      1e57930e
    • Linus Torvalds's avatar
      Merge tag 'lkmm.2022.05.20a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu · b2f02e9c
      Linus Torvalds authored
      Pull LKMM update from Paul McKenney:
       "This updates the klitmus7 compatibility table to indicate that
        herdtools7 7.56.1 or better is required for Linux kernel v5.17 or
        later"
      
      * tag 'lkmm.2022.05.20a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
        tools/memory-model/README: Update klitmus7 compat table
      b2f02e9c
    • Linus Torvalds's avatar
      Merge tag 'nolibc.2022.05.20a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu · f814957b
      Linus Torvalds authored
      Pull nolibc library updates from Paul McKenney:
       "This adds a number of library functions and splits this library into
        multiple files"
      
      * tag 'nolibc.2022.05.20a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (61 commits)
        tools/nolibc/string: Implement `strdup()` and `strndup()`
        tools/nolibc/string: Implement `strnlen()`
        tools/nolibc/stdlib: Implement `malloc()`, `calloc()`, `realloc()` and `free()`
        tools/nolibc/types: Implement `offsetof()` and `container_of()` macro
        tools/nolibc/sys: Implement `mmap()` and `munmap()`
        tools/nolibc: i386: Implement syscall with 6 arguments
        tools/nolibc: Remove .global _start from the entry point code
        tools/nolibc: Replace `asm` with `__asm__`
        tools/nolibc: x86-64: Update System V ABI document link
        tools/nolibc/stdlib: only reference the external environ when inlined
        tools/nolibc/string: do not use __builtin_strlen() at -O0
        tools/nolibc: add the nolibc subdir to the common Makefile
        tools/nolibc: add a makefile to install headers
        tools/nolibc/types: add poll() and waitpid() flag definitions
        tools/nolibc/sys: add syscall definition for getppid()
        tools/nolibc/string: add strcmp() and strncmp()
        tools/nolibc/stdio: add support for '%p' to vfprintf()
        tools/nolibc/stdlib: add a simple getenv() implementation
        tools/nolibc/stdio: make printf(%s) accept NULL
        tools/nolibc/stdlib: implement abort()
        ...
      f814957b
    • Linus Torvalds's avatar
      Merge tag 'efi-next-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi · bf243102
      Linus Torvalds authored
      Pull EFI updates from Ard Biesheuvel:
      
       - Allow runtime services to be re-enabled at boot on RT kernels.
      
       - Provide access to secrets injected into the boot image by CoCo
         hypervisors (COnfidential COmputing)
      
       - Use DXE services on x86 to make the boot image executable after
         relocation, if needed.
      
       - Prefer mirrored memory for randomized allocations.
      
       - Only randomize the placement of the kernel image on arm64 if the
         loader has not already done so.
      
       - Add support for obtaining the boot hartid from EFI on RISC-V.
      
      * tag 'efi-next-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
        riscv/efi_stub: Add support for RISCV_EFI_BOOT_PROTOCOL
        efi: stub: prefer mirrored memory for randomized allocations
        efi/arm64: libstub: run image in place if randomized by the loader
        efi: libstub: pass image handle to handle_kernel_image()
        efi: x86: Set the NX-compatibility flag in the PE header
        efi: libstub: ensure allocated memory to be executable
        efi: libstub: declare DXE services table
        efi: Add missing prototype for efi_capsule_setup_info
        docs: security: Add secrets/coco documentation
        efi: Register efi_secret platform device if EFI secret area is declared
        virt: Add efi_secret module to expose confidential computing secrets
        efi: Save location of EFI confidential computing area
        efi: Allow to enable EFI runtime services by default on RT
      bf243102
  2. 22 May, 2022 4 commits
    • Linus Torvalds's avatar
      Linux 5.18 · 4b0986a3
      Linus Torvalds authored
      4b0986a3
    • David Howells's avatar
      afs: Fix afs_getattr() to refetch file status if callback break occurred · 2aeb8c86
      David Howells authored
      If a callback break occurs (change notification), afs_getattr() needs to
      issue an FS.FetchStatus RPC operation to update the status of the file
      being examined by the stat-family of system calls.
      
      Fix afs_getattr() to do this if AFS_VNODE_CB_PROMISED has been cleared
      on a vnode by a callback break.  Skip this if AT_STATX_DONT_SYNC is set.
      
      This can be tested by appending to a file on one AFS client and then
      using "stat -L" to examine its length on a machine running kafs.  This
      can also be watched through tracing on the kafs machine.  The callback
      break is seen:
      
           kworker/1:1-46      [001] .....   978.910812: afs_cb_call: c=0000005f YFSCB.CallBack
           kworker/1:1-46      [001] ...1.   978.910829: afs_cb_break: 100058:23b4c:242d2c2 b=2 s=1 break-cb
           kworker/1:1-46      [001] .....   978.911062: afs_call_done:    c=0000005f ret=0 ab=0 [0000000082994ead]
      
      And then the stat command generated no traffic if unpatched, but with
      this change a call to fetch the status can be observed:
      
                  stat-4471    [000] .....   986.744122: afs_make_fs_call: c=000000ab 100058:023b4c:242d2c2 YFS.FetchStatus
                  stat-4471    [000] .....   986.745578: afs_call_done:    c=000000ab ret=0 ab=0 [0000000087fc8c84]
      
      Fixes: 08e0e7c8 ("[AF_RXRPC]: Make the in-kernel AFS filesystem use AF_RXRPC.")
      Reported-by: default avatarMarkus Suvanto <markus.suvanto@gmail.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      Tested-by: default avatarMarkus Suvanto <markus.suvanto@gmail.com>
      Tested-by: kafs-testing+fedora34_64checkkafs-build-496@auristor.com
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=216010
      Link: https://lore.kernel.org/r/165308359800.162686.14122417881564420962.stgit@warthog.procyon.org.uk/ # v1
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2aeb8c86
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 978df3e1
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "Some I2C driver bugfixes for 5.18. Nothing spectacular but worth
        fixing"
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        drivers: i2c: thunderx: Allow driver to work with ACPI defined TWSI controllers
        i2c: ismt: Provide a DMA buffer for Interrupt Cause Logging
        i2c: mt7621: fix missing clk_disable_unprepare() on error in mtk_i2c_probe()
      978df3e1
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-fixes-for-v5.18-2022-05-21' of... · eaea45fc
      Linus Torvalds authored
      Merge tag 'perf-tools-fixes-for-v5.18-2022-05-21' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
      Pull perf tools fixes from Arnaldo Carvalho de Melo:
      
       - Fix and validate CPU map inputs in synthetic PERF_RECORD_STAT events
         in 'perf stat'.
      
       - Fix x86's arch__intr_reg_mask() for the hybrid platform.
      
       - Address 'perf bench numa' compiler error on s390.
      
       - Fix check for btf__load_from_kernel_by_id() in libbpf.
      
       - Fix "all PMU test" 'perf test' to skip hv_24x7/hv_gpci tests on
         powerpc.
      
       - Fix session topology test to skip the test in guest environment.
      
       - Skip BPF 'perf test' if clang is not present.
      
       - Avoid shell test description infinite loop in 'perf test'.
      
       - Fix Intel LBR callstack entries and nr print message.
      
      * tag 'perf-tools-fixes-for-v5.18-2022-05-21' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
        perf session: Fix Intel LBR callstack entries and nr print message
        perf test bpf: Skip test if clang is not present
        perf test session topology: Fix test to skip the test in guest environment
        perf bench numa: Address compiler error on s390
        perf test: Avoid shell test description infinite loop
        perf regs x86: Fix arch__intr_reg_mask() for the hybrid platform
        perf test: Fix "all PMU test" to skip hv_24x7/hv_gpci tests on powerpc
        perf stat: Fix and validate CPU map inputs in synthetic PERF_RECORD_STAT events
        perf build: Fix check for btf__load_from_kernel_by_id() in libbpf
      eaea45fc
  3. 21 May, 2022 16 commits
    • Linus Torvalds's avatar
      Merge tag 'input-for-v5.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 4c493b1a
      Linus Torvalds authored
      Pull input fixes from Dmitry Torokhov:
       "A small fixup to ili210x touchscreen driver, and updated maintainer
        entry for the device tree binding of Mediatek 6779 keypad:
      
         - fix reset timing of Ilitek touchscreens
      
         - update maintainer entry of DT binding of Mediatek 6779 keypad"
      
      * tag 'input-for-v5.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
        Input: ili210x - use one common reset implementation
        Input: ili210x - fix reset timing
        dt-bindings: input: mediatek,mt6779-keypad: update maintainer
      4c493b1a
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 36ed2da7
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Two patches, both in drivers.
      
        The iscsi one is fixing the cpumask issue you commented on and the ufs
        one is a late arriving fix for conditions that can occur in Host
        Performance Booster reads"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: ufs: core: Fix referencing invalid rsp field
        scsi: target: Fix incorrect use of cpumask_t
      36ed2da7
    • Chengdong Li's avatar
      perf session: Fix Intel LBR callstack entries and nr print message · 51d0bf99
      Chengdong Li authored
      When generating callstack information from branch_stack(Intel LBR), the
      actual number of callstack entry should be bigger than the number of
      branch_stack, for example:
      
      	branch_stack records:
      		B() -> C()
      		A() -> B()
      	converted callstack records should be:
      		C()
      		B()
      		A()
      though, the number of callstack equals
      to the number of branch stack plus 1.
      
      This patch fixes above issue in branch_stack__printf(). For example,
      
      	# echo 'scale=2000; 4*a(1)' > cmd
      	# perf record --call-graph lbr bc -l < cmd
      
      Before applying this patch, `perf script -D` output:
      
      	1220022677386876 0x2a40 [0xd8]: PERF_RECORD_SAMPLE(IP, 0x4002): 17990/17990: 0x40a6d6 period: 894172 addr: 0
      	... LBR call chain: nr:8
      	.....  0: fffffffffffffe00
      	.....  1: 000000000040a410
      	.....  2: 000000000040573c
      	.....  3: 0000000000408650
      	.....  4: 00000000004022f2
      	.....  5: 00000000004015f5
      	.....  6: 00007f5ed6dcb553
      	.....  7: 0000000000401698
      	... FP chain: nr:2
      	.....  0: fffffffffffffe00
      	.....  1: 000000000040a6d8
      	... branch callstack: nr:6    # which is not consistent with LBR records.
      	.....  0: 000000000040a410
      	.....  1: 0000000000408650    # ditto
      	.....  2: 00000000004022f2
      	.....  3: 00000000004015f5
      	.....  4: 00007f5ed6dcb553
      	.....  5: 0000000000401698
      	 ... thread: bc:17990
      	 ...... dso: /usr/bin/bc
      	bc 17990 1220022.677386:     894172 cycles:
      			  40a410 [unknown] (/usr/bin/bc)
      			  40573c [unknown] (/usr/bin/bc)
      			  408650 [unknown] (/usr/bin/bc)
      			  4022f2 [unknown] (/usr/bin/bc)
      			  4015f5 [unknown] (/usr/bin/bc)
      		    7f5ed6dcb553 __libc_start_main+0xf3 (/usr/lib64/libc-2.17.so)
      			  401698 [unknown] (/usr/bin/bc)
      
      After applied:
      
      	1220022677386876 0x2a40 [0xd8]: PERF_RECORD_SAMPLE(IP, 0x4002): 17990/17990: 0x40a6d6 period: 894172 addr: 0
      	... LBR call chain: nr:8
      	.....  0: fffffffffffffe00
      	.....  1: 000000000040a410
      	.....  2: 000000000040573c
      	.....  3: 0000000000408650
      	.....  4: 00000000004022f2
      	.....  5: 00000000004015f5
      	.....  6: 00007f5ed6dcb553
      	.....  7: 0000000000401698
      	... FP chain: nr:2
      	.....  0: fffffffffffffe00
      	.....  1: 000000000040a6d8
      	... branch callstack: nr:7
      	.....  0: 000000000040a410
      	.....  1: 000000000040573c
      	.....  2: 0000000000408650
      	.....  3: 00000000004022f2
      	.....  4: 00000000004015f5
      	.....  5: 00007f5ed6dcb553
      	.....  6: 0000000000401698
      	 ... thread: bc:17990
      	 ...... dso: /usr/bin/bc
      	bc 17990 1220022.677386:     894172 cycles:
      			  40a410 [unknown] (/usr/bin/bc)
      			  40573c [unknown] (/usr/bin/bc)
      			  408650 [unknown] (/usr/bin/bc)
      			  4022f2 [unknown] (/usr/bin/bc)
      			  4015f5 [unknown] (/usr/bin/bc)
      		    7f5ed6dcb553 __libc_start_main+0xf3 (/usr/lib64/libc-2.17.so)
      			  401698 [unknown] (/usr/bin/bc)
      
      Change from v1:
      	- refined code style according to Jiri's review comments.
      Signed-off-by: default avatarChengdong Li <chengdongli@tencent.com>
      Acked-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: German Gomez <german.gomez@arm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: likexu@tencent.com
      Link: https://lore.kernel.org/r/20220517015726.96131-1-chengdongli@tencent.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      51d0bf99
    • Athira Rajeev's avatar
      perf test bpf: Skip test if clang is not present · 8994e97b
      Athira Rajeev authored
      Perf BPF filter test fails in environment where "clang" is not
      installed.
      
      Test failure logs:
      
      <<>>
       42: BPF filter                    :
       42.1: Basic BPF filtering         : Skip
       42.2: BPF pinning                 : FAILED!
       42.3: BPF prologue generation     : FAILED!
      <<>>
      
      Enabling verbose option provided debug logs which says clang/llvm needs
      to be installed. Snippet of verbose logs:
      
      <<>>
       42.2: BPF pinning                  :
       --- start ---
      test child forked, pid 61423
      ERROR:	unable to find clang.
      Hint:	Try to install latest clang/llvm to support BPF.
              Check your $PATH
      
      <<logs_here>>
      
      Failed to compile test case: 'Basic BPF llvm compile'
      Unable to get BPF object, fix kbuild first
      test child finished with -1
       ---- end ----
      BPF filter subtest 2: FAILED!
      <<>>
      
      Here subtests, "BPF pinning" and "BPF prologue generation" failed and
      logs shows clang/llvm is needed. After installing clang, testcase
      passes.
      
      Reason on why subtest failure happens though logs has proper debug
      information:
      
      Main function __test__bpf calls test_llvm__fetch_bpf_obj by
      passing 4th argument as true ( 4th arguments maps to parameter
      "force" in test_llvm__fetch_bpf_obj ). But this will cause
      test_llvm__fetch_bpf_obj to skip the check for clang/llvm.
      
      Snippet of code part which checks for clang based on
      parameter "force" in test_llvm__fetch_bpf_obj:
      
      <<>>
      if (!force && (!llvm_param.user_set_param &&
      <<>>
      
      Since force is set to "false", test won't get skipped and fails to
      compile test case. The BPF code compilation needs clang, So pass the
      fourth argument as "false" and also skip the test if reason for return
      is "TEST_SKIP"
      
      After the patch:
      
      <<>>
       42: BPF filter                    :
       42.1: Basic BPF filtering         : Skip
       42.2: BPF pinning                 : Skip
       42.3: BPF prologue generation     : Skip
      <<>>
      
      Fixes: ba1fae43 ("perf test: Add 'perf test BPF'")
      Reviewed-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Signed-off-by: default avatarAthira Jajeev <atrajeev@linux.vnet.ibm.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: https://lore.kernel.org/r/20220511115438.84032-1-atrajeev@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      8994e97b
    • Athira Rajeev's avatar
      perf test session topology: Fix test to skip the test in guest environment · cfd7092c
      Athira Rajeev authored
      The session topology test fails in powerpc pSeries platform.
      
      Test logs:
      
        <<>>
        Session topology : FAILED!
        <<>>
      
      This testcases tests cpu topology by checking the core_id and socket_id
      stored in perf_env from perf session. The data from perf session is
      compared with the cpu topology information from
      "/sys/devices/system/cpu/cpuX/topology" like core_id,
      physical_package_id.
      
      In case of virtual environment, detail like physical_package_id is
      restricted to be exposed. Hence physical_package_id is set to -1. The
      testcase fails on such platforms since socket_id can't be fetched from
      topology info.
      
      Skip the testcase in powerpc if physical_package_id returns -1.
      Reviewed-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>---
      Tested-by: default avatarDisha Goel <disgoel@linux.vnet.ibm.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: https://lore.kernel.org/r/20220511114959.84002-1-atrajeev@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      cfd7092c
    • Thomas Richter's avatar
      perf bench numa: Address compiler error on s390 · f8ac1c47
      Thomas Richter authored
      The compilation on s390 results in this error:
      
        # make DEBUG=y bench/numa.o
        ...
        bench/numa.c: In function ‘__bench_numa’:
        bench/numa.c:1749:81: error: ‘%d’ directive output may be truncated
                    writing between 1 and 11 bytes into a region of size between
                    10 and 20 [-Werror=format-truncation=]
        1749 |        snprintf(tname, sizeof(tname), "process%d:thread%d", p, t);
                                                                     ^~
        ...
        bench/numa.c:1749:64: note: directive argument in the range
                       [-2147483647, 2147483646]
        ...
        #
      
      The maximum length of the %d replacement is 11 characters because of the
      negative sign.  Therefore extend the array by two more characters.
      
      Output after:
      
        # make  DEBUG=y bench/numa.o > /dev/null 2>&1; ll bench/numa.o
        -rw-r--r-- 1 root root 418320 May 19 09:11 bench/numa.o
        #
      
      Fixes: 3aff8ba0 ("perf bench numa: Avoid possible truncation when using snprintf()")
      Suggested-by: default avatarNamhyung Kim <namhyung@gmail.com>
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
      Cc: Sven Schnelle <svens@linux.ibm.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Link: https://lore.kernel.org/r/20220520081158.2990006-1-tmricht@linux.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f8ac1c47
    • Ian Rogers's avatar
      perf test: Avoid shell test description infinite loop · caaaa554
      Ian Rogers authored
      for_each_shell_test() is already strict in expecting tests to be files
      and executable. It is sometimes possible when it iterates over all files
      that it finds one that is executable and lacks a newline character. When
      this happens the loop never terminates as it doesn't check for EOF.
      
      Add the EOF check to make this loop at least bounded by the file size.
      
      If the description is returned as NULL then also skip the test.
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Marco Elver <elver@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Riccardo Mancini <rickyman7@gmail.com>
      Cc: Sohaib Mohamed <sohaib.amhmd@gmail.com>
      Cc: Stephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/r/20220517204144.645913-1-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      caaaa554
    • Kan Liang's avatar
      perf regs x86: Fix arch__intr_reg_mask() for the hybrid platform · 01b28e4a
      Kan Liang authored
      The X86 specific arch__intr_reg_mask() is to check whether the kernel
      and hardware can collect XMM registers. But it doesn't work on some
      hybrid platform.
      
      Without the patch on ADL-N:
      
        $ perf record -I?
        available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10
        R11 R12 R13 R14 R15
      
      The config of the test event doesn't contain the PMU information. The
      kernel may fail to initialize it on the correct hybrid PMU and return
      the wrong non-supported information.
      
      Add the PMU information into the config for the hybrid platform. The
      same register set is supported among different hybrid PMUs. Checking
      the first available one is good enough.
      
      With the patch on ADL-N:
      
        $ perf record -I?
        available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10
        R11 R12 R13 R14 R15 XMM0 XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 XMM9
        XMM10 XMM11 XMM12 XMM13 XMM14 XMM15
      
      Fixes: 6466ec14 ("perf regs x86: Add X86 specific arch__intr_reg_mask()")
      Reported-by: default avatarAmmy Yi <ammy.yi@intel.com>
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20220518145125.1494156-1-kan.liang@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      01b28e4a
    • Athira Rajeev's avatar
      perf test: Fix "all PMU test" to skip hv_24x7/hv_gpci tests on powerpc · 451ed805
      Athira Rajeev authored
      "perf all PMU test" picks the input events from "perf list --raw-dump
      pmu" list and runs "perf stat -e" for each of the event in the list. In
      case of powerpc, the PowerVM environment supports events from hv_24x7
      and hv_gpci PMU which is of example format like below:
      
      - hv_24x7/CPM_ADJUNCT_INST,domain=?,core=?/
      - hv_gpci/event,partition_id=?/
      
      The value for "?" needs to be filled in depending on system and
      respective event. CPM_ADJUNCT_INST needs have core value and domain
      value. hv_gpci event needs partition_id.  Similarly, there are other
      events for hv_24x7 and hv_gpci having "?" in event format. Hence skip
      these events on powerpc platform since values like partition_id, domain
      is specific to system and event.
      
      Fixes: 3d5ac9ef ("perf test: Workload test of all PMUs")
      Signed-off-by: default avatarAthira Jajeev <atrajeev@linux.vnet.ibm.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
      Link: https://lore.kernel.org/r/20220520101236.17249-1-atrajeev@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      451ed805
    • Jens Axboe's avatar
      io_uring: cleanup handling of the two task_work lists · 3fe07bcd
      Jens Axboe authored
      Rather than pass in a bool for whether or not this work item needs to go
      into the priority list or not, provide separate helpers for it. For most
      use cases, this also then gets rid of the branch for non-priority task
      work.
      
      While at it, rename the prior_task_list to prio_task_list. Prior is
      a confusing name for it, as it would seem to indicate that this is the
      previous task_work list. prio makes it clear that this is a priority
      task_work list.
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      3fe07bcd
    • Piyush Malgujar's avatar
      drivers: i2c: thunderx: Allow driver to work with ACPI defined TWSI controllers · 03a35bc8
      Piyush Malgujar authored
      Due to i2c->adap.dev.fwnode not being set, ACPI_COMPANION() wasn't properly
      found for TWSI controllers.
      Signed-off-by: default avatarSzymon Balcerak <sbalcerak@marvell.com>
      Signed-off-by: default avatarPiyush Malgujar <pmalgujar@marvell.com>
      Signed-off-by: default avatarWolfram Sang <wsa@kernel.org>
      03a35bc8
    • Mika Westerberg's avatar
      i2c: ismt: Provide a DMA buffer for Interrupt Cause Logging · 17a0f3ac
      Mika Westerberg authored
      Before sending a MSI the hardware writes information pertinent to the
      interrupt cause to a memory location pointed by SMTICL register. This
      memory holds three double words where the least significant bit tells
      whether the interrupt cause of master/target/error is valid. The driver
      does not use this but we need to set it up because otherwise it will
      perform DMA write to the default address (0) and this will cause an
      IOMMU fault such as below:
      
        DMAR: DRHD: handling fault status reg 2
        DMAR: [DMA Write] Request device [00:12.0] PASID ffffffff fault addr 0
              [fault reason 05] PTE Write access is not set
      
      To prevent this from happening, provide a proper DMA buffer for this
      that then gets mapped by the IOMMU accordingly.
      Signed-off-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      Reviewed-by: default avatarFrom: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: default avatarWolfram Sang <wsa@kernel.org>
      17a0f3ac
    • Yang Yingliang's avatar
      i2c: mt7621: fix missing clk_disable_unprepare() on error in mtk_i2c_probe() · a2537c98
      Yang Yingliang authored
      Fix the missing clk_disable_unprepare() before return
      from mtk_i2c_probe() in the error handling case.
      
      Fixes: d04913ec ("i2c: mt7621: Add MediaTek MT7621/7628/7688 I2C driver")
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      Reviewed-by: default avatarStefan Roese <sr@denx.de>
      Signed-off-by: default avatarWolfram Sang <wsa@kernel.org>
      a2537c98
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 6c3f5bec
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "ARM:
      
         - Correctly expose GICv3 support even if no irqchip is created so
           that userspace doesn't observe it changing pointlessly (fixing a
           regression with QEMU)
      
         - Don't issue a hypercall to set the id-mapped vectors when protected
           mode is enabled (fix for pKVM in combination with CPUs affected by
           Spectre-v3a)
      
        x86 (five oneliners, of which the most interesting two are):
      
         - a NULL pointer dereference on INVPCID executed with paging
           disabled, but only if KVM is using shadow paging
      
         - an incorrect bsearch comparison function which could truncate the
           result and apply PMU event filtering incorrectly. This one comes
           with a selftests update too"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: x86/mmu: fix NULL pointer dereference on guest INVPCID
        KVM: x86: hyper-v: fix type of valid_bank_mask
        KVM: Free new dirty bitmap if creating a new memslot fails
        KVM: eventfd: Fix false positive RCU usage warning
        selftests: kvm/x86: Verify the pmu event filter matches the correct event
        selftests: kvm/x86: Add the helper function create_pmu_event_filter
        kvm: x86/pmu: Fix the compare function used by the pmu event filter
        KVM: arm64: Don't hypercall before EL2 init
        KVM: arm64: vgic-v3: Consistently populate ID_AA64PFR0_EL1.GIC
        KVM: x86/mmu: Update number of zapped pages even if page list is stable
      6c3f5bec
    • Linus Torvalds's avatar
      Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · b3454ce0
      Linus Torvalds authored
      Pull clk fixes from Stephen Boyd:
       "Three clk driver fixes to close out the release
      
         - Fix a divider calculation breaking boot on Broadcom bcm2835
      
         - Fix HDMI output on Tanix TX6 mini board by reverting a patch
      
         - Fix clk_set_rate_range() calls on at91 by considering the range
           while calculating the divisor"
      
      * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: at91: generated: consider range when calculating best rate
        Revert "clk: sunxi-ng: sun6i-rtc: Add support for H6"
        clk: bcm2835: fix bcm2835_clock_choose_div
      b3454ce0
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2022-05-21' of git://anongit.freedesktop.org/drm/drm · 93413c84
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Few final fixes for 5.18, one amdgpu, core dp mst leak fix, dma-buf
        two fixes, and i915 has a few fixes, one for a regression on older
        GM45 chipsets,
      
        dma-buf:
         - ioctl userspace use fix
         - fix dma-buf sysfs name generation
      
        core:
         - dp/mst leak fix
      
        amdgpu:
         - suspend/resume regression fix
      
        i915:
         - fix for #5806: GPU hangs and display artifacts on Intel GM45
         - reject DMC with out-of-spec MMIO
         - correctly mark guilty contexts on GuC reset"
      
      * tag 'drm-fixes-2022-05-21' of git://anongit.freedesktop.org/drm/drm:
        drm/i915: Use i915_gem_object_ggtt_pin_ww for reloc_iomap
        drm/amd: Don't reset dGPUs if the system is going to s2idle
        drm/dp/mst: fix a possible memory leak in fetch_monitor_name()
        dma-buf: fix use of DMA_BUF_SET_NAME_{A,B} in userspace
        i915/guc/reset: Make __guc_reset_context aware of guilty engines
        drm/i915/dmc: Add MMIO range restrictions
        dma-buf: ensure unique directory name for dmabuf stats
      93413c84
  4. 20 May, 2022 11 commits
    • Dave Airlie's avatar
      Merge tag 'drm-intel-fixes-2022-05-20' of... · 64eea680
      Dave Airlie authored
      Merge tag 'drm-intel-fixes-2022-05-20' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
      
      - fix for #5806: GPU hangs and display artifacts on 5.18-rc3 on Intel GM45
      - reject DMC with out-of-spec MMIO (Cc: stable)
      - correctly mark guilty contexts on GuC reset.
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/YocqqvG6PbYx3QgJ@jlahtine-mobl.ger.corp.intel.com
      64eea680
    • Dave Airlie's avatar
      Merge tag 'drm-misc-fixes-2022-05-20' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes · 6e4a61cd
      Dave Airlie authored
      Fix for a memory leak in dp_mst, a (userspace) build fix for
      DMA_BUF_SET_NAME defines and a directory name generation fix for dmabuf
      stats
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      
      From: Maxime Ripard <maxime@cerno.tech>
      Link: https://patchwork.freedesktop.org/patch/msgid/20220520072408.cpjzy2taugagvrh7@houat
      6e4a61cd
    • Peter Zijlstra's avatar
      perf: Fix sys_perf_event_open() race against self · 3ac6487e
      Peter Zijlstra authored
      Norbert reported that it's possible to race sys_perf_event_open() such
      that the looser ends up in another context from the group leader,
      triggering many WARNs.
      
      The move_group case checks for races against itself, but the
      !move_group case doesn't, seemingly relying on the previous
      group_leader->ctx == ctx check. However, that check is racy due to not
      holding any locks at that time.
      
      Therefore, re-check the result after acquiring locks and bailing
      if they no longer match.
      
      Additionally, clarify the not_move_group case from the
      move_group-vs-move_group race.
      
      Fixes: f63a8daa ("perf: Fix event->ctx locking")
      Reported-by: default avatarNorbert Slusarek <nslusarek@gmx.net>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3ac6487e
    • Linus Torvalds's avatar
      Merge tag 'gpio-fixes-for-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux · 3b5e1590
      Linus Torvalds authored
      Pull gpio fixes from Bartosz Golaszewski:
      
       - fix bitops logic in gpio-vf610
      
       - return an error if the user tries to use inverted polarity in
         gpio-mvebu
      
      * tag 'gpio-fixes-for-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
        gpio: mvebu/pwm: Refuse requests with inverted polarity
        gpio: gpio-vf610: do not touch other bits when set the target bit
      3b5e1590
    • Linus Torvalds's avatar
      Merge tag 'mmc-v5.18-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · 317de3db
      Linus Torvalds authored
      Pull MMC fix from Ulf Hansson:
       "MMC core:
      
         - Fix busy polling for MMC_SEND_OP_COND again"
      
      * tag 'mmc-v5.18-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
        mmc: core: Fix busy polling for MMC_SEND_OP_COND again
      317de3db
    • Linus Torvalds's avatar
      Merge tag 'ceph-for-5.18-rc8' of https://github.com/ceph/ceph-client · b851c1f8
      Linus Torvalds authored
      Pull ceph fix from Ilya Dryomov:
       "A fix for a nasty use-after-free, marked for stable"
      
      * tag 'ceph-for-5.18-rc8' of https://github.com/ceph/ceph-client:
        libceph: fix misleading ceph_osdc_cancel_request() comment
        libceph: fix potential use-after-free on linger ping and resends
      b851c1f8
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-5.18-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · 265f34c2
      Linus Torvalds authored
      Pull RISC-V fixes from Palmer Dabbelt:
      
       - fix the fu540-c000 device tree to avoid a schema check failure on the
         DMA node name
      
       - fix typo in the PolarFire SOC device tree
      
      * tag 'riscv-for-linus-5.18-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        riscv: dts: microchip: fix gpio1 reg property typo
        riscv: dts: sifive: fu540-c000: align dma node name with dtschema
      265f34c2
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · a956f4e2
      Linus Torvalds authored
      Pull arm64 fixes from Will Deacon:
       "Three arm64 fixes for -rc8/final.
      
        The MTE and stolen time fixes have been doing the rounds for a little
        while, but review and testing feedback was ongoing until earlier this
        week. The kexec fix showed up on Monday and addresses a failure
        observed under Qemu.
      
        Summary:
      
         - Add missing write barrier to publish MTE tags before a pte update
      
         - Fix kexec relocation clobbering its own data structures
      
         - Fix stolen time crash if a timer IRQ fires during CPU hotplug"
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: mte: Ensure the cleared tags are visible before setting the PTE
        arm64: kexec: load from kimage prior to clobbering
        arm64: paravirt: Use RCU read locks to guard stolen_time
      a956f4e2
    • Paolo Bonzini's avatar
      KVM: x86/mmu: fix NULL pointer dereference on guest INVPCID · 9f46c187
      Paolo Bonzini authored
      With shadow paging enabled, the INVPCID instruction results in a call
      to kvm_mmu_invpcid_gva.  If INVPCID is executed with CR0.PG=0, the
      invlpg callback is not set and the result is a NULL pointer dereference.
      Fix it trivially by checking for mmu->invlpg before every call.
      
      There are other possibilities:
      
      - check for CR0.PG, because KVM (like all Intel processors after P5)
        flushes guest TLB on CR0.PG changes so that INVPCID/INVLPG are a
        nop with paging disabled
      
      - check for EFER.LMA, because KVM syncs and flushes when switching
        MMU contexts outside of 64-bit mode
      
      All of these are tricky, go for the simple solution.  This is CVE-2022-1789.
      Reported-by: default avatarYongkang Jia <kangel@zju.edu.cn>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      9f46c187
    • Yury Norov's avatar
      KVM: x86: hyper-v: fix type of valid_bank_mask · ea8c66fe
      Yury Norov authored
      In kvm_hv_flush_tlb(), valid_bank_mask is declared as unsigned long,
      but is used as u64, which is wrong for i386, and has been spotted by
      LKP after applying "KVM: x86: hyper-v: replace bitmap_weight() with
      hweight64()"
      
      https://lore.kernel.org/lkml/20220510154750.212913-12-yury.norov@gmail.com/
      
      But it's wrong even without that patch because now bitmap_weight()
      dereferences a word after valid_bank_mask on i386.
      
      >> include/asm-generic/bitops/const_hweight.h:21:76: warning: right shift count >= width of type
      +[-Wshift-count-overflow]
            21 | #define __const_hweight64(w) (__const_hweight32(w) + __const_hweight32((w) >> 32))
               |                                                                            ^~
         include/asm-generic/bitops/const_hweight.h:10:16: note: in definition of macro '__const_hweight8'
            10 |          ((!!((w) & (1ULL << 0))) +     \
               |                ^
         include/asm-generic/bitops/const_hweight.h:20:31: note: in expansion of macro '__const_hweight16'
            20 | #define __const_hweight32(w) (__const_hweight16(w) + __const_hweight16((w) >> 16))
               |                               ^~~~~~~~~~~~~~~~~
         include/asm-generic/bitops/const_hweight.h:21:54: note: in expansion of macro '__const_hweight32'
            21 | #define __const_hweight64(w) (__const_hweight32(w) + __const_hweight32((w) >> 32))
               |                                                      ^~~~~~~~~~~~~~~~~
         include/asm-generic/bitops/const_hweight.h:29:49: note: in expansion of macro '__const_hweight64'
            29 | #define hweight64(w) (__builtin_constant_p(w) ? __const_hweight64(w) : __arch_hweight64(w))
               |                                                 ^~~~~~~~~~~~~~~~~
         arch/x86/kvm/hyperv.c:1983:36: note: in expansion of macro 'hweight64'
          1983 |                 if (hc->var_cnt != hweight64(valid_bank_mask))
               |                                    ^~~~~~~~~
      
      CC: Borislav Petkov <bp@alien8.de>
      CC: Dave Hansen <dave.hansen@linux.intel.com>
      CC: H. Peter Anvin <hpa@zytor.com>
      CC: Ingo Molnar <mingo@redhat.com>
      CC: Jim Mattson <jmattson@google.com>
      CC: Joerg Roedel <joro@8bytes.org>
      CC: Paolo Bonzini <pbonzini@redhat.com>
      CC: Sean Christopherson <seanjc@google.com>
      CC: Thomas Gleixner <tglx@linutronix.de>
      CC: Vitaly Kuznetsov <vkuznets@redhat.com>
      CC: Wanpeng Li <wanpengli@tencent.com>
      CC: kvm@vger.kernel.org
      CC: linux-kernel@vger.kernel.org
      CC: x86@kernel.org
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarYury Norov <yury.norov@gmail.com>
      Message-Id: <20220519171504.1238724-1-yury.norov@gmail.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      ea8c66fe
    • Sean Christopherson's avatar
      KVM: Free new dirty bitmap if creating a new memslot fails · c87661f8
      Sean Christopherson authored
      Fix a goof in kvm_prepare_memory_region() where KVM fails to free the
      new memslot's dirty bitmap during a CREATE action if
      kvm_arch_prepare_memory_region() fails.  The logic is supposed to detect
      if the bitmap was allocated and thus needs to be freed, versus if the
      bitmap was inherited from the old memslot and thus needs to be kept.  If
      there is no old memslot, then obviously the bitmap can't have been
      inherited
      
      The bug was exposed by commit 86931ff7 ("KVM: x86/mmu: Do not create
      SPTEs for GFNs that exceed host.MAXPHYADDR"), which made it trivally easy
      for syzkaller to trigger failure during kvm_arch_prepare_memory_region(),
      but the bug can be hit other ways too, e.g. due to -ENOMEM when
      allocating x86's memslot metadata.
      
      The backtrace from kmemleak:
      
        __vmalloc_node_range+0xb40/0xbd0 mm/vmalloc.c:3195
        __vmalloc_node mm/vmalloc.c:3232 [inline]
        __vmalloc+0x49/0x50 mm/vmalloc.c:3246
        __vmalloc_array mm/util.c:671 [inline]
        __vcalloc+0x49/0x70 mm/util.c:694
        kvm_alloc_dirty_bitmap virt/kvm/kvm_main.c:1319
        kvm_prepare_memory_region virt/kvm/kvm_main.c:1551
        kvm_set_memslot+0x1bd/0x690 virt/kvm/kvm_main.c:1782
        __kvm_set_memory_region+0x689/0x750 virt/kvm/kvm_main.c:1949
        kvm_set_memory_region virt/kvm/kvm_main.c:1962
        kvm_vm_ioctl_set_memory_region virt/kvm/kvm_main.c:1974
        kvm_vm_ioctl+0x377/0x13a0 virt/kvm/kvm_main.c:4528
        vfs_ioctl fs/ioctl.c:51
        __do_sys_ioctl fs/ioctl.c:870
        __se_sys_ioctl fs/ioctl.c:856
        __x64_sys_ioctl+0xfc/0x140 fs/ioctl.c:856
        do_syscall_x64 arch/x86/entry/common.c:50
        do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
        entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      And the relevant sequence of KVM events:
      
        ioctl(3, KVM_CREATE_VM, 0)              = 4
        ioctl(4, KVM_SET_USER_MEMORY_REGION, {slot=0,
                                              flags=KVM_MEM_LOG_DIRTY_PAGES,
                                              guest_phys_addr=0x10000000000000,
                                              memory_size=4096,
                                              userspace_addr=0x20fe8000}
             ) = -1 EINVAL (Invalid argument)
      
      Fixes: 244893fa ("KVM: Dynamically allocate "new" memslots from the get-go")
      Cc: stable@vger.kernel.org
      Reported-by: syzbot+8606b8a9cc97a63f1c87@syzkaller.appspotmail.com
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20220518003842.1341782-1-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      c87661f8