1. 29 Jan, 2022 6 commits
    • Linus Torvalds's avatar
      Merge tag 'block-5.17-2022-01-28' of git://git.kernel.dk/linux-block · cb323ee7
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
      
       - NVMe pull request
            - add the IGNORE_DEV_SUBNQN quirk for Intel P4500/P4600 SSDs (Wu
              Zheng)
            - remove the unneeded ret variable in nvmf_dev_show (Changcheng
              Deng)
      
       - Fix for a hang regression introduced with a patch in the merge
         window, where low queue depth devices would not always get woken
         correctly (Laibin)
      
       - Small series fixing an IO accounting issue with bio backed dm devices
         (Mike, Yu)
      
      * tag 'block-5.17-2022-01-28' of git://git.kernel.dk/linux-block:
        dm: properly fix redundant bio-based IO accounting
        dm: revert partial fix for redundant bio-based IO accounting
        block: add bio_start_io_acct_time() to control start_time
        blk-mq: Fix wrong wakeup batch configuration which will cause hang
        nvme-fabrics: remove the unneeded ret variable in nvmf_dev_show
        nvme-pci: add the IGNORE_DEV_SUBNQN quirk for Intel P4500/P4600 SSDs
        blk-mq: fix missing blk_account_io_done() in error path
        block: fix memory leak in disk_register_independent_access_ranges
      cb323ee7
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.17-2022-01-28' of git://git.kernel.dk/linux-block · 3b58e9f3
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
       "Just two small fixes this time:
      
         - Fix a bug that can lead to node registration taking 1 second, when
           it should finish much quicker (Dylan)
      
         - Remove an unused argument from a function (Usama)"
      
      * tag 'io_uring-5.17-2022-01-28' of git://git.kernel.dk/linux-block:
        io_uring: remove unused argument from io_rsrc_node_alloc
        io_uring: fix bug in slow unregistering of nodes
      3b58e9f3
    • Linus Torvalds's avatar
      Merge tag 'powerpc-5.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · d66c1e79
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
      
       - Fix VM debug warnings on boot triggered via __set_fixmap().
      
       - Fix a debug warning in the 64-bit Book3S PMU handling code.
      
       - Fix nested guest HFSCR handling with multiple vCPUs on Power9 or
         later.
      
       - Fix decrementer storm caused by a recent change, seen with some
         configs.
      
      Thanks to Alexey Kardashevskiy, Athira Rajeev, Christophe Leroy,
      Fabiano Rosas, Maxime Bizon, Nicholas Piggin, and Sachin Sant.
      
      * tag 'powerpc-5.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/64s/interrupt: Fix decrementer storm
        KVM: PPC: Book3S HV Nested: Fix nested HFSCR being clobbered with multiple vCPUs
        powerpc/perf: Fix power_pmu_disable to call clear_pmi_irq_pending only if PMI is pending
        powerpc/fixmap: Fix VM debug warning on unmap
      d66c1e79
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 216e2aed
      Linus Torvalds authored
      Pull arm64 fixes from Catalin Marinas:
      
       - Errata workarounds for Cortex-A510: broken hardware dirty bit
         management, detection code for the TRBE (tracing) bugs with the
         actual fixes going in via the CoreSight tree.
      
       - Cortex-X2 errata handling for TRBE (inheriting the workarounds from
         Cortex-A710).
      
       - Fix ex_handler_load_unaligned_zeropad() to use the correct struct
         members.
      
       - A couple of kselftest fixes for FPSIMD.
      
       - Silence the vdso "no previous prototype" warning.
      
       - Mark start_backtrace() notrace and NOKPROBE_SYMBOL.
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: cpufeature: List early Cortex-A510 parts as having broken dbm
        kselftest/arm64: Correct logging of FPSIMD register read via ptrace
        kselftest/arm64: Skip VL_INHERIT tests for unsupported vector types
        arm64: errata: Add detection for TRBE trace data corruption
        arm64: errata: Add detection for TRBE invalid prohibited states
        arm64: errata: Add detection for TRBE ignored system register writes
        arm64: Add Cortex-A510 CPU part definition
        arm64: extable: fix load_unaligned_zeropad() reg indices
        arm64: Mark start_backtrace() notrace and NOKPROBE_SYMBOL
        arm64: errata: Update ARM64_ERRATUM_[2119858|2224489] with Cortex-X2 ranges
        arm64: Add Cortex-X2 CPU part definition
        arm64: vdso: Fix "no previous prototype" warning
      216e2aed
    • Linus Torvalds's avatar
      Merge tag 'fixes-v5.17-lsm-ceph-null' of... · d1e7f091
      Linus Torvalds authored
      Merge tag 'fixes-v5.17-lsm-ceph-null' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
      
      Pull security sybsystem fix from James Morris:
       "Fix NULL pointer crash in LSM via Ceph, from Vivek Goyal"
      
      * tag 'fixes-v5.17-lsm-ceph-null' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
        security, lsm: dentry_init_security() Handle multi LSM registration
      d1e7f091
    • Linus Torvalds's avatar
      Merge tag 'docs-5.17-3' of git://git.lwn.net/linux · 246e179d
      Linus Torvalds authored
      Pull documentation fixes from Jonathan Corbet:
       "A few documentation fixes for 5.17"
      
      * tag 'docs-5.17-3' of git://git.lwn.net/linux:
        docs/vm: Fix typo in *harden*
        Documentation: arm: marvell: Extend Avanta list
        docs: fix typo in Documentation/kernel-hacking/locking.rst
        docs: Hook the RTLA documents into the kernel docs build
      246e179d
  2. 28 Jan, 2022 34 commits
    • Mike Snitzer's avatar
      dm: properly fix redundant bio-based IO accounting · b879f915
      Mike Snitzer authored
      Record the start_time for a bio but defer the starting block core's IO
      accounting until after IO is submitted using bio_start_io_acct_time().
      
      This approach avoids the need to mess around with any of the
      individual IO stats in response to a bio_split() that follows bio
      submission.
      Reported-by: default avatarBud Brown <bubrown@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: stable@vger.kernel.org
      Depends-on: e45c47d1 ("block: add bio_start_io_acct_time() to control start_time")
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Link: https://lore.kernel.org/r/20220128155841.39644-4-snitzer@redhat.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      b879f915
    • Mike Snitzer's avatar
      dm: revert partial fix for redundant bio-based IO accounting · f524d9c9
      Mike Snitzer authored
      Reverts a1e1cb72 ("dm: fix redundant IO accounting for bios that
      need splitting") because it was too narrow in scope (only addressed
      redundant 'sectors[]' accounting and not ios, nsecs[], etc).
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Link: https://lore.kernel.org/r/20220128155841.39644-3-snitzer@redhat.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      f524d9c9
    • Mike Snitzer's avatar
      block: add bio_start_io_acct_time() to control start_time · e45c47d1
      Mike Snitzer authored
      bio_start_io_acct_time() interface is like bio_start_io_acct() that
      allows start_time to be passed in. This gives drivers the ability to
      defer starting accounting until after IO is issued (but possibily not
      entirely due to bio splitting).
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Link: https://lore.kernel.org/r/20220128155841.39644-2-snitzer@redhat.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      e45c47d1
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 169387e2
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Sixteen patches, mostly minor fixes and updates; however there are
        substantive driver bug fixes in pm8001, bnx2fc, zfcp, myrs and qedf"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: myrs: Fix crash in error case
        scsi: 53c700: Remove redundant assignment to pointer SCp
        scsi: ufs: Treat link loss as fatal error
        scsi: ufs: Use generic error code in ufshcd_set_dev_pwr_mode()
        scsi: bfa: Remove useless DMA-32 fallback configuration
        scsi: hisi_sas: Remove useless DMA-32 fallback configuration
        scsi: 3w-sas: Remove useless DMA-32 fallback configuration
        scsi: bnx2fc: Flush destroy_work queue before calling bnx2fc_interface_put()
        scsi: zfcp: Fix failed recovery on gone remote port with non-NPIV FCP devices
        scsi: pm8001: Fix bogus FW crash for maxcpus=1
        scsi: qedf: Change context reset messages to ratelimited
        scsi: qedf: Fix refcount issue when LOGO is received during TMF
        scsi: qedf: Add stag_work to all the vports
        scsi: ufs: ufshcd-pltfrm: Check the return value of devm_kstrdup()
        scsi: target: iscsi: Make sure the np under each tpg is unique
        scsi: elx: efct: Don't use GFP_KERNEL under spin lock
      169387e2
    • Linus Torvalds's avatar
      Merge tag 'efi-urgent-for-v5.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi · 073819e0
      Linus Torvalds authored
      Pull EFI fixes from Ard Biesheuvel:
      
       - avoid UEFI v2.00+ runtime services on Apple Mac systems, as they have
         been reported to cause crashes, and most Macs claim to be EFI v1.10
         anyway
      
       - avoid a spurious boot time warning on arm64 systems with 64k pages
      
      * tag 'efi-urgent-for-v5.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
        efi: runtime: avoid EFIv2 runtime services on Apple x86 machines
        efi/libstub: arm64: Fix image check alignment at entry
      073819e0
    • Vivek Goyal's avatar
      security, lsm: dentry_init_security() Handle multi LSM registration · 7f5056b9
      Vivek Goyal authored
      A ceph user has reported that ceph is crashing with kernel NULL pointer
      dereference. Following is the backtrace.
      
      /proc/version: Linux version 5.16.2-arch1-1 (linux@archlinux) (gcc (GCC)
      11.1.0, GNU ld (GNU Binutils) 2.36.1) #1 SMP PREEMPT Thu, 20 Jan 2022
      16:18:29 +0000
      distro / arch: Arch Linux / x86_64
      SELinux is not enabled
      ceph cluster version: 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503)
      
      relevant dmesg output:
      [   30.947129] BUG: kernel NULL pointer dereference, address:
      0000000000000000
      [   30.947206] #PF: supervisor read access in kernel mode
      [   30.947258] #PF: error_code(0x0000) - not-present page
      [   30.947310] PGD 0 P4D 0
      [   30.947342] Oops: 0000 [#1] PREEMPT SMP PTI
      [   30.947388] CPU: 5 PID: 778 Comm: touch Not tainted 5.16.2-arch1-1 #1
      86fbf2c313cc37a553d65deb81d98e9dcc2a3659
      [   30.947486] Hardware name: Gigabyte Technology Co., Ltd. B365M
      DS3H/B365M DS3H, BIOS F5 08/13/2019
      [   30.947569] RIP: 0010:strlen+0x0/0x20
      [   30.947616] Code: b6 07 38 d0 74 16 48 83 c7 01 84 c0 74 05 48 39 f7 75
      ec 31 c0 31 d2 89 d6 89 d7 c3 48 89 f8 31 d2 89 d6 89 d7 c3 0
      f 1f 40 00 <80> 3f 00 74 12 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 31
      ff
      [   30.947782] RSP: 0018:ffffa4ed80ffbbb8 EFLAGS: 00010246
      [   30.947836] RAX: 0000000000000000 RBX: ffffa4ed80ffbc60 RCX:
      0000000000000000
      [   30.947904] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
      0000000000000000
      [   30.947971] RBP: ffff94b0d15c0ae0 R08: 0000000000000000 R09:
      0000000000000000
      [   30.948040] R10: 0000000000000000 R11: 0000000000000000 R12:
      0000000000000000
      [   30.948106] R13: 0000000000000001 R14: ffffa4ed80ffbc60 R15:
      0000000000000000
      [   30.948174] FS:  00007fc7520f0740(0000) GS:ffff94b7ced40000(0000)
      knlGS:0000000000000000
      [   30.948252] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   30.948308] CR2: 0000000000000000 CR3: 0000000104a40001 CR4:
      00000000003706e0
      [   30.948376] Call Trace:
      [   30.948404]  <TASK>
      [   30.948431]  ceph_security_init_secctx+0x7b/0x240 [ceph
      49f9c4b9bf5be8760f19f1747e26da33920bce4b]
      [   30.948582]  ceph_atomic_open+0x51e/0x8a0 [ceph
      49f9c4b9bf5be8760f19f1747e26da33920bce4b]
      [   30.948708]  ? get_cached_acl+0x4d/0xa0
      [   30.948759]  path_openat+0x60d/0x1030
      [   30.948809]  do_filp_open+0xa5/0x150
      [   30.948859]  do_sys_openat2+0xc4/0x190
      [   30.948904]  __x64_sys_openat+0x53/0xa0
      [   30.948948]  do_syscall_64+0x5c/0x90
      [   30.948989]  ? exc_page_fault+0x72/0x180
      [   30.949034]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      [   30.949091] RIP: 0033:0x7fc7521e25bb
      [   30.950849] Code: 25 00 00 41 00 3d 00 00 41 00 74 4b 64 8b 04 25 18 00
      00 00 85 c0 75 67 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 0
      0 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 91 00 00 00 48 8b 54 24 28 64 48 2b 14
      25
      
      Core of the problem is that ceph checks for return code from
      security_dentry_init_security() and if return code is 0, it assumes
      everything is fine and continues to call strlen(name), which crashes.
      
      Typically SELinux LSM returns 0 and sets name to "security.selinux" and
      it is not a problem. Or if selinux is not compiled in or disabled, it
      returns -EOPNOTSUP and ceph deals with it.
      
      But somehow in this configuration, 0 is being returned and "name" is
      not being initialized and that's creating the problem.
      
      Our suspicion is that BPF LSM is registering a hook for
      dentry_init_security() and returns hook default of 0.
      
      LSM_HOOK(int, 0, dentry_init_security, struct dentry *dentry,...)
      
      I have not been able to reproduce it just by doing CONFIG_BPF_LSM=y.
      Stephen has tested the patch though and confirms it solves the problem
      for him.
      
      dentry_init_security() is written in such a way that it expects only one
      LSM to register the hook. Atleast that's the expectation with current code.
      
      If another LSM returns a hook and returns default, it will simply return
      0 as of now and that will break ceph.
      
      Hence, suggestion is that change semantics of this hook a bit. If there
      are no LSMs or no LSM is taking ownership and initializing security context,
      then return -EOPNOTSUP. Also allow at max one LSM to initialize security
      context. This hook can't deal with multiple LSMs trying to init security
      context. This patch implements this new behavior.
      Reported-by: default avatarStephen Muth <smuth4@gmail.com>
      Tested-by: default avatarStephen Muth <smuth4@gmail.com>
      Suggested-by: default avatarCasey Schaufler <casey@schaufler-ca.com>
      Acked-by: default avatarCasey Schaufler <casey@schaufler-ca.com>
      Reviewed-by: default avatarSerge Hallyn <serge@hallyn.com>
      Cc: Jeff Layton <jlayton@kernel.org>
      Cc: Christian Brauner <brauner@kernel.org>
      Cc: Paul Moore <paul@paul-moore.com>
      Cc: <stable@vger.kernel.org> # 5.16.0
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      Acked-by: default avatarPaul Moore <paul@paul-moore.com>
      Acked-by: default avatarChristian Brauner <brauner@kernel.org>
      Signed-off-by: default avatarJames Morris <jmorris@namei.org>
      7f5056b9
    • Linus Torvalds's avatar
      Merge tag 'pm-5.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · a7b4b007
      Linus Torvalds authored
      Pull power management fixes from Rafael Wysocki:
       "These make the buffer handling in pm_show_wakelocks() more robust and
        drop an unused hibernation-related function.
      
        Specifics:
      
         - Make the buffer handling in pm_show_wakelocks() more robust by
           using sysfs_emit_at() in it to generate output (Greg
           Kroah-Hartman).
      
         - Drop register_nosave_region_late() which is not used (Amadeusz
           Sławiński)"
      
      * tag 'pm-5.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        PM: hibernate: Remove register_nosave_region_late()
        PM: wakeup: simplify the output logic of pm_show_wakelocks()
      a7b4b007
    • Linus Torvalds's avatar
      Merge tag 'trace-v5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · df000154
      Linus Torvalds authored
      Pulltracing fixes from Steven Rostedt:
      
       - Limit mcount build time sorting to only those archs that we know it
         works for.
      
       - Fix memory leak in error path of histogram setup
      
       - Fix and clean up rel_loc array out of bounds issue
      
       - tools/rtla documentation fixes
      
       - Fix issues with histogram logic
      
      * tag 'trace-v5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        tracing: Don't inc err_log entry count if entry allocation fails
        tracing: Propagate is_signed to expression
        tracing: Fix smatch warning for do while check in event_hist_trigger_parse()
        tracing: Fix smatch warning for null glob in event_hist_trigger_parse()
        tools/tracing: Update Makefile to build rtla
        rtla: Make doc build optional
        tracing/perf: Avoid -Warray-bounds warning for __rel_loc macro
        tracing: Avoid -Warray-bounds warning for __rel_loc macro
        tracing/histogram: Fix a potential memory leak for kstrdup()
        ftrace: Have architectures opt-in for mcount build time sorting
      df000154
    • Linus Torvalds's avatar
      Merge branch 'ucount-rlimit-fixes-for-v5.17-rc2' of... · 76fcbc9c
      Linus Torvalds authored
      Merge branch 'ucount-rlimit-fixes-for-v5.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
      
      Pull ucount rlimit fix from Eric Biederman.
      
      Make sure the ucounts have a reference to the user namespace it refers
      to, so that users that themselves don't carry such a reference around
      can safely use the ucount functions.
      
      * 'ucount-rlimit-fixes-for-v5.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
        ucount:  Make get_ucount a safe get_user replacement
      76fcbc9c
    • Linus Torvalds's avatar
      Merge tag 'rcu-urgent.2022.01.26a' of... · a773abf7
      Linus Torvalds authored
      Merge tag 'rcu-urgent.2022.01.26a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
      
      Pull RCU fix from Paul McKenney:
       "This fixes a brown-paper-bag bug in RCU tasks that causes things like
        BPF and ftrace to fail miserably on systems with non-power-of-two
        numbers of CPUs.
      
        It fixes a math error added in 7a30871b ("rcu-tasks: Introduce
        ->percpu_enqueue_shift for dynamic queue selection') during the v5.17
        merge window. This commit works correctly only on systems with a
        power-of-two number of CPUs, which just so happens to be the kind that
        rcutorture always uses by default.
      
        This pull request fixes the math so that things also work on systems
        that don't happen to have a power-of-two number of CPUs"
      
      * tag 'rcu-urgent.2022.01.26a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
        rcu-tasks: Fix computation of CPU-to-list shift counts
      a773abf7
    • Linus Torvalds's avatar
      Merge tag 'hyperv-fixes-signed-20220128' of... · 56a14c69
      Linus Torvalds authored
      Merge tag 'hyperv-fixes-signed-20220128' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
      
      Pull hyperv fixes from Wei Liu:
      
       - Fix screen resolution for hyperv framebuffer (Michael Kelley)
      
       - Fix packet header accounting for balloon driver (Yanming Liu)
      
      * tag 'hyperv-fixes-signed-20220128' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
        video: hyperv_fb: Fix validation of screen resolution
        Drivers: hv: balloon: account for vmbus packet header in max_pkt_size
      56a14c69
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 3cd7cd8a
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "Two larger x86 series:
      
         - Redo incorrect fix for SEV/SMAP erratum
      
         - Windows 11 Hyper-V workaround
      
        Other x86 changes:
      
         - Various x86 cleanups
      
         - Re-enable access_tracking_perf_test
      
         - Fix for #GP handling on SVM
      
         - Fix for CPUID leaf 0Dh in KVM_GET_SUPPORTED_CPUID
      
         - Fix for ICEBP in interrupt shadow
      
         - Avoid false-positive RCU splat
      
         - Enable Enlightened MSR-Bitmap support for real
      
        ARM:
      
         - Correctly update the shadow register on exception injection when
           running in nVHE mode
      
         - Correctly use the mm_ops indirection when performing cache
           invalidation from the page-table walker
      
         - Restrict the vgic-v3 workaround for SEIS to the two known broken
           implementations
      
        Generic code changes:
      
         - Dead code cleanup"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (43 commits)
        KVM: eventfd: Fix false positive RCU usage warning
        KVM: nVMX: Allow VMREAD when Enlightened VMCS is in use
        KVM: nVMX: Implement evmcs_field_offset() suitable for handle_vmread()
        KVM: nVMX: Rename vmcs_to_field_offset{,_table}
        KVM: nVMX: eVMCS: Filter out VM_EXIT_SAVE_VMX_PREEMPTION_TIMER
        KVM: nVMX: Also filter MSR_IA32_VMX_TRUE_PINBASED_CTLS when eVMCS
        selftests: kvm: check dynamic bits against KVM_X86_XCOMP_GUEST_SUPP
        KVM: x86: add system attribute to retrieve full set of supported xsave states
        KVM: x86: Add a helper to retrieve userspace address from kvm_device_attr
        selftests: kvm: move vm_xsave_req_perm call to amx_test
        KVM: x86: Sync the states size with the XCR0/IA32_XSS at, any time
        KVM: x86: Update vCPU's runtime CPUID on write to MSR_IA32_XSS
        KVM: x86: Keep MSR_IA32_XSS unchanged for INIT
        KVM: x86: Free kvm_cpuid_entry2 array on post-KVM_RUN KVM_SET_CPUID{,2}
        KVM: nVMX: WARN on any attempt to allocate shadow VMCS for vmcs02
        KVM: selftests: Don't skip L2's VMCALL in SMM test for SVM guest
        KVM: x86: Check .flags in kvm_cpuid_check_equal() too
        KVM: x86: Forcibly leave nested virt when SMM state is toggled
        KVM: SVM: drop unnecessary code in svm_hv_vmcb_dirty_nested_enlightenments()
        KVM: SVM: hyper-v: Enable Enlightened MSR-Bitmap support for real
        ...
      3cd7cd8a
    • Linus Torvalds's avatar
      Merge tag 'mips-fixes-5.17_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux · e0152705
      Linus Torvalds authored
      Pull MIPS build fix from Thomas Bogendoerfer:
       "Fix for allmodconfig build"
      
      * tag 'mips-fixes-5.17_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
        MIPS: Fix build error due to PTR used in more places
      e0152705
    • Linus Torvalds's avatar
      Merge tag 's390-5.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 7eb36254
      Linus Torvalds authored
      Pull s390 fixes from Vasily Gorbik:
      
       - Fix loading of modules with lots of relocations and add a regression
         test for it.
      
       - Fix machine check handling for vector validity and guarded storage
         validity failures in KVM guests.
      
       - Fix hypervisor performance data to include z/VM guests with access
         control group set.
      
       - Fix z900 build problem in uaccess code.
      
       - Update defconfigs.
      
      * tag 's390-5.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/hypfs: include z/VM guests with access control group set
        s390: update defconfigs
        s390/module: test loading modules with a lot of relocations
        s390/module: fix loading modules with a lot of relocations
        s390/uaccess: fix compile error
        s390/nmi: handle vector validity failures for KVM guests
        s390/nmi: handle guarded storage validity failures for KVM guests
      7eb36254
    • Linus Torvalds's avatar
      Merge tag 'ceph-for-5.17-rc2' of git://github.com/ceph/ceph-client · 8157f470
      Linus Torvalds authored
      Pull ceph fixes from Ilya Dryomov:
       "A ZERO_SIZE_PTR dereference fix from Xiubo and two fixes for async
        creates interacting with pool namespace-constrained OSD permissions
        from Jeff (marked for stable)"
      
      * tag 'ceph-for-5.17-rc2' of git://github.com/ceph/ceph-client:
        ceph: set pool_ns in new inode layout for async creates
        ceph: properly put ceph_string reference after async create attempt
        ceph: put the requests/sessions when it fails to alloc memory
      8157f470
    • James Morse's avatar
      arm64: cpufeature: List early Cortex-A510 parts as having broken dbm · 297ae1eb
      James Morse authored
      Versions of Cortex-A510 before r0p3 are affected by a hardware erratum
      where the hardware update of the dirty bit is not correctly ordered.
      
      Add these cpus to the cpu_has_broken_dbm list.
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Link: https://lore.kernel.org/r/20220125154040.549272-3-james.morse@arm.comSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      297ae1eb
    • Linus Torvalds's avatar
      ocfs2: fix subdirectory registration with register_sysctl() · f6a26318
      Linus Torvalds authored
      The kernel test robot reports that commit c42ff46f ("ocfs2: simplify
      subdirectory registration with register_sysctl()") is broken, and
      results in kernel warning messages like
      
        sysctl table check failed: fs/ocfs2/nm Not a file
        sysctl table check failed: fs/ocfs2/nm No proc_handler
        sysctl table check failed: fs/ocfs2/nm bogus .mode 0555
      
      and in fact this was already reported back in linux-next, but nobody
      seems to have reacted to that report.  Possibly that original report
      only ever made it to the lkp list.
      
      The problem seems to be that the simplification didn't actually go far
      enough, and should have converted the whole directory path to the final
      sysctl file, rather than just the two first components.
      
      So take that last step.
      
      Fixes: c42ff46f ("ocfs2: simplify subdirectory registration with register_sysctl()")
      Reported-by: default avatarkernel test robot <oliver.sang@intel.com>
      Link: https://lore.kernel.org/all/20220128065310.GF8421@xsang-OptiPlex-9020/
      Link: https://lists.01.org/hyperkitty/list/lkp@lists.01.org/thread/KQ2F6TPJWMDVEXJM4WTUC4DU3EH3YJVT/Tested-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Cc: Luis Chamberlain <mcgrof@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f6a26318
    • Catalin Marinas's avatar
      Merge tag 'trbe-cortex-a510-errata' of... · df205970
      Catalin Marinas authored
      Merge tag 'trbe-cortex-a510-errata' of gitolite.kernel.org:pub/scm/linux/kernel/git/coresight/linux into for-next/fixes
      
      coresight: trbe: Workaround Cortex-A510 erratas
      
      This pull request is providing arm64 definitions to support
      TRBE Cortex-A510 erratas.
      Signed-off-by: default avatarMathieu Poirier <mathieu.poirier@linaro.org>
      
      * tag 'trbe-cortex-a510-errata' of gitolite.kernel.org:pub/scm/linux/kernel/git/coresight/linux:
        arm64: errata: Add detection for TRBE trace data corruption
        arm64: errata: Add detection for TRBE invalid prohibited states
        arm64: errata: Add detection for TRBE ignored system register writes
        arm64: Add Cortex-A510 CPU part definition
      df205970
    • Linus Torvalds's avatar
      Merge tag 'fsnotify_for_v5.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs · 4897e722
      Linus Torvalds authored
      Pull fsnotify fixes from Jan Kara:
       "Fixes for userspace breakage caused by fsnotify changes ~3 years ago
        and one fanotify cleanup"
      
      * tag 'fsnotify_for_v5.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
        fsnotify: fix fsnotify hooks in pseudo filesystems
        fsnotify: invalidate dcache before IN_DELETE event
        fanotify: remove variable set but not used
      4897e722
    • Linus Torvalds's avatar
      Merge tag 'fs_for_v5.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs · c2b19fd7
      Linus Torvalds authored
      Pull udf and quota fixes from Jan Kara:
       "Fixes for crashes in UDF when inode expansion fails and one quota
        cleanup"
      
      * tag 'fs_for_v5.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
        quota: cleanup double word in comment
        udf: Restore i_lenAlloc when inode expansion fails
        udf: Fix NULL ptr deref when converting from inline format
      c2b19fd7
    • Paolo Bonzini's avatar
      Merge tag 'kvmarm-fixes-5.17-1' of... · 17179d00
      Paolo Bonzini authored
      Merge tag 'kvmarm-fixes-5.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
      
      KVM/arm64 fixes for 5.17, take #1
      
      - Correctly update the shadow register on exception injection when
        running in nVHE mode
      
      - Correctly use the mm_ops indirection when performing cache invalidation
        from the page-table walker
      
      - Restrict the vgic-v3 workaround for SEIS to the two known broken
        implementations
      17179d00
    • Hou Wenlong's avatar
      KVM: eventfd: Fix false positive RCU usage warning · 6a0c6170
      Hou Wenlong authored
      Fix the following false positive warning:
       =============================
       WARNING: suspicious RCU usage
       5.16.0-rc4+ #57 Not tainted
       -----------------------------
       arch/x86/kvm/../../../virt/kvm/eventfd.c:484 RCU-list traversed in non-reader section!!
      
       other info that might help us debug this:
      
       rcu_scheduler_active = 2, debug_locks = 1
       3 locks held by fc_vcpu 0/330:
        #0: ffff8884835fc0b0 (&vcpu->mutex){+.+.}-{3:3}, at: kvm_vcpu_ioctl+0x88/0x6f0 [kvm]
        #1: ffffc90004c0bb68 (&kvm->srcu){....}-{0:0}, at: vcpu_enter_guest+0x600/0x1860 [kvm]
        #2: ffffc90004c0c1d0 (&kvm->irq_srcu){....}-{0:0}, at: kvm_notify_acked_irq+0x36/0x180 [kvm]
      
       stack backtrace:
       CPU: 26 PID: 330 Comm: fc_vcpu 0 Not tainted 5.16.0-rc4+
       Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
       Call Trace:
        <TASK>
        dump_stack_lvl+0x44/0x57
        kvm_notify_acked_gsi+0x6b/0x70 [kvm]
        kvm_notify_acked_irq+0x8d/0x180 [kvm]
        kvm_ioapic_update_eoi+0x92/0x240 [kvm]
        kvm_apic_set_eoi_accelerated+0x2a/0xe0 [kvm]
        handle_apic_eoi_induced+0x3d/0x60 [kvm_intel]
        vmx_handle_exit+0x19c/0x6a0 [kvm_intel]
        vcpu_enter_guest+0x66e/0x1860 [kvm]
        kvm_arch_vcpu_ioctl_run+0x438/0x7f0 [kvm]
        kvm_vcpu_ioctl+0x38a/0x6f0 [kvm]
        __x64_sys_ioctl+0x89/0xc0
        do_syscall_64+0x3a/0x90
        entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Since kvm_unregister_irq_ack_notifier() does synchronize_srcu(&kvm->irq_srcu),
      kvm->irq_ack_notifier_list is protected by kvm->irq_srcu. In fact,
      kvm->irq_srcu SRCU read lock is held in kvm_notify_acked_irq(), making it
      a false positive warning. So use hlist_for_each_entry_srcu() instead of
      hlist_for_each_entry_rcu().
      Reviewed-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarHou Wenlong <houwenlong93@linux.alibaba.com>
      Message-Id: <f98bac4f5052bad2c26df9ad50f7019e40434512.1643265976.git.houwenlong.hwl@antgroup.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      6a0c6170
    • Vitaly Kuznetsov's avatar
      KVM: nVMX: Allow VMREAD when Enlightened VMCS is in use · 6cbbaab6
      Vitaly Kuznetsov authored
      Hyper-V TLFS explicitly forbids VMREAD and VMWRITE instructions when
      Enlightened VMCS interface is in use:
      
      "Any VMREAD or VMWRITE instructions while an enlightened VMCS is
      active is unsupported and can result in unexpected behavior.""
      
      Windows 11 + WSL2 seems to ignore this, attempts to VMREAD VMCS field
      0x4404 ("VM-exit interruption information") are observed. Failing
      these attempts with nested_vmx_failInvalid() makes such guests
      unbootable.
      
      Microsoft confirms this is a Hyper-V bug and claims that it'll get fixed
      eventually but for the time being we need a workaround. (Temporary) allow
      VMREAD to get data from the currently loaded Enlightened VMCS.
      
      Note: VMWRITE instructions remain forbidden, it is not clear how to
      handle them properly and hopefully won't ever be needed.
      Reviewed-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20220112170134.1904308-6-vkuznets@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      6cbbaab6
    • Vitaly Kuznetsov's avatar
      KVM: nVMX: Implement evmcs_field_offset() suitable for handle_vmread() · 892a42c1
      Vitaly Kuznetsov authored
      In preparation to allowing reads from Enlightened VMCS from
      handle_vmread(), implement evmcs_field_offset() to get the correct
      read offset. get_evmcs_offset(), which is being used by KVM-on-Hyper-V,
      is almost what's needed but a few things need to be adjusted. First,
      WARN_ON() is unacceptable for handle_vmread() as any field can (in
      theory) be supplied by the guest and not all fields are defined in
      eVMCS v1. Second, we need to handle 'holes' in eVMCS (missing fields).
      It also sounds like a good idea to WARN_ON() if such fields are ever
      accessed by KVM-on-Hyper-V.
      
      Implement dedicated evmcs_field_offset() helper.
      
      No functional change intended.
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20220112170134.1904308-5-vkuznets@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      892a42c1
    • Vitaly Kuznetsov's avatar
      KVM: nVMX: Rename vmcs_to_field_offset{,_table} · 2423a4c0
      Vitaly Kuznetsov authored
      vmcs_to_field_offset{,_table} may sound misleading as VMCS is an opaque
      blob which is not supposed to be accessed directly. In fact,
      vmcs_to_field_offset{,_table} are related to KVM defined VMCS12 structure.
      
      Rename vmcs_field_to_offset() to get_vmcs12_field_offset() for clarity.
      
      No functional change intended.
      Reviewed-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20220112170134.1904308-4-vkuznets@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      2423a4c0
    • Vitaly Kuznetsov's avatar
      KVM: nVMX: eVMCS: Filter out VM_EXIT_SAVE_VMX_PREEMPTION_TIMER · 7a601e2c
      Vitaly Kuznetsov authored
      Enlightened VMCS v1 doesn't have VMX_PREEMPTION_TIMER_VALUE field,
      PIN_BASED_VMX_PREEMPTION_TIMER is also filtered out already so it makes
      sense to filter out VM_EXIT_SAVE_VMX_PREEMPTION_TIMER too.
      
      Note, none of the currently existing Windows/Hyper-V versions are known
      to enable 'save VMX-preemption timer value' when eVMCS is in use, the
      change is aimed at making the filtering future proof.
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20220112170134.1904308-3-vkuznets@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      7a601e2c
    • Vitaly Kuznetsov's avatar
      KVM: nVMX: Also filter MSR_IA32_VMX_TRUE_PINBASED_CTLS when eVMCS · f80ae0ef
      Vitaly Kuznetsov authored
      Similar to MSR_IA32_VMX_EXIT_CTLS/MSR_IA32_VMX_TRUE_EXIT_CTLS,
      MSR_IA32_VMX_ENTRY_CTLS/MSR_IA32_VMX_TRUE_ENTRY_CTLS pair,
      MSR_IA32_VMX_TRUE_PINBASED_CTLS needs to be filtered the same way
      MSR_IA32_VMX_PINBASED_CTLS is currently filtered as guests may solely rely
      on 'true' MSR data.
      
      Note, none of the currently existing Windows/Hyper-V versions are known
      to stumble upon the unfiltered MSR_IA32_VMX_TRUE_PINBASED_CTLS, the change
      is aimed at making the filtering future proof.
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20220112170134.1904308-2-vkuznets@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      f80ae0ef
    • Paolo Bonzini's avatar
      selftests: kvm: check dynamic bits against KVM_X86_XCOMP_GUEST_SUPP · b19c99b9
      Paolo Bonzini authored
      Provide coverage for the new API.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      b19c99b9
    • Paolo Bonzini's avatar
      KVM: x86: add system attribute to retrieve full set of supported xsave states · dd6e6312
      Paolo Bonzini authored
      Because KVM_GET_SUPPORTED_CPUID is meant to be passed (by simple-minded
      VMMs) to KVM_SET_CPUID2, it cannot include any dynamic xsave states that
      have not been enabled.  Probing those, for example so that they can be
      passed to ARCH_REQ_XCOMP_GUEST_PERM, requires a new ioctl or arch_prctl.
      The latter is in fact worse, even though that is what the rest of the
      API uses, because it would require supported_xcr0 to be moved from the
      KVM module to the kernel just for this use.  In addition, the value
      would be nonsensical (or an error would have to be returned) until
      the KVM module is loaded in.
      
      Therefore, to limit the growth of system ioctls, add a /dev/kvm
      variant of KVM_{GET,HAS}_DEVICE_ATTR, and implement it in x86
      with just one group (0) and attribute (KVM_X86_XCOMP_GUEST_SUPP).
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      dd6e6312
    • Sean Christopherson's avatar
      KVM: x86: Add a helper to retrieve userspace address from kvm_device_attr · 56f289a8
      Sean Christopherson authored
      Add a helper to handle converting the u64 userspace address embedded in
      struct kvm_device_attr into a userspace pointer, it's all too easy to
      forget the intermediate "unsigned long" cast as well as the truncation
      check.
      
      No functional change intended.
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      56f289a8
    • Mark Brown's avatar
      kselftest/arm64: Correct logging of FPSIMD register read via ptrace · 9ae279ec
      Mark Brown authored
      There's a cut'n'paste error in the logging for our test for reading register
      state back via ptrace, correctly say that we did a read instead of a write.
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Reviewed-by: default avatarShuah Khan <skhan@linuxfoundation.org>
      Link: https://lore.kernel.org/r/20220124175527.3260234-3-broonie@kernel.orgSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      9ae279ec
    • Mark Brown's avatar
      kselftest/arm64: Skip VL_INHERIT tests for unsupported vector types · 50806fd9
      Mark Brown authored
      Currently we unconditionally test the ability to set the vector length
      inheritance flag via ptrace meaning that we generate false failures on
      systems that don't support SVE when we attempt to set the vector length
      there. Check the hwcap and mark the tests as skipped when it's not present.
      
      Fixes: 0ba1ce1e ("selftests: arm64: Add coverage of ptrace flags for SVE VL inheritance")
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Reviewed-by: default avatarShuah Khan <skhan@linuxfoundation.org>
      Link: https://lore.kernel.org/r/20220124175527.3260234-2-broonie@kernel.orgSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      50806fd9
    • Linus Torvalds's avatar
      Merge tag 'ata-5.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata · 145d9b49
      Linus Torvalds authored
      Pull ATA fix from Damien Le Moal:
       "A single fix for 5.17-rc2, adding a missing resource allocation error
        check in the pata_platform driver, from Zhou"
      
      * tag 'ata-5.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
        ata: pata_platform: Fix a NULL pointer dereference in __pata_platform_probe()
      145d9b49
    • Linus Torvalds's avatar
      Merge tag 'hwmon-for-v5.17-rc2' of... · 374630e3
      Linus Torvalds authored
      Merge tag 'hwmon-for-v5.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
      
      Pull hwmon fixes from Guenter Roeck:
      
       - Fix crash in nct6775 driver
      
       - Prevent divide by zero in adt7470 driver
      
       - Fix conditional compile warning in pmbus/ir38064 driver
      
       - Various minor fixes in lm90 driver
      
      * tag 'hwmon-for-v5.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
        hwmon: (nct6775) Fix crash in clear_caseopen
        hwmon: (adt7470) Prevent divide by zero in adt7470_fan_write()
        hwmon: (pmbus/ir38064) Mark ir38064_of_match as __maybe_unused
        hwmon: (lm90) Fix sysfs and udev notifications
        hwmon: (lm90) Mark alert as broken for MAX6646/6647/6649
        hwmon: (lm90) Mark alert as broken for MAX6680
        hwmon: (lm90) Mark alert as broken for MAX6654
        hwmon: (lm90) Re-enable interrupts after alert clears
        hwmon: (lm90) Reduce maximum conversion rate for G781
      374630e3