1. 20 Oct, 2024 7 commits
    • Maxim Levitsky's avatar
      KVM: VMX: reset the segment cache after segment init in vmx_vcpu_reset() · 731285fb
      Maxim Levitsky authored
      Reset the segment cache after segment initialization in vmx_vcpu_reset()
      to harden KVM against caching stale/uninitialized data.  Without the
      recent fix to bypass the cache in kvm_arch_vcpu_put(), the following
      scenario is possible:
      
       - vCPU is just created, and the vCPU thread is preempted before
         SS.AR_BYTES is written in vmx_vcpu_reset().
      
       - When scheduling out the vCPU task, kvm_arch_vcpu_in_kernel() =>
         vmx_get_cpl() reads and caches '0' for SS.AR_BYTES.
      
       - vmx_vcpu_reset() => seg_setup() configures SS.AR_BYTES, but doesn't
         invoke vmx_segment_cache_clear() to invalidate the cache.
      
      As a result, KVM retains a stale value in the cache, which can be read,
      e.g. via KVM_GET_SREGS.  Usually this is not a problem because the VMX
      segment cache is reset on each VM-Exit, but if the userspace VMM (e.g KVM
      selftests) reads and writes system registers just after the vCPU was
      created, _without_ modifying SS.AR_BYTES, userspace will write back the
      stale '0' value and ultimately will trigger a VM-Entry failure due to
      incorrect SS segment type.
      
      Invalidating the cache after writing the VMCS doesn't address the general
      issue of cache accesses from IRQ context being unsafe, but it does prevent
      KVM from clobbering the VMCS, i.e. mitigates the harm done _if_ KVM has a
      bug that results in an unsafe cache access.
      Signed-off-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Fixes: 2fb92db1 ("KVM: VMX: Cache vmcs segment fields")
      [sean: rework changelog to account for previous patch]
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-ID: <20241009175002.1118178-3-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      731285fb
    • Sean Christopherson's avatar
      KVM: x86: Clean up documentation for KVM_X86_QUIRK_SLOT_ZAP_ALL · 5a279842
      Sean Christopherson authored
      Massage the documentation for KVM_X86_QUIRK_SLOT_ZAP_ALL to call out that
      it applies to moved memslots as well as deleted memslots, to avoid KVM's
      "fast zap" terminology (which has no meaning for userspace), and to reword
      the documented targeted zap behavior to specifically say that KVM _may_
      zap a subset of all SPTEs.  As evidenced by the fix to zap non-leafs SPTEs
      with gPTEs, formally documenting KVM's exact internal behavior is risky
      and unnecessary.
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-ID: <20241009192345.1148353-4-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      5a279842
    • Sean Christopherson's avatar
      KVM: x86/mmu: Add lockdep assert to enforce safe usage of kvm_unmap_gfn_range() · 28cf4978
      Sean Christopherson authored
      Add a lockdep assertion in kvm_unmap_gfn_range() to ensure that either
      mmu_invalidate_in_progress is elevated, or that the range is being zapped
      due to memslot removal (loosely detected by slots_lock being held).
      Zapping SPTEs without mmu_invalidate_{in_progress,seq} protection is unsafe
      as KVM's page fault path snapshots state before acquiring mmu_lock, and
      thus can create SPTEs with stale information if vCPUs aren't forced to
      retry faults (due to seeing an in-progress or past MMU invalidation).
      
      Memslot removal is a special case, as the memslot is retrieved outside of
      mmu_invalidate_seq, i.e. doesn't use the "standard" protections, and
      instead relies on SRCU synchronization to ensure any in-flight page faults
      are fully resolved before zapping SPTEs.
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-ID: <20241009192345.1148353-3-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      28cf4978
    • Sean Christopherson's avatar
      KVM: x86/mmu: Zap only SPs that shadow gPTEs when deleting memslot · 58a20a94
      Sean Christopherson authored
      When performing a targeted zap on memslot removal, zap only MMU pages that
      shadow guest PTEs, as zapping all SPs that "match" the gfn is inexact and
      unnecessary.  Furthermore, for_each_gfn_valid_sp() arguably shouldn't
      exist, because it doesn't do what most people would it expect it to do.
      The "round gfn for level" adjustment that is done for direct SPs (no gPTE)
      means that the exact gfn comparison will not get a match, even when a SP
      does "cover" a gfn, or was even created specifically for a gfn.
      
      For memslot deletion specifically, KVM's behavior will vary significantly
      based on the size and alignment of a memslot, and in weird ways.  E.g. for
      a 4KiB memslot, KVM will zap more SPs if the slot is 1GiB aligned than if
      it's only 4KiB aligned.  And as described below, zapping SPs in the
      aligned case overzaps for direct MMUs, as odds are good the upper-level
      SPs are serving other memslots.
      
      To iterate over all potentially-relevant gfns, KVM would need to make a
      pass over the hash table for each level, with the gfn used for lookup
      rounded for said level.  And then check that the SP is of the correct
      level, too, e.g. to avoid over-zapping.
      
      But even then, KVM would massively overzap, as processing every level is
      all but guaranteed to zap SPs that serve other memslots, especially if the
      memslot being removed is relatively small.  KVM could mitigate that issue
      by processing only levels that can be possible guest huge pages, i.e. are
      less likely to be re-used for other memslot, but while somewhat logical,
      that's quite arbitrary and would be a bit of a mess to implement.
      
      So, zap only SPs with gPTEs, as the resulting behavior is easy to describe,
      is predictable, and is explicitly minimal, i.e. KVM only zaps SPs that
      absolutely must be zapped.
      
      Cc: Yan Zhao <yan.y.zhao@intel.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Reviewed-by: default avatarYan Zhao <yan.y.zhao@intel.com>
      Tested-by: default avatarYan Zhao <yan.y.zhao@intel.com>
      Message-ID: <20241009192345.1148353-2-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      58a20a94
    • Kirill A. Shutemov's avatar
      x86/kvm: Override default caching mode for SEV-SNP and TDX · 8e690b81
      Kirill A. Shutemov authored
      AMD SEV-SNP and Intel TDX have limited access to MTRR: either it is not
      advertised in CPUID or it cannot be programmed (on TDX, due to #VE on
      CR0.CD clear).
      
      This results in guests using uncached mappings where it shouldn't and
      pmd/pud_set_huge() failures due to non-uniform memory type reported by
      mtrr_type_lookup().
      
      Override MTRR state, making it WB by default as the kernel does for
      Hyper-V guests.
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Suggested-by: default avatarBinbin Wu <binbin.wu@intel.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Reviewed-by: default avatarJuergen Gross <jgross@suse.com>
      Message-ID: <20241015095818.357915-1-kirill.shutemov@linux.intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      8e690b81
    • Dr. David Alan Gilbert's avatar
      KVM: Remove unused kvm_vcpu_gfn_to_pfn_atomic · bc07eea2
      Dr. David Alan Gilbert authored
      The last use of kvm_vcpu_gfn_to_pfn_atomic was removed by commit
      1bbc60d0 ("KVM: x86/mmu: Remove MMU auditing")
      
      Remove it.
      Signed-off-by: default avatarDr. David Alan Gilbert <linux@treblig.org>
      Message-ID: <20241001141354.18009-3-linux@treblig.org>
      [Adjust Documentation/virt/kvm/locking.rst. - Paolo]
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      bc07eea2
    • Dr. David Alan Gilbert's avatar
      KVM: Remove unused kvm_vcpu_gfn_to_pfn · 88a387cf
      Dr. David Alan Gilbert authored
      The last use of kvm_vcpu_gfn_to_pfn was removed by commit
      b1624f99 ("KVM: Remove kvm_vcpu_gfn_to_page() and kvm_vcpu_gpa_to_page()")
      
      Remove it.
      Signed-off-by: default avatarDr. David Alan Gilbert <linux@treblig.org>
      Message-ID: <20241001141354.18009-2-linux@treblig.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      88a387cf
  2. 06 Oct, 2024 3 commits
    • Paolo Bonzini's avatar
      Merge tag 'kvmarm-fixes-6.12-1' of... · c8d430db
      Paolo Bonzini authored
      Merge tag 'kvmarm-fixes-6.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
      
      KVM/arm64 fixes for 6.12, take #1
      
      - Fix pKVM error path on init, making sure we do not change critical
        system registers as we're about to fail
      
      - Make sure that the host's vector length is at capped by a value
        common to all CPUs
      
      - Fix kvm_has_feat*() handling of "negative" features, as the current
        code is pretty broken
      
      - Promote Joey to the status of official reviewer, while James steps
        down -- hopefully only temporarly
      c8d430db
    • Paolo Bonzini's avatar
      x86/reboot: emergency callbacks are now registered by common KVM code · 2a5fe5a0
      Paolo Bonzini authored
      Guard them with CONFIG_KVM_X86_COMMON rather than the two vendor modules.
      In practice this has no functional change, because CONFIG_KVM_X86_COMMON
      is set if and only if at least one vendor-specific module is being built.
      However, it is cleaner to specify CONFIG_KVM_X86_COMMON for functions that
      are used in kvm.ko.
      Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Fixes: 590b09b1 ("KVM: x86: Register "emergency disable" callbacks when virt is enabled")
      Fixes: 6d55a942 ("x86/reboot: Unconditionally define cpu_emergency_virt_cb typedef")
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      2a5fe5a0
    • Paolo Bonzini's avatar
      KVM: x86: leave kvm.ko out of the build if no vendor module is requested · ea4290d7
      Paolo Bonzini authored
      kvm.ko is nothing but library code shared by kvm-intel.ko and kvm-amd.ko.
      It provides no functionality on its own and it is unnecessary unless one
      of the vendor-specific module is compiled.  In particular, /dev/kvm is
      not created until one of kvm-intel.ko or kvm-amd.ko is loaded.
      
      Use CONFIG_KVM to decide if it is built-in or a module, but use the
      vendor-specific modules for the actual decision on whether to build it.
      
      This also fixes a build failure when CONFIG_KVM_INTEL and CONFIG_KVM_AMD
      are both disabled.  The cpu_emergency_register_virt_callback() function
      is called from kvm.ko, but it is only defined if at least one of
      CONFIG_KVM_INTEL and CONFIG_KVM_AMD is provided.
      
      Fixes: 590b09b1 ("KVM: x86: Register "emergency disable" callbacks when virt is enabled")
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      ea4290d7
  3. 03 Oct, 2024 2 commits
    • Paolo Bonzini's avatar
      KVM: x86/mmu: fix KVM_X86_QUIRK_SLOT_ZAP_ALL for shadow MMU · fcd1ec9c
      Paolo Bonzini authored
      As was tried in commit 4e103134 ("KVM: x86/mmu: Zap only the relevant
      pages when removing a memslot"), all shadow pages, i.e. non-leaf SPTEs,
      need to be zapped.  All of the accounting for a shadow page is tied to the
      memslot, i.e. the shadow page holds a reference to the memslot, for all
      intents and purposes.  Deleting the memslot without removing all relevant
      shadow pages, as is done when KVM_X86_QUIRK_SLOT_ZAP_ALL is disabled,
      results in NULL pointer derefs when tearing down the VM.
      
      Reintroduce from that commit the code that walks the whole memslot when
      there are active shadow MMU pages.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      fcd1ec9c
    • Marc Zyngier's avatar
      KVM: arm64: Fix kvm_has_feat*() handling of negative features · a1d402ab
      Marc Zyngier authored
      Oliver reports that the kvm_has_feat() helper is not behaviing as
      expected for negative feature. On investigation, the main issue
      seems to be caused by the following construct:
      
       #define get_idreg_field(kvm, id, fld)				\
       	(id##_##fld##_SIGNED ?					\
      	 get_idreg_field_signed(kvm, id, fld) :			\
      	 get_idreg_field_unsigned(kvm, id, fld))
      
      where one side of the expression evaluates as something signed,
      and the other as something unsigned. In retrospect, this is totally
      braindead, as the compiler converts this into an unsigned expression.
      When compared to something that is 0, the test is simply elided.
      
      Epic fail. Similar issue exists in the expand_field_sign() macro.
      
      The correct way to handle this is to chose between signed and unsigned
      comparisons, so that both sides of the ternary expression are of the
      same type (bool).
      
      In order to keep the code readable (sort of), we introduce new
      comparison primitives taking an operator as a parameter, and
      rewrite the kvm_has_feat*() helpers in terms of these primitives.
      
      Fixes: c62d7a23 ("KVM: arm64: Add feature checking helpers")
      Reported-by: default avatarOliver Upton <oliver.upton@linux.dev>
      Tested-by: default avatarOliver Upton <oliver.upton@linux.dev>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20241002204239.2051637-1-maz@kernel.orgSigned-off-by: default avatarMarc Zyngier <maz@kernel.org>
      a1d402ab
  4. 01 Oct, 2024 4 commits
  5. 29 Sep, 2024 12 commits
    • Linus Torvalds's avatar
      Linux 6.12-rc1 · 9852d85e
      Linus Torvalds authored
      9852d85e
    • Linus Torvalds's avatar
      x86: kvm: fix build error · 3f749bef
      Linus Torvalds authored
      The cpu_emergency_register_virt_callback() function is used
      unconditionally by the x86 kvm code, but it is declared (and defined)
      conditionally:
      
        #if IS_ENABLED(CONFIG_KVM_INTEL) || IS_ENABLED(CONFIG_KVM_AMD)
        void cpu_emergency_register_virt_callback(cpu_emergency_virt_cb *callback);
        ...
      
      leading to a build error when neither KVM_INTEL nor KVM_AMD support is
      enabled:
      
        arch/x86/kvm/x86.c: In function ‘kvm_arch_enable_virtualization’:
        arch/x86/kvm/x86.c:12517:9: error: implicit declaration of function ‘cpu_emergency_register_virt_callback’ [-Wimplicit-function-declaration]
        12517 |         cpu_emergency_register_virt_callback(kvm_x86_ops.emergency_disable_virtualization_cpu);
              |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        arch/x86/kvm/x86.c: In function ‘kvm_arch_disable_virtualization’:
        arch/x86/kvm/x86.c:12522:9: error: implicit declaration of function ‘cpu_emergency_unregister_virt_callback’ [-Wimplicit-function-declaration]
        12522 |         cpu_emergency_unregister_virt_callback(kvm_x86_ops.emergency_disable_virtualization_cpu);
              |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      Fix the build by defining empty helper functions the same way the old
      cpu_emergency_disable_virtualization() function was dealt with for the
      same situation.
      
      Maybe we could instead have made the call sites conditional, since the
      callers (kvm_arch_{en,dis}able_virtualization()) have an empty weak
      fallback.  I'll leave that to the kvm people to argue about, this at
      least gets the build going for that particular config.
      
      Fixes: 590b09b1 ("KVM: x86: Register "emergency disable" callbacks when virt is enabled")
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Kai Huang <kai.huang@intel.com>
      Cc: Chao Gao <chao.gao@intel.com>
      Cc: Farrah Chen <farrah.chen@intel.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3f749bef
    • Linus Torvalds's avatar
      Merge tag 'mailbox-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jassibrar/mailbox · e7ed3436
      Linus Torvalds authored
      Pull mailbox updates from Jassi Brar:
      
       - fix kconfig dependencies (mhu-v3, omap2+)
      
       - use devie name instead of genereic imx_mu_chan as interrupt name
         (imx)
      
       - enable sa8255p and qcs8300 ipc controllers (qcom)
      
       - Fix timeout during suspend mode (bcm2835)
      
       - convert to use use of_property_match_string (mailbox)
      
       - enable mt8188 (mediatek)
      
       - use devm_clk_get_enabled helpers (spreadtrum)
      
       - fix device-id typo (rockchip)
      
      * tag 'mailbox-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jassibrar/mailbox:
        mailbox, remoteproc: omap2+: fix compile testing
        dt-bindings: mailbox: qcom-ipcc: Document QCS8300 IPCC
        dt-bindings: mailbox: qcom-ipcc: document the support for SA8255p
        dt-bindings: mailbox: mtk,adsp-mbox: Add compatible for MT8188
        mailbox: Use of_property_match_string() instead of open-coding
        mailbox: bcm2835: Fix timeout during suspend mode
        mailbox: sprd: Use devm_clk_get_enabled() helpers
        mailbox: rockchip: fix a typo in module autoloading
        mailbox: imx: use device name in interrupt name
        mailbox: ARM_MHU_V3 should depend on ARM64
      e7ed3436
    • Linus Torvalds's avatar
      Merge tag 'i2c-for-6.12-rc1-additional_fixes' of... · 907537f5
      Linus Torvalds authored
      Merge tag 'i2c-for-6.12-rc1-additional_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
      
      Pull i2c fixes from Wolfram Sang:
      
       - fix DesignWare driver ENABLE-ABORT sequence, ensuring ABORT can
         always be sent when needed
      
       - check for PCLK in the SynQuacer controller as an optional clock,
         allowing ACPI to directly provide the clock rate
      
       - KEBA driver Kconfig dependency fix
      
       - fix XIIC driver power suspend sequence
      
      * tag 'i2c-for-6.12-rc1-additional_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: xiic: Fix pm_runtime_set_suspended() with runtime pm enabled
        i2c: keba: I2C_KEBA should depend on KEBA_CP500
        i2c: synquacer: Deal with optional PCLK correctly
        i2c: designware: fix controller is holding SCL low while ENABLE bit is disabled
      907537f5
    • Linus Torvalds's avatar
      Merge tag 'dma-mapping-6.12-2024-09-29' of git://git.infradead.org/users/hch/dma-mapping · b81b78da
      Linus Torvalds authored
      Pull dma-mapping fix from Christoph Hellwig:
      
       - handle chained SGLs in the new tracing code (Christoph Hellwig)
      
      * tag 'dma-mapping-6.12-2024-09-29' of git://git.infradead.org/users/hch/dma-mapping:
        dma-mapping: fix DMA API tracing for chained scatterlists
      b81b78da
    • Linus Torvalds's avatar
      Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 3ed7df08
      Linus Torvalds authored
      Pull more SCSI updates from James Bottomley:
       "These are mostly minor updates.
      
        There are two drivers (lpfc and mpi3mr) which missed the initial
        pull and a core change to retry a start/stop unit which affect
        suspend/resume"
      
      * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (32 commits)
        scsi: lpfc: Update lpfc version to 14.4.0.5
        scsi: lpfc: Support loopback tests with VMID enabled
        scsi: lpfc: Revise TRACE_EVENT log flag severities from KERN_ERR to KERN_WARNING
        scsi: lpfc: Ensure DA_ID handling completion before deleting an NPIV instance
        scsi: lpfc: Fix kref imbalance on fabric ndlps from dev_loss_tmo handler
        scsi: lpfc: Restrict support for 32 byte CDBs to specific HBAs
        scsi: lpfc: Update phba link state conditional before sending CMF_SYNC_WQE
        scsi: lpfc: Add ELS_RSP cmd to the list of WQEs to flush in lpfc_els_flush_cmd()
        scsi: mpi3mr: Update driver version to 8.12.0.0.50
        scsi: mpi3mr: Improve wait logic while controller transitions to READY state
        scsi: mpi3mr: Update MPI Headers to revision 34
        scsi: mpi3mr: Use firmware-provided timestamp update interval
        scsi: mpi3mr: Enhance the Enable Controller retry logic
        scsi: sd: Fix off-by-one error in sd_read_block_characteristics()
        scsi: pm8001: Do not overwrite PCI queue mapping
        scsi: scsi_debug: Remove a useless memset()
        scsi: pmcraid: Convert comma to semicolon
        scsi: sd: Retry START STOP UNIT commands
        scsi: mpi3mr: A performance fix
        scsi: ufs: qcom: Update MODE_MAX cfg_bw value
        ...
      3ed7df08
    • Linus Torvalds's avatar
      Merge tag 'bcachefs-2024-09-28' of git://evilpiepirate.org/bcachefs · 9f9a5347
      Linus Torvalds authored
      Pull more bcachefs updates from Kent Overstreet:
       "Assorted minor syzbot fixes, and for bigger stuff:
      
        Fix two disk accounting rewrite bugs:
      
         - Disk accounting keys use the version field of bkey so that journal
           replay can tell which updates have been applied to the btree.
      
           This is set in the transaction commit path, after we've gotten our
           journal reservation (and our time ordering), but the
           BCH_TRANS_COMMIT_skip_accounting_apply flag that journal replay
           uses was incorrectly skipping this for new updates generated prior
           to journal replay.
      
           This fixes the underlying cause of an assertion pop in
           disk_accounting_read.
      
         - A couple of fixes for disk accounting + device removal.
      
           Checking if acocunting replicas entries were marked in the
           superblock was being done at the wrong point, when deltas in the
           journal could still zero them out, and then additionally we'd try
           to add a missing replicas entry to the superblock without checking
           if it referred to an invalid (removed) device.
      
        A whole slew of repair fixes:
      
         - fix infinite loop in propagate_key_to_snapshot_leaves(), this fixes
           an infinite loop when repairing a filesystem with many snapshots
      
         - fix incorrect transaction restart handling leading to occasional
           "fsck counted ..." warnings
      
         - fix warning in __bch2_fsck_err() for bkey fsck errors
      
         - check_inode() in fsck now correctly checks if the filesystem was
           clean
      
         - there shouldn't be pending logged ops if the fs was clean, we now
           check for this
      
         - remove_backpointer() doesn't remove a dirent that doesn't actually
           point to the inode
      
         - many more fsck errors are AUTOFIX"
      
      * tag 'bcachefs-2024-09-28' of git://evilpiepirate.org/bcachefs: (35 commits)
        bcachefs: check_subvol_path() now prints subvol root inode
        bcachefs: remove_backpointer() now checks if dirent points to inode
        bcachefs: dirent_points_to_inode() now warns on mismatch
        bcachefs: Fix lost wake up
        bcachefs: Check for logged ops when clean
        bcachefs: BCH_FS_clean_recovery
        bcachefs: Convert disk accounting BUG_ON() to WARN_ON()
        bcachefs: Fix BCH_TRANS_COMMIT_skip_accounting_apply
        bcachefs: Check for accounting keys with bversion=0
        bcachefs: rename version -> bversion
        bcachefs: Don't delete unlinked inodes before logged op resume
        bcachefs: Fix BCH_SB_ERRS() so we can reorder
        bcachefs: Fix fsck warnings from bkey validation
        bcachefs: Move transaction commit path validation to as late as possible
        bcachefs: Fix disk accounting attempting to mark invalid replicas entry
        bcachefs: Fix unlocked access to c->disk_sb.sb in bch2_replicas_entry_validate()
        bcachefs: Fix accounting read + device removal
        bcachefs: bch_accounting_mode
        bcachefs: fix transaction restart handling in check_extents(), check_dirents()
        bcachefs: kill inode_walker_entry.seen_this_pos
        ...
      9f9a5347
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2024-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d37421e6
      Linus Torvalds authored
      Pull x86 fixes from Ingo Molnar:
       "Fix TDX MMIO #VE fault handling, and add two new Intel model numbers
        for 'Pantherlake' and 'Diamond Rapids'"
      
      * tag 'x86-urgent-2024-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/cpu: Add two Intel CPU model numbers
        x86/tdx: Fix "in-kernel MMIO" check
      d37421e6
    • Linus Torvalds's avatar
      Merge tag 'locking-urgent-2024-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ec03de73
      Linus Torvalds authored
      Pull locking updates from Ingo Molnar:
       "lockdep:
          - Fix potential deadlock between lockdep and RCU (Zhiguo Niu)
          - Use str_plural() to address Coccinelle warning (Thorsten Blum)
          - Add debuggability enhancement (Luis Claudio R. Goncalves)
      
        static keys & calls:
          - Fix static_key_slow_dec() yet again (Peter Zijlstra)
          - Handle module init failure correctly in static_call_del_module()
            (Thomas Gleixner)
          - Replace pointless WARN_ON() in static_call_module_notify() (Thomas
            Gleixner)
      
        <linux/cleanup.h>:
          - Add usage and style documentation (Dan Williams)
      
        rwsems:
          - Move is_rwsem_reader_owned() and rwsem_owner() under
            CONFIG_DEBUG_RWSEMS (Waiman Long)
      
        atomic ops, x86:
          - Redeclare x86_32 arch_atomic64_{add,sub}() as void (Uros Bizjak)
          - Introduce the read64_nonatomic macro to x86_32 with cx8 (Uros
            Bizjak)"
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      
      * tag 'locking-urgent-2024-09-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        locking/rwsem: Move is_rwsem_reader_owned() and rwsem_owner() under CONFIG_DEBUG_RWSEMS
        jump_label: Fix static_key_slow_dec() yet again
        static_call: Replace pointless WARN_ON() in static_call_module_notify()
        static_call: Handle module init failure correctly in static_call_del_module()
        locking/lockdep: Simplify character output in seq_line()
        lockdep: fix deadlock issue between lockdep and rcu
        lockdep: Use str_plural() to fix Coccinelle warning
        cleanup: Add usage and style documentation
        lockdep: suggest the fix for "lockdep bfs error:-1" on print_bfs_bug
        locking/atomic/x86: Redeclare x86_32 arch_atomic64_{add,sub}() as void
        locking/atomic/x86: Introduce the read64_nonatomic macro to x86_32 with cx8
      ec03de73
    • Linus Torvalds's avatar
      Merge tag 'cocci-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux · 68e4b0e0
      Linus Torvalds authored
      Pull coccinelle updates from Julia Lawall:
       "Extend string_choices.cocci to use more available helpers
      
        Ten patches from Hongbo Li extending string_choices.cocci with the
        complete set of functions offered by include/linux/string_choices.h.
      
        One patch from myself reducing the number of redundant cases that are
        checked by Coccinelle, giving a small performance improvement"
      
      * tag 'cocci-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux:
        Reduce Coccinelle choices in string_choices.cocci
        coccinelle: Remove unnecessary parentheses for only one possible change.
        coccinelle: Add rules to find str_yes_no() replacements
        coccinelle: Add rules to find str_on_off() replacements
        coccinelle: Add rules to find str_write_read() replacements
        coccinelle: Add rules to find str_read_write() replacements
        coccinelle: Add rules to find str_enable{d}_disable{d}() replacements
        coccinelle: Add rules to find str_lo{w}_hi{gh}() replacements
        coccinelle: Add rules to find str_hi{gh}_lo{w}() replacements
        coccinelle: Add rules to find str_false_true() replacements
        coccinelle: Add rules to find str_true_false() replacements
      68e4b0e0
    • Linus Torvalds's avatar
      Merge tag 'linux_kselftest-next-6.12-rc1-fixes' of... · e7ebdb51
      Linus Torvalds authored
      Merge tag 'linux_kselftest-next-6.12-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
      
      Pull kselftest fix from Shuah Khan:
       "One urgent fix to vDSO as automated testing is failing due to this
        bug"
      
      * tag 'linux_kselftest-next-6.12-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
        selftests: vDSO: align stack for O2-optimized memcpy
      e7ebdb51
    • Ingo Molnar's avatar
      Merge branch 'locking/core' into locking/urgent, to pick up pending commits · ae39e0bd
      Ingo Molnar authored
      Merge all pending locking commits into a single branch.
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ae39e0bd
  6. 28 Sep, 2024 12 commits
    • Julia Lawall's avatar
      Reduce Coccinelle choices in string_choices.cocci · 4003ba66
      Julia Lawall authored
      The isomorphism neg_if_exp negates the test of a ?: conditional,
      making it unnecessary to have an explicit case for a negated test
      with the branches inverted.
      
      At the same time, we can disable neg_if_exp in cases where a
      different API function may be more suitable for a negated test.
      
      Finally, in the non-patch cases, E matches an expression with
      parentheses around it, so there is no need to mention ()
      explicitly in the pattern.  The () are still needed in the patch
      cases, because we want to drop them, if they are present.
      Signed-off-by: default avatarJulia Lawall <Julia.Lawall@inria.fr>
      4003ba66
    • Hongbo Li's avatar
      coccinelle: Remove unnecessary parentheses for only one possible change. · f584e375
      Hongbo Li authored
      The parentheses are only needed if there is a disjunction, ie a
      set of possible changes. If there is only one pattern, we can
      remove these parentheses. Just like the format:
      
        -  x
        +  y
      
      not:
      
        (
        -  x
        +  y
        )
      Signed-off-by: default avatarHongbo Li <lihongbo22@huawei.com>
      Signed-off-by: default avatarJulia Lawall <Julia.Lawall@inria.fr>
      f584e375
    • Hongbo Li's avatar
      coccinelle: Add rules to find str_yes_no() replacements · 253244cd
      Hongbo Li authored
      As other rules done, we add rules for str_yes_no()
      to check the relative opportunities.
      Signed-off-by: default avatarHongbo Li <lihongbo22@huawei.com>
      Signed-off-by: default avatarJulia Lawall <Julia.Lawall@inria.fr>
      253244cd
    • Hongbo Li's avatar
      coccinelle: Add rules to find str_on_off() replacements · 9b5b4810
      Hongbo Li authored
      As other rules done, we add rules for str_on_off()
      to check the relative opportunities.
      Signed-off-by: default avatarHongbo Li <lihongbo22@huawei.com>
      Signed-off-by: default avatarJulia Lawall <Julia.Lawall@inria.fr>
      9b5b4810
    • Hongbo Li's avatar
      coccinelle: Add rules to find str_write_read() replacements · c81ca023
      Hongbo Li authored
      As other rules done, we add rules for str_write_read()
      to check the relative opportunities.
      Signed-off-by: default avatarHongbo Li <lihongbo22@huawei.com>
      Signed-off-by: default avatarJulia Lawall <Julia.Lawall@inria.fr>
      c81ca023
    • Hongbo Li's avatar
      coccinelle: Add rules to find str_read_write() replacements · ba4b514a
      Hongbo Li authored
      As other rules done, we add rules for str_read_write()
      to check the relative opportunities.
      Signed-off-by: default avatarHongbo Li <lihongbo22@huawei.com>
      Signed-off-by: default avatarJulia Lawall <Julia.Lawall@inria.fr>
      ba4b514a
    • Hongbo Li's avatar
      coccinelle: Add rules to find str_enable{d}_disable{d}() replacements · dd2275d3
      Hongbo Li authored
      As other rules done, we add rules for str_enable{d}_
      disable{d}() to check the relative opportunities.
      Signed-off-by: default avatarHongbo Li <lihongbo22@huawei.com>
      Signed-off-by: default avatarJulia Lawall <Julia.Lawall@inria.fr>
      dd2275d3
    • Hongbo Li's avatar
      coccinelle: Add rules to find str_lo{w}_hi{gh}() replacements · 5b7ca450
      Hongbo Li authored
      As other rules done, we add rules for str_lo{w}_hi{gh}()
      to check the relative opportunities.
      Signed-off-by: default avatarHongbo Li <lihongbo22@huawei.com>
      Signed-off-by: default avatarJulia Lawall <Julia.Lawall@inria.fr>
      5b7ca450
    • Hongbo Li's avatar
      coccinelle: Add rules to find str_hi{gh}_lo{w}() replacements · d4c75440
      Hongbo Li authored
      As other rules done, we add rules for str_hi{gh}_lo{w}()
      to check the relative opportunities.
      Signed-off-by: default avatarHongbo Li <lihongbo22@huawei.com>
      Signed-off-by: default avatarJulia Lawall <Julia.Lawall@inria.fr>
      d4c75440
    • Hongbo Li's avatar
      coccinelle: Add rules to find str_false_true() replacements · 8a0236ba
      Hongbo Li authored
      As done with str_true_false(), add checks for str_false_true()
      opportunities. A simple test can find over 9 cases currently
      exist in the tree.
      Signed-off-by: default avatarHongbo Li <lihongbo22@huawei.com>
      Signed-off-by: default avatarJulia Lawall <Julia.Lawall@inria.fr>
      8a0236ba
    • Hongbo Li's avatar
      coccinelle: Add rules to find str_true_false() replacements · 716bf84e
      Hongbo Li authored
      After str_true_false() has been introduced in the tree,
      we can add rules for finding places where str_true_false()
      can be used. A simple test can find over 10 locations.
      Signed-off-by: default avatarHongbo Li <lihongbo22@huawei.com>
      Signed-off-by: default avatarJulia Lawall <Julia.Lawall@inria.fr>
      716bf84e
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 3efc5736
      Linus Torvalds authored
      Pull x86 kvm updates from Paolo Bonzini:
       "x86:
      
         - KVM currently invalidates the entirety of the page tables, not just
           those for the memslot being touched, when a memslot is moved or
           deleted.
      
           This does not traditionally have particularly noticeable overhead,
           but Intel's TDX will require the guest to re-accept private pages
           if they are dropped from the secure EPT, which is a non starter.
      
           Actually, the only reason why this is not already being done is a
           bug which was never fully investigated and caused VM instability
           with assigned GeForce GPUs, so allow userspace to opt into the new
           behavior.
      
         - Advertise AVX10.1 to userspace (effectively prep work for the
           "real" AVX10 functionality that is on the horizon)
      
         - Rework common MSR handling code to suppress errors on userspace
           accesses to unsupported-but-advertised MSRs
      
           This will allow removing (almost?) all of KVM's exemptions for
           userspace access to MSRs that shouldn't exist based on the vCPU
           model (the actual cleanup is non-trivial future work)
      
         - Rework KVM's handling of x2APIC ICR, again, because AMD (x2AVIC)
           splits the 64-bit value into the legacy ICR and ICR2 storage,
           whereas Intel (APICv) stores the entire 64-bit value at the ICR
           offset
      
         - Fix a bug where KVM would fail to exit to userspace if one was
           triggered by a fastpath exit handler
      
         - Add fastpath handling of HLT VM-Exit to expedite re-entering the
           guest when there's already a pending wake event at the time of the
           exit
      
         - Fix a WARN caused by RSM entering a nested guest from SMM with
           invalid guest state, by forcing the vCPU out of guest mode prior to
           signalling SHUTDOWN (the SHUTDOWN hits the VM altogether, not the
           nested guest)
      
         - Overhaul the "unprotect and retry" logic to more precisely identify
           cases where retrying is actually helpful, and to harden all retry
           paths against putting the guest into an infinite retry loop
      
         - Add support for yielding, e.g. to honor NEED_RESCHED, when zapping
           rmaps in the shadow MMU
      
         - Refactor pieces of the shadow MMU related to aging SPTEs in
           prepartion for adding multi generation LRU support in KVM
      
         - Don't stuff the RSB after VM-Exit when RETPOLINE=y and AutoIBRS is
           enabled, i.e. when the CPU has already flushed the RSB
      
         - Trace the per-CPU host save area as a VMCB pointer to improve
           readability and cleanup the retrieval of the SEV-ES host save area
      
         - Remove unnecessary accounting of temporary nested VMCB related
           allocations
      
         - Set FINAL/PAGE in the page fault error code for EPT violations if
           and only if the GVA is valid. If the GVA is NOT valid, there is no
           guest-side page table walk and so stuffing paging related metadata
           is nonsensical
      
         - Fix a bug where KVM would incorrectly synthesize a nested VM-Exit
           instead of emulating posted interrupt delivery to L2
      
         - Add a lockdep assertion to detect unsafe accesses of vmcs12
           structures
      
         - Harden eVMCS loading against an impossible NULL pointer deref
           (really truly should be impossible)
      
         - Minor SGX fix and a cleanup
      
         - Misc cleanups
      
        Generic:
      
         - Register KVM's cpuhp and syscore callbacks when enabling
           virtualization in hardware, as the sole purpose of said callbacks
           is to disable and re-enable virtualization as needed
      
         - Enable virtualization when KVM is loaded, not right before the
           first VM is created
      
           Together with the previous change, this simplifies a lot the logic
           of the callbacks, because their very existence implies
           virtualization is enabled
      
         - Fix a bug that results in KVM prematurely exiting to userspace for
           coalesced MMIO/PIO in many cases, clean up the related code, and
           add a testcase
      
         - Fix a bug in kvm_clear_guest() where it would trigger a buffer
           overflow _if_ the gpa+len crosses a page boundary, which thankfully
           is guaranteed to not happen in the current code base. Add WARNs in
           more helpers that read/write guest memory to detect similar bugs
      
        Selftests:
      
         - Fix a goof that caused some Hyper-V tests to be skipped when run on
           bare metal, i.e. NOT in a VM
      
         - Add a regression test for KVM's handling of SHUTDOWN for an SEV-ES
           guest
      
         - Explicitly include one-off assets in .gitignore. Past Sean was
           completely wrong about not being able to detect missing .gitignore
           entries
      
         - Verify userspace single-stepping works when KVM happens to handle a
           VM-Exit in its fastpath
      
         - Misc cleanups"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (127 commits)
        Documentation: KVM: fix warning in "make htmldocs"
        s390: Enable KVM_S390_UCONTROL config in debug_defconfig
        selftests: kvm: s390: Add VM run test case
        KVM: SVM: let alternatives handle the cases when RSB filling is required
        KVM: VMX: Set PFERR_GUEST_{FINAL,PAGE}_MASK if and only if the GVA is valid
        KVM: x86/mmu: Use KVM_PAGES_PER_HPAGE() instead of an open coded equivalent
        KVM: x86/mmu: Add KVM_RMAP_MANY to replace open coded '1' and '1ul' literals
        KVM: x86/mmu: Fold mmu_spte_age() into kvm_rmap_age_gfn_range()
        KVM: x86/mmu: Morph kvm_handle_gfn_range() into an aging specific helper
        KVM: x86/mmu: Honor NEED_RESCHED when zapping rmaps and blocking is allowed
        KVM: x86/mmu: Add a helper to walk and zap rmaps for a memslot
        KVM: x86/mmu: Plumb a @can_yield parameter into __walk_slot_rmaps()
        KVM: x86/mmu: Move walk_slot_rmaps() up near for_each_slot_rmap_range()
        KVM: x86/mmu: WARN on MMIO cache hit when emulating write-protected gfn
        KVM: x86/mmu: Detect if unprotect will do anything based on invalid_list
        KVM: x86/mmu: Subsume kvm_mmu_unprotect_page() into the and_retry() version
        KVM: x86: Rename reexecute_instruction()=>kvm_unprotect_and_retry_on_failure()
        KVM: x86: Update retry protection fields when forcing retry on emulation failure
        KVM: x86: Apply retry protection to "unprotect on failure" path
        KVM: x86: Check EMULTYPE_WRITE_PF_TO_SP before unprotecting gfn
        ...
      3efc5736