1. 13 Jul, 2022 4 commits
  2. 11 Jul, 2022 24 commits
  3. 24 Jun, 2022 12 commits
    • Zeng Guang's avatar
      KVM: selftests: Enhance handling WRMSR ICR register in x2APIC mode · 4b88b1a5
      Zeng Guang authored
      Hardware would directly write x2APIC ICR register instead of software
      emulation in some circumstances, e.g when Intel IPI virtualization is
      enabled. This behavior requires normal reserved bits checking to ensure
      them input as zero, otherwise it will cause #GP. So we need mask out
      those reserved bits from the data written to vICR register.
      
      Remove Delivery Status bit emulation in test case as this flag
      is invalid and not needed in x2APIC mode. KVM may ignore clearing
      it during interrupt dispatch which will lead to fake test failure.
      
      Opportunistically correct vector number for test sending IPI to
      non-existent vCPUs.
      Signed-off-by: default avatarZeng Guang <guang.zeng@intel.com>
      Message-Id: <20220623094511.26066-1-guang.zeng@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      4b88b1a5
    • Jue Wang's avatar
      KVM: selftests: Add a self test for CMCI and UCNA emulations. · eede2065
      Jue Wang authored
      This patch add a self test that verifies user space can inject
      UnCorrectable No Action required (UCNA) memory errors to the guest.
      It also verifies that incorrectly configured MSRs for Corrected
      Machine Check Interrupt (CMCI) emulation will result in #GP.
      Signed-off-by: default avatarJue Wang <juew@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Message-Id: <20220610171134.772566-9-juew@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      eede2065
    • Jue Wang's avatar
      KVM: x86: Enable CMCI capability by default and handle injected UCNA errors · aebc3ca1
      Jue Wang authored
      This patch enables MCG_CMCI_P by default in kvm_mce_cap_supported. It
      reuses ioctl KVM_X86_SET_MCE to implement injection of UnCorrectable
      No Action required (UCNA) errors, signaled via Corrected Machine
      Check Interrupt (CMCI).
      
      Neither of the CMCI and UCNA emulations depends on hardware.
      Signed-off-by: default avatarJue Wang <juew@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Message-Id: <20220610171134.772566-8-juew@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      aebc3ca1
    • Jue Wang's avatar
      KVM: x86: Add emulation for MSR_IA32_MCx_CTL2 MSRs. · 281b5278
      Jue Wang authored
      This patch adds the emulation of IA32_MCi_CTL2 registers to KVM. A
      separate mci_ctl2_banks array is used to keep the existing mce_banks
      register layout intact.
      
      In Machine Check Architecture, in addition to MCG_CMCI_P, bit 30 of
      the per-bank register IA32_MCi_CTL2 controls whether Corrected Machine
      Check error reporting is enabled.
      Signed-off-by: default avatarJue Wang <juew@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Message-Id: <20220610171134.772566-7-juew@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      281b5278
    • Jue Wang's avatar
      KVM: x86: Use kcalloc to allocate the mce_banks array. · 087acc4e
      Jue Wang authored
      This patch updates the allocation of mce_banks with the array allocation
      API (kcalloc) as a precedent for the later mci_ctl2_banks to implement
      per-bank control of Corrected Machine Check Interrupt (CMCI).
      Suggested-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarJue Wang <juew@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Message-Id: <20220610171134.772566-6-juew@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      087acc4e
    • Jue Wang's avatar
      KVM: x86: Add Corrected Machine Check Interrupt (CMCI) emulation to lapic. · 4b903561
      Jue Wang authored
      This patch calculates the number of lvt entries as part of
      KVM_X86_MCE_SETUP conditioned on the presence of MCG_CMCI_P bit in
      MCG_CAP and stores result in kvm_lapic. It translats from APIC_LVTx
      register to index in lapic_lvt_entry enum. It extends the APIC_LVTx
      macro as well as other lapic write/reset handling etc to support
      Corrected Machine Check Interrupt.
      Signed-off-by: default avatarJue Wang <juew@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Message-Id: <20220610171134.772566-5-juew@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      4b903561
    • Jue Wang's avatar
      KVM: x86: Add APIC_LVTx() macro. · 987f625e
      Jue Wang authored
      An APIC_LVTx macro is introduced to calcualte the APIC_LVTx register
      offset based on the index in the lapic_lvt_entry enum. Later patches
      will extend the APIC_LVTx macro to support the APIC_LVTCMCI register
      in order to implement Corrected Machine Check Interrupt signaling.
      Suggested-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarJue Wang <juew@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Message-Id: <20220610171134.772566-4-juew@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      987f625e
    • Jue Wang's avatar
      KVM: x86: Fill apic_lvt_mask with enums / explicit entries. · 1d8c681f
      Jue Wang authored
      This patch defines a lapic_lvt_entry enum used as explicit indices to
      the apic_lvt_mask array. In later patches a LVT_CMCI will be added to
      implement the Corrected Machine Check Interrupt signaling.
      Suggested-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarJue Wang <juew@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Message-Id: <20220610171134.772566-3-juew@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      1d8c681f
    • Jue Wang's avatar
      KVM: x86: Make APIC_VERSION capture only the magic 0x14UL. · 951ceb94
      Jue Wang authored
      Refactor APIC_VERSION so that the maximum number of LVT entries is
      inserted at runtime rather than compile time. This will be used in a
      subsequent commit to expose the LVT CMCI Register to VMs that support
      Corrected Machine Check error counting/signaling
      (IA32_MCG_CAP.MCG_CMCI_P=1).
      Suggested-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarJue Wang <juew@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Message-Id: <20220610171134.772566-2-juew@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      951ceb94
    • Paolo Bonzini's avatar
      KVM: x86/mmu: Avoid unnecessary flush on eager page split · 03787394
      Paolo Bonzini authored
      The TLB flush before installing the newly-populated lower level
      page table is unnecessary if the lower-level page table maps
      the huge page identically.  KVM knows it is if it did not reuse
      an existing shadow page table, tell drop_large_spte() to skip
      the flush in that case.
      
      Extracted from a patch by David Matlack.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      03787394
    • David Matlack's avatar
      KVM: x86/mmu: Extend Eager Page Splitting to nested MMUs · ada51a9d
      David Matlack authored
      Add support for Eager Page Splitting pages that are mapped by nested
      MMUs. Walk through the rmap first splitting all 1GiB pages to 2MiB
      pages, and then splitting all 2MiB pages to 4KiB pages.
      
      Note, Eager Page Splitting is limited to nested MMUs as a policy rather
      than due to any technical reason (the sp->role.guest_mode check could
      just be deleted and Eager Page Splitting would work correctly for all
      shadow MMU pages). There is really no reason to support Eager Page
      Splitting for tdp_mmu=N, since such support will eventually be phased
      out, and there is no current use case supporting Eager Page Splitting on
      hosts where TDP is either disabled or unavailable in hardware.
      Furthermore, future improvements to nested MMU scalability may diverge
      the code from the legacy shadow paging implementation. These
      improvements will be simpler to make if Eager Page Splitting does not
      have to worry about legacy shadow paging.
      
      Splitting huge pages mapped by nested MMUs requires dealing with some
      extra complexity beyond that of the TDP MMU:
      
      (1) The shadow MMU has a limit on the number of shadow pages that are
          allowed to be allocated. So, as a policy, Eager Page Splitting
          refuses to split if there are KVM_MIN_FREE_MMU_PAGES or fewer
          pages available.
      
      (2) Splitting a huge page may end up re-using an existing lower level
          shadow page tables. This is unlike the TDP MMU which always allocates
          new shadow page tables when splitting.
      
      (3) When installing the lower level SPTEs, they must be added to the
          rmap which may require allocating additional pte_list_desc structs.
      
      Case (2) is especially interesting since it may require a TLB flush,
      unlike the TDP MMU which can fully split huge pages without any TLB
      flushes. Specifically, an existing lower level page table may point to
      even lower level page tables that are not fully populated, effectively
      unmapping a portion of the huge page, which requires a flush.  As of
      this commit, a flush is always done always after dropping the huge page
      and before installing the lower level page table.
      
      This TLB flush could instead be delayed until the MMU lock is about to be
      dropped, which would batch flushes for multiple splits.  However these
      flushes should be rare in practice (a huge page must be aliased in
      multiple SPTEs and have been split for NX Huge Pages in only some of
      them). Flushing immediately is simpler to plumb and also reduces the
      chances of tripping over a CPU bug (e.g. see iTLB multihit).
      
      [ This commit is based off of the original implementation of Eager Page
        Splitting from Peter in Google's kernel from 2016. ]
      Suggested-by: default avatarPeter Feiner <pfeiner@google.com>
      Signed-off-by: default avatarDavid Matlack <dmatlack@google.com>
      Message-Id: <20220516232138.1783324-23-dmatlack@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      ada51a9d
    • David Matlack's avatar
      KVM: Allow for different capacities in kvm_mmu_memory_cache structs · 837f66c7
      David Matlack authored
      Allow the capacity of the kvm_mmu_memory_cache struct to be chosen at
      declaration time rather than being fixed for all declarations. This will
      be used in a follow-up commit to declare an cache in x86 with a capacity
      of 512+ objects without having to increase the capacity of all caches in
      KVM.
      
      This change requires each cache now specify its capacity at runtime,
      since the cache struct itself no longer has a fixed capacity known at
      compile time. To protect against someone accidentally defining a
      kvm_mmu_memory_cache struct directly (without the extra storage), this
      commit includes a WARN_ON() in kvm_mmu_topup_memory_cache().
      
      In order to support different capacities, this commit changes the
      objects pointer array to be dynamically allocated the first time the
      cache is topped-up.
      
      While here, opportunistically clean up the stack-allocated
      kvm_mmu_memory_cache structs in riscv and arm64 to use designated
      initializers.
      
      No functional change intended.
      Reviewed-by: default avatarMarc Zyngier <maz@kernel.org>
      Signed-off-by: default avatarDavid Matlack <dmatlack@google.com>
      Message-Id: <20220516232138.1783324-22-dmatlack@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      837f66c7