1. 04 Feb, 2021 27 commits
  2. 03 Feb, 2021 2 commits
    • Paolo Bonzini's avatar
      KVM: x86: cleanup CR3 reserved bits checks · c1c35cf7
      Paolo Bonzini authored
      If not in long mode, the low bits of CR3 are reserved but not enforced to
      be zero, so remove those checks.  If in long mode, however, the MBZ bits
      extend down to the highest physical address bit of the guest, excluding
      the encryption bit.
      
      Make the checks consistent with the above, and match them between
      nested_vmcb_checks and KVM_SET_SREGS.
      
      Cc: stable@vger.kernel.org
      Fixes: 761e4169 ("KVM: nSVM: Check that MBZ bits in CR3 and CR4 are not set on vmrun of nested guests")
      Fixes: a780a3ea ("KVM: X86: Fix reserved bits check for MOV to CR3")
      Reviewed-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      c1c35cf7
    • Sean Christopherson's avatar
      KVM: SVM: Treat SVM as unsupported when running as an SEV guest · ccd85d90
      Sean Christopherson authored
      Don't let KVM load when running as an SEV guest, regardless of what
      CPUID says.  Memory is encrypted with a key that is not accessible to
      the host (L0), thus it's impossible for L0 to emulate SVM, e.g. it'll
      see garbage when reading the VMCB.
      
      Technically, KVM could decrypt all memory that needs to be accessible to
      the L0 and use shadow paging so that L0 does not need to shadow NPT, but
      exposing such information to L0 largely defeats the purpose of running as
      an SEV guest.  This can always be revisited if someone comes up with a
      use case for running VMs inside SEV guests.
      
      Note, VMLOAD, VMRUN, etc... will also #GP on GPAs with C-bit set, i.e. KVM
      is doomed even if the SEV guest is debuggable and the hypervisor is willing
      to decrypt the VMCB.  This may or may not be fixed on CPUs that have the
      SVME_ADDR_CHK fix.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20210202212017.2486595-1-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      ccd85d90
  3. 02 Feb, 2021 1 commit
    • Sean Christopherson's avatar
      KVM: x86: Update emulator context mode if SYSENTER xfers to 64-bit mode · 943dea8a
      Sean Christopherson authored
      Set the emulator context to PROT64 if SYSENTER transitions from 32-bit
      userspace (compat mode) to a 64-bit kernel, otherwise the RIP update at
      the end of x86_emulate_insn() will incorrectly truncate the new RIP.
      
      Note, this bug is mostly limited to running an Intel virtual CPU model on
      an AMD physical CPU, as other combinations of virtual and physical CPUs
      do not trigger full emulation.  On Intel CPUs, SYSENTER in compatibility
      mode is legal, and unconditionally transitions to 64-bit mode.  On AMD
      CPUs, SYSENTER is illegal in compatibility mode and #UDs.  If the vCPU is
      AMD, KVM injects a #UD on SYSENTER in compat mode.  If the pCPU is Intel,
      SYSENTER will execute natively and not trigger #UD->VM-Exit (ignoring
      guest TLB shenanigans).
      
      Fixes: fede8076 ("KVM: x86: handle wrap around 32-bit address space")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJonny Barker <jonny@jonnybarker.com>
      [sean: wrote changelog]
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20210202165546.2390296-1-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      943dea8a
  4. 01 Feb, 2021 3 commits
    • Vitaly Kuznetsov's avatar
      KVM: x86: Supplement __cr4_reserved_bits() with X86_FEATURE_PCID check · 4683d758
      Vitaly Kuznetsov authored
      Commit 7a873e45 ("KVM: selftests: Verify supported CR4 bits can be set
      before KVM_SET_CPUID2") reveals that KVM allows to set X86_CR4_PCIDE even
      when PCID support is missing:
      
      ==== Test Assertion Failure ====
        x86_64/set_sregs_test.c:41: rc
        pid=6956 tid=6956 - Invalid argument
           1	0x000000000040177d: test_cr4_feature_bit at set_sregs_test.c:41
           2	0x00000000004014fc: main at set_sregs_test.c:119
           3	0x00007f2d9346d041: ?? ??:0
           4	0x000000000040164d: _start at ??:?
        KVM allowed unsupported CR4 bit (0x20000)
      
      Add X86_FEATURE_PCID feature check to __cr4_reserved_bits() to make
      kvm_is_valid_cr4() fail.
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Message-Id: <20210201142843.108190-1-vkuznets@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      4683d758
    • Zheng Zhan Liang's avatar
      KVM/x86: assign hva with the right value to vm_munmap the pages · b66f9bab
      Zheng Zhan Liang authored
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Wanpeng Li <wanpengli@tencent.com>
      Cc: kvm@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarZheng Zhan Liang <zhengzhanliang@huorong.cn>
      Message-Id: <20210201055310.267029-1-zhengzhanliang@huorong.cn>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      b66f9bab
    • Paolo Bonzini's avatar
      KVM: x86: Allow guests to see MSR_IA32_TSX_CTRL even if tsx=off · 7131636e
      Paolo Bonzini authored
      Userspace that does not know about KVM_GET_MSR_FEATURE_INDEX_LIST
      will generally use the default value for MSR_IA32_ARCH_CAPABILITIES.
      When this happens and the host has tsx=on, it is possible to end up with
      virtual machines that have HLE and RTM disabled, but TSX_CTRL available.
      
      If the fleet is then switched to tsx=off, kvm_get_arch_capabilities()
      will clear the ARCH_CAP_TSX_CTRL_MSR bit and it will not be possible to
      use the tsx=off hosts as migration destinations, even though the guests
      do not have TSX enabled.
      
      To allow this migration, allow guests to write to their TSX_CTRL MSR,
      while keeping the host MSR unchanged for the entire life of the guests.
      This ensures that TSX remains disabled and also saves MSR reads and
      writes, and it's okay to do because with tsx=off we know that guests will
      not have the HLE and RTM features in their CPUID.  (If userspace sets
      bogus CPUID data, we do not expect HLE and RTM to work in guests anyway).
      
      Cc: stable@vger.kernel.org
      Fixes: cbbaa272 ("KVM: x86: fix presentation of TSX feature in ARCH_CAPABILITIES")
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      7131636e
  5. 28 Jan, 2021 4 commits
  6. 25 Jan, 2021 3 commits
    • Paolo Bonzini's avatar
      KVM: x86: allow KVM_REQ_GET_NESTED_STATE_PAGES outside guest mode for VMX · 9a78e158
      Paolo Bonzini authored
      VMX also uses KVM_REQ_GET_NESTED_STATE_PAGES for the Hyper-V eVMCS,
      which may need to be loaded outside guest mode.  Therefore we cannot
      WARN in that case.
      
      However, that part of nested_get_vmcs12_pages is _not_ needed at
      vmentry time.  Split it out of KVM_REQ_GET_NESTED_STATE_PAGES handling,
      so that both vmentry and migration (and in the latter case, independent
      of is_guest_mode) do the parts that are needed.
      
      Cc: <stable@vger.kernel.org> # 5.10.x: f2c7ef3b: KVM: nSVM: cancel KVM_REQ_GET_NESTED_STATE_PAGES
      Cc: <stable@vger.kernel.org> # 5.10.x
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      9a78e158
    • Sean Christopherson's avatar
      KVM: x86: Revert "KVM: x86: Mark GPRs dirty when written" · aed89418
      Sean Christopherson authored
      Revert the dirty/available tracking of GPRs now that KVM copies the GPRs
      to the GHCB on any post-VMGEXIT VMRUN, even if a GPR is not dirty.  Per
      commit de3cd117 ("KVM: x86: Omit caching logic for always-available
      GPRs"), tracking for GPRs noticeably impacts KVM's code footprint.
      
      This reverts commit 1c04d8c9.
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20210122235049.3107620-3-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      aed89418
    • Sean Christopherson's avatar
      KVM: SVM: Unconditionally sync GPRs to GHCB on VMRUN of SEV-ES guest · 25009140
      Sean Christopherson authored
      Drop the per-GPR dirty checks when synchronizing GPRs to the GHCB, the
      GRPs' dirty bits are set from time zero and never cleared, i.e. will
      always be seen as dirty.  The obvious alternative would be to clear
      the dirty bits when appropriate, but removing the dirty checks is
      desirable as it allows reverting GPR dirty+available tracking, which
      adds overhead to all flavors of x86 VMs.
      
      Note, unconditionally writing the GPRs in the GHCB is tacitly allowed
      by the GHCB spec, which allows the hypervisor (or guest) to provide
      unnecessary info; it's the guest's responsibility to consume only what
      it needs (the hypervisor is untrusted after all).
      
        The guest and hypervisor can supply additional state if desired but
        must not rely on that additional state being provided.
      
      Cc: Brijesh Singh <brijesh.singh@amd.com>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Fixes: 291bd20d ("KVM: SVM: Add initial support for a VMGEXIT VMEXIT")
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20210122235049.3107620-2-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      25009140