1. 15 May, 2020 6 commits
    • Sean Christopherson's avatar
      KVM: x86/mmu: Add a helper to consolidate root sp allocation · 8123f265
      Sean Christopherson authored
      Add a helper, mmu_alloc_root(), to consolidate the allocation of a root
      shadow page, which has the same basic mechanics for all flavors of TDP
      and shadow paging.
      
      Note, __pa(sp->spt) doesn't need to be protected by mmu_lock, sp->spt
      points at a kernel page.
      
      No functional change intended.
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Message-Id: <20200428023714.31923-1-sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      8123f265
    • Sean Christopherson's avatar
      KVM: x86/mmu: Drop KVM's hugepage enums in favor of the kernel's enums · 3bae0459
      Sean Christopherson authored
      Replace KVM's PT_PAGE_TABLE_LEVEL, PT_DIRECTORY_LEVEL and PT_PDPE_LEVEL
      with the kernel's PG_LEVEL_4K, PG_LEVEL_2M and PG_LEVEL_1G.  KVM's
      enums are borderline impossible to remember and result in code that is
      visually difficult to audit, e.g.
      
              if (!enable_ept)
                      ept_lpage_level = 0;
              else if (cpu_has_vmx_ept_1g_page())
                      ept_lpage_level = PT_PDPE_LEVEL;
              else if (cpu_has_vmx_ept_2m_page())
                      ept_lpage_level = PT_DIRECTORY_LEVEL;
              else
                      ept_lpage_level = PT_PAGE_TABLE_LEVEL;
      
      versus
      
              if (!enable_ept)
                      ept_lpage_level = 0;
              else if (cpu_has_vmx_ept_1g_page())
                      ept_lpage_level = PG_LEVEL_1G;
              else if (cpu_has_vmx_ept_2m_page())
                      ept_lpage_level = PG_LEVEL_2M;
              else
                      ept_lpage_level = PG_LEVEL_4K;
      
      No functional change intended.
      Suggested-by: default avatarBarret Rhoden <brho@google.com>
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Message-Id: <20200428005422.4235-4-sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      3bae0459
    • Sean Christopherson's avatar
      KVM: x86/mmu: Move max hugepage level to a separate #define · e662ec3e
      Sean Christopherson authored
      Rename PT_MAX_HUGEPAGE_LEVEL to KVM_MAX_HUGEPAGE_LEVEL and make it a
      separate define in anticipation of dropping KVM's PT_*_LEVEL enums in
      favor of the kernel's PG_LEVEL_* enums.
      
      No functional change intended.
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Message-Id: <20200428005422.4235-3-sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e662ec3e
    • Sean Christopherson's avatar
      KVM: x86/mmu: Tweak PSE hugepage handling to avoid 2M vs 4M conundrum · b2f432f8
      Sean Christopherson authored
      Change the PSE hugepage handling in walk_addr_generic() to fire on any
      page level greater than PT_PAGE_TABLE_LEVEL, a.k.a. PG_LEVEL_4K.  PSE
      paging only has two levels, so "== 2" and "> 1" are functionally the
      same, i.e. this is a nop.
      
      A future patch will drop KVM's PT_*_LEVEL enums in favor of the kernel's
      PG_LEVEL_* enums, at which point "walker->level == PG_LEVEL_2M" is
      semantically incorrect (though still functionally ok).
      
      No functional change intended.
      Suggested-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Message-Id: <20200428005422.4235-2-sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      b2f432f8
    • Xiaoyao Li's avatar
      kvm: x86: Cleanup vcpu->arch.guest_xstate_size · a71936ab
      Xiaoyao Li authored
      vcpu->arch.guest_xstate_size lost its only user since commit df1daba7
      ("KVM: x86: support XSAVES usage in the host"), so clean it up.
      Signed-off-by: default avatarXiaoyao Li <xiaoyao.li@intel.com>
      Message-Id: <20200429154312.1411-1-xiaoyao.li@intel.com>
      Reviewed-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      a71936ab
    • Sean Christopherson's avatar
      KVM: nVMX: Tweak handling of failure code for nested VM-Enter failure · 68cda40d
      Sean Christopherson authored
      Use an enum for passing around the failure code for a failed VM-Enter
      that results in VM-Exit to provide a level of indirection from the final
      resting place of the failure code, vmcs.EXIT_QUALIFICATION.  The exit
      qualification field is an unsigned long, e.g. passing around
      'u32 exit_qual' throws up red flags as it suggests KVM may be dropping
      bits when reporting errors to L1.  This is a red herring because the
      only defined failure codes are 0, 2, 3, and 4, i.e. don't come remotely
      close to overflowing a u32.
      
      Setting vmcs.EXIT_QUALIFICATION on entry failure is further complicated
      by the MSR load list, which returns the (1-based) entry that failed, and
      the number of MSRs to load is a 32-bit VMCS field.  At first blush, it
      would appear that overflowing a u32 is possible, but the number of MSRs
      that can be loaded is hardcapped at 4096 (limited by MSR_IA32_VMX_MISC).
      
      In other words, there are two completely disparate types of data that
      eventually get stuffed into vmcs.EXIT_QUALIFICATION, neither of which is
      an 'unsigned long' in nature.  This was presumably the reasoning for
      switching to 'u32' when the related code was refactored in commit
      ca0bde28 ("kvm: nVMX: Split VMCS checks from nested_vmx_run()").
      
      Using an enum for the failure code addresses the technically-possible-
      but-will-never-happen scenario where Intel defines a failure code that
      doesn't fit in a 32-bit integer.  The enum variables and values will
      either be automatically sized (gcc 5.4 behavior) or be subjected to some
      combination of truncation.  The former case will simply work, while the
      latter will trigger a compile-time warning unless the compiler is being
      particularly unhelpful.
      
      Separating the failure code from the failed MSR entry allows for
      disassociating both from vmcs.EXIT_QUALIFICATION, which avoids the
      conundrum where KVM has to choose between 'u32 exit_qual' and tracking
      values as 'unsigned long' that have no business being tracked as such.
      To cement the split, set vmcs12->exit_qualification directly from the
      entry error code or failed MSR index instead of bouncing through a local
      variable.
      
      Opportunistically rename the variables in load_vmcs12_host_state() and
      vmx_set_nested_state() to call out that they're ignored, set exit_reason
      on demand on nested VM-Enter failure, and add a comment in
      nested_vmx_load_msr() to call out that returning 'i + 1' can't wrap.
      
      No functional change intended.
      Reported-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Cc: Jim Mattson <jmattson@google.com>
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Message-Id: <20200511220529.11402-1-sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      68cda40d
  2. 13 May, 2020 34 commits