• Sean Christopherson's avatar
    KVM: VMX: Flush all EPTP/VPID contexts on remote TLB flush · e8eff282
    Sean Christopherson authored
    Flush all EPTP/VPID contexts if a TLB flush _may_ have been triggered by
    a remote or deferred TLB flush, i.e. by KVM_REQ_TLB_FLUSH.  Remote TLB
    flushes require all contexts to be invalidated, not just the active
    contexts, e.g. all mappings in all contexts for a given HVA need to be
    invalidated on a mmu_notifier invalidation.  Similarly, the instigator
    of the deferred TLB flush may be expecting all contexts to be flushed,
    e.g. vmx_vcpu_load_vmcs().
    
    Without nested VMX, flushing only the current EPTP/VPID context isn't
    problematic because KVM uses a constant VPID for each vCPU, and
    mmu_alloc_direct_roots() all but guarantees KVM will use a single EPTP
    for L1.  In the rare case where a different EPTP is created or reused,
    KVM (currently) unconditionally flushes the new EPTP context prior to
    entering the guest.
    
    With nested VMX, KVM conditionally uses a different VPID for L2, and
    unconditionally uses a different EPTP for L2.  Because KVM doesn't
    _intentionally_ guarantee L2's EPTP/VPID context is flushed on nested
    VM-Enter, it'd be possible for a malicious L1 to attack the host and/or
    different VMs by exploiting the lack of flushing for L2.
    
      1) Launch nested guest from malicious L1.
    
      2) Nested VM-Enter to L2.
    
      3) Access target GPA 'g'.  CPU inserts TLB entry tagged with L2's ASID
         mapping 'g' to host PFN 'x'.
    
      2) Nested VM-Exit to L1.
    
      3) L1 triggers kernel same-page merging (ksm) by duplicating/zeroing
         the page for PFN 'x'.
    
      4) Host kernel merges PFN 'x' with PFN 'y', i.e. unmaps PFN 'x' and
         remaps the page to PFN 'y'.  mmu_notifier sends invalidate command,
         KVM flushes TLB only for L1's ASID.
    
      4) Host kernel reallocates PFN 'x' to some other task/guest.
    
      5) Nested VM-Enter to L2.  KVM does not invalidate L2's EPTP or VPID.
    
      6) L2 accesses GPA 'g' and gains read/write access to PFN 'x' via its
         stale TLB entry.
    
    However, current KVM unconditionally flushes L1's EPTP/VPID context on
    nested VM-Exit.  But, that behavior is mostly unintentional, KVM doesn't
    go out of its way to flush EPTP/VPID on nested VM-Enter/VM-Exit, rather
    a TLB flush is guaranteed to occur prior to re-entering L1 due to
    __kvm_mmu_new_cr3() always being called with skip_tlb_flush=false.  On
    nested VM-Enter, this happens via kvm_init_shadow_ept_mmu() (nested EPT
    enabled) or in nested_vmx_load_cr3() (nested EPT disabled).  On nested
    VM-Exit it occurs via nested_vmx_load_cr3().
    
    This also fixes a bug where a deferred TLB flush in the context of L2,
    with EPT disabled, would flush L1's VPID instead of L2's VPID, as
    vmx_flush_tlb() flushes L1's VPID regardless of is_guest_mode().
    
    Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
    Cc: Ben Gardon <bgardon@google.com>
    Cc: Jim Mattson <jmattson@google.com>
    Cc: Junaid Shahid <junaids@google.com>
    Cc: Liran Alon <liran.alon@oracle.com>
    Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
    Cc: John Haxby <john.haxby@oracle.com>
    Reviewed-by: default avatarLiran Alon <liran.alon@oracle.com>
    Fixes: efebf0aa ("KVM: nVMX: Do not flush TLB on L1<->L2 transitions if L1 uses VPID and EPT")
    Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
    Message-Id: <20200320212833.3507-2-sean.j.christopherson@intel.com>
    Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
    e8eff282
vmx.h 14.4 KB