Commit f28e9c7f authored by Sean Christopherson's avatar Sean Christopherson Committed by Paolo Bonzini

KVM: x86/mmu: Fix wrong/misleading comments in TDP MMU fast zap

Fix misleading and arguably wrong comments in the TDP MMU's fast zap
flow.  The comments, and the fact that actually zapping invalid roots was
added separately, strongly suggests that zapping invalid roots is an
optimization and not required for correctness.  That is a lie.

KVM _must_ zap invalid roots before returning from kvm_mmu_zap_all_fast(),
because when it's called from kvm_mmu_invalidate_zap_pages_in_memslot(),
KVM is relying on it to fully remove all references to the memslot.  Once
the memslot is gone, KVM's mmu_notifier hooks will be unable to find the
stale references as the hva=>gfn translation is done via the memslots.
If KVM doesn't immediately zap SPTEs and userspace unmaps a range after
deleting a memslot, KVM will fail to zap in response to the mmu_notifier
due to not finding a memslot corresponding to the notifier's range, which
leads to a variation of use-after-free.

The other misleading comment (and code) explicitly states that roots
without a reference should be skipped.  While that's technically true,
it's also extremely misleading as it should be impossible for KVM to
encounter a defunct root on the list while holding mmu_lock for write.
Opportunistically add a WARN to enforce that invariant.

Fixes: b7cccd39 ("KVM: x86/mmu: Fast invalidation for TDP MMU")
Fixes: 4c6654bd ("KVM: x86/mmu: Tear down roots before kvm_mmu_zap_all_fast returns")
Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
Reviewed-by: default avatarBen Gardon <bgardon@google.com>
Message-Id: <20220226001546.360188-4-seanjc@google.com>
Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
parent 3354ef5a
...@@ -5732,6 +5732,14 @@ static void kvm_mmu_zap_all_fast(struct kvm *kvm) ...@@ -5732,6 +5732,14 @@ static void kvm_mmu_zap_all_fast(struct kvm *kvm)
write_unlock(&kvm->mmu_lock); write_unlock(&kvm->mmu_lock);
/*
* Zap the invalidated TDP MMU roots, all SPTEs must be dropped before
* returning to the caller, e.g. if the zap is in response to a memslot
* deletion, mmu_notifier callbacks will be unable to reach the SPTEs
* associated with the deleted memslot once the update completes, and
* Deferring the zap until the final reference to the root is put would
* lead to use-after-free.
*/
if (is_tdp_mmu_enabled(kvm)) { if (is_tdp_mmu_enabled(kvm)) {
read_lock(&kvm->mmu_lock); read_lock(&kvm->mmu_lock);
kvm_tdp_mmu_zap_invalidated_roots(kvm); kvm_tdp_mmu_zap_invalidated_roots(kvm);
......
...@@ -833,12 +833,11 @@ static struct kvm_mmu_page *next_invalidated_root(struct kvm *kvm, ...@@ -833,12 +833,11 @@ static struct kvm_mmu_page *next_invalidated_root(struct kvm *kvm,
} }
/* /*
* Since kvm_tdp_mmu_zap_all_fast has acquired a reference to each * Zap all invalidated roots to ensure all SPTEs are dropped before the "fast
* invalidated root, they will not be freed until this function drops the * zap" completes. Since kvm_tdp_mmu_invalidate_all_roots() has acquired a
* reference. Before dropping that reference, tear down the paging * reference to each invalidated root, roots will not be freed until after this
* structure so that whichever thread does drop the last reference * function drops the gifted reference, e.g. so that vCPUs don't get stuck with
* only has to do a trivial amount of work. Since the roots are invalid, * tearing down paging structures.
* no new SPTEs should be created under them.
*/ */
void kvm_tdp_mmu_zap_invalidated_roots(struct kvm *kvm) void kvm_tdp_mmu_zap_invalidated_roots(struct kvm *kvm)
{ {
...@@ -877,21 +876,25 @@ void kvm_tdp_mmu_zap_invalidated_roots(struct kvm *kvm) ...@@ -877,21 +876,25 @@ void kvm_tdp_mmu_zap_invalidated_roots(struct kvm *kvm)
} }
/* /*
* Mark each TDP MMU root as invalid so that other threads * Mark each TDP MMU root as invalid to prevent vCPUs from reusing a root that
* will drop their references and allow the root count to * is about to be zapped, e.g. in response to a memslots update. The caller is
* go to 0. * responsible for invoking kvm_tdp_mmu_zap_invalidated_roots() to do the actual
* zapping.
* *
* Also take a reference on all roots so that this thread * Take a reference on all roots to prevent the root from being freed before it
* can do the bulk of the work required to free the roots * is zapped by this thread. Freeing a root is not a correctness issue, but if
* once they are invalidated. Without this reference, a * a vCPU drops the last reference to a root prior to the root being zapped, it
* vCPU thread might drop the last reference to a root and * will get stuck with tearing down the entire paging structure.
* get stuck with tearing down the entire paging structure.
* *
* Roots which have a zero refcount should be skipped as * Get a reference even if the root is already invalid,
* they're already being torn down. * kvm_tdp_mmu_zap_invalidated_roots() assumes it was gifted a reference to all
* Already invalid roots should be referenced again so that * invalid roots, e.g. there's no epoch to identify roots that were invalidated
* they aren't freed before kvm_tdp_mmu_zap_all_fast is * by a previous call. Roots stay on the list until the last reference is
* done with them. * dropped, so even though all invalid roots are zapped, a root may not go away
* for quite some time, e.g. if a vCPU blocks across multiple memslot updates.
*
* Because mmu_lock is held for write, it should be impossible to observe a
* root with zero refcount, i.e. the list of roots cannot be stale.
* *
* This has essentially the same effect for the TDP MMU * This has essentially the same effect for the TDP MMU
* as updating mmu_valid_gen does for the shadow MMU. * as updating mmu_valid_gen does for the shadow MMU.
...@@ -901,9 +904,10 @@ void kvm_tdp_mmu_invalidate_all_roots(struct kvm *kvm) ...@@ -901,9 +904,10 @@ void kvm_tdp_mmu_invalidate_all_roots(struct kvm *kvm)
struct kvm_mmu_page *root; struct kvm_mmu_page *root;
lockdep_assert_held_write(&kvm->mmu_lock); lockdep_assert_held_write(&kvm->mmu_lock);
list_for_each_entry(root, &kvm->arch.tdp_mmu_roots, link) list_for_each_entry(root, &kvm->arch.tdp_mmu_roots, link) {
if (refcount_inc_not_zero(&root->tdp_mmu_root_count)) if (!WARN_ON_ONCE(!kvm_tdp_mmu_get_root(root)))
root->role.invalid = true; root->role.invalid = true;
}
} }
/* /*
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment