• Sean Christopherson's avatar
    KVM: Fix multiple races in gfn=>pfn cache refresh · 58cd407c
    Sean Christopherson authored
    Rework the gfn=>pfn cache (gpc) refresh logic to address multiple races
    between the cache itself, and between the cache and mmu_notifier events.
    
    The existing refresh code attempts to guard against races with the
    mmu_notifier by speculatively marking the cache valid, and then marking
    it invalid if a mmu_notifier invalidation occurs.  That handles the case
    where an invalidation occurs between dropping and re-acquiring gpc->lock,
    but it doesn't handle the scenario where the cache is refreshed after the
    cache was invalidated by the notifier, but before the notifier elevates
    mmu_notifier_count.  The gpc refresh can't use the "retry" helper as its
    invalidation occurs _before_ mmu_notifier_count is elevated and before
    mmu_notifier_range_start is set/updated.
    
      CPU0                                    CPU1
      ----                                    ----
    
      gfn_to_pfn_cache_invalidate_start()
      |
      -> gpc->valid = false;
                                              kvm_gfn_to_pfn_cache_refresh()
                                              |
                                              |-> gpc->valid = true;
    
                                              hva_to_pfn_retry()
                                              |
                                              -> acquire kvm->mmu_lock
                                                 kvm->mmu_notifier_count == 0
                                                 mmu_seq == kvm->mmu_notifier_seq
                                                 drop kvm->mmu_lock
                                                 return pfn 'X'
      acquire kvm->mmu_lock
      kvm_inc_notifier_count()
      drop kvm->mmu_lock()
      kernel frees pfn 'X'
                                              kvm_gfn_to_pfn_cache_check()
                                              |
                                              |-> gpc->valid == true
    
                                              caller accesses freed pfn 'X'
    
    Key off of mn_active_invalidate_count to detect that a pfncache refresh
    needs to wait for an in-progress mmu_notifier invalidation.  While
    mn_active_invalidate_count is not guaranteed to be stable, it is
    guaranteed to be elevated prior to an invalidation acquiring gpc->lock,
    so either the refresh will see an active invalidation and wait, or the
    invalidation will run after the refresh completes.
    
    Speculatively marking the cache valid is itself flawed, as a concurrent
    kvm_gfn_to_pfn_cache_check() would see a valid cache with stale pfn/khva
    values.  The KVM Xen use case explicitly allows/wants multiple users;
    even though the caches are allocated per vCPU, __kvm_xen_has_interrupt()
    can read a different vCPU (or vCPUs).  Address this race by invalidating
    the cache prior to dropping gpc->lock (this is made possible by fixing
    the above mmu_notifier race).
    
    Complicating all of this is the fact that both the hva=>pfn resolution
    and mapping of the kernel address can sleep, i.e. must be done outside
    of gpc->lock.
    
    Fix the above races in one fell swoop, trying to fix each individual race
    is largely pointless and essentially impossible to test, e.g. closing one
    hole just shifts the focus to the other hole.
    
    Fixes: 982ed0de ("KVM: Reinstate gfn_to_pfn_cache with invalidation support")
    Cc: stable@vger.kernel.org
    Cc: David Woodhouse <dwmw@amazon.co.uk>
    Cc: Mingwei Zhang <mizhang@google.com>
    Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
    Message-Id: <20220429210025.3293691-8-seanjc@google.com>
    Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
    58cd407c
kvm_main.c 147 KB