1. 20 Apr, 2021 21 commits
  2. 19 Apr, 2021 18 commits
  3. 17 Apr, 2021 1 commit
    • Sean Christopherson's avatar
      KVM: Take mmu_lock when handling MMU notifier iff the hva hits a memslot · 8931a454
      Sean Christopherson authored
      Defer acquiring mmu_lock in the MMU notifier paths until a "hit" has been
      detected in the memslots, i.e. don't take the lock for notifications that
      don't affect the guest.
      
      For small VMs, spurious locking is a minor annoyance.  And for "volatile"
      setups where the majority of notifications _are_ relevant, this barely
      qualifies as an optimization.
      
      But, for large VMs (hundreds of threads) with static setups, e.g. no
      page migration, no swapping, etc..., the vast majority of MMU notifier
      callbacks will be unrelated to the guest, e.g. will often be in response
      to the userspace VMM adjusting its own virtual address space.  In such
      large VMs, acquiring mmu_lock can be painful as it blocks vCPUs from
      handling page faults.  In some scenarios it can even be "fatal" in the
      sense that it causes unacceptable brownouts, e.g. when rebuilding huge
      pages after live migration, a significant percentage of vCPUs will be
      attempting to handle page faults.
      
      x86's TDP MMU implementation is especially susceptible to spurious
      locking due it taking mmu_lock for read when handling page faults.
      Because rwlock is fair, a single writer will stall future readers, while
      the writer is itself stalled waiting for in-progress readers to complete.
      This is exacerbated by the MMU notifiers often firing multiple times in
      quick succession, e.g. moving a page will (always?) invoke three separate
      notifiers: .invalidate_range_start(), invalidate_range_end(), and
      .change_pte().  Unnecessarily taking mmu_lock each time means even a
      single spurious sequence can be problematic.
      
      Note, this optimizes only the unpaired callbacks.  Optimizing the
      .invalidate_range_{start,end}() pairs is more complex and will be done in
      a future patch.
      Suggested-by: default avatarBen Gardon <bgardon@google.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20210402005658.3024832-9-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      8931a454