Commit eb298605 authored by David Matlack's avatar David Matlack Committed by Paolo Bonzini

KVM: x86/mmu: Do not recover dirty-tracked NX Huge Pages

Do not recover (i.e. zap) an NX Huge Page that is being dirty tracked,
as it will just be faulted back in at the same 4KiB granularity when
accessed by a vCPU. This may need to be changed if KVM ever supports
2MiB (or larger) dirty tracking granularity, or faulting huge pages
during dirty tracking for reads/executes. However for now, these zaps
are entirely wasteful.

In order to check if this commit increases the CPU usage of the NX
recovery worker thread I used a modified version of execute_perf_test
[1] that supports splitting guest memory into multiple slots and reports
/proc/pid/schedstat:se.sum_exec_runtime for the NX recovery worker just
before tearing down the VM. The goal was to force a large number of NX
Huge Page recoveries and see if the recovery worker used any more CPU.

Test Setup:

  echo 1000 > /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms
  echo 10 > /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio

Test Command:

  ./execute_perf_test -v64 -s anonymous_hugetlb_1gb -x 16 -o

        | kvm-nx-lpage-re:se.sum_exec_runtime      |
        | ---------------------------------------- |
Run     | Before             | After               |
------- | ------------------ | ------------------- |
1       | 730.084105         | 724.375314          |
2       | 728.751339         | 740.581988          |
3       | 736.264720         | 757.078163          |

Comparing the median results, this commit results in about a 1% increase
CPU usage of the NX recovery worker when testing a VM with 16 slots.
However, the effect is negligible with the default halving time of NX
pages, which is 1 hour rather than 10 seconds given by period_ms = 1000,
ratio = 10.

[1] https://lore.kernel.org/kvm/20221019234050.3919566-2-dmatlack@google.com/Signed-off-by: default avatarDavid Matlack <dmatlack@google.com>
Message-Id: <20221103204421.1146958-1-dmatlack@google.com>
Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
parent 63d28a25
...@@ -6841,6 +6841,7 @@ static int set_nx_huge_pages_recovery_param(const char *val, const struct kernel ...@@ -6841,6 +6841,7 @@ static int set_nx_huge_pages_recovery_param(const char *val, const struct kernel
static void kvm_recover_nx_huge_pages(struct kvm *kvm) static void kvm_recover_nx_huge_pages(struct kvm *kvm)
{ {
unsigned long nx_lpage_splits = kvm->stat.nx_lpage_splits; unsigned long nx_lpage_splits = kvm->stat.nx_lpage_splits;
struct kvm_memory_slot *slot;
int rcu_idx; int rcu_idx;
struct kvm_mmu_page *sp; struct kvm_mmu_page *sp;
unsigned int ratio; unsigned int ratio;
...@@ -6875,7 +6876,21 @@ static void kvm_recover_nx_huge_pages(struct kvm *kvm) ...@@ -6875,7 +6876,21 @@ static void kvm_recover_nx_huge_pages(struct kvm *kvm)
struct kvm_mmu_page, struct kvm_mmu_page,
possible_nx_huge_page_link); possible_nx_huge_page_link);
WARN_ON_ONCE(!sp->nx_huge_page_disallowed); WARN_ON_ONCE(!sp->nx_huge_page_disallowed);
if (is_tdp_mmu_page(sp)) WARN_ON_ONCE(!sp->role.direct);
slot = gfn_to_memslot(kvm, sp->gfn);
WARN_ON_ONCE(!slot);
/*
* Unaccount and do not attempt to recover any NX Huge Pages
* that are being dirty tracked, as they would just be faulted
* back in as 4KiB pages. The NX Huge Pages in this slot will be
* recovered, along with all the other huge pages in the slot,
* when dirty logging is disabled.
*/
if (slot && kvm_slot_dirty_track_enabled(slot))
unaccount_nx_huge_page(kvm, sp);
else if (is_tdp_mmu_page(sp))
flush |= kvm_tdp_mmu_zap_sp(kvm, sp); flush |= kvm_tdp_mmu_zap_sp(kvm, sp);
else else
kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list); kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment