- 07 Apr, 2023 2 commits
-
-
Aaron Lewis authored
When counting "Instructions Retired" (0xc0) in a guest, KVM will occasionally increment the PMU counter regardless of if that event is being filtered. This is because some PMU events are incremented via kvm_pmu_trigger_event(), which doesn't know about the event filter. Add the event filter to kvm_pmu_trigger_event(), so events that are disallowed do not increment their counters. Fixes: 9cd803d4 ("KVM: x86: Update vPMCs when retiring instructions") Signed-off-by:
Aaron Lewis <aaronlewis@google.com> Reviewed-by:
Like Xu <likexu@tencent.com> Link: https://lore.kernel.org/r/20230307141400.1486314-2-aaronlewis@google.com [sean: prepend "pmc" to the new function] Signed-off-by:
Sean Christopherson <seanjc@google.com>
-
Like Xu authored
Fix a "reprogam" => "reprogram" typo in kvm_pmu_request_counter_reprogam(). Fixes: 68fb4757 ("KVM: x86/pmu: Defer reprogram_counter() to kvm_pmu_handle_event()") Signed-off-by:
Like Xu <likexu@tencent.com> Link: https://lore.kernel.org/r/20230310113349.31799-1-likexu@tencent.com [sean: trim the changelog] Signed-off-by:
Sean Christopherson <seanjc@google.com>
-
- 06 Apr, 2023 25 commits
-
-
Like Xu authored
A valid pmc is always tested before using pmu->reprogram_pmi. Eliminate this part of the redundancy by setting the counter's bitmask directly, and in addition, trigger KVM_REQ_PMU only once to save more cpu cycles. Signed-off-by:
Like Xu <likexu@tencent.com> Link: https://lore.kernel.org/r/20230214050757.9623-4-likexu@tencent.comSigned-off-by:
Sean Christopherson <seanjc@google.com>
-
Sean Christopherson authored
Invert the flows in intel_pmu_{g,s}et_msr()'s case statements so that they follow the kernel's preferred style of: if (<not valid>) return <error> <commit change> return <success> which is also the style used by every other {g,s}et_msr() helper (except AMD's PMU variant, which doesn't use a switch statement). Modify the "set" paths with costly side effects, i.e. that reprogram counters, to skip only the side effects, i.e. to perform reserved bits checks even if the value is unchanged. None of the reserved bits checks are expensive, so there's no strong justification for skipping them, and guarding only the side effect makes it slightly more obvious what is being skipped and why. No functional change intended (assuming no reserved bit bugs). Link: https://lkml.kernel.org/r/Y%2B6cfen%2FCpO3%2FdLO%40google.comSigned-off-by:
Sean Christopherson <seanjc@google.com>
-
Like Xu authored
The name of function pmc_is_enabled() is a bit misleading. A PMC can be disabled either by PERF_CLOBAL_CTRL or by its corresponding EVTSEL. Append global semantics to its name. Suggested-by:
Jim Mattson <jmattson@google.com> Signed-off-by:
Like Xu <likexu@tencent.com> Link: https://lore.kernel.org/r/20230214050757.9623-2-likexu@tencent.comSigned-off-by:
Sean Christopherson <seanjc@google.com>
-
Sean Christopherson authored
Verify that disabling the guest's vPMU via CPUID also disables LBRs. KVM has had at least one bug where LBRs would remain enabled even though the intent was to disable everything PMU related. Link: https://lore.kernel.org/r/20230311004618.920745-22-seanjc@google.comSigned-off-by:
Sean Christopherson <seanjc@google.com>
-
Sean Christopherson authored
Expand the immutable features sub-test for PERF_CAPABILITIES to verify KVM rejects any attempt to use a PEBS format other than the host's. Link: https://lore.kernel.org/r/20230311004618.920745-21-seanjc@google.comSigned-off-by:
Sean Christopherson <seanjc@google.com>
-
Sean Christopherson authored
Rework the LBR format test to use the bitfield instead of a separate mask macro, mainly so that adding a nearly-identical PEBS format test doesn't have to copy-paste-tweak the macro too. No functional change intended. Link: https://lore.kernel.org/r/20230311004618.920745-20-seanjc@google.comSigned-off-by:
Sean Christopherson <seanjc@google.com>
-
Sean Christopherson authored
Drop the arbitrary "done" message from the VMX PMU caps test, it's pretty obvious the test is done when the process exits. Link: https://lore.kernel.org/r/20230311004618.920745-19-seanjc@google.comSigned-off-by:
Sean Christopherson <seanjc@google.com>
-
Sean Christopherson authored
Now that KVM disallows changing PERF_CAPABILITIES after KVM_RUN, expand the host side checks to verify KVM rejects any attempts to change bits from userspace. Link: https://lore.kernel.org/r/20230311004618.920745-18-seanjc@google.comSigned-off-by:
Sean Christopherson <seanjc@google.com>
-
Sean Christopherson authored
Test that the guest can't write 0 to PERF_CAPABILITIES, can't write the current value, and can't toggle _any_ bits. There is no reason to special case the LBR format. Link: https://lore.kernel.org/r/20230311004618.920745-17-seanjc@google.comSigned-off-by:
Sean Christopherson <seanjc@google.com>
-
Sean Christopherson authored
Add negative testing of all immutable bits in PERF_CAPABILITIES, i.e. single bits that are reserved-0 or are effectively reserved-1 by KVM. Omit LBR and PEBS format bits from the test as it's easier to test them manually than it is to add safeguards to the comment path, e.g. toggling a single bit can yield a format of '0', which is legal as a "disable" value. Link: https://lore.kernel.org/r/20230311004618.920745-16-seanjc@google.comSigned-off-by:
Sean Christopherson <seanjc@google.com>
-
Sean Christopherson authored
Verify that userspace can set all fungible features in PERF_CAPABILITIES. Drop the now unused #define of the "full-width writes" flag. Link: https://lore.kernel.org/r/20230311004618.920745-15-seanjc@google.comSigned-off-by:
Sean Christopherson <seanjc@google.com>
-
Sean Christopherson authored
Now that vcpu_set_msr() verifies the expected "read what was wrote" semantics of all durable MSRs, including PERF_CAPABILITIES, drop the now-redundant manual checks in the VMX PMU caps test. Link: https://lore.kernel.org/r/20230311004618.920745-14-seanjc@google.comSigned-off-by:
Sean Christopherson <seanjc@google.com>
-
Sean Christopherson authored
Assert that KVM provides "read what you wrote" semantics for all "durable" MSRs (for lack of a better name). The extra coverage is cheap from a runtime performance perspective, and verifying the behavior in the common helper avoids gratuitous copy+paste in individual tests. Note, this affects all tests that set MSRs from userspace! Link: https://lore.kernel.org/r/20230311004618.920745-13-seanjc@google.comSigned-off-by:
Sean Christopherson <seanjc@google.com>
-
Sean Christopherson authored
Reimplement vcpu_set_msr() as a macro and pretty print the failing MSR (when possible) and the value if KVM_SET_MSRS fails instead of using the using the standard KVM_IOCTL_ERROR(). KVM_SET_MSRS is somewhat odd in that it returns the index of the last successful write, i.e. will be '0' on failure barring an entirely different KVM bug. And for writing MSRs, the MSR being written and the value being written are almost always relevant to the failure, i.e. just saying "failed!" doesn't help debug. Place the string goo in a separate macro in anticipation of using it to further expand MSR testing. Link: https://lore.kernel.org/r/20230311004618.920745-12-seanjc@google.comSigned-off-by:
Sean Christopherson <seanjc@google.com>
-
Sean Christopherson authored
KVM emulates full-width PMC writes in software, assert that KVM reports full-width writes as supported if PERF_CAPABILITIES is supported. Link: https://lore.kernel.org/r/20230311004618.920745-11-seanjc@google.comSigned-off-by:
Sean Christopherson <seanjc@google.com>
-
Sean Christopherson authored
Use a separate sub-test to verify userspace can clear PERF_CAPABILITIES and restore it to the KVM-supported value, as the testcase isn't unique to the LBR format. Link: https://lore.kernel.org/r/20230311004618.920745-10-seanjc@google.comSigned-off-by:
Sean Christopherson <seanjc@google.com>
-
Sean Christopherson authored
Zero out the LBR capabilities during PMU refresh to avoid exposing LBRs to the guest against userspace's wishes. If userspace modifies the guest's CPUID model or invokes KVM_CAP_PMU_CAPABILITY to disable vPMU after an initial KVM_SET_CPUID2, but before the first KVM_RUN, KVM will retain the previous LBR info due to bailing before refreshing the LBR descriptor. Note, this is a very theoretical bug, there is no known use case where a VMM would deliberately enable the vPMU via KVM_SET_CPUID2, and then later disable the vPMU. Link: https://lore.kernel.org/r/20230311004618.920745-9-seanjc@google.comSigned-off-by:
Sean Christopherson <seanjc@google.com>
-
Sean Christopherson authored
Now that KVM disallows changing feature MSRs, i.e. PERF_CAPABILITIES, after running a vCPU, WARN and bug the VM if the PMU is refreshed after the vCPU has run. Note, KVM has disallowed CPUID updates after running a vCPU since commit feb627e8 ("KVM: x86: Forbid KVM_SET_CPUID{,2} after KVM_RUN"), i.e. PERF_CAPABILITIES was the only remaining way to trigger a PMU refresh after KVM_RUN. Cc: Like Xu <like.xu.linux@gmail.com> Link: https://lore.kernel.org/r/20230311004618.920745-8-seanjc@google.comSigned-off-by:
Sean Christopherson <seanjc@google.com>
-
Sean Christopherson authored
Disallow writes to feature MSRs after KVM_RUN to prevent userspace from changing the vCPU model after running the vCPU. Similar to guest CPUID, KVM uses feature MSRs to configure intercepts, determine what operations are/aren't allowed, etc. Changing the capabilities while the vCPU is active will at best yield unpredictable guest behavior, and at worst could be dangerous to KVM. Allow writing the current value, e.g. so that userspace can blindly set all MSRs when emulating RESET, and unconditionally allow writes to MSR_IA32_UCODE_REV so that userspace can emulate patch loads. Special case the VMX MSRs to keep the generic list small, i.e. so that KVM can do a linear walk of the generic list without incurring meaningful overhead. Cc: Like Xu <like.xu.linux@gmail.com> Cc: Yu Zhang <yu.c.zhang@linux.intel.com> Reviewed-by:
Xiaoyao Li <xiaoyao.li@intel.com> Link: https://lore.kernel.org/r/20230311004618.920745-7-seanjc@google.comSigned-off-by:
Sean Christopherson <seanjc@google.com>
-
Sean Christopherson authored
Split the PERF_CAPABILITIES subtests into two parts so that the LBR format testcases don't execute after KVM_RUN. Similar to the guest CPUID model, KVM will soon disallow changing PERF_CAPABILITIES after KVM_RUN, at which point attempting to set the MSR after KVM_RUN will yield false positives and/or false negatives depending on what the test is trying to do. Land the LBR format test in a more generic "immutable features" test in anticipation of expanding its scope to other immutable features. Link: https://lore.kernel.org/r/20230311004618.920745-6-seanjc@google.comSigned-off-by:
Sean Christopherson <seanjc@google.com>
-
Sean Christopherson authored
Add VMX MSRs to the runtime list of feature MSRs by iterating over the range of emulated MSRs instead of manually defining each MSR in the "all" list. Using the range definition reduces the cost of emulating a new VMX MSR, e.g. prevents forgetting to add an MSR to the list. Extracting the VMX MSRs from the "all" list, which is a compile-time constant, also shrinks the list to the point where the compiler can heavily optimize code that iterates over the list. No functional change intended. Reviewed-by:
Xiaoyao Li <xiaoyao.li@intel.com> Link: https://lore.kernel.org/r/20230311004618.920745-5-seanjc@google.comSigned-off-by:
Sean Christopherson <seanjc@google.com>
-
Sean Christopherson authored
Add macros to track the range of VMX feature MSRs that are emulated by KVM to reduce the maintenance cost of extending the set of emulated MSRs. Note, KVM doesn't necessarily emulate all known/consumed VMX MSRs, e.g. PROCBASED_CTLS3 is consumed by KVM to enable IPI virtualization, but is not emulated as KVM doesn't emulate/virtualize IPI virtualization for nested guests. No functional change intended. Reviewed-by:
Xiaoyao Li <xiaoyao.li@intel.com> Link: https://lore.kernel.org/r/20230311004618.920745-4-seanjc@google.comSigned-off-by:
Sean Christopherson <seanjc@google.com>
-
Sean Christopherson authored
Add a helper to query if a vCPU has run so that KVM doesn't have to open code the check on last_vmentry_cpu being set to a magic value. No functional change intended. Suggested-by:
Xiaoyao Li <xiaoyao.li@intel.com> Cc: Like Xu <like.xu.linux@gmail.com> Reviewed-by:
Xiaoyao Li <xiaoyao.li@intel.com> Link: https://lore.kernel.org/r/20230311004618.920745-3-seanjc@google.comSigned-off-by:
Sean Christopherson <seanjc@google.com>
-
Sean Christopherson authored
Rename kvm_init_msr_list() to kvm_init_msr_lists() to clarify that it initializes multiple lists: MSRs to save, emulated MSRs, and feature MSRs. No functional change intended. Reviewed-by:
Xiaoyao Li <xiaoyao.li@intel.com> Link: https://lore.kernel.org/r/20230311004618.920745-2-seanjc@google.comSigned-off-by:
Sean Christopherson <seanjc@google.com>
-
Sean Christopherson authored
Disallow enabling LBR support if the CPU supports architectural LBRs. Traditional LBR support is absent on CPU models that have architectural LBRs, and KVM doesn't yet support arch LBRs, i.e. KVM will pass through non-existent MSRs if userspace enables LBRs for the guest. Cc: stable@vger.kernel.org Cc: Yang Weijiang <weijiang.yang@intel.com> Cc: Like Xu <like.xu.linux@gmail.com> Reported-by:
Paolo Bonzini <pbonzini@redhat.com> Fixes: be635e34 ("KVM: vmx/pmu: Expose LBR_FMT in the MSR_IA32_PERF_CAPABILITIES") Tested-by:
Like Xu <likexu@tencent.com> Link: https://lore.kernel.org/r/20230128001427.2548858-1-seanjc@google.comSigned-off-by:
Sean Christopherson <seanjc@google.com>
-
- 05 Apr, 2023 1 commit
-
-
Like Xu authored
The kvm_pmu_refresh() may be called repeatedly (e.g. configure guest CPUID repeatedly or update MSR_IA32_PERF_CAPABILITIES) and each call will use the last pmu->all_valid_pmc_idx value, with the residual bits introducing additional overhead later in the vPMU emulation. Fixes: b35e5548 ("KVM: x86/vPMU: Add lazy mechanism to release perf_event per vPMC") Suggested-by:
Sean Christopherson <seanjc@google.com> Signed-off-by:
Like Xu <likexu@tencent.com> Link: https://lore.kernel.org/r/20230404071759.75376-1-likexu@tencent.comSigned-off-by:
Sean Christopherson <seanjc@google.com>
-
- 23 Mar, 2023 1 commit
-
-
Mathias Krause authored
Move the 'version' member to the beginning of the structure to reuse an existing hole instead of introducing another one. This allows us to save 8 bytes for 64 bit builds. Signed-off-by:
Mathias Krause <minipli@grsecurity.net> Link: https://lore.kernel.org/r/20230217193336.15278-2-minipli@grsecurity.netSigned-off-by:
Sean Christopherson <seanjc@google.com>
-
- 16 Mar, 2023 9 commits
-
-
Thomas Huth authored
All kvm_arch_vm_ioctl() implementations now only deal with "int" types as return values, so we can change the return type of these functions to use "int" instead of "long". Signed-off-by:
Thomas Huth <thuth@redhat.com> Acked-by:
Anup Patel <anup@brainfault.org> Message-Id: <20230208140105.655814-7-thuth@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Thomas Huth authored
KVM functions use "long" return values for functions that are wired up to "struct file_operations", but otherwise use "int" return values for functions that can return 0/-errno in order to avoid unintentional divergences between 32-bit and 64-bit kernels. Some code still uses "long" in unnecessary spots, though, which can cause a little bit of confusion and unnecessary size casts. Let's change these spots to use "int" types, too. Signed-off-by:
Thomas Huth <thuth@redhat.com> Message-Id: <20230208140105.655814-6-thuth@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Thomas Huth authored
In case of success, this function returns the amount of handled bytes. However, this does not work for large values: The function is called from kvm_arch_vm_ioctl() (which still returns a long), which in turn is called from kvm_vm_ioctl() in virt/kvm/kvm_main.c. And that function stores the return value in an "int r" variable. So the upper 32-bits of the "long" return value are lost there. KVM ioctl functions should only return "int" values, so let's limit the amount of bytes that can be requested here to INT_MAX to avoid the problem with the truncated return value. We can then also change the return type of the function to "int" to make it clearer that it is not possible to return a "long" here. Fixes: f0376edb ("KVM: arm64: Add ioctl to fetch/store tags in a guest") Signed-off-by:
Thomas Huth <thuth@redhat.com> Reviewed-by:
Cornelia Huck <cohuck@redhat.com> Reviewed-by:
Gavin Shan <gshan@redhat.com> Reviewed-by:
Steven Price <steven.price@arm.com> Message-Id: <20230208140105.655814-5-thuth@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Thomas Huth authored
The KVM_GET_NR_MMU_PAGES ioctl is quite questionable on 64-bit hosts since it fails to return the full 64 bits of the value that can be set with the corresponding KVM_SET_NR_MMU_PAGES call. Its "long" return value is truncated into an "int" in the kvm_arch_vm_ioctl() function. Since this ioctl also never has been used by userspace applications (QEMU, Google's internal VMM, kvmtool and CrosVM have been checked), it's likely the best if we remove this badly designed ioctl before anybody really tries to use it. Signed-off-by:
Thomas Huth <thuth@redhat.com> Reviewed-by:
Sean Christopherson <seanjc@google.com> Message-Id: <20230208140105.655814-4-thuth@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Thomas Huth authored
These two functions only return normal integers, so it does not make sense to declare the return type as "long" here. Reviewed-by:
Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by:
Thomas Huth <thuth@redhat.com> Message-Id: <20230208140105.655814-3-thuth@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Thomas Huth authored
Most functions that are related to kvm_arch_vm_ioctl() already use "int" as return type to pass error values back to the caller. Some outlier functions use "long" instead for no good reason (they do not really require long values here). Let's standardize on "int" here to avoid casting the values back and forth between the two types. Reviewed-by:
Nicholas Piggin <npiggin@gmail.com> Signed-off-by:
Thomas Huth <thuth@redhat.com> Message-Id: <20230208140105.655814-2-thuth@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Emanuele Giuseppe Esposito authored
FLUSH_L1D was already added in 11e34e64, but the feature is not visible to userspace yet. The bit definition: CPUID.(EAX=7,ECX=0):EDX[bit 28] If the feature is supported by the host, kvm should support it too so that userspace can choose whether to expose it to the guest or not. One disadvantage of not exposing it is that the guest will report a non existing vulnerability in /sys/devices/system/cpu/vulnerabilities/mmio_stale_data because the mitigation is present only if the guest supports (FLUSH_L1D and MD_CLEAR) or FB_CLEAR. Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20230201132905.549148-4-eesposit@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Emanuele Giuseppe Esposito authored
Expose IA32_FLUSH_CMD to the guest if the guest CPUID enumerates support for this MSR. As with IA32_PRED_CMD, permission for unintercepted writes to this MSR will be granted to the guest after the first non-zero write. Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20230201132905.549148-3-eesposit@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Emanuele Giuseppe Esposito authored
Expose IA32_FLUSH_CMD to the guest if the guest CPUID enumerates support for this MSR. As with IA32_PRED_CMD, permission for unintercepted writes to this MSR will be granted to the guest after the first non-zero write. Co-developed-by:
Jim Mattson <jmattson@google.com> Signed-off-by:
Jim Mattson <jmattson@google.com> Signed-off-by:
Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20230201132905.549148-2-eesposit@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- 14 Mar, 2023 2 commits
-
-
Sean Christopherson authored
Rename enable_evmcs to __kvm_is_using_evmcs to match its wrapper, and to avoid confusion with enabling eVMCS for nested virtualization, i.e. have "enable eVMCS" be reserved for "enable eVMCS support for L1". No functional change intended. Signed-off-by:
Sean Christopherson <seanjc@google.com> Message-Id: <20230211003534.564198-4-seanjc@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Wrap enable_evmcs in a helper and stub it out when CONFIG_HYPERV=n in order to eliminate the static branch nop placeholders. clang-14 is clever enough to elide the nop, but gcc-12 is not. Stubbing out the key reduces the size of kvm-intel.ko by ~7.5% (200KiB) when compiled with gcc-12 (there are a _lot_ of VMCS accesses throughout KVM). Reviewed-by:
Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by:
Sean Christopherson <seanjc@google.com> Message-Id: <20230211003534.564198-3-seanjc@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-