- 20 Jun, 2022 2 commits
-
-
Sean Christopherson authored
If a nested run isn't pending, snapshot vmcs01.GUEST_IA32_DEBUGCTL irrespective of whether or not VM_ENTRY_LOAD_DEBUG_CONTROLS is set in vmcs12. When restoring nested state, e.g. after migration, without a nested run pending, prepare_vmcs02() will propagate nested.vmcs01_debugctl to vmcs02, i.e. will load garbage/zeros into vmcs02.GUEST_IA32_DEBUGCTL. If userspace restores nested state before MSRs, then loading garbage is a non-issue as loading DEBUGCTL will also update vmcs02. But if usersepace restores MSRs first, then KVM is responsible for propagating L2's value, which is actually thrown into vmcs01, into vmcs02. Restoring L2 MSRs into vmcs01, i.e. loading all MSRs before nested state is all kinds of bizarre and ideally would not be supported. Sadly, some VMMs do exactly that and rely on KVM to make things work. Note, there's still a lurking SMM bug, as propagating vmcs01's DEBUGCTL to vmcs02 across RSM may corrupt L2's DEBUGCTL. But KVM's entire VMX+SMM emulation is flawed as SMI+RSM should not toouch _any_ VMCS when use the "default treatment of SMIs", i.e. when not using an SMI Transfer Monitor. Link: https://lore.kernel.org/all/Yobt1XwOfb5M6Dfa@google.com Fixes: 8fcc4b59 ("kvm: nVMX: Introduce KVM_CAP_NESTED_STATE") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220614215831.3762138-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
If a nested run isn't pending, snapshot vmcs01.GUEST_BNDCFGS irrespective of whether or not VM_ENTRY_LOAD_BNDCFGS is set in vmcs12. When restoring nested state, e.g. after migration, without a nested run pending, prepare_vmcs02() will propagate nested.vmcs01_guest_bndcfgs to vmcs02, i.e. will load garbage/zeros into vmcs02.GUEST_BNDCFGS. If userspace restores nested state before MSRs, then loading garbage is a non-issue as loading BNDCFGS will also update vmcs02. But if usersepace restores MSRs first, then KVM is responsible for propagating L2's value, which is actually thrown into vmcs01, into vmcs02. Restoring L2 MSRs into vmcs01, i.e. loading all MSRs before nested state is all kinds of bizarre and ideally would not be supported. Sadly, some VMMs do exactly that and rely on KVM to make things work. Note, there's still a lurking SMM bug, as propagating vmcs01.GUEST_BNDFGS to vmcs02 across RSM may corrupt L2's BNDCFGS. But KVM's entire VMX+SMM emulation is flawed as SMI+RSM should not toouch _any_ VMCS when use the "default treatment of SMIs", i.e. when not using an SMI Transfer Monitor. Link: https://lore.kernel.org/all/Yobt1XwOfb5M6Dfa@google.com Fixes: 62cf9bd8 ("KVM: nVMX: Fix emulation of VM_ENTRY_LOAD_BNDCFGS") Cc: stable@vger.kernel.org Cc: Lei Wang <lei4.wang@intel.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220614215831.3762138-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
- 15 Jun, 2022 13 commits
-
-
Uros Bizjak authored
Use try_cmpxchg64 instead of cmpxchg64 (*ptr, old, new) != old in fast_pf_fix_direct_spte. cmpxchg returns success in ZF flag, so this change saves a compare after cmpxchg (and related move instruction in front of cmpxchg). Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Wanpeng Li <wanpengli@tencent.com> Cc: Jim Mattson <jmattson@google.com> Cc: Joerg Roedel <joro@8bytes.org> Message-Id: <20220520144635.63134-1-ubizjak@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Uros Bizjak authored
Use try_cmpxchg64 instead of cmpxchg64 (*ptr, old, new) != old in pi_try_set_control. cmpxchg returns success in ZF flag, so this change saves a compare after cmpxchg (and related move instruction in front of cmpxchg): b9: 88 44 24 60 mov %al,0x60(%rsp) bd: 48 89 c8 mov %rcx,%rax c0: c6 44 24 62 f2 movb $0xf2,0x62(%rsp) c5: 48 8b 74 24 60 mov 0x60(%rsp),%rsi ca: f0 49 0f b1 34 24 lock cmpxchg %rsi,(%r12) d0: 48 39 c1 cmp %rax,%rcx d3: 75 cf jne a4 <vmx_vcpu_pi_load+0xa4> patched: c1: 88 54 24 60 mov %dl,0x60(%rsp) c5: c6 44 24 62 f2 movb $0xf2,0x62(%rsp) ca: 48 8b 54 24 60 mov 0x60(%rsp),%rdx cf: f0 48 0f b1 13 lock cmpxchg %rdx,(%rbx) d4: 75 d5 jne ab <vmx_vcpu_pi_load+0xab> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Wanpeng Li <wanpengli@tencent.com> Cc: Jim Mattson <jmattson@google.com> Cc: Joerg Roedel <joro@8bytes.org> Reported-by: kernel test robot <lkp@intel.com> Message-Id: <20220520143737.62513-1-ubizjak@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Uros Bizjak authored
Use try_cmpxchg64 instead of cmpxchg64 (*ptr, old, new) != old in tdp_mmu_set_spte_atomic. cmpxchg returns success in ZF flag, so this change saves a compare after cmpxchg (and related move instruction in front of cmpxchg). Also, remove explicit assignment to iter->old_spte when cmpxchg fails, this is what try_cmpxchg does implicitly. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Reviewed-by: David Matlack <dmatlack@google.com> Message-Id: <20220518135111.3535-1-ubizjak@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
When handling userspace MSR filter updates, recompute interception for possible passthrough MSRs if and only if KVM wants to disabled interception. If KVM wants to intercept accesses, i.e. the associated bit is set in vmx->shadow_msr_intercept, then there's no need to set the intercept again as KVM will intercept the MSR regardless of userspace's wants. No functional change intended, the call to vmx_enable_intercept_for_msr() really is just a gigantic nop. Suggested-by: Aaron Lewis <aaronlewis@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220610214140.612025-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Drop the CMPXCHG macro from paging_tmpl.h, it's no longer used now that KVM uses a common uaccess helper to do 8-byte CMPXCHG. Fixes: f122dfe4 ("KVM: x86: Use __try_cmpxchg_user() to update guest PTE A/D bits") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220613225723.2734132-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Lai Jiangshan authored
Use root_level in svm_load_mmu_pg() rather that looking up the root level in vcpu->arch.mmu->root_role.level. svm_load_mmu_pgd() has only one caller, kvm_mmu_load_pgd(), which always passes vcpu->arch.mmu->root_role.level as root_level. Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Message-Id: <20220605063417.308311-7-jiangshanlai@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Lai Jiangshan authored
Since the commit c5e2184d("KVM: x86/mmu: Remove the defunct update_pte() paging hook"), kvm_mmu_pte_write() no longer uses the rmap cache. So remove mmu_topup_memory_caches() in it. Cc: Sean Christopherson <seanjc@google.com> Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Message-Id: <20220605063417.308311-6-jiangshanlai@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Lai Jiangshan authored
Make it use the same verb as in kvm_kick_many_cpus(). Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Message-Id: <20220605063417.308311-5-jiangshanlai@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Lai Jiangshan authored
It is unused. Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Message-Id: <20220605063417.308311-3-jiangshanlai@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Janis Schoetterl-Glausch authored
Fix the inverted logic of the memop extension capability check. Fixes: 97da92c0 ("KVM: s390: selftests: Use TAP interface in the memop test") Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com> Message-Id: <20220614162635.3445019-1-scgl@linux.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Wrap the manipulation of @role and the manual mutex_{release,acquire}() invocations in CONFIG_PROVE_LOCKING=y to squash a clang-15 warning. When building with -Wunused-but-set-parameter and CONFIG_DEBUG_LOCK_ALLOC=n, clang-15 seees there's no usage of @role in mutex_lock_killable_nested() and yells. PROVE_LOCKING selects DEBUG_LOCK_ALLOC, and the only reason KVM manipulates @role is to make PROVE_LOCKING happy. To avoid true ugliness, use "i" and "j" to detect the first pass in the loops; the "idx" field that's used by kvm_for_each_vcpu() is guaranteed to be '0' on the first pass as it's simply the first entry in the vCPUs XArray, which is fully KVM controlled. kvm_for_each_vcpu() passes '0' for xa_for_each_range()'s "start", and xa_for_each_range() will not enter the loop if there's no entry at '0'. Fixes: 0c2c7c06 ("KVM: SEV: Mark nested locking of vcpu->lock") Reported-by: kernel test robot <lkp@intel.com> Cc: Peter Gonda <pgonda@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220613214237.2538266-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
This caused a warning on 32-bit systems, but undoubtedly would have acted funny on 64-bit as well. The fix was applied directly on merge in 5.19, see commit 24625f7d ("Merge tag for-linus of git://git.kernel.org/pub/scm/virt/kvm/kvm"). Fixes: 3743c2f0 ("KVM: x86: inhibit APICv/AVIC on changes to APIC ID or APIC base") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Shaoqin Huang authored
There are some parameter being removed in function but the parameter comments still exist, so remove them. Signed-off-by: Shaoqin Huang <shaoqin.huang@intel.com> Message-Id: <20220614224126.211054-1-shaoqin.huang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
- 14 Jun, 2022 5 commits
-
-
Sean Christopherson authored
Replace calls to kvm_check_cap() that treat its return as a boolean with calls to kvm_has_cap(). Several instances of kvm_check_cap() were missed when kvm_has_cap() was introduced. Reported-by: Andrew Jones <drjones@redhat.com> Fixes: 3ea9b809 ("KVM: selftests: Add kvm_has_cap() to provide syntactic sugar") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220613161942.1586791-5-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Remove a duplicate TEST_ASSERT() on the number of runnable vCPUs in vm_nr_pages_required() that snuck in during a rebase gone bad. Reported-by: Andrew Jones <drjones@redhat.com> Fixes: 6e1d13bf ("KVM: selftests: Move per-VM/per-vCPU nr pages calculation to __vm_create()") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220613161942.1586791-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Replace the goofy static_assert on the size of the @vm/@vcpu parameters with a call to a dummy helper, i.e. let the compiler naturally complain about an incompatible type instead of homebrewing a poor replacement. Reported-by: Andrew Jones <drjones@redhat.com> Fixes: fcba483e ("KVM: selftests: Sanity check input to ioctls() at build time") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220613161942.1586791-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Add an apostrophe in a comment about it being the caller's, not callers, responsibility to free an object. Reported-by: Andrew Jones <drjones@redhat.com> Fixes: 768e9a61 ("KVM: selftests: Purge vm+vcpu_id == vcpu silliness") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220613161942.1586791-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Andrew Jones authored
kvm_binary_stats_test accepts two arguments, the number of vms and number of vcpus. If these inputs are not equal then the test would likely crash for one reason or another due to using miscalculated indices for the vcpus array. Fix the index expressions by swapping the use of i and j. Signed-off-by: Andrew Jones <drjones@redhat.com> Message-Id: <20220614081041.2571511-1-drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
- 11 Jun, 2022 20 commits
-
-
Sean Christopherson authored
Add a static assert to the KVM/VM/vCPU ioctl() helpers to verify that the size of the argument provided matches the expected size of the IOCTL. Because ioctl() ultimately takes a "void *", it's all too easy to pass in garbage and not detect the error until runtime. E.g. while working on a CPUID rework, selftests happily compiled when vcpu_set_cpuid() unintentionally passed the cpuid() function as the parameter to ioctl() (a local "cpuid" parameter was removed, but its use was not replaced with "vcpu->cpuid" as intended). Tweak a variety of benign issues that aren't compatible with the sanity check, e.g. passing a non-pointer for ioctls(). Note, static_assert() requires a string on older versions of GCC. Feed it an empty string to make the compiler happy. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Use the TAP-friendly ksft_exit_skip() instead of KVM's custom print_skip() when skipping a test via __TEST_REQUIRE. KVM's "skipping test" has no known benefit, whereas some setups rely on TAP output. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Add TEST_REQUIRE() and __TEST_REQUIRE() to replace the myriad open coded instances of selftests exiting with KSFT_SKIP after printing an informational message. In addition to reducing the amount of boilerplate code in selftests, the UPPERCASE macro names make it easier to visually identify a test's requirements. Convert usage that erroneously uses something other than print_skip() and/or "exits" with '0' or some other non-KSFT_SKIP value. Intentionally drop a kvm_vm_free() in aarch64/debug-exceptions.c as part of the conversion. All memory and file descriptors are freed on process exit, so the explicit free is superfluous. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Add kvm_has_cap() to wrap kvm_check_cap() and return a bool for the use cases where the caller only wants check if a capability is supported, i.e. doesn't care about the value beyond whether or not it's non-zero. The "check" terminology is somewhat ambiguous as the non-boolean return suggests that '0' might mean "success", i.e. suggests that the ioctl uses the 0/-errno pattern. Provide a wrapper instead of trying to find a new name for the raw helper; the "check" terminology is derived from the name of the ioctl, so using e.g. "get" isn't a clear win. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Return an 'unsigned int' instead of a signed 'int' from kvm_check_cap(), to make it more obvious that kvm_check_cap() can never return a negative value due to its assertion that the return is ">= 0". Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Remove DEFAULT_GUEST_PHY_PAGES and open code the magic number (with a comment) in vm_nr_pages_required(). Exposing DEFAULT_GUEST_PHY_PAGES to tests was a symptom of the VM creation APIs not cleanly supporting tests that create runnable vCPUs, but can't do so immediately. Now that tests don't have to manually compute the amount of memory needed for basic operation, make it harder for tests to do things that should be handled by the framework, i.e. force developers to improve the framework instead of hacking around flaws in individual tests. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Use vm->max_gfn to compute the highest gpa in vmx_apic_access_test, and blindly trust that the highest gfn/gpa will be well above the memory carved out for memslot0. The existing check is beyond paranoid; KVM doesn't support CPUs with host.MAXPHYADDR < 32, and the selftests are all kinds of hosed if memslot0 overlaps the local xAPIC, which resides above "lower" (below 4gb) DRAM. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Handle all memslot0 size adjustments in __vm_create(). Currently, the adjustments reside in __vm_create_with_vcpus(), which means tests that call vm_create() or __vm_create() directly are left to their own devices. Some tests just pass DEFAULT_GUEST_PHY_PAGES and don't bother with any adjustments, while others mimic the per-vCPU calculations. For vm_create(), and thus __vm_create(), take the number of vCPUs that will be runnable to calculate that number of per-vCPU pages needed for memslot0. To give readers a hint that neither vm_create() nor __vm_create() create vCPUs, name the parameter @nr_runnable_vcpus instead of @nr_vcpus. That also gives readers a hint as to why tests that create larger numbers of vCPUs but never actually run those vCPUs can skip straight to the vm_create_barebones() variant. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Drop @num_percpu_pages from __vm_create_with_vcpus(), all callers pass '0' and there's unlikely to be a test that allocates just enough memory that it needs a per-CPU allocation, but not so much that it won't just do its own memory management. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
All callers of __vm_create_with_vcpus() pass DEFAULT_GUEST_PHY_PAGES for @slot_mem_pages; drop the param and just hardcode the "default" as the base number of pages for slot0. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Drop a variety of 'struct kvm_vm' accessors that wrap a single variable now that tests can simply reference the variable directly. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Drop vcpu_state() now that all tests reference vcpu->run directly. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Drop vcpu_get() and rename vcpu_find() to vcpu_exists() to make it that much harder for a test to give meaning to a vCPU ID. I.e. force tests to capture a vCPU when the vCPU is created. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Take a vCPU directly instead of a VM+vcpu pair in all vCPU-scoped helpers and ioctls. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Require the caller of __vm_create_with_vcpus() to provide a non-NULL array of vCPUs now that all callers do so. It's extremely unlikely a test will have a legitimate use case for creating a VM with vCPUs without wanting to do something with those vCPUs, and if there is such a use case, requiring that one-off test to provide a dummy array is a minor annoyance. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Grab the vCPU from vm_vcpu_add() directly instead of doing vcpu_get() after the fact. This will allow removing vcpu_get() entirely. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Track vCPUs by their 'struct kvm_vcpu' object, and stop assuming that a vCPU's ID is the same as its index when referencing a vCPU's metadata. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
In preparation for taking a vCPU pointer in vCPU-scoped functions, grab the vCPU(s) created by __vm_vcpu_add() and use the ID from the vCPU object instead of hardcoding the ID in ioctl() invocations. Rename init1/init2 => init0/init1 to avoid having odd/confusing code where vcpu0 consumes init1 and vcpu1 consumes init2. Note, this change could easily be done when the functions are converted in the future, and/or the vcpu{0,1} vs. init{1,2} discrepancy could be ignored, but then there would be no opportunity to poke fun at the 1-based counting scheme. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Track the vCPU's 'struct kvm_vcpu' object in get-reg-list instead of hardcoding '0' everywhere. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Sean Christopherson authored
Track vCPUs by their 'struct kvm_vcpu' object in kvm_binary_stats_test, not by their ID. The per-vCPU helpers will soon take a vCPU instead of a VM+vcpu_id pair. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-