- 04 Jun, 2018 1 commit
-
-
Wanpeng Li authored
'Commit d0659d94 ("KVM: x86: add option to advance tscdeadline hrtimer expiration")' advances the tscdeadline (the timer is emulated by hrtimer) expiration in order that the latency which is incurred by hypervisor (apic_timer_fn -> vmentry) can be avoided. This patch adds the advance tscdeadline expiration support to which the tscdeadline timer is emulated by VMX preemption timer to reduce the hypervisor lantency (handle_preemption_timer -> vmentry). The guest can also set an expiration that is very small (for example in Linux if an hrtimer feeds a expiration in the past); in that case we set delta_tsc to 0, leading to an immediately vmexit when delta_tsc is not bigger than advance ns. This patch can reduce ~63% latency (~4450 cycles to ~1660 cycles on a haswell desktop) for kvm-unit-tests/tscdeadline_latency when testing busy waits. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
- 01 Jun, 2018 7 commits
-
-
Liran Alon authored
We can document other "Known Limitations" but the ones currently referenced don't hold anymore... Signed-off-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Liran Alon authored
Fix outdated statement that KVM is not able to expose SLAT (Second-Layer-Address-Translation) to guests. This was implemented a long time ago... Signed-off-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Greg Kroah-Hartman authored
When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. This cleans up the error handling a lot, as this code will never get hit. Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Christoffer Dall <christoffer.dall@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim KrÄmáÅ
™ " <rkrcmar@redhat.com> Cc: Arvind Yadav <arvind.yadav.cs@gmail.com> Cc: Eric Auger <eric.auger@redhat.com> Cc: Andre Przywara <andre.przywara@arm.com> Cc: kvm-ppc@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-kernel@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: kvmarm@lists.cs.columbia.edu Cc: kvm@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> -
Marc Orr authored
The kvm struct has been bloating. For example, it's tens of kilo-bytes for x86, which turns out to be a large amount of memory to allocate contiguously via kzalloc. Thus, this patch does the following: 1. Uses architecture-specific routines to allocate the kvm struct via vzalloc for x86. 2. Switches arm to __KVM_HAVE_ARCH_VM_ALLOC so that it can use vzalloc when has_vhe() is true. Other architectures continue to default to kalloc, as they have a dependency on kalloc or have a small-enough struct kvm. Signed-off-by: Marc Orr <marcorr@google.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Souptick Joarder authored
Use new return type vm_fault_t for fault handler. For now, this is just documenting that the function returns a VM_FAULT value rather than an errno. Once all instances are converted, vm_fault_t will become a distinct type. commit 1c8f4220 ("mm: change return type to vm_fault_t") Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
Merge tag 'kvm-s390-next-4.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: Cleanups for 4.18 - cleanups for nested, clock handling, crypto, storage keys and control register bits
-
Paolo Bonzini authored
Merge tag 'kvmarm-for-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/ARM updates for 4.18 - Lazy context-switching of FPSIMD registers on arm64 - Allow virtual redistributors to be part of two or more MMIO ranges
-
- 26 May, 2018 10 commits
-
-
Liran Alon authored
Signed-off-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
-
Jim Mattson authored
Document the subtle nuances that KVM_CAP_X86_DISABLE_EXITS induces in the KVM_GET_SUPPORTED_CPUID API. Fixes: 4d5422ce ("KVM: X86: Provide a capability to disable MWAIT intercepts") Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
-
Vitaly Kuznetsov authored
We need a new capability to indicate support for the newly added HvFlushVirtualAddress{List,Space}{,Ex} hypercalls. Upon seeing this capability, userspace is supposed to announce PV TLB flush features by setting the appropriate CPUID bits (if needed). Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
-
Vitaly Kuznetsov authored
Implement HvFlushVirtualAddress{List,Space}Ex hypercalls in the same way we've implemented non-EX counterparts. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> [Initialized valid_bank_mask to silence misguided GCC warnigs. - Radim] Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
-
Vitaly Kuznetsov authored
Implement HvFlushVirtualAddress{List,Space} hypercalls in a simplistic way: do full TLB flush with KVM_REQ_TLB_FLUSH and kick vCPUs which are currently IN_GUEST_MODE. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
-
Vitaly Kuznetsov authored
Hyper-V style PV TLB flush hypercalls inmplementation will use this API. To avoid memory allocation in CONFIG_CPUMASK_OFFSTACK case add cpumask_var_t argument. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
-
Vitaly Kuznetsov authored
Prepare to support TLB flush hypercalls, some of which are REP hypercalls. Also, return HV_STATUS_INVALID_HYPERCALL_INPUT as it seems more appropriate. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
-
Vitaly Kuznetsov authored
Avoid open-coding offsets for hypercall input parameters, we already have defines for them. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
-
Vitaly Kuznetsov authored
Hyper-V TLB flush hypercalls definitions will be required for KVM so move them hyperv-tlfs.h. Structures also need to be renamed as '_pcpu' suffix is irrelevant for a general-purpose definition. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipRadim Krčmář authored
To resolve conflicts with the PV TLB flush series.
-
- 25 May, 2018 22 commits
-
-
Eric Auger authored
Let's raise the number of supported vcpus along with vgic v3 now that HW is looming with more physical CPUs. Signed-off-by: Eric Auger <eric.auger@redhat.com> Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Eric Auger authored
Now all the internals are ready to handle multiple redistributor regions, let's allow the userspace to register them. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Eric Auger authored
This new attribute allows the userspace to set the base address of a reditributor region, relaxing the constraint of having all consecutive redistibutor frames contiguous. Signed-off-by: Eric Auger <eric.auger@redhat.com> Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Eric Auger authored
On vcpu first run, we eventually know the actual number of vcpus. This is a synchronization point to check all redistributors were assigned. On kvm_vgic_map_resources() we check both dist and redist were set, eventually check potential base address inconsistencies. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Eric Auger authored
As we are going to register several redist regions, vgic_register_all_redist_iodevs() may be called several times. We need to register a redist_iodev for a given vcpu only once. So let's check if the base address has already been set. Initialize this latter in kvm_vgic_vcpu_init(). Signed-off-by: Eric Auger <eric.auger@redhat.com> Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Eric Auger authored
kvm_vgic_vcpu_early_init gets called after kvm_vgic_cpu_init which is confusing. The call path is as follows: kvm_vm_ioctl_create_vcpu |_ kvm_arch_cpu_create |_ kvm_vcpu_init |_ kvm_arch_vcpu_init |_ kvm_vgic_vcpu_init |_ kvm_arch_vcpu_postcreate |_ kvm_vgic_vcpu_early_init Static initialization currently done in kvm_vgic_vcpu_early_init() can be moved to kvm_vgic_vcpu_init(). So let's move the code and remove kvm_vgic_vcpu_early_init(). kvm_arch_vcpu_postcreate() does nothing. Signed-off-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Eric Auger authored
We introduce a new helper that creates and inserts a new redistributor region into the rdist region list. This helper both handles the case where the redistributor region size is known at registration time and the legacy case where it is not (eventually depending on the number of online vcpus). Depending on pfns, we perform all the possible checks that we can do: - end of memory crossing - incorrect alignment of the base address - collision with distributor region if already defined - collision with already registered rdist regions - check of the new index Rdist regions must be inserted by increasing order of indices. Indices must be contiguous. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Eric Auger authored
vgic_v3_check_base() currently only handles the case of a unique legacy redistributor region whose size is not explicitly set but inferred, instead, from the number of online vcpus. We adapt it to handle the case of multiple redistributor regions with explicitly defined size. We rely on two new helpers: - vgic_v3_rdist_overlap() is used to detect overlap with the dist region if defined - vgic_v3_rd_region_size computes the size of the redist region, would it be a legacy unique region or a new explicitly sized region. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Eric Auger authored
The TYPER of an redistributor reflects whether the rdist is the last one of the redistributor region. Let's compare the TYPER GPA against the address of the last occupied slot within the redistributor region. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Eric Auger authored
We introduce vgic_v3_rdist_free_slot to help identifying where we can place a new 2x64KB redistributor. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Eric Auger authored
At the moment KVM supports a single rdist region. We want to support several separate rdist regions so let's introduce a list of them. This patch currently only cares about a single entry in this list as the functionality to register several redist regions is not yet there. So this only translates the existing code into something functionally similar using that new data struct. The redistributor region handle is stored in the vgic_cpu structure to allow later computation of the TYPER last bit. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Eric Auger authored
We introduce a new KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION attribute in KVM_DEV_ARM_VGIC_GRP_ADDR group. It allows userspace to provide the base address and size of a redistributor region Compared to KVM_VGIC_V3_ADDR_TYPE_REDIST, this new attribute allows to declare several separate redistributor regions. So the whole redist space does not need to be contiguous anymore. Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Eric Auger authored
in case kvm_vgic_map_resources() fails, typically if the vgic distributor is not defined, __kvm_vgic_destroy will be called several times. Indeed kvm_vgic_map_resources() is called on first vcpu run. As a result dist->spis is freeed more than once and on the second time it causes a "kernel BUG at mm/slub.c:3912!" Set dist->spis to NULL to avoid the crash. Fixes: ad275b8b ("KVM: arm/arm64: vgic-new: vgic_init: implement vgic_init") Signed-off-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Dave Martin authored
The conversion of the FPSIMD context switch trap code to C has added some overhead to calling it, due to the need to save registers that the procedure call standard defines as caller-saved. So, perhaps it is no longer worth invoking this trap handler quite so early. Instead, we can invoke it from fixup_guest_exit(), with little likelihood of increasing the overhead much further. As a convenience, this patch gives __hyp_switch_fpsimd() the same return semantics fixup_guest_exit(). For now there is no possibility of a spurious FPSIMD trap, so the function always returns true, but this allows it to be tail-called with a single return statement. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Dave Martin authored
The entire tail of fixup_guest_exit() is contained in if statements of the form if (x && *exit_code == ARM_EXCEPTION_TRAP). As a result, we can check just once and bail out of the function early, allowing the remaining if conditions to be simplified. The only awkward case is where *exit_code is changed to ARM_EXCEPTION_EL1_SERROR in the case of an illegal GICv2 CPU interface access: in that case, the GICv3 trap handling code is skipped using a goto. This avoids pointlessly evaluating the static branch check for the GICv3 case, even though we can't have vgic_v2_cpuif_trap and vgic_v3_cpuif_trap true simultaneously unless we have a GICv3 and GICv2 on the host: that sounds stupid, but I haven't satisfied myself that it can't happen. No functional change. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Dave Martin authored
In fixup_guest_exit(), there are a couple of cases where after checking what the exit code was, we assign it explicitly with the value it already had. Assuming this is not indicative of a bug, these assignments are not needed. This patch removes the redundant assignments, and simplifies some if-nesting that becomes trivial as a result. No functional change. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Christoffer Dall <christoffer.dall@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Dave Martin authored
Now that the host SVE context can be saved on demand from Hyp, there is no longer any need to save this state in advance before entering the guest. This patch removes the relevant call to kvm_fpsimd_flush_cpu_state(). Since the problem that function was intended to solve now no longer exists, the function and its dependencies are also deleted. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Christoffer Dall <christoffer.dall@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Dave Martin authored
This patch adds SVE context saving to the hyp FPSIMD context switch path. This means that it is no longer necessary to save the host SVE state in advance of entering the guest, when in use. In order to avoid adding pointless complexity to the code, VHE is assumed if SVE is in use. VHE is an architectural prerequisite for SVE, so there is no good reason to turn CONFIG_ARM64_VHE off in kernels that support both SVE and KVM. Historically, software models exist that can expose the architecturally invalid configuration of SVE without VHE, so if this situation is detected at kvm_init() time then KVM will be disabled. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Dave Martin authored
In order to make sve_save_state()/sve_load_state() more easily reusable and to get rid of a potential branch on context switch critical paths, this patch makes sve_pffr() inline and moves it to fpsimd.h. <asm/processor.h> must be included in fpsimd.h in order to make this work, and this creates an #include cycle that is tricky to avoid without modifying core code, due to the way the PR_SVE_*() prctl helpers are included in the core prctl implementation. Instead of breaking the cycle, this patch defers inclusion of <asm/fpsimd.h> in <asm/processor.h> until the point where it is actually needed: i.e., immediately before the prctl definitions. No functional change. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Dave Martin authored
sve_pffr(), which is used to derive the base address used for low-level SVE save/restore routines, currently takes the relevant task_struct as an argument. The only accessed fields are actually part of thread_struct, so this patch changes the argument type accordingly. This is done in preparation for moving this function to a header, where we do not want to have to include <linux/sched.h> due to the consequent circular #include problems. No functional change. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Dave Martin authored
Having read_zcr_features() inline in cpufeature.h results in that header requiring #includes which make it hard to include <asm/fpsimd.h> elsewhere without triggering header inclusion cycles. This is not a hot-path function and arguably should not be in cpufeature.h in the first place, so this patch moves it to fpsimd.c, compiled conditionally if CONFIG_ARM64_SVE=y. This allows some SVE-related #includes to be dropped from cpufeature.h, which will ease future maintenance. A couple of missing #includes of <asm/fpsimd.h> are exposed by this change under arch/arm64/. This patch adds the missing #includes as necessary. No functional change. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Dave Martin authored
This patch refactors KVM to align the host and guest FPSIMD save/restore logic with each other for arm64. This reduces the number of redundant save/restore operations that must occur, and reduces the common-case IRQ blackout time during guest exit storms by saving the host state lazily and optimising away the need to restore the host state before returning to the run loop. Four hooks are defined in order to enable this: * kvm_arch_vcpu_run_map_fp(): Called on PID change to map necessary bits of current to Hyp. * kvm_arch_vcpu_load_fp(): Set up FP/SIMD for entering the KVM run loop (parse as "vcpu_load fp"). * kvm_arch_vcpu_ctxsync_fp(): Get FP/SIMD into a safe state for re-enabling interrupts after a guest exit back to the run loop. For arm64 specifically, this involves updating the host kernel's FPSIMD context tracking metadata so that kernel-mode NEON use will cause the vcpu's FPSIMD state to be saved back correctly into the vcpu struct. This must be done before re-enabling interrupts because kernel-mode NEON may be used by softirqs. * kvm_arch_vcpu_put_fp(): Save guest FP/SIMD state back to memory and dissociate from the CPU ("vcpu_put fp"). Also, the arm64 FPSIMD context switch code is updated to enable it to save back FPSIMD state for a vcpu, not just current. A few helpers drive this: * fpsimd_bind_state_to_cpu(struct user_fpsimd_state *fp): mark this CPU as having context fp (which may belong to a vcpu) currently loaded in its registers. This is the non-task equivalent of the static function fpsimd_bind_to_cpu() in fpsimd.c. * task_fpsimd_save(): exported to allow KVM to save the guest's FPSIMD state back to memory on exit from the run loop. * fpsimd_flush_state(): invalidate any context's FPSIMD state that is currently loaded. Used to disassociate the vcpu from the CPU regs on run loop exit. These changes allow the run loop to enable interrupts (and thus softirqs that may use kernel-mode NEON) without having to save the guest's FPSIMD state eagerly. Some new vcpu_arch fields are added to make all this work. Because host FPSIMD state can now be saved back directly into current's thread_struct as appropriate, host_cpu_context is no longer used for preserving the FPSIMD state. However, it is still needed for preserving other things such as the host's system registers. To avoid ABI churn, the redundant storage space in host_cpu_context is not removed for now. arch/arm is not addressed by this patch and continues to use its current save/restore logic. It could provide implementations of the helpers later if desired. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-