- 05 May, 2017 1 commit
-
-
Jim Mattson authored
According to the SDM, if the "activate secondary controls" primary processor-based VM-execution control is 0, no checks are performed on the secondary processor-based VM-execution controls. Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
- 04 May, 2017 1 commit
-
-
Paolo Bonzini authored
The #ifndef was removed in 75aaafb7, but it was also protecting smp_send_reschedule() in kvm_vcpu_kick(). Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
- 03 May, 2017 2 commits
-
-
Paolo Bonzini authored
This reverts commit bbd64115. I've been sitting on this revert for too long and it unfortunately missed 4.11. It's also the reason why I haven't merged ring-based dirty tracking for 4.12. Using kvm_vcpu_memslots in kvm_gfn_to_hva_cache_init and kvm_vcpu_write_guest_offset_cached means that the MSR value can now be used to access SMRAM, simply by making it point to an SMRAM physical address. This is problematic because it lets the guest OS overwrite memory that it shouldn't be able to touch. Cc: stable@vger.kernel.org Fixes: bbd64115Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Justin M. Forbes authored
The top level tools/Makefile includes kvm_stat as a target in help, but the actual target is missing. Signed-off-by: Justin M. Forbes <jforbes@fedoraproject.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
- 02 May, 2017 1 commit
-
-
David Hildenbrand authored
We needed the lock to avoid racing with creation of the irqchip on x86. As kvm_set_irq_routing() calls srcu_synchronize_expedited(), this lock might be held for a longer time. Let's introduce an arch specific callback to check if we can actually add irq routes. For x86, all we have to do is check if we have an irqchip in the kernel. We don't need kvm->lock at that point as the irqchip is marked as inititalized only when actually fully created. Reported-by: Steve Rutherford <srutherford@google.com> Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Fixes: 1df6dded ("KVM: x86: race between KVM_SET_GSI_ROUTING and KVM_CREATE_IRQCHIP") Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
- 28 Apr, 2017 1 commit
-
-
Jann Horn authored
Since commit 80f5b5e7 ("KVM: remove vm mmap method"), the VM mmap handler is gone. Remove the corresponding documentation. Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
- 27 Apr, 2017 12 commits
-
-
Paolo Bonzini authored
Merge tag 'kvm-arm-for-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/ARM Changes for v4.12. Changes include: - Using the common sysreg definitions between KVM and arm64 - Improved hyp-stub implementation with support for kexec and kdump on the 32-bit side - Proper PMU exception handling - Performance improvements of our GIC handling - Support for irqchip in userspace with in-kernel arch-timers and PMU support - A fix for a race condition in our PSCI code Conflicts: Documentation/virtual/kvm/api.txt include/uapi/linux/kvm.h
-
Jim Mattson authored
According to the Intel SDM, "Certain exceptions have priority over VM exits. These include invalid-opcode exceptions, faults based on privilege level*, and general-protection exceptions that are based on checking I/O permission bits in the task-state segment (TSS)." There is no need to check for faulting conditions that the hardware has already checked. * These include faults generated by attempts to execute, in virtual-8086 mode, privileged instructions that are not recognized in that mode. Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Ladi Prosek authored
On AMD, the effect of set_nmi_mask called by emulate_iret_real and em_rsm on hflags is reverted later on in x86_emulate_instruction where hflags are overwritten with ctxt->emul_flags (the kvm_set_hflags call). This manifests as a hang when rebooting Windows VMs with QEMU, OVMF, and >1 vcpu. Instead of trying to merge ctxt->emul_flags into vcpu->arch.hflags after an instruction is emulated, this commit deletes emul_flags altogether and makes the emulator access vcpu->arch.hflags using two new accessors. This way all changes, on the emulator side as well as in functions called from the emulator and accessing vcpu state with emul_to_vcpu, are preserved. More details on the bug and its manifestation with Windows and OVMF: It's a KVM bug in the interaction between SMI/SMM and NMI, specific to AMD. I believe that the SMM part explains why we started seeing this only with OVMF. KVM masks and unmasks NMI when entering and leaving SMM. When KVM emulates the RSM instruction in em_rsm, the set_nmi_mask call doesn't stick because later on in x86_emulate_instruction we overwrite arch.hflags with ctxt->emul_flags, effectively reverting the effect of the set_nmi_mask call. The AMD-specific hflag of interest here is HF_NMI_MASK. When rebooting the system, Windows sends an NMI IPI to all but the current cpu to shut them down. Only after all of them are parked in HLT will the initiating cpu finish the restart. If NMI is masked, other cpus never get the memo and the initiating cpu spins forever, waiting for hal!HalpInterruptProcessorsStarted to drop. That's the symptom we observe. Fixes: a584539b ("KVM: x86: pass the whole hflags field to emulator and back") Signed-off-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
kvm_make_all_requests() provides a synchronization that waits until all kicked VCPUs have acknowledged the kick. This is important for KVM_REQ_MMU_RELOAD as it prevents freeing while lockless paging is underway. This patch adds the synchronization property into all requests that are currently being used with kvm_make_all_requests() in order to preserve the current behavior and only introduce a new framework. Removing it from requests where it is not necessary is left for future patches. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Radim Krčmář authored
No need to kick a VCPU that we have just woken up. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Andrew Jones authored
kvm_vcpu_kick() must issue a general memory barrier prior to reading vcpu->mode in order to ensure correctness of the mutual-exclusion memory barrier pattern used with vcpu->requests. While the cmpxchg called from kvm_vcpu_kick(): kvm_vcpu_kick kvm_arch_vcpu_should_kick kvm_vcpu_exiting_guest_mode cmpxchg implies general memory barriers before and after the operation, that implication is only valid when cmpxchg succeeds. We need an explicit barrier for when it fails, otherwise a VCPU thread on its entry path that reads zero for vcpu->requests does not exclude the possibility the requesting thread sees !IN_GUEST_MODE when it reads vcpu->mode. kvm_make_all_cpus_request already had a barrier, so we remove it, as now it would be redundant. Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Radim Krčmář authored
We want to have kvm_make_all_cpus_request() to be an optmized version of kvm_for_each_vcpu(i, vcpu, kvm) { kvm_make_request(vcpu, request); kvm_vcpu_kick(vcpu); } and kvm_vcpu_kick() wakes up the target vcpu. We know which requests do not need the wake up and use it to optimize the loop. Thanks to that, this patch doesn't change the behavior of current users (the all don't need the wake up) and only prepares for future where the wake up is going to be needed. I think that most requests do not need the wake up, so we would flip the bit then. Later on, kvm_make_request() will take care of kicking too, using this bit to make the decision whether to kick or not. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Radim Krčmář authored
Some operations must ensure that the guest is not running with stale data, but if the guest is halted, then the update can wait until another event happens. kvm_make_all_requests() currently doesn't wake up, so we can mark all requests used with it. First 8 bits were arbitrarily reserved for request numbers. Most uses of requests have the request type as a constant, so a compiler will optimize the '&'. An alternative would be to have an inline function that would return whether the request needs a wake-up or not, but I like this one better even though it might produce worse assembly. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Radim Krčmář authored
The #ifndef was protecting a missing halt_wakeup stat, but that is no longer necessary. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Radim Krčmář authored
Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Radim Krčmář authored
Users were expected to use kvm_check_request() for testing and clearing, but request have expanded their use since then and some users want to only test or do a faster clear. Make sure that requests are not directly accessed with bit operations. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
Merge tag 'kvm-s390-next-4.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: MSA8 feature for guests - Detect all function codes for KMA and export the features for use in the cpu model
-
- 26 Apr, 2017 4 commits
-
-
Jason J. Herne authored
msa6 and msa7 require no changes. msa8 adds kma instruction and feature area. Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
-
Jason J. Herne authored
Provide a kma instruction definition for use by callers of __cpacf_query. Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
-
Jason J. Herne authored
The new KMA instruction requires unique parameters. Update __cpacf_query to generate a compatible assembler instruction. Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com> Acked-by: Harald Freudenberger <freude@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
-
- 21 Apr, 2017 9 commits
-
-
Marcelo Tosatti authored
The disablement of interrupts at KVM_SET_CLOCK/KVM_GET_CLOCK attempts to disable software suspend from causing "non atomic behaviour" of the operation: Add a helper function to compute the kernel time and convert nanoseconds back to CPU specific cycles. Note that these must not be called in preemptible context, as that would mean the kernel could enter software suspend state, which would cause non-atomic operation. However, assume the kernel can enter software suspend at the following 2 points: ktime_get_ts(&ts); 1. hypothetical_ktime_get_ts(&ts) monotonic_to_bootbased(&ts); 2. monotonic_to_bootbased() should be correct relative to a ktime_get_ts(&ts) performed after point 1 (that is after resuming from software suspend), hypothetical_ktime_get_ts() Therefore it is also correct for the ktime_get_ts(&ts) before point 1, which is ktime_get_ts(&ts) = hypothetical_ktime_get_ts(&ts) + time-to-execute-suspend-code Note CLOCK_MONOTONIC does not count during suspension. So remove the irq disablement, which causes the following warning on -RT kernels: With this reasoning, and the -RT bug that the irq disablement causes (because spin_lock is now a sleeping lock), remove the IRQ protection as it causes: [ 1064.668109] in_atomic(): 0, irqs_disabled(): 1, pid: 15296, name:m [ 1064.668110] INFO: lockdep is turned off. [ 1064.668110] irq event stamp: 0 [ 1064.668112] hardirqs last enabled at (0): [< (null)>] ) [ 1064.668116] hardirqs last disabled at (0): [] c0 [ 1064.668118] softirqs last enabled at (0): [] c0 [ 1064.668118] softirqs last disabled at (0): [< (null)>] ) [ 1064.668121] CPU: 13 PID: 15296 Comm: qemu-kvm Not tainted 3.10.0-1 [ 1064.668121] Hardware name: Dell Inc. PowerEdge R730/0H21J3, BIOS 5 [ 1064.668123] ffff8c1796b88000 00000000afe7344c ffff8c179abf3c68 f3 [ 1064.668125] ffff8c179abf3c90 ffffffff930ccb3d ffff8c1b992b3610 f0 [ 1064.668126] 00007ffc1a26fbc0 ffff8c179abf3cb0 ffffffff9375f694 f0 [ 1064.668126] Call Trace: [ 1064.668132] [] dump_stack+0x19/0x1b [ 1064.668135] [] __might_sleep+0x12d/0x1f0 [ 1064.668138] [] rt_spin_lock+0x24/0x60 [ 1064.668155] [] __get_kvmclock_ns+0x36/0x110 [k] [ 1064.668159] [] ? futex_wait_queue_me+0x103/0x10 [ 1064.668171] [] kvm_arch_vm_ioctl+0xa2/0xd70 [k] [ 1064.668173] [] ? futex_wait+0x1ac/0x2a0 v2: notice get_kvmclock_ns with the same problem (Pankaj). v3: remove useless helper function (Pankaj). Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Michael S. Tsirkin authored
Guests that are heavy on futexes end up IPI'ing each other a lot. That can lead to significant slowdowns and latency increase for those guests when running within KVM. If only a single guest is needed on a host, we have a lot of spare host CPU time we can throw at the problem. Modern CPUs implement a feature called "MWAIT" which allows guests to wake up sleeping remote CPUs without an IPI - thus without an exit - at the expense of never going out of guest context. The decision whether this is something sensible to use should be up to the VM admin, so to user space. We can however allow MWAIT execution on systems that support it properly hardware wise. This patch adds a CAP to user space and a KVM cpuid leaf to indicate availability of native MWAIT execution. With that enabled, the worst a guest can do is waste as many cycles as a "jmp ." would do, so it's not a privilege problem. We consciously do *not* expose the feature in our CPUID bitmap, as most people will want to benefit from sleeping vCPUs to allow for over commit. Reported-by: "Gabriel L. Somlo" <gsomlo@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> [agraf: fix amd, change commit message] Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Kyle Huey authored
Hardware support for faulting on the cpuid instruction is not required to emulate it, because cpuid triggers a VM exit anyways. KVM handles the relevant MSRs (MSR_PLATFORM_INFO and MSR_MISC_FEATURES_ENABLE) and upon a cpuid-induced VM exit checks the cpuid faulting state and the CPL. kvm_require_cpl is even kind enough to inject the GP fault for us. Signed-off-by: Kyle Huey <khuey@kylehuey.com> Reviewed-by: David Matlack <dmatlack@google.com> [Return "1" from kvm_emulate_cpuid, it's not void. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
Merge tag 'kvm-s390-next-4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: Guarded storage fixup and keyless subset mode - detect and use the keyless subset mode (guests without storage keys) - fix vSIE support for sdnxc - fix machine check data for guarded storage
-
Paolo Bonzini authored
Merge branch 'kvm-ppc-next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipPaolo Bonzini authored
Required for KVM support of the CPUID faulting feature. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
David Hildenbrand authored
vmm_exclusive=0 leads to KVM setting X86_CR4_VMXE always and calling VMXON only when the vcpu is loaded. X86_CR4_VMXE is used as an indication in cpu_emergency_vmxoff() (called on kdump) if VMXOFF has to be called. This is obviously not the case if both are used independtly. Calling VMXOFF without a previous VMXON will result in an exception. In addition, X86_CR4_VMXE is used as a mean to test if VMX is already in use by another VMM in hardware_enable(). So there can't really be co-existance. If the other VMM is prepared for co-existance and does a similar check, only one VMM can exist. If the other VMM is not prepared and blindly sets/clears X86_CR4_VMXE, we will get inconsistencies with X86_CR4_VMXE. As we also had bug reports related to clearing of vmcs with vmm_exclusive=0 this seems to be pretty much untested. So let's better drop it. While at it, directly move setting/clearing X86_CR4_VMXE into kvm_cpu_vmxon/off. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Farhan Ali authored
If the KSS facility is available on the machine, we also make it available for our KVM guests. The KSS facility bypasses storage key management as long as the guest does not issue a related instruction. When that happens, the control is returned to the host, which has to turn off KSS for a guest vcpu before retrying the instruction. Signed-off-by: Corey S. McQuay <csmcquay@linux.vnet.ibm.com> Signed-off-by: Farhan Ali <alifm@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
-
Farhan Ali authored
Let's detect the keyless subset facility. Signed-off-by: Farhan Ali <alifm@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
-
- 20 Apr, 2017 9 commits
-
-
Marc Zyngier authored
When entering the hyp stub implemented in the idmap, we try to be mindful of the fact that we could be running a Thumb-2 kernel by adding 1 to the address we compute. Unfortunately, the assembler also knows about this trick, and has already generated an address that has bit 0 set in the litteral pool. Our superfluous correction ends up confusing the CPU entierely, as we now branch to the stub in ARM mode instead of Thumb, and on a possibly unaligned address for good measure. From that point, nothing really good happens. The obvious fix in to remove this stupid target PC correction. Fixes: 6bebcecb ("ARM: KVM: Allow the main HYP code to use the init hyp stub implementation") Reported-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org>
-
Marc Zyngier authored
The assembler defaults to emiting the short form of ADR, leading to an out-of-range immediate. Using the wide version solves this issue. Fixes: bc845e4f ("ARM: KVM: Implement HVC_RESET_VECTORS stub hypercall in the init code") Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org>
-
Thomas Huth authored
According to the PowerISA 2.07, mtspr and mfspr should not always generate an illegal instruction exception when being used with an undefined SPR, but rather treat the instruction as a NOP or inject a privilege exception in some cases, too - depending on the SPR number. Also turn the printk here into a ratelimited print statement, so that the guest can not flood the dmesg log of the host by issueing lots of illegal mtspr/mfspr instruction here. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
Alexey Kardashevskiy authored
This allows the host kernel to handle H_PUT_TCE, H_PUT_TCE_INDIRECT and H_STUFF_TCE requests targeted an IOMMU TCE table used for VFIO without passing them to user space which saves time on switching to user space and back. This adds H_PUT_TCE/H_PUT_TCE_INDIRECT/H_STUFF_TCE handlers to KVM. KVM tries to handle a TCE request in the real mode, if failed it passes the request to the virtual mode to complete the operation. If it a virtual mode handler fails, the request is passed to the user space; this is not expected to happen though. To avoid dealing with page use counters (which is tricky in real mode), this only accelerates SPAPR TCE IOMMU v2 clients which are required to pre-register the userspace memory. The very first TCE request will be handled in the VFIO SPAPR TCE driver anyway as the userspace view of the TCE table (iommu_table::it_userspace) is not allocated till the very first mapping happens and we cannot call vmalloc in real mode. If we fail to update a hardware IOMMU table unexpected reason, we just clear it and move on as there is nothing really we can do about it - for example, if we hot plug a VFIO device to a guest, existing TCE tables will be mirrored automatically to the hardware and there is no interface to report to the guest about possible failures. This adds new attribute - KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE - to the VFIO KVM device. It takes a VFIO group fd and SPAPR TCE table fd and associates a physical IOMMU table with the SPAPR TCE table (which is a guest view of the hardware IOMMU table). The iommu_table object is cached and referenced so we do not have to look up for it in real mode. This does not implement the UNSET counterpart as there is no use for it - once the acceleration is enabled, the existing userspace won't disable it unless a VFIO container is destroyed; this adds necessary cleanup to the KVM_DEV_VFIO_GROUP_DEL handler. This advertises the new KVM_CAP_SPAPR_TCE_VFIO capability to the user space. This adds real mode version of WARN_ON_ONCE() as the generic version causes problems with rcu_sched. Since we testing what vmalloc_to_phys() returns in the code, this also adds a check for already existing vmalloc_to_phys() call in kvmppc_rm_h_put_tce_indirect(). This finally makes use of vfio_external_user_iommu_id() which was introduced quite some time ago and was considered for removal. Tests show that this patch increases transmission speed from 220MB/s to 750..1020MB/s on 10Gb network (Chelsea CXGB3 10Gb ethernet card). Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
Alexey Kardashevskiy authored
This reworks helpers for checking TCE update parameters in way they can be used in KVM. This should cause no behavioral change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
Alexey Kardashevskiy authored
VFIO on sPAPR already implements guest memory pre-registration when the entire guest RAM gets pinned. This can be used to translate the physical address of a guest page containing the TCE list from H_PUT_TCE_INDIRECT. This makes use of the pre-registrered memory API to access TCE list pages in order to avoid unnecessary locking on the KVM memory reverse map as we know that all of guest memory is pinned and we have a flat array mapping GPA to HPA which makes it simpler and quicker to index into that array (even with looking up the kernel page tables in vmalloc_to_phys) than it is to find the memslot, lock the rmap entry, look up the user page tables, and unlock the rmap entry. Note that the rmap pointer is initialized to NULL where declared (not in this patch). If a requested chunk of memory has not been preregistered, this will fall back to non-preregistered case and lock rmap. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
Alexey Kardashevskiy authored
The guest view TCE tables are per KVM anyway (not per VCPU) so pass kvm* there. This will be used in the following patches where we will be attaching VFIO containers to LIOBNs via ioctl() to KVM (rather than to VCPU). Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
Alexey Kardashevskiy authored
It does not make much sense to have KVM in book3s-64 and not to have IOMMU bits for PCI pass through support as it costs little and allows VFIO to function on book3s KVM. Having IOMMU_API always enabled makes it unnecessary to have a lot of "#ifdef IOMMU_API" in arch/powerpc/kvm/book3s_64_vio*. With those ifdef's we could have only user space emulated devices accelerated (but not VFIO) which do not seem to be very useful. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-
Alexey Kardashevskiy authored
This adds a capability number for in-kernel support for VFIO on SPAPR platform. The capability will tell the user space whether in-kernel handlers of H_PUT_TCE can handle VFIO-targeted requests or not. If not, the user space must not attempt allocating a TCE table in the host kernel via the KVM_CREATE_SPAPR_TCE KVM ioctl because in that case TCE requests will not be passed to the user space which is desired action in the situation like that. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
-