Commit fd7e9a88 authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "4.11 is going to be a relatively large release for KVM, with a little
  over 200 commits and noteworthy changes for most architectures.

  ARM:
   - GICv3 save/restore
   - cache flushing fixes
   - working MSI injection for GICv3 ITS
   - physical timer emulation

  MIPS:
   - various improvements under the hood
   - support for SMP guests
   - a large rewrite of MMU emulation. KVM MIPS can now use MMU
     notifiers to support copy-on-write, KSM, idle page tracking,
     swapping, ballooning and everything else. KVM_CAP_READONLY_MEM is
     also supported, so that writes to some memory regions can be
     treated as MMIO. The new MMU also paves the way for hardware
     virtualization support.

  PPC:
   - support for POWER9 using the radix-tree MMU for host and guest
   - resizable hashed page table
   - bugfixes.

  s390:
   - expose more features to the guest
   - more SIMD extensions
   - instruction execution protection
   - ESOP2

  x86:
   - improved hashing in the MMU
   - faster PageLRU tracking for Intel CPUs without EPT A/D bits
   - some refactoring of nested VMX entry/exit code, preparing for live
     migration support of nested hypervisors
   - expose yet another AVX512 CPUID bit
   - host-to-guest PTP support
   - refactoring of interrupt injection, with some optimizations thrown
     in and some duct tape removed.
   - remove lazy FPU handling
   - optimizations of user-mode exits
   - optimizations of vcpu_is_preempted() for KVM guests

  generic:
   - alternative signaling mechanism that doesn't pound on
     tsk->sighand->siglock"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (195 commits)
  x86/kvm: Provide optimized version of vcpu_is_preempted() for x86-64
  x86/paravirt: Change vcp_is_preempted() arg type to long
  KVM: VMX: use correct vmcs_read/write for guest segment selector/base
  x86/kvm/vmx: Defer TR reload after VM exit
  x86/asm/64: Drop __cacheline_aligned from struct x86_hw_tss
  x86/kvm/vmx: Simplify segment_base()
  x86/kvm/vmx: Get rid of segment_base() on 64-bit kernels
  x86/kvm/vmx: Don't fetch the TSS base from the GDT
  x86/asm: Define the kernel TSS limit in a macro
  kvm: fix page struct leak in handle_vmon
  KVM: PPC: Book3S HV: Disable HPT resizing on POWER9 for now
  KVM: Return an error code only as a constant in kvm_get_dirty_log()
  KVM: Return an error code only as a constant in kvm_get_dirty_log_protect()
  KVM: Return directly after a failed copy_from_user() in kvm_vm_compat_ioctl()
  KVM: x86: remove code for lazy FPU handling
  KVM: race-free exit from KVM_RUN without POSIX signals
  KVM: PPC: Book3S HV: Turn "KVM guest htab" message into a debug message
  KVM: PPC: Book3S PR: Ratelimit copy data failure error messages
  KVM: Support vCPU-based gfn->hva cache
  KVM: use separate generations for each address space
  ...
parents 5066e4a3 dd0fd8bc
...@@ -2061,6 +2061,8 @@ registers, find a list below: ...@@ -2061,6 +2061,8 @@ registers, find a list below:
MIPS | KVM_REG_MIPS_LO | 64 MIPS | KVM_REG_MIPS_LO | 64
MIPS | KVM_REG_MIPS_PC | 64 MIPS | KVM_REG_MIPS_PC | 64
MIPS | KVM_REG_MIPS_CP0_INDEX | 32 MIPS | KVM_REG_MIPS_CP0_INDEX | 32
MIPS | KVM_REG_MIPS_CP0_ENTRYLO0 | 64
MIPS | KVM_REG_MIPS_CP0_ENTRYLO1 | 64
MIPS | KVM_REG_MIPS_CP0_CONTEXT | 64 MIPS | KVM_REG_MIPS_CP0_CONTEXT | 64
MIPS | KVM_REG_MIPS_CP0_USERLOCAL | 64 MIPS | KVM_REG_MIPS_CP0_USERLOCAL | 64
MIPS | KVM_REG_MIPS_CP0_PAGEMASK | 32 MIPS | KVM_REG_MIPS_CP0_PAGEMASK | 32
...@@ -2071,9 +2073,11 @@ registers, find a list below: ...@@ -2071,9 +2073,11 @@ registers, find a list below:
MIPS | KVM_REG_MIPS_CP0_ENTRYHI | 64 MIPS | KVM_REG_MIPS_CP0_ENTRYHI | 64
MIPS | KVM_REG_MIPS_CP0_COMPARE | 32 MIPS | KVM_REG_MIPS_CP0_COMPARE | 32
MIPS | KVM_REG_MIPS_CP0_STATUS | 32 MIPS | KVM_REG_MIPS_CP0_STATUS | 32
MIPS | KVM_REG_MIPS_CP0_INTCTL | 32
MIPS | KVM_REG_MIPS_CP0_CAUSE | 32 MIPS | KVM_REG_MIPS_CP0_CAUSE | 32
MIPS | KVM_REG_MIPS_CP0_EPC | 64 MIPS | KVM_REG_MIPS_CP0_EPC | 64
MIPS | KVM_REG_MIPS_CP0_PRID | 32 MIPS | KVM_REG_MIPS_CP0_PRID | 32
MIPS | KVM_REG_MIPS_CP0_EBASE | 64
MIPS | KVM_REG_MIPS_CP0_CONFIG | 32 MIPS | KVM_REG_MIPS_CP0_CONFIG | 32
MIPS | KVM_REG_MIPS_CP0_CONFIG1 | 32 MIPS | KVM_REG_MIPS_CP0_CONFIG1 | 32
MIPS | KVM_REG_MIPS_CP0_CONFIG2 | 32 MIPS | KVM_REG_MIPS_CP0_CONFIG2 | 32
...@@ -2148,6 +2152,12 @@ patterns depending on whether they're 32-bit or 64-bit registers: ...@@ -2148,6 +2152,12 @@ patterns depending on whether they're 32-bit or 64-bit registers:
0x7020 0000 0001 00 <reg:5> <sel:3> (32-bit) 0x7020 0000 0001 00 <reg:5> <sel:3> (32-bit)
0x7030 0000 0001 00 <reg:5> <sel:3> (64-bit) 0x7030 0000 0001 00 <reg:5> <sel:3> (64-bit)
Note: KVM_REG_MIPS_CP0_ENTRYLO0 and KVM_REG_MIPS_CP0_ENTRYLO1 are the MIPS64
versions of the EntryLo registers regardless of the word size of the host
hardware, host kernel, guest, and whether XPA is present in the guest, i.e.
with the RI and XI bits (if they exist) in bits 63 and 62 respectively, and
the PFNX field starting at bit 30.
MIPS KVM control registers (see above) have the following id bit patterns: MIPS KVM control registers (see above) have the following id bit patterns:
0x7030 0000 0002 <reg:16> 0x7030 0000 0002 <reg:16>
...@@ -2443,18 +2453,20 @@ are, it will do nothing and return an EBUSY error. ...@@ -2443,18 +2453,20 @@ are, it will do nothing and return an EBUSY error.
The parameter is a pointer to a 32-bit unsigned integer variable The parameter is a pointer to a 32-bit unsigned integer variable
containing the order (log base 2) of the desired size of the hash containing the order (log base 2) of the desired size of the hash
table, which must be between 18 and 46. On successful return from the table, which must be between 18 and 46. On successful return from the
ioctl, it will have been updated with the order of the hash table that ioctl, the value will not be changed by the kernel.
was allocated.
If no hash table has been allocated when any vcpu is asked to run If no hash table has been allocated when any vcpu is asked to run
(with the KVM_RUN ioctl), the host kernel will allocate a (with the KVM_RUN ioctl), the host kernel will allocate a
default-sized hash table (16 MB). default-sized hash table (16 MB).
If this ioctl is called when a hash table has already been allocated, If this ioctl is called when a hash table has already been allocated,
the kernel will clear out the existing hash table (zero all HPTEs) and with a different order from the existing hash table, the existing hash
return the hash table order in the parameter. (If the guest is using table will be freed and a new one allocated. If this is ioctl is
the virtualized real-mode area (VRMA) facility, the kernel will called when a hash table has already been allocated of the same order
re-create the VMRA HPTEs on the next KVM_RUN of any vcpu.) as specified, the kernel will clear out the existing hash table (zero
all HPTEs). In either case, if the guest is using the virtualized
real-mode area (VRMA) facility, the kernel will re-create the VMRA
HPTEs on the next KVM_RUN of any vcpu.
4.77 KVM_S390_INTERRUPT 4.77 KVM_S390_INTERRUPT
...@@ -3177,7 +3189,7 @@ of IOMMU pages. ...@@ -3177,7 +3189,7 @@ of IOMMU pages.
The rest of functionality is identical to KVM_CREATE_SPAPR_TCE. The rest of functionality is identical to KVM_CREATE_SPAPR_TCE.
4.98 KVM_REINJECT_CONTROL 4.99 KVM_REINJECT_CONTROL
Capability: KVM_CAP_REINJECT_CONTROL Capability: KVM_CAP_REINJECT_CONTROL
Architectures: x86 Architectures: x86
...@@ -3201,7 +3213,7 @@ struct kvm_reinject_control { ...@@ -3201,7 +3213,7 @@ struct kvm_reinject_control {
pit_reinject = 0 (!reinject mode) is recommended, unless running an old pit_reinject = 0 (!reinject mode) is recommended, unless running an old
operating system that uses the PIT for timing (e.g. Linux 2.4.x). operating system that uses the PIT for timing (e.g. Linux 2.4.x).
4.99 KVM_PPC_CONFIGURE_V3_MMU 4.100 KVM_PPC_CONFIGURE_V3_MMU
Capability: KVM_CAP_PPC_RADIX_MMU or KVM_CAP_PPC_HASH_MMU_V3 Capability: KVM_CAP_PPC_RADIX_MMU or KVM_CAP_PPC_HASH_MMU_V3
Architectures: ppc Architectures: ppc
...@@ -3232,7 +3244,7 @@ process table, which is in the guest's space. This field is formatted ...@@ -3232,7 +3244,7 @@ process table, which is in the guest's space. This field is formatted
as the second doubleword of the partition table entry, as defined in as the second doubleword of the partition table entry, as defined in
the Power ISA V3.00, Book III section 5.7.6.1. the Power ISA V3.00, Book III section 5.7.6.1.
4.100 KVM_PPC_GET_RMMU_INFO 4.101 KVM_PPC_GET_RMMU_INFO
Capability: KVM_CAP_PPC_RADIX_MMU Capability: KVM_CAP_PPC_RADIX_MMU
Architectures: ppc Architectures: ppc
...@@ -3266,6 +3278,101 @@ The ap_encodings gives the supported page sizes and their AP field ...@@ -3266,6 +3278,101 @@ The ap_encodings gives the supported page sizes and their AP field
encodings, encoded with the AP value in the top 3 bits and the log encodings, encoded with the AP value in the top 3 bits and the log
base 2 of the page size in the bottom 6 bits. base 2 of the page size in the bottom 6 bits.
4.102 KVM_PPC_RESIZE_HPT_PREPARE
Capability: KVM_CAP_SPAPR_RESIZE_HPT
Architectures: powerpc
Type: vm ioctl
Parameters: struct kvm_ppc_resize_hpt (in)
Returns: 0 on successful completion,
>0 if a new HPT is being prepared, the value is an estimated
number of milliseconds until preparation is complete
-EFAULT if struct kvm_reinject_control cannot be read,
-EINVAL if the supplied shift or flags are invalid
-ENOMEM if unable to allocate the new HPT
-ENOSPC if there was a hash collision when moving existing
HPT entries to the new HPT
-EIO on other error conditions
Used to implement the PAPR extension for runtime resizing of a guest's
Hashed Page Table (HPT). Specifically this starts, stops or monitors
the preparation of a new potential HPT for the guest, essentially
implementing the H_RESIZE_HPT_PREPARE hypercall.
If called with shift > 0 when there is no pending HPT for the guest,
this begins preparation of a new pending HPT of size 2^(shift) bytes.
It then returns a positive integer with the estimated number of
milliseconds until preparation is complete.
If called when there is a pending HPT whose size does not match that
requested in the parameters, discards the existing pending HPT and
creates a new one as above.
If called when there is a pending HPT of the size requested, will:
* If preparation of the pending HPT is already complete, return 0
* If preparation of the pending HPT has failed, return an error
code, then discard the pending HPT.
* If preparation of the pending HPT is still in progress, return an
estimated number of milliseconds until preparation is complete.
If called with shift == 0, discards any currently pending HPT and
returns 0 (i.e. cancels any in-progress preparation).
flags is reserved for future expansion, currently setting any bits in
flags will result in an -EINVAL.
Normally this will be called repeatedly with the same parameters until
it returns <= 0. The first call will initiate preparation, subsequent
ones will monitor preparation until it completes or fails.
struct kvm_ppc_resize_hpt {
__u64 flags;
__u32 shift;
__u32 pad;
};
4.103 KVM_PPC_RESIZE_HPT_COMMIT
Capability: KVM_CAP_SPAPR_RESIZE_HPT
Architectures: powerpc
Type: vm ioctl
Parameters: struct kvm_ppc_resize_hpt (in)
Returns: 0 on successful completion,
-EFAULT if struct kvm_reinject_control cannot be read,
-EINVAL if the supplied shift or flags are invalid
-ENXIO is there is no pending HPT, or the pending HPT doesn't
have the requested size
-EBUSY if the pending HPT is not fully prepared
-ENOSPC if there was a hash collision when moving existing
HPT entries to the new HPT
-EIO on other error conditions
Used to implement the PAPR extension for runtime resizing of a guest's
Hashed Page Table (HPT). Specifically this requests that the guest be
transferred to working with the new HPT, essentially implementing the
H_RESIZE_HPT_COMMIT hypercall.
This should only be called after KVM_PPC_RESIZE_HPT_PREPARE has
returned 0 with the same parameters. In other cases
KVM_PPC_RESIZE_HPT_COMMIT will return an error (usually -ENXIO or
-EBUSY, though others may be possible if the preparation was started,
but failed).
This will have undefined effects on the guest if it has not already
placed itself in a quiescent state where no vcpu will make MMU enabled
memory accesses.
On succsful completion, the pending HPT will become the guest's active
HPT and the previous HPT will be discarded.
On failure, the guest will still be operating on its previous HPT.
struct kvm_ppc_resize_hpt {
__u64 flags;
__u32 shift;
__u32 pad;
};
5. The kvm_run structure 5. The kvm_run structure
------------------------ ------------------------
...@@ -3282,7 +3389,18 @@ struct kvm_run { ...@@ -3282,7 +3389,18 @@ struct kvm_run {
Request that KVM_RUN return when it becomes possible to inject external Request that KVM_RUN return when it becomes possible to inject external
interrupts into the guest. Useful in conjunction with KVM_INTERRUPT. interrupts into the guest. Useful in conjunction with KVM_INTERRUPT.
__u8 padding1[7]; __u8 immediate_exit;
This field is polled once when KVM_RUN starts; if non-zero, KVM_RUN
exits immediately, returning -EINTR. In the common scenario where a
signal is used to "kick" a VCPU out of KVM_RUN, this field can be used
to avoid usage of KVM_SET_SIGNAL_MASK, which has worse scalability.
Rather than blocking the signal outside KVM_RUN, userspace can set up
a signal handler that sets run->immediate_exit to a non-zero value.
This field is ignored if KVM_CAP_IMMEDIATE_EXIT is not available.
__u8 padding1[6];
/* out */ /* out */
__u32 exit_reason; __u32 exit_reason;
......
...@@ -118,7 +118,7 @@ Groups: ...@@ -118,7 +118,7 @@ Groups:
-EBUSY: One or more VCPUs are running -EBUSY: One or more VCPUs are running
KVM_DEV_ARM_VGIC_CPU_SYSREGS KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS
Attributes: Attributes:
The attr field of kvm_device_attr encodes two values: The attr field of kvm_device_attr encodes two values:
bits: | 63 .... 32 | 31 .... 16 | 15 .... 0 | bits: | 63 .... 32 | 31 .... 16 | 15 .... 0 |
...@@ -139,13 +139,15 @@ Groups: ...@@ -139,13 +139,15 @@ Groups:
All system regs accessed through this API are (rw, 64-bit) and All system regs accessed through this API are (rw, 64-bit) and
kvm_device_attr.addr points to a __u64 value. kvm_device_attr.addr points to a __u64 value.
KVM_DEV_ARM_VGIC_CPU_SYSREGS accesses the CPU interface registers for the KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS accesses the CPU interface registers for the
CPU specified by the mpidr field. CPU specified by the mpidr field.
CPU interface registers access is not implemented for AArch32 mode.
Error -ENXIO is returned when accessed in AArch32 mode.
Errors: Errors:
-ENXIO: Getting or setting this register is not yet supported -ENXIO: Getting or setting this register is not yet supported
-EBUSY: VCPU is running -EBUSY: VCPU is running
-EINVAL: Invalid mpidr supplied -EINVAL: Invalid mpidr or register value supplied
KVM_DEV_ARM_VGIC_GRP_NR_IRQS KVM_DEV_ARM_VGIC_GRP_NR_IRQS
...@@ -204,3 +206,6 @@ Groups: ...@@ -204,3 +206,6 @@ Groups:
architecture defined MPIDR, and the field is encoded as follows: architecture defined MPIDR, and the field is encoded as follows:
| 63 .... 56 | 55 .... 48 | 47 .... 40 | 39 .... 32 | | 63 .... 56 | 55 .... 48 | 47 .... 40 | 39 .... 32 |
| Aff3 | Aff2 | Aff1 | Aff0 | | Aff3 | Aff2 | Aff1 | Aff0 |
Errors:
-EINVAL: vINTID is not multiple of 32 or
info field is not VGIC_LEVEL_INFO_LINE_LEVEL
...@@ -81,3 +81,38 @@ the vcpu to sleep until occurrence of an appropriate event. Another vcpu of the ...@@ -81,3 +81,38 @@ the vcpu to sleep until occurrence of an appropriate event. Another vcpu of the
same guest can wakeup the sleeping vcpu by issuing KVM_HC_KICK_CPU hypercall, same guest can wakeup the sleeping vcpu by issuing KVM_HC_KICK_CPU hypercall,
specifying APIC ID (a1) of the vcpu to be woken up. An additional argument (a0) specifying APIC ID (a1) of the vcpu to be woken up. An additional argument (a0)
is used in the hypercall for future use. is used in the hypercall for future use.
6. KVM_HC_CLOCK_PAIRING
------------------------
Architecture: x86
Status: active
Purpose: Hypercall used to synchronize host and guest clocks.
Usage:
a0: guest physical address where host copies
"struct kvm_clock_offset" structure.
a1: clock_type, ATM only KVM_CLOCK_PAIRING_WALLCLOCK (0)
is supported (corresponding to the host's CLOCK_REALTIME clock).
struct kvm_clock_pairing {
__s64 sec;
__s64 nsec;
__u64 tsc;
__u32 flags;
__u32 pad[9];
};
Where:
* sec: seconds from clock_type clock.
* nsec: nanoseconds from clock_type clock.
* tsc: guest TSC value used to calculate sec/nsec pair
* flags: flags, unused (0) at the moment.
The hypercall lets a guest compute a precise timestamp across
host and guest. The guest can use the returned TSC value to
compute the CLOCK_REALTIME for its clock, at the same instant.
Returns KVM_EOPNOTSUPP if the host does not use TSC clocksource,
or if clock type is different than KVM_CLOCK_PAIRING_WALLCLOCK.
...@@ -26,9 +26,16 @@ sections. ...@@ -26,9 +26,16 @@ sections.
Fast page fault: Fast page fault:
Fast page fault is the fast path which fixes the guest page fault out of Fast page fault is the fast path which fixes the guest page fault out of
the mmu-lock on x86. Currently, the page fault can be fast only if the the mmu-lock on x86. Currently, the page fault can be fast in one of the
shadow page table is present and it is caused by write-protect, that means following two cases:
we just need change the W bit of the spte.
1. Access Tracking: The SPTE is not present, but it is marked for access
tracking i.e. the SPTE_SPECIAL_MASK is set. That means we need to
restore the saved R/X bits. This is described in more detail later below.
2. Write-Protection: The SPTE is present and the fault is
caused by write-protect. That means we just need to change the W bit of the
spte.
What we use to avoid all the race is the SPTE_HOST_WRITEABLE bit and What we use to avoid all the race is the SPTE_HOST_WRITEABLE bit and
SPTE_MMU_WRITEABLE bit on the spte: SPTE_MMU_WRITEABLE bit on the spte:
...@@ -38,7 +45,8 @@ SPTE_MMU_WRITEABLE bit on the spte: ...@@ -38,7 +45,8 @@ SPTE_MMU_WRITEABLE bit on the spte:
page write-protection. page write-protection.
On fast page fault path, we will use cmpxchg to atomically set the spte W On fast page fault path, we will use cmpxchg to atomically set the spte W
bit if spte.SPTE_HOST_WRITEABLE = 1 and spte.SPTE_WRITE_PROTECT = 1, this bit if spte.SPTE_HOST_WRITEABLE = 1 and spte.SPTE_WRITE_PROTECT = 1, or
restore the saved R/X bits if VMX_EPT_TRACK_ACCESS mask is set, or both. This
is safe because whenever changing these bits can be detected by cmpxchg. is safe because whenever changing these bits can be detected by cmpxchg.
But we need carefully check these cases: But we need carefully check these cases:
...@@ -142,6 +150,21 @@ Since the spte is "volatile" if it can be updated out of mmu-lock, we always ...@@ -142,6 +150,21 @@ Since the spte is "volatile" if it can be updated out of mmu-lock, we always
atomically update the spte, the race caused by fast page fault can be avoided, atomically update the spte, the race caused by fast page fault can be avoided,
See the comments in spte_has_volatile_bits() and mmu_spte_update(). See the comments in spte_has_volatile_bits() and mmu_spte_update().
Lockless Access Tracking:
This is used for Intel CPUs that are using EPT but do not support the EPT A/D
bits. In this case, when the KVM MMU notifier is called to track accesses to a
page (via kvm_mmu_notifier_clear_flush_young), it marks the PTE as not-present
by clearing the RWX bits in the PTE and storing the original R & X bits in
some unused/ignored bits. In addition, the SPTE_SPECIAL_MASK is also set on the
PTE (using the ignored bit 62). When the VM tries to access the page later on,
a fault is generated and the fast page fault mechanism described above is used
to atomically restore the PTE to a Present state. The W bit is not saved when
the PTE is marked for access tracking and during restoration to the Present
state, the W bit is set depending on whether or not it was a write access. If
it wasn't, then the W bit will remain clear until a write access happens, at
which time it will be set using the Dirty tracking mechanism described above.
3. Reference 3. Reference
------------ ------------
......
...@@ -60,9 +60,6 @@ struct kvm_arch { ...@@ -60,9 +60,6 @@ struct kvm_arch {
/* The last vcpu id that ran on each physical CPU */ /* The last vcpu id that ran on each physical CPU */
int __percpu *last_vcpu_ran; int __percpu *last_vcpu_ran;
/* Timer */
struct arch_timer_kvm timer;
/* /*
* Anything that is not used directly from assembly code goes * Anything that is not used directly from assembly code goes
* here. * here.
......
...@@ -129,8 +129,7 @@ static inline bool vcpu_has_cache_enabled(struct kvm_vcpu *vcpu) ...@@ -129,8 +129,7 @@ static inline bool vcpu_has_cache_enabled(struct kvm_vcpu *vcpu)
static inline void __coherent_cache_guest_page(struct kvm_vcpu *vcpu, static inline void __coherent_cache_guest_page(struct kvm_vcpu *vcpu,
kvm_pfn_t pfn, kvm_pfn_t pfn,
unsigned long size, unsigned long size)
bool ipa_uncached)
{ {
/* /*
* If we are going to insert an instruction page and the icache is * If we are going to insert an instruction page and the icache is
...@@ -150,18 +149,12 @@ static inline void __coherent_cache_guest_page(struct kvm_vcpu *vcpu, ...@@ -150,18 +149,12 @@ static inline void __coherent_cache_guest_page(struct kvm_vcpu *vcpu,
* and iterate over the range. * and iterate over the range.
*/ */
bool need_flush = !vcpu_has_cache_enabled(vcpu) || ipa_uncached;
VM_BUG_ON(size & ~PAGE_MASK); VM_BUG_ON(size & ~PAGE_MASK);
if (!need_flush && !icache_is_pipt())
goto vipt_cache;
while (size) { while (size) {
void *va = kmap_atomic_pfn(pfn); void *va = kmap_atomic_pfn(pfn);
if (need_flush) kvm_flush_dcache_to_poc(va, PAGE_SIZE);
kvm_flush_dcache_to_poc(va, PAGE_SIZE);
if (icache_is_pipt()) if (icache_is_pipt())
__cpuc_coherent_user_range((unsigned long)va, __cpuc_coherent_user_range((unsigned long)va,
...@@ -173,7 +166,6 @@ static inline void __coherent_cache_guest_page(struct kvm_vcpu *vcpu, ...@@ -173,7 +166,6 @@ static inline void __coherent_cache_guest_page(struct kvm_vcpu *vcpu,
kunmap_atomic(va); kunmap_atomic(va);
} }
vipt_cache:
if (!icache_is_pipt() && !icache_is_vivt_asid_tagged()) { if (!icache_is_pipt() && !icache_is_vivt_asid_tagged()) {
/* any kind of VIPT cache */ /* any kind of VIPT cache */
__flush_icache_all(); __flush_icache_all();
......
...@@ -181,10 +181,23 @@ struct kvm_arch_memory_slot { ...@@ -181,10 +181,23 @@ struct kvm_arch_memory_slot {
#define KVM_DEV_ARM_VGIC_GRP_CPU_REGS 2 #define KVM_DEV_ARM_VGIC_GRP_CPU_REGS 2
#define KVM_DEV_ARM_VGIC_CPUID_SHIFT 32 #define KVM_DEV_ARM_VGIC_CPUID_SHIFT 32
#define KVM_DEV_ARM_VGIC_CPUID_MASK (0xffULL << KVM_DEV_ARM_VGIC_CPUID_SHIFT) #define KVM_DEV_ARM_VGIC_CPUID_MASK (0xffULL << KVM_DEV_ARM_VGIC_CPUID_SHIFT)
#define KVM_DEV_ARM_VGIC_V3_MPIDR_SHIFT 32
#define KVM_DEV_ARM_VGIC_V3_MPIDR_MASK \
(0xffffffffULL << KVM_DEV_ARM_VGIC_V3_MPIDR_SHIFT)
#define KVM_DEV_ARM_VGIC_OFFSET_SHIFT 0 #define KVM_DEV_ARM_VGIC_OFFSET_SHIFT 0
#define KVM_DEV_ARM_VGIC_OFFSET_MASK (0xffffffffULL << KVM_DEV_ARM_VGIC_OFFSET_SHIFT) #define KVM_DEV_ARM_VGIC_OFFSET_MASK (0xffffffffULL << KVM_DEV_ARM_VGIC_OFFSET_SHIFT)
#define KVM_DEV_ARM_VGIC_SYSREG_INSTR_MASK (0xffff)
#define KVM_DEV_ARM_VGIC_GRP_NR_IRQS 3 #define KVM_DEV_ARM_VGIC_GRP_NR_IRQS 3
#define KVM_DEV_ARM_VGIC_GRP_CTRL 4 #define KVM_DEV_ARM_VGIC_GRP_CTRL 4
#define KVM_DEV_ARM_VGIC_GRP_REDIST_REGS 5
#define KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS 6
#define KVM_DEV_ARM_VGIC_GRP_LEVEL_INFO 7
#define KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_SHIFT 10
#define KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_MASK \
(0x3fffffULL << KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_SHIFT)
#define KVM_DEV_ARM_VGIC_LINE_LEVEL_INTID_MASK 0x3ff
#define VGIC_LEVEL_INFO_LINE_LEVEL 0
#define KVM_DEV_ARM_VGIC_CTRL_INIT 0 #define KVM_DEV_ARM_VGIC_CTRL_INIT 0
/* KVM_IRQ_LINE irq field index values */ /* KVM_IRQ_LINE irq field index values */
......
...@@ -7,7 +7,7 @@ ifeq ($(plus_virt),+virt) ...@@ -7,7 +7,7 @@ ifeq ($(plus_virt),+virt)
plus_virt_def := -DREQUIRES_VIRT=1 plus_virt_def := -DREQUIRES_VIRT=1
endif endif
ccflags-y += -Iarch/arm/kvm ccflags-y += -Iarch/arm/kvm -Ivirt/kvm/arm/vgic
CFLAGS_arm.o := -I. $(plus_virt_def) CFLAGS_arm.o := -I. $(plus_virt_def)
CFLAGS_mmu.o := -I. CFLAGS_mmu.o := -I.
...@@ -20,7 +20,7 @@ kvm-arm-y = $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/eventfd.o $(KVM)/vf ...@@ -20,7 +20,7 @@ kvm-arm-y = $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o $(KVM)/eventfd.o $(KVM)/vf
obj-$(CONFIG_KVM_ARM_HOST) += hyp/ obj-$(CONFIG_KVM_ARM_HOST) += hyp/
obj-y += kvm-arm.o init.o interrupts.o obj-y += kvm-arm.o init.o interrupts.o
obj-y += arm.o handle_exit.o guest.o mmu.o emulate.o reset.o obj-y += arm.o handle_exit.o guest.o mmu.o emulate.o reset.o
obj-y += coproc.o coproc_a15.o coproc_a7.o mmio.o psci.o perf.o obj-y += coproc.o coproc_a15.o coproc_a7.o mmio.o psci.o perf.o vgic-v3-coproc.o
obj-y += $(KVM)/arm/aarch32.o obj-y += $(KVM)/arm/aarch32.o
obj-y += $(KVM)/arm/vgic/vgic.o obj-y += $(KVM)/arm/vgic/vgic.o
...@@ -33,5 +33,6 @@ obj-y += $(KVM)/arm/vgic/vgic-mmio-v2.o ...@@ -33,5 +33,6 @@ obj-y += $(KVM)/arm/vgic/vgic-mmio-v2.o
obj-y += $(KVM)/arm/vgic/vgic-mmio-v3.o obj-y += $(KVM)/arm/vgic/vgic-mmio-v3.o
obj-y += $(KVM)/arm/vgic/vgic-kvm-device.o obj-y += $(KVM)/arm/vgic/vgic-kvm-device.o
obj-y += $(KVM)/arm/vgic/vgic-its.o obj-y += $(KVM)/arm/vgic/vgic-its.o
obj-y += $(KVM)/arm/vgic/vgic-debug.o
obj-y += $(KVM)/irqchip.o obj-y += $(KVM)/irqchip.o
obj-y += $(KVM)/arm/arch_timer.o obj-y += $(KVM)/arm/arch_timer.o
...@@ -135,7 +135,6 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) ...@@ -135,7 +135,6 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
goto out_free_stage2_pgd; goto out_free_stage2_pgd;
kvm_vgic_early_init(kvm); kvm_vgic_early_init(kvm);
kvm_timer_init(kvm);
/* Mark the initial VMID generation invalid */ /* Mark the initial VMID generation invalid */
kvm->arch.vmid_gen = 0; kvm->arch.vmid_gen = 0;
...@@ -207,6 +206,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) ...@@ -207,6 +206,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
case KVM_CAP_ARM_PSCI_0_2: case KVM_CAP_ARM_PSCI_0_2:
case KVM_CAP_READONLY_MEM: case KVM_CAP_READONLY_MEM:
case KVM_CAP_MP_STATE: case KVM_CAP_MP_STATE:
case KVM_CAP_IMMEDIATE_EXIT:
r = 1; r = 1;
break; break;
case KVM_CAP_COALESCED_MMIO: case KVM_CAP_COALESCED_MMIO:
...@@ -301,7 +301,8 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu) ...@@ -301,7 +301,8 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu) int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
{ {
return kvm_timer_should_fire(vcpu); return kvm_timer_should_fire(vcpu_vtimer(vcpu)) ||
kvm_timer_should_fire(vcpu_ptimer(vcpu));
} }
void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu)
...@@ -604,6 +605,9 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run) ...@@ -604,6 +605,9 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
return ret; return ret;
} }
if (run->immediate_exit)
return -EINTR;
if (vcpu->sigset_active) if (vcpu->sigset_active)
sigprocmask(SIG_SETMASK, &vcpu->sigset, &sigsaved); sigprocmask(SIG_SETMASK, &vcpu->sigset, &sigsaved);
......
...@@ -1232,9 +1232,9 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm, ...@@ -1232,9 +1232,9 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kvm *kvm,
} }
static void coherent_cache_guest_page(struct kvm_vcpu *vcpu, kvm_pfn_t pfn, static void coherent_cache_guest_page(struct kvm_vcpu *vcpu, kvm_pfn_t pfn,
unsigned long size, bool uncached) unsigned long size)
{ {
__coherent_cache_guest_page(vcpu, pfn, size, uncached); __coherent_cache_guest_page(vcpu, pfn, size);
} }
static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
...@@ -1250,7 +1250,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, ...@@ -1250,7 +1250,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
struct vm_area_struct *vma; struct vm_area_struct *vma;
kvm_pfn_t pfn; kvm_pfn_t pfn;
pgprot_t mem_type = PAGE_S2; pgprot_t mem_type = PAGE_S2;
bool fault_ipa_uncached;
bool logging_active = memslot_is_logging(memslot); bool logging_active = memslot_is_logging(memslot);
unsigned long flags = 0; unsigned long flags = 0;
...@@ -1337,8 +1336,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, ...@@ -1337,8 +1336,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
if (!hugetlb && !force_pte) if (!hugetlb && !force_pte)
hugetlb = transparent_hugepage_adjust(&pfn, &fault_ipa); hugetlb = transparent_hugepage_adjust(&pfn, &fault_ipa);
fault_ipa_uncached = memslot->flags & KVM_MEMSLOT_INCOHERENT;
if (hugetlb) { if (hugetlb) {
pmd_t new_pmd = pfn_pmd(pfn, mem_type); pmd_t new_pmd = pfn_pmd(pfn, mem_type);
new_pmd = pmd_mkhuge(new_pmd); new_pmd = pmd_mkhuge(new_pmd);
...@@ -1346,7 +1343,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, ...@@ -1346,7 +1343,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
new_pmd = kvm_s2pmd_mkwrite(new_pmd); new_pmd = kvm_s2pmd_mkwrite(new_pmd);
kvm_set_pfn_dirty(pfn); kvm_set_pfn_dirty(pfn);
} }
coherent_cache_guest_page(vcpu, pfn, PMD_SIZE, fault_ipa_uncached); coherent_cache_guest_page(vcpu, pfn, PMD_SIZE);
ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, &new_pmd); ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, &new_pmd);
} else { } else {
pte_t new_pte = pfn_pte(pfn, mem_type); pte_t new_pte = pfn_pte(pfn, mem_type);
...@@ -1356,7 +1353,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, ...@@ -1356,7 +1353,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
kvm_set_pfn_dirty(pfn); kvm_set_pfn_dirty(pfn);
mark_page_dirty(kvm, gfn); mark_page_dirty(kvm, gfn);
} }
coherent_cache_guest_page(vcpu, pfn, PAGE_SIZE, fault_ipa_uncached); coherent_cache_guest_page(vcpu, pfn, PAGE_SIZE);
ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte, flags); ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte, flags);
} }
...@@ -1879,15 +1876,6 @@ void kvm_arch_free_memslot(struct kvm *kvm, struct kvm_memory_slot *free, ...@@ -1879,15 +1876,6 @@ void kvm_arch_free_memslot(struct kvm *kvm, struct kvm_memory_slot *free,
int kvm_arch_create_memslot(struct kvm *kvm, struct kvm_memory_slot *slot, int kvm_arch_create_memslot(struct kvm *kvm, struct kvm_memory_slot *slot,
unsigned long npages) unsigned long npages)
{ {
/*
* Readonly memslots are not incoherent with the caches by definition,
* but in practice, they are used mostly to emulate ROMs or NOR flashes
* that the guest may consider devices and hence map as uncached.
* To prevent incoherency issues in these cases, tag all readonly
* regions as incoherent.
*/
if (slot->flags & KVM_MEM_READONLY)
slot->flags |= KVM_MEMSLOT_INCOHERENT;
return 0; return 0;
} }
......
...@@ -37,6 +37,11 @@ static struct kvm_regs cortexa_regs_reset = { ...@@ -37,6 +37,11 @@ static struct kvm_regs cortexa_regs_reset = {
.usr_regs.ARM_cpsr = SVC_MODE | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT, .usr_regs.ARM_cpsr = SVC_MODE | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT,
}; };
static const struct kvm_irq_level cortexa_ptimer_irq = {
{ .irq = 30 },
.level = 1,
};
static const struct kvm_irq_level cortexa_vtimer_irq = { static const struct kvm_irq_level cortexa_vtimer_irq = {
{ .irq = 27 }, { .irq = 27 },
.level = 1, .level = 1,
...@@ -58,6 +63,7 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu) ...@@ -58,6 +63,7 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
{ {
struct kvm_regs *reset_regs; struct kvm_regs *reset_regs;
const struct kvm_irq_level *cpu_vtimer_irq; const struct kvm_irq_level *cpu_vtimer_irq;
const struct kvm_irq_level *cpu_ptimer_irq;
switch (vcpu->arch.target) { switch (vcpu->arch.target) {
case KVM_ARM_TARGET_CORTEX_A7: case KVM_ARM_TARGET_CORTEX_A7:
...@@ -65,6 +71,7 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu) ...@@ -65,6 +71,7 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
reset_regs = &cortexa_regs_reset; reset_regs = &cortexa_regs_reset;
vcpu->arch.midr = read_cpuid_id(); vcpu->arch.midr = read_cpuid_id();
cpu_vtimer_irq = &cortexa_vtimer_irq; cpu_vtimer_irq = &cortexa_vtimer_irq;
cpu_ptimer_irq = &cortexa_ptimer_irq;
break; break;
default: default:
return -ENODEV; return -ENODEV;
...@@ -77,5 +84,5 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu) ...@@ -77,5 +84,5 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
kvm_reset_coprocs(vcpu); kvm_reset_coprocs(vcpu);
/* Reset arch_timer context */ /* Reset arch_timer context */
return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq); return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq, cpu_ptimer_irq);
} }
/*
* VGIC system registers handling functions for AArch32 mode
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*/
#include <linux/kvm.h>
#include <linux/kvm_host.h>
#include <asm/kvm_emulate.h>
#include "vgic.h"
int vgic_v3_has_cpu_sysregs_attr(struct kvm_vcpu *vcpu, bool is_write, u64 id,
u64 *reg)
{
/*
* TODO: Implement for AArch32
*/
return -ENXIO;
}
int vgic_v3_cpu_sysregs_uaccess(struct kvm_vcpu *vcpu, bool is_write, u64 id,
u64 *reg)
{
/*
* TODO: Implement for AArch32
*/
return -ENXIO;
}
...@@ -70,9 +70,6 @@ struct kvm_arch { ...@@ -70,9 +70,6 @@ struct kvm_arch {
/* Interrupt controller */ /* Interrupt controller */
struct vgic_dist vgic; struct vgic_dist vgic;
/* Timer */
struct arch_timer_kvm timer;
}; };
#define KVM_NR_MEM_OBJS 40 #define KVM_NR_MEM_OBJS 40
......
...@@ -236,13 +236,11 @@ static inline bool vcpu_has_cache_enabled(struct kvm_vcpu *vcpu) ...@@ -236,13 +236,11 @@ static inline bool vcpu_has_cache_enabled(struct kvm_vcpu *vcpu)
static inline void __coherent_cache_guest_page(struct kvm_vcpu *vcpu, static inline void __coherent_cache_guest_page(struct kvm_vcpu *vcpu,
kvm_pfn_t pfn, kvm_pfn_t pfn,
unsigned long size, unsigned long size)
bool ipa_uncached)
{ {
void *va = page_address(pfn_to_page(pfn)); void *va = page_address(pfn_to_page(pfn));
if (!vcpu_has_cache_enabled(vcpu) || ipa_uncached) kvm_flush_dcache_to_poc(va, size);
kvm_flush_dcache_to_poc(va, size);
if (!icache_is_aliasing()) { /* PIPT */ if (!icache_is_aliasing()) { /* PIPT */
flush_icache_range((unsigned long)va, flush_icache_range((unsigned long)va,
......
...@@ -201,10 +201,23 @@ struct kvm_arch_memory_slot { ...@@ -201,10 +201,23 @@ struct kvm_arch_memory_slot {
#define KVM_DEV_ARM_VGIC_GRP_CPU_REGS 2 #define KVM_DEV_ARM_VGIC_GRP_CPU_REGS 2
#define KVM_DEV_ARM_VGIC_CPUID_SHIFT 32 #define KVM_DEV_ARM_VGIC_CPUID_SHIFT 32
#define KVM_DEV_ARM_VGIC_CPUID_MASK (0xffULL << KVM_DEV_ARM_VGIC_CPUID_SHIFT) #define KVM_DEV_ARM_VGIC_CPUID_MASK (0xffULL << KVM_DEV_ARM_VGIC_CPUID_SHIFT)
#define KVM_DEV_ARM_VGIC_V3_MPIDR_SHIFT 32
#define KVM_DEV_ARM_VGIC_V3_MPIDR_MASK \
(0xffffffffULL << KVM_DEV_ARM_VGIC_V3_MPIDR_SHIFT)
#define KVM_DEV_ARM_VGIC_OFFSET_SHIFT 0 #define KVM_DEV_ARM_VGIC_OFFSET_SHIFT 0
#define KVM_DEV_ARM_VGIC_OFFSET_MASK (0xffffffffULL << KVM_DEV_ARM_VGIC_OFFSET_SHIFT) #define KVM_DEV_ARM_VGIC_OFFSET_MASK (0xffffffffULL << KVM_DEV_ARM_VGIC_OFFSET_SHIFT)
#define KVM_DEV_ARM_VGIC_SYSREG_INSTR_MASK (0xffff)
#define KVM_DEV_ARM_VGIC_GRP_NR_IRQS 3 #define KVM_DEV_ARM_VGIC_GRP_NR_IRQS 3
#define KVM_DEV_ARM_VGIC_GRP_CTRL 4 #define KVM_DEV_ARM_VGIC_GRP_CTRL 4
#define KVM_DEV_ARM_VGIC_GRP_REDIST_REGS 5
#define KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS 6
#define KVM_DEV_ARM_VGIC_GRP_LEVEL_INFO 7
#define KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_SHIFT 10
#define KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_MASK \
(0x3fffffULL << KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_SHIFT)
#define KVM_DEV_ARM_VGIC_LINE_LEVEL_INTID_MASK 0x3ff
#define VGIC_LEVEL_INFO_LINE_LEVEL 0
#define KVM_DEV_ARM_VGIC_CTRL_INIT 0 #define KVM_DEV_ARM_VGIC_CTRL_INIT 0
/* Device Control API on vcpu fd */ /* Device Control API on vcpu fd */
......
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
# Makefile for Kernel-based Virtual Machine module # Makefile for Kernel-based Virtual Machine module
# #
ccflags-y += -Iarch/arm64/kvm ccflags-y += -Iarch/arm64/kvm -Ivirt/kvm/arm/vgic
CFLAGS_arm.o := -I. CFLAGS_arm.o := -I.
CFLAGS_mmu.o := -I. CFLAGS_mmu.o := -I.
...@@ -19,6 +19,7 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(ARM)/psci.o $(ARM)/perf.o ...@@ -19,6 +19,7 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(ARM)/psci.o $(ARM)/perf.o
kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o
kvm-$(CONFIG_KVM_ARM_HOST) += hyp.o hyp-init.o handle_exit.o kvm-$(CONFIG_KVM_ARM_HOST) += hyp.o hyp-init.o handle_exit.o
kvm-$(CONFIG_KVM_ARM_HOST) += guest.o debug.o reset.o sys_regs.o sys_regs_generic_v8.o kvm-$(CONFIG_KVM_ARM_HOST) += guest.o debug.o reset.o sys_regs.o sys_regs_generic_v8.o
kvm-$(CONFIG_KVM_ARM_HOST) += vgic-sys-reg-v3.o
kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/aarch32.o kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/aarch32.o
kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic.o kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic.o
...@@ -31,6 +32,7 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic-mmio-v2.o ...@@ -31,6 +32,7 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic-mmio-v2.o
kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic-mmio-v3.o kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic-mmio-v3.o
kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic-kvm-device.o kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic-kvm-device.o
kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic-its.o kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic-its.o
kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic-debug.o
kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/irqchip.o kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/irqchip.o
kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/arch_timer.o kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/arch_timer.o
kvm-$(CONFIG_KVM_ARM_PMU) += $(KVM)/arm/pmu.o kvm-$(CONFIG_KVM_ARM_PMU) += $(KVM)/arm/pmu.o
...@@ -46,6 +46,11 @@ static const struct kvm_regs default_regs_reset32 = { ...@@ -46,6 +46,11 @@ static const struct kvm_regs default_regs_reset32 = {
COMPAT_PSR_I_BIT | COMPAT_PSR_F_BIT), COMPAT_PSR_I_BIT | COMPAT_PSR_F_BIT),
}; };
static const struct kvm_irq_level default_ptimer_irq = {
.irq = 30,
.level = 1,
};
static const struct kvm_irq_level default_vtimer_irq = { static const struct kvm_irq_level default_vtimer_irq = {
.irq = 27, .irq = 27,
.level = 1, .level = 1,
...@@ -104,6 +109,7 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext) ...@@ -104,6 +109,7 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext)
int kvm_reset_vcpu(struct kvm_vcpu *vcpu) int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
{ {
const struct kvm_irq_level *cpu_vtimer_irq; const struct kvm_irq_level *cpu_vtimer_irq;
const struct kvm_irq_level *cpu_ptimer_irq;
const struct kvm_regs *cpu_reset; const struct kvm_regs *cpu_reset;
switch (vcpu->arch.target) { switch (vcpu->arch.target) {
...@@ -117,6 +123,7 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu) ...@@ -117,6 +123,7 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
} }
cpu_vtimer_irq = &default_vtimer_irq; cpu_vtimer_irq = &default_vtimer_irq;
cpu_ptimer_irq = &default_ptimer_irq;
break; break;
} }
...@@ -130,5 +137,5 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu) ...@@ -130,5 +137,5 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
kvm_pmu_vcpu_reset(vcpu); kvm_pmu_vcpu_reset(vcpu);
/* Reset timer */ /* Reset timer */
return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq); return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq, cpu_ptimer_irq);
} }
...@@ -820,6 +820,61 @@ static bool access_pmuserenr(struct kvm_vcpu *vcpu, struct sys_reg_params *p, ...@@ -820,6 +820,61 @@ static bool access_pmuserenr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
CRm((0b1100 | (((n) >> 3) & 0x3))), Op2(((n) & 0x7)), \ CRm((0b1100 | (((n) >> 3) & 0x3))), Op2(((n) & 0x7)), \
access_pmu_evtyper, reset_unknown, (PMEVTYPER0_EL0 + n), } access_pmu_evtyper, reset_unknown, (PMEVTYPER0_EL0 + n), }
static bool access_cntp_tval(struct kvm_vcpu *vcpu,
struct sys_reg_params *p,
const struct sys_reg_desc *r)
{
struct arch_timer_context *ptimer = vcpu_ptimer(vcpu);
u64 now = kvm_phys_timer_read();
if (p->is_write)
ptimer->cnt_cval = p->regval + now;
else
p->regval = ptimer->cnt_cval - now;
return true;
}
static bool access_cntp_ctl(struct kvm_vcpu *vcpu,
struct sys_reg_params *p,
const struct sys_reg_desc *r)
{
struct arch_timer_context *ptimer = vcpu_ptimer(vcpu);
if (p->is_write) {
/* ISTATUS bit is read-only */
ptimer->cnt_ctl = p->regval & ~ARCH_TIMER_CTRL_IT_STAT;
} else {
u64 now = kvm_phys_timer_read();
p->regval = ptimer->cnt_ctl;
/*
* Set ISTATUS bit if it's expired.
* Note that according to ARMv8 ARM Issue A.k, ISTATUS bit is
* UNKNOWN when ENABLE bit is 0, so we chose to set ISTATUS bit
* regardless of ENABLE bit for our implementation convenience.
*/
if (ptimer->cnt_cval <= now)
p->regval |= ARCH_TIMER_CTRL_IT_STAT;
}
return true;
}
static bool access_cntp_cval(struct kvm_vcpu *vcpu,
struct sys_reg_params *p,
const struct sys_reg_desc *r)
{
struct arch_timer_context *ptimer = vcpu_ptimer(vcpu);
if (p->is_write)
ptimer->cnt_cval = p->regval;
else
p->regval = ptimer->cnt_cval;
return true;
}
/* /*
* Architected system registers. * Architected system registers.
* Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2 * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2
...@@ -1029,6 +1084,16 @@ static const struct sys_reg_desc sys_reg_descs[] = { ...@@ -1029,6 +1084,16 @@ static const struct sys_reg_desc sys_reg_descs[] = {
{ Op0(0b11), Op1(0b011), CRn(0b1101), CRm(0b0000), Op2(0b011), { Op0(0b11), Op1(0b011), CRn(0b1101), CRm(0b0000), Op2(0b011),
NULL, reset_unknown, TPIDRRO_EL0 }, NULL, reset_unknown, TPIDRRO_EL0 },
/* CNTP_TVAL_EL0 */
{ Op0(0b11), Op1(0b011), CRn(0b1110), CRm(0b0010), Op2(0b000),
access_cntp_tval },
/* CNTP_CTL_EL0 */
{ Op0(0b11), Op1(0b011), CRn(0b1110), CRm(0b0010), Op2(0b001),
access_cntp_ctl },
/* CNTP_CVAL_EL0 */
{ Op0(0b11), Op1(0b011), CRn(0b1110), CRm(0b0010), Op2(0b010),
access_cntp_cval },
/* PMEVCNTRn_EL0 */ /* PMEVCNTRn_EL0 */
PMU_PMEVCNTR_EL0(0), PMU_PMEVCNTR_EL0(0),
PMU_PMEVCNTR_EL0(1), PMU_PMEVCNTR_EL0(1),
...@@ -1795,6 +1860,17 @@ static bool index_to_params(u64 id, struct sys_reg_params *params) ...@@ -1795,6 +1860,17 @@ static bool index_to_params(u64 id, struct sys_reg_params *params)
} }
} }
const struct sys_reg_desc *find_reg_by_id(u64 id,
struct sys_reg_params *params,
const struct sys_reg_desc table[],
unsigned int num)
{
if (!index_to_params(id, params))
return NULL;
return find_reg(params, table, num);
}
/* Decode an index value, and find the sys_reg_desc entry. */ /* Decode an index value, and find the sys_reg_desc entry. */
static const struct sys_reg_desc *index_to_sys_reg_desc(struct kvm_vcpu *vcpu, static const struct sys_reg_desc *index_to_sys_reg_desc(struct kvm_vcpu *vcpu,
u64 id) u64 id)
...@@ -1807,11 +1883,8 @@ static const struct sys_reg_desc *index_to_sys_reg_desc(struct kvm_vcpu *vcpu, ...@@ -1807,11 +1883,8 @@ static const struct sys_reg_desc *index_to_sys_reg_desc(struct kvm_vcpu *vcpu,
if ((id & KVM_REG_ARM_COPROC_MASK) != KVM_REG_ARM64_SYSREG) if ((id & KVM_REG_ARM_COPROC_MASK) != KVM_REG_ARM64_SYSREG)
return NULL; return NULL;
if (!index_to_params(id, &params))
return NULL;
table = get_target_table(vcpu->arch.target, true, &num); table = get_target_table(vcpu->arch.target, true, &num);
r = find_reg(&params, table, num); r = find_reg_by_id(id, &params, table, num);
if (!r) if (!r)
r = find_reg(&params, sys_reg_descs, ARRAY_SIZE(sys_reg_descs)); r = find_reg(&params, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
...@@ -1918,10 +1991,8 @@ static int get_invariant_sys_reg(u64 id, void __user *uaddr) ...@@ -1918,10 +1991,8 @@ static int get_invariant_sys_reg(u64 id, void __user *uaddr)
struct sys_reg_params params; struct sys_reg_params params;
const struct sys_reg_desc *r; const struct sys_reg_desc *r;
if (!index_to_params(id, &params)) r = find_reg_by_id(id, &params, invariant_sys_regs,
return -ENOENT; ARRAY_SIZE(invariant_sys_regs));
r = find_reg(&params, invariant_sys_regs, ARRAY_SIZE(invariant_sys_regs));
if (!r) if (!r)
return -ENOENT; return -ENOENT;
...@@ -1935,9 +2006,8 @@ static int set_invariant_sys_reg(u64 id, void __user *uaddr) ...@@ -1935,9 +2006,8 @@ static int set_invariant_sys_reg(u64 id, void __user *uaddr)
int err; int err;
u64 val = 0; /* Make sure high bits are 0 for 32-bit regs */ u64 val = 0; /* Make sure high bits are 0 for 32-bit regs */
if (!index_to_params(id, &params)) r = find_reg_by_id(id, &params, invariant_sys_regs,
return -ENOENT; ARRAY_SIZE(invariant_sys_regs));
r = find_reg(&params, invariant_sys_regs, ARRAY_SIZE(invariant_sys_regs));
if (!r) if (!r)
return -ENOENT; return -ENOENT;
......
...@@ -136,6 +136,10 @@ static inline int cmp_sys_reg(const struct sys_reg_desc *i1, ...@@ -136,6 +136,10 @@ static inline int cmp_sys_reg(const struct sys_reg_desc *i1,
return i1->Op2 - i2->Op2; return i1->Op2 - i2->Op2;
} }
const struct sys_reg_desc *find_reg_by_id(u64 id,
struct sys_reg_params *params,
const struct sys_reg_desc table[],
unsigned int num);
#define Op0(_x) .Op0 = _x #define Op0(_x) .Op0 = _x
#define Op1(_x) .Op1 = _x #define Op1(_x) .Op1 = _x
......
/*
* VGIC system registers handling functions for AArch64 mode
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*/
#include <linux/irqchip/arm-gic-v3.h>
#include <linux/kvm.h>
#include <linux/kvm_host.h>
#include <asm/kvm_emulate.h>
#include "vgic.h"
#include "sys_regs.h"
static bool access_gic_ctlr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
const struct sys_reg_desc *r)
{
u32 host_pri_bits, host_id_bits, host_seis, host_a3v, seis, a3v;
struct vgic_cpu *vgic_v3_cpu = &vcpu->arch.vgic_cpu;
struct vgic_vmcr vmcr;
u64 val;
vgic_get_vmcr(vcpu, &vmcr);
if (p->is_write) {
val = p->regval;
/*
* Disallow restoring VM state if not supported by this
* hardware.
*/
host_pri_bits = ((val & ICC_CTLR_EL1_PRI_BITS_MASK) >>
ICC_CTLR_EL1_PRI_BITS_SHIFT) + 1;
if (host_pri_bits > vgic_v3_cpu->num_pri_bits)
return false;
vgic_v3_cpu->num_pri_bits = host_pri_bits;
host_id_bits = (val & ICC_CTLR_EL1_ID_BITS_MASK) >>
ICC_CTLR_EL1_ID_BITS_SHIFT;
if (host_id_bits > vgic_v3_cpu->num_id_bits)
return false;
vgic_v3_cpu->num_id_bits = host_id_bits;
host_seis = ((kvm_vgic_global_state.ich_vtr_el2 &
ICH_VTR_SEIS_MASK) >> ICH_VTR_SEIS_SHIFT);
seis = (val & ICC_CTLR_EL1_SEIS_MASK) >>
ICC_CTLR_EL1_SEIS_SHIFT;
if (host_seis != seis)
return false;
host_a3v = ((kvm_vgic_global_state.ich_vtr_el2 &
ICH_VTR_A3V_MASK) >> ICH_VTR_A3V_SHIFT);
a3v = (val & ICC_CTLR_EL1_A3V_MASK) >> ICC_CTLR_EL1_A3V_SHIFT;
if (host_a3v != a3v)
return false;
/*
* Here set VMCR.CTLR in ICC_CTLR_EL1 layout.
* The vgic_set_vmcr() will convert to ICH_VMCR layout.
*/
vmcr.ctlr = val & ICC_CTLR_EL1_CBPR_MASK;
vmcr.ctlr |= val & ICC_CTLR_EL1_EOImode_MASK;
vgic_set_vmcr(vcpu, &vmcr);
} else {
val = 0;
val |= (vgic_v3_cpu->num_pri_bits - 1) <<
ICC_CTLR_EL1_PRI_BITS_SHIFT;
val |= vgic_v3_cpu->num_id_bits << ICC_CTLR_EL1_ID_BITS_SHIFT;
val |= ((kvm_vgic_global_state.ich_vtr_el2 &
ICH_VTR_SEIS_MASK) >> ICH_VTR_SEIS_SHIFT) <<
ICC_CTLR_EL1_SEIS_SHIFT;
val |= ((kvm_vgic_global_state.ich_vtr_el2 &
ICH_VTR_A3V_MASK) >> ICH_VTR_A3V_SHIFT) <<
ICC_CTLR_EL1_A3V_SHIFT;
/*
* The VMCR.CTLR value is in ICC_CTLR_EL1 layout.
* Extract it directly using ICC_CTLR_EL1 reg definitions.
*/
val |= vmcr.ctlr & ICC_CTLR_EL1_CBPR_MASK;
val |= vmcr.ctlr & ICC_CTLR_EL1_EOImode_MASK;
p->regval = val;
}
return true;
}
static bool access_gic_pmr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
const struct sys_reg_desc *r)
{
struct vgic_vmcr vmcr;
vgic_get_vmcr(vcpu, &vmcr);
if (p->is_write) {
vmcr.pmr = (p->regval & ICC_PMR_EL1_MASK) >> ICC_PMR_EL1_SHIFT;
vgic_set_vmcr(vcpu, &vmcr);
} else {
p->regval = (vmcr.pmr << ICC_PMR_EL1_SHIFT) & ICC_PMR_EL1_MASK;
}
return true;
}
static bool access_gic_bpr0(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
const struct sys_reg_desc *r)
{
struct vgic_vmcr vmcr;
vgic_get_vmcr(vcpu, &vmcr);
if (p->is_write) {
vmcr.bpr = (p->regval & ICC_BPR0_EL1_MASK) >>
ICC_BPR0_EL1_SHIFT;
vgic_set_vmcr(vcpu, &vmcr);
} else {
p->regval = (vmcr.bpr << ICC_BPR0_EL1_SHIFT) &
ICC_BPR0_EL1_MASK;
}
return true;
}
static bool access_gic_bpr1(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
const struct sys_reg_desc *r)
{
struct vgic_vmcr vmcr;
if (!p->is_write)
p->regval = 0;
vgic_get_vmcr(vcpu, &vmcr);
if (!((vmcr.ctlr & ICH_VMCR_CBPR_MASK) >> ICH_VMCR_CBPR_SHIFT)) {
if (p->is_write) {
vmcr.abpr = (p->regval & ICC_BPR1_EL1_MASK) >>
ICC_BPR1_EL1_SHIFT;
vgic_set_vmcr(vcpu, &vmcr);
} else {
p->regval = (vmcr.abpr << ICC_BPR1_EL1_SHIFT) &
ICC_BPR1_EL1_MASK;
}
} else {
if (!p->is_write)
p->regval = min((vmcr.bpr + 1), 7U);
}
return true;
}
static bool access_gic_grpen0(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
const struct sys_reg_desc *r)
{
struct vgic_vmcr vmcr;
vgic_get_vmcr(vcpu, &vmcr);
if (p->is_write) {
vmcr.grpen0 = (p->regval & ICC_IGRPEN0_EL1_MASK) >>
ICC_IGRPEN0_EL1_SHIFT;
vgic_set_vmcr(vcpu, &vmcr);
} else {
p->regval = (vmcr.grpen0 << ICC_IGRPEN0_EL1_SHIFT) &
ICC_IGRPEN0_EL1_MASK;
}
return true;
}
static bool access_gic_grpen1(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
const struct sys_reg_desc *r)
{
struct vgic_vmcr vmcr;
vgic_get_vmcr(vcpu, &vmcr);
if (p->is_write) {
vmcr.grpen1 = (p->regval & ICC_IGRPEN1_EL1_MASK) >>
ICC_IGRPEN1_EL1_SHIFT;
vgic_set_vmcr(vcpu, &vmcr);
} else {
p->regval = (vmcr.grpen1 << ICC_IGRPEN1_EL1_SHIFT) &
ICC_IGRPEN1_EL1_MASK;
}
return true;
}
static void vgic_v3_access_apr_reg(struct kvm_vcpu *vcpu,
struct sys_reg_params *p, u8 apr, u8 idx)
{
struct vgic_v3_cpu_if *vgicv3 = &vcpu->arch.vgic_cpu.vgic_v3;
uint32_t *ap_reg;
if (apr)
ap_reg = &vgicv3->vgic_ap1r[idx];
else
ap_reg = &vgicv3->vgic_ap0r[idx];
if (p->is_write)
*ap_reg = p->regval;
else
p->regval = *ap_reg;
}
static bool access_gic_aprn(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
const struct sys_reg_desc *r, u8 apr)
{
struct vgic_cpu *vgic_v3_cpu = &vcpu->arch.vgic_cpu;
u8 idx = r->Op2 & 3;
/*
* num_pri_bits are initialized with HW supported values.
* We can rely safely on num_pri_bits even if VM has not
* restored ICC_CTLR_EL1 before restoring APnR registers.
*/
switch (vgic_v3_cpu->num_pri_bits) {
case 7:
vgic_v3_access_apr_reg(vcpu, p, apr, idx);
break;
case 6:
if (idx > 1)
goto err;
vgic_v3_access_apr_reg(vcpu, p, apr, idx);
break;
default:
if (idx > 0)
goto err;
vgic_v3_access_apr_reg(vcpu, p, apr, idx);
}
return true;
err:
if (!p->is_write)
p->regval = 0;
return false;
}
static bool access_gic_ap0r(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
const struct sys_reg_desc *r)
{
return access_gic_aprn(vcpu, p, r, 0);
}
static bool access_gic_ap1r(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
const struct sys_reg_desc *r)
{
return access_gic_aprn(vcpu, p, r, 1);
}
static bool access_gic_sre(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
const struct sys_reg_desc *r)
{
struct vgic_v3_cpu_if *vgicv3 = &vcpu->arch.vgic_cpu.vgic_v3;
/* Validate SRE bit */
if (p->is_write) {
if (!(p->regval & ICC_SRE_EL1_SRE))
return false;
} else {
p->regval = vgicv3->vgic_sre;
}
return true;
}
static const struct sys_reg_desc gic_v3_icc_reg_descs[] = {
/* ICC_PMR_EL1 */
{ Op0(3), Op1(0), CRn(4), CRm(6), Op2(0), access_gic_pmr },
/* ICC_BPR0_EL1 */
{ Op0(3), Op1(0), CRn(12), CRm(8), Op2(3), access_gic_bpr0 },
/* ICC_AP0R0_EL1 */
{ Op0(3), Op1(0), CRn(12), CRm(8), Op2(4), access_gic_ap0r },
/* ICC_AP0R1_EL1 */
{ Op0(3), Op1(0), CRn(12), CRm(8), Op2(5), access_gic_ap0r },
/* ICC_AP0R2_EL1 */
{ Op0(3), Op1(0), CRn(12), CRm(8), Op2(6), access_gic_ap0r },
/* ICC_AP0R3_EL1 */
{ Op0(3), Op1(0), CRn(12), CRm(8), Op2(7), access_gic_ap0r },
/* ICC_AP1R0_EL1 */
{ Op0(3), Op1(0), CRn(12), CRm(9), Op2(0), access_gic_ap1r },
/* ICC_AP1R1_EL1 */
{ Op0(3), Op1(0), CRn(12), CRm(9), Op2(1), access_gic_ap1r },
/* ICC_AP1R2_EL1 */
{ Op0(3), Op1(0), CRn(12), CRm(9), Op2(2), access_gic_ap1r },
/* ICC_AP1R3_EL1 */
{ Op0(3), Op1(0), CRn(12), CRm(9), Op2(3), access_gic_ap1r },
/* ICC_BPR1_EL1 */
{ Op0(3), Op1(0), CRn(12), CRm(12), Op2(3), access_gic_bpr1 },
/* ICC_CTLR_EL1 */
{ Op0(3), Op1(0), CRn(12), CRm(12), Op2(4), access_gic_ctlr },
/* ICC_SRE_EL1 */
{ Op0(3), Op1(0), CRn(12), CRm(12), Op2(5), access_gic_sre },
/* ICC_IGRPEN0_EL1 */
{ Op0(3), Op1(0), CRn(12), CRm(12), Op2(6), access_gic_grpen0 },
/* ICC_GRPEN1_EL1 */
{ Op0(3), Op1(0), CRn(12), CRm(12), Op2(7), access_gic_grpen1 },
};
int vgic_v3_has_cpu_sysregs_attr(struct kvm_vcpu *vcpu, bool is_write, u64 id,
u64 *reg)
{
struct sys_reg_params params;
u64 sysreg = (id & KVM_DEV_ARM_VGIC_SYSREG_MASK) | KVM_REG_SIZE_U64;
params.regval = *reg;
params.is_write = is_write;
params.is_aarch32 = false;
params.is_32bit = false;
if (find_reg_by_id(sysreg, &params, gic_v3_icc_reg_descs,
ARRAY_SIZE(gic_v3_icc_reg_descs)))
return 0;
return -ENXIO;
}
int vgic_v3_cpu_sysregs_uaccess(struct kvm_vcpu *vcpu, bool is_write, u64 id,
u64 *reg)
{
struct sys_reg_params params;
const struct sys_reg_desc *r;
u64 sysreg = (id & KVM_DEV_ARM_VGIC_SYSREG_MASK) | KVM_REG_SIZE_U64;
if (is_write)
params.regval = *reg;
params.is_write = is_write;
params.is_aarch32 = false;
params.is_32bit = false;
r = find_reg_by_id(sysreg, &params, gic_v3_icc_reg_descs,
ARRAY_SIZE(gic_v3_icc_reg_descs));
if (!r)
return -ENXIO;
if (!r->access(vcpu, &params, r))
return -EINVAL;
if (!is_write)
*reg = params.regval;
return 0;
}
This diff is collapsed.
...@@ -29,9 +29,11 @@ do { \ ...@@ -29,9 +29,11 @@ do { \
} \ } \
} while (0) } while (0)
extern void tlbmiss_handler_setup_pgd(unsigned long);
/* Note: This is also implemented with uasm in arch/mips/kvm/entry.c */
#define TLBMISS_HANDLER_SETUP_PGD(pgd) \ #define TLBMISS_HANDLER_SETUP_PGD(pgd) \
do { \ do { \
extern void tlbmiss_handler_setup_pgd(unsigned long); \
tlbmiss_handler_setup_pgd((unsigned long)(pgd)); \ tlbmiss_handler_setup_pgd((unsigned long)(pgd)); \
htw_set_pwbase((unsigned long)pgd); \ htw_set_pwbase((unsigned long)pgd); \
} while (0) } while (0)
...@@ -97,17 +99,12 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk) ...@@ -97,17 +99,12 @@ static inline void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk)
static inline void static inline void
get_new_mmu_context(struct mm_struct *mm, unsigned long cpu) get_new_mmu_context(struct mm_struct *mm, unsigned long cpu)
{ {
extern void kvm_local_flush_tlb_all(void);
unsigned long asid = asid_cache(cpu); unsigned long asid = asid_cache(cpu);
if (!((asid += cpu_asid_inc()) & cpu_asid_mask(&cpu_data[cpu]))) { if (!((asid += cpu_asid_inc()) & cpu_asid_mask(&cpu_data[cpu]))) {
if (cpu_has_vtag_icache) if (cpu_has_vtag_icache)
flush_icache_all(); flush_icache_all();
#ifdef CONFIG_KVM
kvm_local_flush_tlb_all(); /* start new asid cycle */
#else
local_flush_tlb_all(); /* start new asid cycle */ local_flush_tlb_all(); /* start new asid cycle */
#endif
if (!asid) /* fix version if needed */ if (!asid) /* fix version if needed */
asid = asid_first_version(cpu); asid = asid_first_version(cpu);
} }
......
...@@ -19,6 +19,8 @@ ...@@ -19,6 +19,8 @@
* Some parts derived from the x86 version of this file. * Some parts derived from the x86 version of this file.
*/ */
#define __KVM_HAVE_READONLY_MEM
/* /*
* for KVM_GET_REGS and KVM_SET_REGS * for KVM_GET_REGS and KVM_SET_REGS
* *
......
...@@ -20,7 +20,9 @@ config KVM ...@@ -20,7 +20,9 @@ config KVM
select EXPORT_UASM select EXPORT_UASM
select PREEMPT_NOTIFIERS select PREEMPT_NOTIFIERS
select ANON_INODES select ANON_INODES
select KVM_GENERIC_DIRTYLOG_READ_PROTECT
select KVM_MMIO select KVM_MMIO
select MMU_NOTIFIER
select SRCU select SRCU
---help--- ---help---
Support for hosting Guest kernels. Support for hosting Guest kernels.
......
...@@ -13,6 +13,7 @@ ...@@ -13,6 +13,7 @@
#include <linux/err.h> #include <linux/err.h>
#include <linux/highmem.h> #include <linux/highmem.h>
#include <linux/kvm_host.h> #include <linux/kvm_host.h>
#include <linux/uaccess.h>
#include <linux/vmalloc.h> #include <linux/vmalloc.h>
#include <linux/fs.h> #include <linux/fs.h>
#include <linux/bootmem.h> #include <linux/bootmem.h>
...@@ -29,28 +30,37 @@ ...@@ -29,28 +30,37 @@
static int kvm_mips_trans_replace(struct kvm_vcpu *vcpu, u32 *opc, static int kvm_mips_trans_replace(struct kvm_vcpu *vcpu, u32 *opc,
union mips_instruction replace) union mips_instruction replace)
{ {
unsigned long paddr, flags; unsigned long vaddr = (unsigned long)opc;
void *vaddr; int err;
if (KVM_GUEST_KSEGX((unsigned long)opc) == KVM_GUEST_KSEG0) { retry:
paddr = kvm_mips_translate_guest_kseg0_to_hpa(vcpu, /* The GVA page table is still active so use the Linux TLB handlers */
(unsigned long)opc); kvm_trap_emul_gva_lockless_begin(vcpu);
vaddr = kmap_atomic(pfn_to_page(PHYS_PFN(paddr))); err = put_user(replace.word, opc);
vaddr += paddr & ~PAGE_MASK; kvm_trap_emul_gva_lockless_end(vcpu);
memcpy(vaddr, (void *)&replace, sizeof(u32));
local_flush_icache_range((unsigned long)vaddr, if (unlikely(err)) {
(unsigned long)vaddr + 32); /*
kunmap_atomic(vaddr); * We write protect clean pages in GVA page table so normal
} else if (KVM_GUEST_KSEGX((unsigned long) opc) == KVM_GUEST_KSEG23) { * Linux TLB mod handler doesn't silently dirty the page.
local_irq_save(flags); * Its also possible we raced with a GVA invalidation.
memcpy((void *)opc, (void *)&replace, sizeof(u32)); * Try to force the page to become dirty.
__local_flush_icache_user_range((unsigned long)opc, */
(unsigned long)opc + 32); err = kvm_trap_emul_gva_fault(vcpu, vaddr, true);
local_irq_restore(flags); if (unlikely(err)) {
} else { kvm_info("%s: Address unwriteable: %p\n",
kvm_err("%s: Invalid address: %p\n", __func__, opc); __func__, opc);
return -EFAULT; return -EFAULT;
}
/*
* Try again. This will likely trigger a TLB refill, which will
* fetch the new dirty entry from the GVA page table, which
* should then succeed.
*/
goto retry;
} }
__local_flush_icache_user_range(vaddr, vaddr + 4);
return 0; return 0;
} }
......
This diff is collapsed.
...@@ -12,8 +12,11 @@ ...@@ -12,8 +12,11 @@
*/ */
#include <linux/kvm_host.h> #include <linux/kvm_host.h>
#include <linux/log2.h>
#include <asm/mmu_context.h>
#include <asm/msa.h> #include <asm/msa.h>
#include <asm/setup.h> #include <asm/setup.h>
#include <asm/tlbex.h>
#include <asm/uasm.h> #include <asm/uasm.h>
/* Register names */ /* Register names */
...@@ -50,6 +53,8 @@ ...@@ -50,6 +53,8 @@
/* Some CP0 registers */ /* Some CP0 registers */
#define C0_HWRENA 7, 0 #define C0_HWRENA 7, 0
#define C0_BADVADDR 8, 0 #define C0_BADVADDR 8, 0
#define C0_BADINSTR 8, 1
#define C0_BADINSTRP 8, 2
#define C0_ENTRYHI 10, 0 #define C0_ENTRYHI 10, 0
#define C0_STATUS 12, 0 #define C0_STATUS 12, 0
#define C0_CAUSE 13, 0 #define C0_CAUSE 13, 0
...@@ -89,6 +94,21 @@ static void *kvm_mips_build_ret_from_exit(void *addr); ...@@ -89,6 +94,21 @@ static void *kvm_mips_build_ret_from_exit(void *addr);
static void *kvm_mips_build_ret_to_guest(void *addr); static void *kvm_mips_build_ret_to_guest(void *addr);
static void *kvm_mips_build_ret_to_host(void *addr); static void *kvm_mips_build_ret_to_host(void *addr);
/*
* The version of this function in tlbex.c uses current_cpu_type(), but for KVM
* we assume symmetry.
*/
static int c0_kscratch(void)
{
switch (boot_cpu_type()) {
case CPU_XLP:
case CPU_XLR:
return 22;
default:
return 31;
}
}
/** /**
* kvm_mips_entry_setup() - Perform global setup for entry code. * kvm_mips_entry_setup() - Perform global setup for entry code.
* *
...@@ -103,18 +123,21 @@ int kvm_mips_entry_setup(void) ...@@ -103,18 +123,21 @@ int kvm_mips_entry_setup(void)
* We prefer to use KScratchN registers if they are available over the * We prefer to use KScratchN registers if they are available over the
* defaults above, which may not work on all cores. * defaults above, which may not work on all cores.
*/ */
unsigned int kscratch_mask = cpu_data[0].kscratch_mask & 0xfc; unsigned int kscratch_mask = cpu_data[0].kscratch_mask;
if (pgd_reg != -1)
kscratch_mask &= ~BIT(pgd_reg);
/* Pick a scratch register for storing VCPU */ /* Pick a scratch register for storing VCPU */
if (kscratch_mask) { if (kscratch_mask) {
scratch_vcpu[0] = 31; scratch_vcpu[0] = c0_kscratch();
scratch_vcpu[1] = ffs(kscratch_mask) - 1; scratch_vcpu[1] = ffs(kscratch_mask) - 1;
kscratch_mask &= ~BIT(scratch_vcpu[1]); kscratch_mask &= ~BIT(scratch_vcpu[1]);
} }
/* Pick a scratch register to use as a temp for saving state */ /* Pick a scratch register to use as a temp for saving state */
if (kscratch_mask) { if (kscratch_mask) {
scratch_tmp[0] = 31; scratch_tmp[0] = c0_kscratch();
scratch_tmp[1] = ffs(kscratch_mask) - 1; scratch_tmp[1] = ffs(kscratch_mask) - 1;
kscratch_mask &= ~BIT(scratch_tmp[1]); kscratch_mask &= ~BIT(scratch_tmp[1]);
} }
...@@ -130,7 +153,7 @@ static void kvm_mips_build_save_scratch(u32 **p, unsigned int tmp, ...@@ -130,7 +153,7 @@ static void kvm_mips_build_save_scratch(u32 **p, unsigned int tmp,
UASM_i_SW(p, tmp, offsetof(struct pt_regs, cp0_epc), frame); UASM_i_SW(p, tmp, offsetof(struct pt_regs, cp0_epc), frame);
/* Save the temp scratch register value in cp0_cause of stack frame */ /* Save the temp scratch register value in cp0_cause of stack frame */
if (scratch_tmp[0] == 31) { if (scratch_tmp[0] == c0_kscratch()) {
UASM_i_MFC0(p, tmp, scratch_tmp[0], scratch_tmp[1]); UASM_i_MFC0(p, tmp, scratch_tmp[0], scratch_tmp[1]);
UASM_i_SW(p, tmp, offsetof(struct pt_regs, cp0_cause), frame); UASM_i_SW(p, tmp, offsetof(struct pt_regs, cp0_cause), frame);
} }
...@@ -146,7 +169,7 @@ static void kvm_mips_build_restore_scratch(u32 **p, unsigned int tmp, ...@@ -146,7 +169,7 @@ static void kvm_mips_build_restore_scratch(u32 **p, unsigned int tmp,
UASM_i_LW(p, tmp, offsetof(struct pt_regs, cp0_epc), frame); UASM_i_LW(p, tmp, offsetof(struct pt_regs, cp0_epc), frame);
UASM_i_MTC0(p, tmp, scratch_vcpu[0], scratch_vcpu[1]); UASM_i_MTC0(p, tmp, scratch_vcpu[0], scratch_vcpu[1]);
if (scratch_tmp[0] == 31) { if (scratch_tmp[0] == c0_kscratch()) {
UASM_i_LW(p, tmp, offsetof(struct pt_regs, cp0_cause), frame); UASM_i_LW(p, tmp, offsetof(struct pt_regs, cp0_cause), frame);
UASM_i_MTC0(p, tmp, scratch_tmp[0], scratch_tmp[1]); UASM_i_MTC0(p, tmp, scratch_tmp[0], scratch_tmp[1]);
} }
...@@ -286,23 +309,26 @@ static void *kvm_mips_build_enter_guest(void *addr) ...@@ -286,23 +309,26 @@ static void *kvm_mips_build_enter_guest(void *addr)
uasm_i_andi(&p, T0, T0, KSU_USER | ST0_ERL | ST0_EXL); uasm_i_andi(&p, T0, T0, KSU_USER | ST0_ERL | ST0_EXL);
uasm_i_xori(&p, T0, T0, KSU_USER); uasm_i_xori(&p, T0, T0, KSU_USER);
uasm_il_bnez(&p, &r, T0, label_kernel_asid); uasm_il_bnez(&p, &r, T0, label_kernel_asid);
UASM_i_ADDIU(&p, T1, K1, UASM_i_ADDIU(&p, T1, K1, offsetof(struct kvm_vcpu_arch,
offsetof(struct kvm_vcpu_arch, guest_kernel_asid)); guest_kernel_mm.context.asid));
/* else user */ /* else user */
UASM_i_ADDIU(&p, T1, K1, UASM_i_ADDIU(&p, T1, K1, offsetof(struct kvm_vcpu_arch,
offsetof(struct kvm_vcpu_arch, guest_user_asid)); guest_user_mm.context.asid));
uasm_l_kernel_asid(&l, p); uasm_l_kernel_asid(&l, p);
/* t1: contains the base of the ASID array, need to get the cpu id */ /* t1: contains the base of the ASID array, need to get the cpu id */
/* smp_processor_id */ /* smp_processor_id */
uasm_i_lw(&p, T2, offsetof(struct thread_info, cpu), GP); uasm_i_lw(&p, T2, offsetof(struct thread_info, cpu), GP);
/* x4 */ /* index the ASID array */
uasm_i_sll(&p, T2, T2, 2); uasm_i_sll(&p, T2, T2, ilog2(sizeof(long)));
UASM_i_ADDU(&p, T3, T1, T2); UASM_i_ADDU(&p, T3, T1, T2);
uasm_i_lw(&p, K0, 0, T3); UASM_i_LW(&p, K0, 0, T3);
#ifdef CONFIG_MIPS_ASID_BITS_VARIABLE #ifdef CONFIG_MIPS_ASID_BITS_VARIABLE
/* x sizeof(struct cpuinfo_mips)/4 */ /*
uasm_i_addiu(&p, T3, ZERO, sizeof(struct cpuinfo_mips)/4); * reuse ASID array offset
* cpuinfo_mips is a multiple of sizeof(long)
*/
uasm_i_addiu(&p, T3, ZERO, sizeof(struct cpuinfo_mips)/sizeof(long));
uasm_i_mul(&p, T2, T2, T3); uasm_i_mul(&p, T2, T2, T3);
UASM_i_LA_mostly(&p, AT, (long)&cpu_data[0].asid_mask); UASM_i_LA_mostly(&p, AT, (long)&cpu_data[0].asid_mask);
...@@ -312,7 +338,20 @@ static void *kvm_mips_build_enter_guest(void *addr) ...@@ -312,7 +338,20 @@ static void *kvm_mips_build_enter_guest(void *addr)
#else #else
uasm_i_andi(&p, K0, K0, MIPS_ENTRYHI_ASID); uasm_i_andi(&p, K0, K0, MIPS_ENTRYHI_ASID);
#endif #endif
uasm_i_mtc0(&p, K0, C0_ENTRYHI);
/*
* Set up KVM T&E GVA pgd.
* This does roughly the same as TLBMISS_HANDLER_SETUP_PGD():
* - call tlbmiss_handler_setup_pgd(mm->pgd)
* - but skips write into CP0_PWBase for now
*/
UASM_i_LW(&p, A0, (int)offsetof(struct mm_struct, pgd) -
(int)offsetof(struct mm_struct, context.asid), T1);
UASM_i_LA(&p, T9, (unsigned long)tlbmiss_handler_setup_pgd);
uasm_i_jalr(&p, RA, T9);
uasm_i_mtc0(&p, K0, C0_ENTRYHI);
uasm_i_ehb(&p); uasm_i_ehb(&p);
/* Disable RDHWR access */ /* Disable RDHWR access */
...@@ -347,6 +386,80 @@ static void *kvm_mips_build_enter_guest(void *addr) ...@@ -347,6 +386,80 @@ static void *kvm_mips_build_enter_guest(void *addr)
return p; return p;
} }
/**
* kvm_mips_build_tlb_refill_exception() - Assemble TLB refill handler.
* @addr: Address to start writing code.
* @handler: Address of common handler (within range of @addr).
*
* Assemble TLB refill exception fast path handler for guest execution.
*
* Returns: Next address after end of written function.
*/
void *kvm_mips_build_tlb_refill_exception(void *addr, void *handler)
{
u32 *p = addr;
struct uasm_label labels[2];
struct uasm_reloc relocs[2];
struct uasm_label *l = labels;
struct uasm_reloc *r = relocs;
memset(labels, 0, sizeof(labels));
memset(relocs, 0, sizeof(relocs));
/* Save guest k1 into scratch register */
UASM_i_MTC0(&p, K1, scratch_tmp[0], scratch_tmp[1]);
/* Get the VCPU pointer from the VCPU scratch register */
UASM_i_MFC0(&p, K1, scratch_vcpu[0], scratch_vcpu[1]);
/* Save guest k0 into VCPU structure */
UASM_i_SW(&p, K0, offsetof(struct kvm_vcpu, arch.gprs[K0]), K1);
/*
* Some of the common tlbex code uses current_cpu_type(). For KVM we
* assume symmetry and just disable preemption to silence the warning.
*/
preempt_disable();
/*
* Now for the actual refill bit. A lot of this can be common with the
* Linux TLB refill handler, however we don't need to handle so many
* cases. We only need to handle user mode refills, and user mode runs
* with 32-bit addressing.
*
* Therefore the branch to label_vmalloc generated by build_get_pmde64()
* that isn't resolved should never actually get taken and is harmless
* to leave in place for now.
*/
#ifdef CONFIG_64BIT
build_get_pmde64(&p, &l, &r, K0, K1); /* get pmd in K1 */
#else
build_get_pgde32(&p, K0, K1); /* get pgd in K1 */
#endif
/* we don't support huge pages yet */
build_get_ptep(&p, K0, K1);
build_update_entries(&p, K0, K1);
build_tlb_write_entry(&p, &l, &r, tlb_random);
preempt_enable();
/* Get the VCPU pointer from the VCPU scratch register again */
UASM_i_MFC0(&p, K1, scratch_vcpu[0], scratch_vcpu[1]);
/* Restore the guest's k0/k1 registers */
UASM_i_LW(&p, K0, offsetof(struct kvm_vcpu, arch.gprs[K0]), K1);
uasm_i_ehb(&p);
UASM_i_MFC0(&p, K1, scratch_tmp[0], scratch_tmp[1]);
/* Jump to guest */
uasm_i_eret(&p);
return p;
}
/** /**
* kvm_mips_build_exception() - Assemble first level guest exception handler. * kvm_mips_build_exception() - Assemble first level guest exception handler.
* @addr: Address to start writing code. * @addr: Address to start writing code.
...@@ -468,6 +581,18 @@ void *kvm_mips_build_exit(void *addr) ...@@ -468,6 +581,18 @@ void *kvm_mips_build_exit(void *addr)
uasm_i_mfc0(&p, K0, C0_CAUSE); uasm_i_mfc0(&p, K0, C0_CAUSE);
uasm_i_sw(&p, K0, offsetof(struct kvm_vcpu_arch, host_cp0_cause), K1); uasm_i_sw(&p, K0, offsetof(struct kvm_vcpu_arch, host_cp0_cause), K1);
if (cpu_has_badinstr) {
uasm_i_mfc0(&p, K0, C0_BADINSTR);
uasm_i_sw(&p, K0, offsetof(struct kvm_vcpu_arch,
host_cp0_badinstr), K1);
}
if (cpu_has_badinstrp) {
uasm_i_mfc0(&p, K0, C0_BADINSTRP);
uasm_i_sw(&p, K0, offsetof(struct kvm_vcpu_arch,
host_cp0_badinstrp), K1);
}
/* Now restore the host state just enough to run the handlers */ /* Now restore the host state just enough to run the handlers */
/* Switch EBASE to the one used by Linux */ /* Switch EBASE to the one used by Linux */
......
...@@ -183,10 +183,11 @@ int kvm_mips_irq_deliver_cb(struct kvm_vcpu *vcpu, unsigned int priority, ...@@ -183,10 +183,11 @@ int kvm_mips_irq_deliver_cb(struct kvm_vcpu *vcpu, unsigned int priority,
(exccode << CAUSEB_EXCCODE)); (exccode << CAUSEB_EXCCODE));
/* XXXSL Set PC to the interrupt exception entry point */ /* XXXSL Set PC to the interrupt exception entry point */
arch->pc = kvm_mips_guest_exception_base(vcpu);
if (kvm_read_c0_guest_cause(cop0) & CAUSEF_IV) if (kvm_read_c0_guest_cause(cop0) & CAUSEF_IV)
arch->pc = KVM_GUEST_KSEG0 + 0x200; arch->pc += 0x200;
else else
arch->pc = KVM_GUEST_KSEG0 + 0x180; arch->pc += 0x180;
clear_bit(priority, &vcpu->arch.pending_exceptions); clear_bit(priority, &vcpu->arch.pending_exceptions);
} }
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
...@@ -22,6 +22,10 @@ ...@@ -22,6 +22,10 @@
#include <asm/book3s/64/mmu-hash.h> #include <asm/book3s/64/mmu-hash.h>
/* Power architecture requires HPT is at least 256kiB, at most 64TiB */
#define PPC_MIN_HPT_ORDER 18
#define PPC_MAX_HPT_ORDER 46
#ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
static inline struct kvmppc_book3s_shadow_vcpu *svcpu_get(struct kvm_vcpu *vcpu) static inline struct kvmppc_book3s_shadow_vcpu *svcpu_get(struct kvm_vcpu *vcpu)
{ {
...@@ -356,6 +360,18 @@ extern void kvmppc_mmu_debugfs_init(struct kvm *kvm); ...@@ -356,6 +360,18 @@ extern void kvmppc_mmu_debugfs_init(struct kvm *kvm);
extern void kvmhv_rm_send_ipi(int cpu); extern void kvmhv_rm_send_ipi(int cpu);
static inline unsigned long kvmppc_hpt_npte(struct kvm_hpt_info *hpt)
{
/* HPTEs are 2**4 bytes long */
return 1UL << (hpt->order - 4);
}
static inline unsigned long kvmppc_hpt_mask(struct kvm_hpt_info *hpt)
{
/* 128 (2**7) bytes in each HPTEG */
return (1UL << (hpt->order - 7)) - 1;
}
#endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */ #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */
#endif /* __ASM_KVM_BOOK3S_64_H__ */ #endif /* __ASM_KVM_BOOK3S_64_H__ */
...@@ -241,12 +241,24 @@ struct kvm_arch_memory_slot { ...@@ -241,12 +241,24 @@ struct kvm_arch_memory_slot {
#endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */ #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */
}; };
struct kvm_hpt_info {
/* Host virtual (linear mapping) address of guest HPT */
unsigned long virt;
/* Array of reverse mapping entries for each guest HPTE */
struct revmap_entry *rev;
/* Guest HPT size is 2**(order) bytes */
u32 order;
/* 1 if HPT allocated with CMA, 0 otherwise */
int cma;
};
struct kvm_resize_hpt;
struct kvm_arch { struct kvm_arch {
unsigned int lpid; unsigned int lpid;
#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
unsigned int tlb_sets; unsigned int tlb_sets;
unsigned long hpt_virt; struct kvm_hpt_info hpt;
struct revmap_entry *revmap;
atomic64_t mmio_update; atomic64_t mmio_update;
unsigned int host_lpid; unsigned int host_lpid;
unsigned long host_lpcr; unsigned long host_lpcr;
...@@ -256,20 +268,17 @@ struct kvm_arch { ...@@ -256,20 +268,17 @@ struct kvm_arch {
unsigned long lpcr; unsigned long lpcr;
unsigned long vrma_slb_v; unsigned long vrma_slb_v;
int hpte_setup_done; int hpte_setup_done;
u32 hpt_order;
atomic_t vcpus_running; atomic_t vcpus_running;
u32 online_vcores; u32 online_vcores;
unsigned long hpt_npte;
unsigned long hpt_mask;
atomic_t hpte_mod_interest; atomic_t hpte_mod_interest;
cpumask_t need_tlb_flush; cpumask_t need_tlb_flush;
cpumask_t cpu_in_guest; cpumask_t cpu_in_guest;
int hpt_cma_alloc;
u8 radix; u8 radix;
pgd_t *pgtable; pgd_t *pgtable;
u64 process_table; u64 process_table;
struct dentry *debugfs_dir; struct dentry *debugfs_dir;
struct dentry *htab_dentry; struct dentry *htab_dentry;
struct kvm_resize_hpt *resize_hpt; /* protected by kvm->lock */
#endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */ #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */
#ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
struct mutex hpt_mutex; struct mutex hpt_mutex;
......
...@@ -155,9 +155,10 @@ extern void kvmppc_core_destroy_mmu(struct kvm_vcpu *vcpu); ...@@ -155,9 +155,10 @@ extern void kvmppc_core_destroy_mmu(struct kvm_vcpu *vcpu);
extern int kvmppc_kvm_pv(struct kvm_vcpu *vcpu); extern int kvmppc_kvm_pv(struct kvm_vcpu *vcpu);
extern void kvmppc_map_magic(struct kvm_vcpu *vcpu); extern void kvmppc_map_magic(struct kvm_vcpu *vcpu);
extern long kvmppc_alloc_hpt(struct kvm *kvm, u32 *htab_orderp); extern int kvmppc_allocate_hpt(struct kvm_hpt_info *info, u32 order);
extern long kvmppc_alloc_reset_hpt(struct kvm *kvm, u32 *htab_orderp); extern void kvmppc_set_hpt(struct kvm *kvm, struct kvm_hpt_info *info);
extern void kvmppc_free_hpt(struct kvm *kvm); extern long kvmppc_alloc_reset_hpt(struct kvm *kvm, int order);
extern void kvmppc_free_hpt(struct kvm_hpt_info *info);
extern long kvmppc_prepare_vrma(struct kvm *kvm, extern long kvmppc_prepare_vrma(struct kvm *kvm,
struct kvm_userspace_memory_region *mem); struct kvm_userspace_memory_region *mem);
extern void kvmppc_map_vrma(struct kvm_vcpu *vcpu, extern void kvmppc_map_vrma(struct kvm_vcpu *vcpu,
...@@ -186,8 +187,8 @@ extern long kvmppc_h_stuff_tce(struct kvm_vcpu *vcpu, ...@@ -186,8 +187,8 @@ extern long kvmppc_h_stuff_tce(struct kvm_vcpu *vcpu,
unsigned long tce_value, unsigned long npages); unsigned long tce_value, unsigned long npages);
extern long kvmppc_h_get_tce(struct kvm_vcpu *vcpu, unsigned long liobn, extern long kvmppc_h_get_tce(struct kvm_vcpu *vcpu, unsigned long liobn,
unsigned long ioba); unsigned long ioba);
extern struct page *kvm_alloc_hpt(unsigned long nr_pages); extern struct page *kvm_alloc_hpt_cma(unsigned long nr_pages);
extern void kvm_release_hpt(struct page *page, unsigned long nr_pages); extern void kvm_free_hpt_cma(struct page *page, unsigned long nr_pages);
extern int kvmppc_core_init_vm(struct kvm *kvm); extern int kvmppc_core_init_vm(struct kvm *kvm);
extern void kvmppc_core_destroy_vm(struct kvm *kvm); extern void kvmppc_core_destroy_vm(struct kvm *kvm);
extern void kvmppc_core_free_memslot(struct kvm *kvm, extern void kvmppc_core_free_memslot(struct kvm *kvm,
...@@ -214,6 +215,10 @@ extern void kvmppc_bookehv_exit(void); ...@@ -214,6 +215,10 @@ extern void kvmppc_bookehv_exit(void);
extern int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu); extern int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu);
extern int kvm_vm_ioctl_get_htab_fd(struct kvm *kvm, struct kvm_get_htab_fd *); extern int kvm_vm_ioctl_get_htab_fd(struct kvm *kvm, struct kvm_get_htab_fd *);
extern long kvm_vm_ioctl_resize_hpt_prepare(struct kvm *kvm,
struct kvm_ppc_resize_hpt *rhpt);
extern long kvm_vm_ioctl_resize_hpt_commit(struct kvm *kvm,
struct kvm_ppc_resize_hpt *rhpt);
int kvm_vcpu_ioctl_interrupt(struct kvm_vcpu *vcpu, struct kvm_interrupt *irq); int kvm_vcpu_ioctl_interrupt(struct kvm_vcpu *vcpu, struct kvm_interrupt *irq);
......
...@@ -633,5 +633,7 @@ struct kvm_ppc_rmmu_info { ...@@ -633,5 +633,7 @@ struct kvm_ppc_rmmu_info {
#define KVM_XICS_LEVEL_SENSITIVE (1ULL << 40) #define KVM_XICS_LEVEL_SENSITIVE (1ULL << 40)
#define KVM_XICS_MASKED (1ULL << 41) #define KVM_XICS_MASKED (1ULL << 41)
#define KVM_XICS_PENDING (1ULL << 42) #define KVM_XICS_PENDING (1ULL << 42)
#define KVM_XICS_PRESENTED (1ULL << 43)
#define KVM_XICS_QUEUED (1ULL << 44)
#endif /* __LINUX_KVM_POWERPC_H */ #endif /* __LINUX_KVM_POWERPC_H */
...@@ -224,7 +224,8 @@ static int kvmppc_mmu_book3s_32_xlate_pte(struct kvm_vcpu *vcpu, gva_t eaddr, ...@@ -224,7 +224,8 @@ static int kvmppc_mmu_book3s_32_xlate_pte(struct kvm_vcpu *vcpu, gva_t eaddr,
ptem = kvmppc_mmu_book3s_32_get_ptem(sre, eaddr, primary); ptem = kvmppc_mmu_book3s_32_get_ptem(sre, eaddr, primary);
if(copy_from_user(pteg, (void __user *)ptegp, sizeof(pteg))) { if(copy_from_user(pteg, (void __user *)ptegp, sizeof(pteg))) {
printk(KERN_ERR "KVM: Can't copy data from 0x%lx!\n", ptegp); printk_ratelimited(KERN_ERR
"KVM: Can't copy data from 0x%lx!\n", ptegp);
goto no_page_found; goto no_page_found;
} }
......
...@@ -265,7 +265,8 @@ static int kvmppc_mmu_book3s_64_xlate(struct kvm_vcpu *vcpu, gva_t eaddr, ...@@ -265,7 +265,8 @@ static int kvmppc_mmu_book3s_64_xlate(struct kvm_vcpu *vcpu, gva_t eaddr,
goto no_page_found; goto no_page_found;
if(copy_from_user(pteg, (void __user *)ptegp, sizeof(pteg))) { if(copy_from_user(pteg, (void __user *)ptegp, sizeof(pteg))) {
printk(KERN_ERR "KVM can't copy data from 0x%lx!\n", ptegp); printk_ratelimited(KERN_ERR
"KVM: Can't copy data from 0x%lx!\n", ptegp);
goto no_page_found; goto no_page_found;
} }
......
This diff is collapsed.
...@@ -171,6 +171,7 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm, ...@@ -171,6 +171,7 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
goto fail; goto fail;
} }
ret = -ENOMEM;
stt = kzalloc(sizeof(*stt) + npages * sizeof(struct page *), stt = kzalloc(sizeof(*stt) + npages * sizeof(struct page *),
GFP_KERNEL); GFP_KERNEL);
if (!stt) if (!stt)
......
This diff is collapsed.
...@@ -52,19 +52,19 @@ static int __init early_parse_kvm_cma_resv(char *p) ...@@ -52,19 +52,19 @@ static int __init early_parse_kvm_cma_resv(char *p)
} }
early_param("kvm_cma_resv_ratio", early_parse_kvm_cma_resv); early_param("kvm_cma_resv_ratio", early_parse_kvm_cma_resv);
struct page *kvm_alloc_hpt(unsigned long nr_pages) struct page *kvm_alloc_hpt_cma(unsigned long nr_pages)
{ {
VM_BUG_ON(order_base_2(nr_pages) < KVM_CMA_CHUNK_ORDER - PAGE_SHIFT); VM_BUG_ON(order_base_2(nr_pages) < KVM_CMA_CHUNK_ORDER - PAGE_SHIFT);
return cma_alloc(kvm_cma, nr_pages, order_base_2(HPT_ALIGN_PAGES)); return cma_alloc(kvm_cma, nr_pages, order_base_2(HPT_ALIGN_PAGES));
} }
EXPORT_SYMBOL_GPL(kvm_alloc_hpt); EXPORT_SYMBOL_GPL(kvm_alloc_hpt_cma);
void kvm_release_hpt(struct page *page, unsigned long nr_pages) void kvm_free_hpt_cma(struct page *page, unsigned long nr_pages)
{ {
cma_release(kvm_cma, page, nr_pages); cma_release(kvm_cma, page, nr_pages);
} }
EXPORT_SYMBOL_GPL(kvm_release_hpt); EXPORT_SYMBOL_GPL(kvm_free_hpt_cma);
/** /**
* kvm_cma_reserve() - reserve area for kvm hash pagetable * kvm_cma_reserve() - reserve area for kvm hash pagetable
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
...@@ -31,16 +31,19 @@ ...@@ -31,16 +31,19 @@
/* Priority value to use for disabling an interrupt */ /* Priority value to use for disabling an interrupt */
#define MASKED 0xff #define MASKED 0xff
#define PQ_PRESENTED 1
#define PQ_QUEUED 2
/* State for one irq source */ /* State for one irq source */
struct ics_irq_state { struct ics_irq_state {
u32 number; u32 number;
u32 server; u32 server;
u32 pq_state;
u8 priority; u8 priority;
u8 saved_priority; u8 saved_priority;
u8 resend; u8 resend;
u8 masked_pending; u8 masked_pending;
u8 lsi; /* level-sensitive interrupt */ u8 lsi; /* level-sensitive interrupt */
u8 asserted; /* Only for LSI */
u8 exists; u8 exists;
int intr_cpu; int intr_cpu;
u32 host_irq; u32 host_irq;
...@@ -73,7 +76,6 @@ struct kvmppc_icp { ...@@ -73,7 +76,6 @@ struct kvmppc_icp {
*/ */
#define XICS_RM_KICK_VCPU 0x1 #define XICS_RM_KICK_VCPU 0x1
#define XICS_RM_CHECK_RESEND 0x2 #define XICS_RM_CHECK_RESEND 0x2
#define XICS_RM_REJECT 0x4
#define XICS_RM_NOTIFY_EOI 0x8 #define XICS_RM_NOTIFY_EOI 0x8
u32 rm_action; u32 rm_action;
struct kvm_vcpu *rm_kick_target; struct kvm_vcpu *rm_kick_target;
...@@ -84,7 +86,6 @@ struct kvmppc_icp { ...@@ -84,7 +86,6 @@ struct kvmppc_icp {
/* Counters for each reason we exited real mode */ /* Counters for each reason we exited real mode */
unsigned long n_rm_kick_vcpu; unsigned long n_rm_kick_vcpu;
unsigned long n_rm_check_resend; unsigned long n_rm_check_resend;
unsigned long n_rm_reject;
unsigned long n_rm_notify_eoi; unsigned long n_rm_notify_eoi;
/* Counters for handling ICP processing in real mode */ /* Counters for handling ICP processing in real mode */
unsigned long n_check_resend; unsigned long n_check_resend;
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
...@@ -441,5 +441,6 @@ int emulator_task_switch(struct x86_emulate_ctxt *ctxt, ...@@ -441,5 +441,6 @@ int emulator_task_switch(struct x86_emulate_ctxt *ctxt,
int emulate_int_real(struct x86_emulate_ctxt *ctxt, int irq); int emulate_int_real(struct x86_emulate_ctxt *ctxt, int irq);
void emulator_invalidate_register_cache(struct x86_emulate_ctxt *ctxt); void emulator_invalidate_register_cache(struct x86_emulate_ctxt *ctxt);
void emulator_writeback_register_cache(struct x86_emulate_ctxt *ctxt); void emulator_writeback_register_cache(struct x86_emulate_ctxt *ctxt);
bool emulator_can_use_gpa(struct x86_emulate_ctxt *ctxt);
#endif /* _ASM_X86_KVM_X86_EMULATE_H */ #endif /* _ASM_X86_KVM_X86_EMULATE_H */
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment