Commit 44e98edc authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'kvm-4.3-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "A very small release for x86 and s390 KVM.

   - s390: timekeeping changes, cleanups and fixes

   - x86: support for Hyper-V MSRs to report crashes, and a bunch of
     cleanups.

  One interesting feature that was planned for 4.3 (emulating the local
  APIC in kernel while keeping the IOAPIC and 8254 in userspace) had to
  be delayed because Intel complained about my reading of the manual"

* tag 'kvm-4.3-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (42 commits)
  x86/kvm: Rename VMX's segment access rights defines
  KVM: x86/vPMU: Fix unnecessary signed extension for AMD PERFCTRn
  kvm: x86: Fix error handling in the function kvm_lapic_sync_from_vapic
  KVM: s390: Fix assumption that kvm_set_irq_routing is always run successfully
  KVM: VMX: drop ept misconfig check
  KVM: MMU: fully check zero bits for sptes
  KVM: MMU: introduce is_shadow_zero_bits_set()
  KVM: MMU: introduce the framework to check zero bits on sptes
  KVM: MMU: split reset_rsvds_bits_mask_ept
  KVM: MMU: split reset_rsvds_bits_mask
  KVM: MMU: introduce rsvd_bits_validate
  KVM: MMU: move FNAME(is_rsvd_bits_set) to mmu.c
  KVM: MMU: fix validation of mmio page fault
  KVM: MTRR: Use default type for non-MTRR-covered gfn before WARN_ON
  KVM: s390: host STP toleration for VMs
  KVM: x86: clean/fix memory barriers in irqchip_in_kernel
  KVM: document memory barriers for kvm->vcpus/kvm->online_vcpus
  KVM: x86: remove unnecessary memory barriers for shared MSRs
  KVM: move code related to KVM_SET_BOOT_CPU_ID to x86
  KVM: s390: log capability enablement and vm attribute changes
  ...
parents 64291f7d 4d283ec9
......@@ -16,8 +16,6 @@ Debugging390.txt
- hints for debugging on s390 systems.
driver-model.txt
- information on s390 devices and the driver model.
kvm.txt
- ioctl calls to /dev/kvm on s390.
monreader.txt
- information on accessing the z/VM monitor stream from Linux.
qeth.txt
......
*** BIG FAT WARNING ***
The kvm module is currently in EXPERIMENTAL state for s390. This means that
the interface to the module is not yet considered to remain stable. Thus, be
prepared that we keep breaking your userspace application and guest
compatibility over and over again until we feel happy with the result. Make sure
your guest kernel, your host kernel, and your userspace launcher are in a
consistent state.
This Documentation describes the unique ioctl calls to /dev/kvm, the resulting
kvm-vm file descriptors, and the kvm-vcpu file descriptors that differ from x86.
1. ioctl calls to /dev/kvm
KVM does support the following ioctls on s390 that are common with other
architectures and do behave the same:
KVM_GET_API_VERSION
KVM_CREATE_VM (*) see note
KVM_CHECK_EXTENSION
KVM_GET_VCPU_MMAP_SIZE
Notes:
* KVM_CREATE_VM may fail on s390, if the calling process has multiple
threads and has not called KVM_S390_ENABLE_SIE before.
In addition, on s390 the following architecture specific ioctls are supported:
ioctl: KVM_S390_ENABLE_SIE
args: none
see also: include/linux/kvm.h
This call causes the kernel to switch on PGSTE in the user page table. This
operation is needed in order to run a virtual machine, and it requires the
calling process to be single-threaded. Note that the first call to KVM_CREATE_VM
will implicitly try to switch on PGSTE if the user process has not called
KVM_S390_ENABLE_SIE before. User processes that want to launch multiple threads
before creating a virtual machine have to call KVM_S390_ENABLE_SIE, or will
observe an error calling KVM_CREATE_VM. Switching on PGSTE is a one-time
operation, is not reversible, and will persist over the entire lifetime of
the calling process. It does not have any user-visible effect other than a small
performance penalty.
2. ioctl calls to the kvm-vm file descriptor
KVM does support the following ioctls on s390 that are common with other
architectures and do behave the same:
KVM_CREATE_VCPU
KVM_SET_USER_MEMORY_REGION (*) see note
KVM_GET_DIRTY_LOG (**) see note
Notes:
* kvm does only allow exactly one memory slot on s390, which has to start
at guest absolute address zero and at a user address that is aligned on any
page boundary. This hardware "limitation" allows us to have a few unique
optimizations. The memory slot doesn't have to be filled
with memory actually, it may contain sparse holes. That said, with different
user memory layout this does still allow a large flexibility when
doing the guest memory setup.
** KVM_GET_DIRTY_LOG doesn't work properly yet. The user will receive an empty
log. This ioctl call is only needed for guest migration, and we intend to
implement this one in the future.
In addition, on s390 the following architecture specific ioctls for the kvm-vm
file descriptor are supported:
ioctl: KVM_S390_INTERRUPT
args: struct kvm_s390_interrupt *
see also: include/linux/kvm.h
This ioctl is used to submit a floating interrupt for a virtual machine.
Floating interrupts may be delivered to any virtual cpu in the configuration.
Only some interrupt types defined in include/linux/kvm.h make sense when
submitted as floating interrupts. The following interrupts are not considered
to be useful as floating interrupts, and a call to inject them will result in
-EINVAL error code: program interrupts and interprocessor signals. Valid
floating interrupts are:
KVM_S390_INT_VIRTIO
KVM_S390_INT_SERVICE
3. ioctl calls to the kvm-vcpu file descriptor
KVM does support the following ioctls on s390 that are common with other
architectures and do behave the same:
KVM_RUN
KVM_GET_REGS
KVM_SET_REGS
KVM_GET_SREGS
KVM_SET_SREGS
KVM_GET_FPU
KVM_SET_FPU
In addition, on s390 the following architecture specific ioctls for the
kvm-vcpu file descriptor are supported:
ioctl: KVM_S390_INTERRUPT
args: struct kvm_s390_interrupt *
see also: include/linux/kvm.h
This ioctl is used to submit an interrupt for a specific virtual cpu.
Only some interrupt types defined in include/linux/kvm.h make sense when
submitted for a specific cpu. The following interrupts are not considered
to be useful, and a call to inject them will result in -EINVAL error code:
service processor calls and virtio interrupts. Valid interrupt types are:
KVM_S390_PROGRAM_INT
KVM_S390_SIGP_STOP
KVM_S390_RESTART
KVM_S390_SIGP_SET_PREFIX
KVM_S390_INT_EMERGENCY
ioctl: KVM_S390_STORE_STATUS
args: unsigned long
see also: include/linux/kvm.h
This ioctl stores the state of the cpu at the guest real address given as
argument, unless one of the following values defined in include/linux/kvm.h
is given as argument:
KVM_S390_STORE_STATUS_NOADDR - the CPU stores its status to the save area in
absolute lowcore as defined by the principles of operation
KVM_S390_STORE_STATUS_PREFIXED - the CPU stores its status to the save area in
its prefix page just like the dump tool that comes with zipl. This is useful
to create a system dump for use with lkcdutils or crash.
ioctl: KVM_S390_SET_INITIAL_PSW
args: struct kvm_s390_psw *
see also: include/linux/kvm.h
This ioctl can be used to set the processor status word (psw) of a stopped cpu
prior to running it with KVM_RUN. Note that this call is not required to modify
the psw during sie intercepts that fall back to userspace because struct kvm_run
does contain the psw, and this value is evaluated during reentry of KVM_RUN
after the intercept exit was recognized.
ioctl: KVM_S390_INITIAL_RESET
args: none
see also: include/linux/kvm.h
This ioctl can be used to perform an initial cpu reset as defined by the
principles of operation. The target cpu has to be in stopped state.
......@@ -3277,6 +3277,7 @@ should put the acknowledged interrupt vector into the 'epr' field.
struct {
#define KVM_SYSTEM_EVENT_SHUTDOWN 1
#define KVM_SYSTEM_EVENT_RESET 2
#define KVM_SYSTEM_EVENT_CRASH 3
__u32 type;
__u64 flags;
} system_event;
......@@ -3296,6 +3297,10 @@ Valid values for 'type' are:
KVM_SYSTEM_EVENT_RESET -- the guest has requested a reset of the VM.
As with SHUTDOWN, userspace can choose to ignore the request, or
to schedule the reset to occur in the future and may call KVM_RUN again.
KVM_SYSTEM_EVENT_CRASH -- the guest crash occurred and the guest
has requested a crash condition maintenance. Userspace can choose
to ignore the request, or to gather VM memory core dump and/or
reset/shutdown of the VM.
/* Fix the size of the union. */
char padding[256];
......
......@@ -214,6 +214,9 @@ static inline int etr_ptff(void *ptff_block, unsigned int func)
void etr_switch_to_local(void);
void etr_sync_check(void);
/* notifier for syncs */
extern struct atomic_notifier_head s390_epoch_delta_notifier;
/* STP interruption parameter */
struct stp_irq_parm {
unsigned int _pad0 : 14;
......
......@@ -258,6 +258,9 @@ struct kvm_vcpu_stat {
u32 diagnose_10;
u32 diagnose_44;
u32 diagnose_9c;
u32 diagnose_258;
u32 diagnose_308;
u32 diagnose_500;
};
#define PGM_OPERATION 0x01
......@@ -630,7 +633,6 @@ extern char sie_exit;
static inline void kvm_arch_hardware_disable(void) {}
static inline void kvm_arch_check_processor_compat(void *rtn) {}
static inline void kvm_arch_exit(void) {}
static inline void kvm_arch_sync_events(struct kvm *kvm) {}
static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
......
......@@ -58,6 +58,9 @@ EXPORT_SYMBOL_GPL(sched_clock_base_cc);
static DEFINE_PER_CPU(struct clock_event_device, comparators);
ATOMIC_NOTIFIER_HEAD(s390_epoch_delta_notifier);
EXPORT_SYMBOL(s390_epoch_delta_notifier);
/*
* Scheduler clock - returns current time in nanosec units.
*/
......@@ -752,7 +755,7 @@ static void clock_sync_cpu(struct clock_sync_data *sync)
static int etr_sync_clock(void *data)
{
static int first;
unsigned long long clock, old_clock, delay, delta;
unsigned long long clock, old_clock, clock_delta, delay, delta;
struct clock_sync_data *etr_sync;
struct etr_aib *sync_port, *aib;
int port;
......@@ -789,6 +792,9 @@ static int etr_sync_clock(void *data)
delay = (unsigned long long)
(aib->edf2.etv - sync_port->edf2.etv) << 32;
delta = adjust_time(old_clock, clock, delay);
clock_delta = clock - old_clock;
atomic_notifier_call_chain(&s390_epoch_delta_notifier, 0,
&clock_delta);
etr_sync->fixup_cc = delta;
fixup_clock_comparator(delta);
/* Verify that the clock is properly set. */
......@@ -1526,7 +1532,7 @@ void stp_island_check(void)
static int stp_sync_clock(void *data)
{
static int first;
unsigned long long old_clock, delta;
unsigned long long old_clock, delta, new_clock, clock_delta;
struct clock_sync_data *stp_sync;
int rc;
......@@ -1551,7 +1557,11 @@ static int stp_sync_clock(void *data)
old_clock = get_tod_clock();
rc = chsc_sstpc(stp_page, STP_OP_SYNC, 0);
if (rc == 0) {
delta = adjust_time(old_clock, get_tod_clock(), 0);
new_clock = get_tod_clock();
delta = adjust_time(old_clock, new_clock, 0);
clock_delta = new_clock - old_clock;
atomic_notifier_call_chain(&s390_epoch_delta_notifier,
0, &clock_delta);
fixup_clock_comparator(delta);
rc = chsc_sstpi(stp_page, &stp_info,
sizeof(struct stp_sstpi));
......
......@@ -27,13 +27,13 @@ static int diag_release_pages(struct kvm_vcpu *vcpu)
start = vcpu->run->s.regs.gprs[(vcpu->arch.sie_block->ipa & 0xf0) >> 4];
end = vcpu->run->s.regs.gprs[vcpu->arch.sie_block->ipa & 0xf] + 4096;
vcpu->stat.diagnose_10++;
if (start & ~PAGE_MASK || end & ~PAGE_MASK || start >= end
|| start < 2 * PAGE_SIZE)
return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
VCPU_EVENT(vcpu, 5, "diag release pages %lX %lX", start, end);
vcpu->stat.diagnose_10++;
/*
* We checked for start >= end above, so lets check for the
......@@ -75,6 +75,9 @@ static int __diag_page_ref_service(struct kvm_vcpu *vcpu)
u16 rx = (vcpu->arch.sie_block->ipa & 0xf0) >> 4;
u16 ry = (vcpu->arch.sie_block->ipa & 0x0f);
VCPU_EVENT(vcpu, 3, "diag page reference parameter block at 0x%llx",
vcpu->run->s.regs.gprs[rx]);
vcpu->stat.diagnose_258++;
if (vcpu->run->s.regs.gprs[rx] & 7)
return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
rc = read_guest(vcpu, vcpu->run->s.regs.gprs[rx], rx, &parm, sizeof(parm));
......@@ -85,6 +88,9 @@ static int __diag_page_ref_service(struct kvm_vcpu *vcpu)
switch (parm.subcode) {
case 0: /* TOKEN */
VCPU_EVENT(vcpu, 3, "pageref token addr 0x%llx "
"select mask 0x%llx compare mask 0x%llx",
parm.token_addr, parm.select_mask, parm.compare_mask);
if (vcpu->arch.pfault_token != KVM_S390_PFAULT_TOKEN_INVALID) {
/*
* If the pagefault handshake is already activated,
......@@ -114,6 +120,7 @@ static int __diag_page_ref_service(struct kvm_vcpu *vcpu)
* the cancel, therefore to reduce code complexity, we assume
* all outstanding tokens are already pending.
*/
VCPU_EVENT(vcpu, 3, "pageref cancel addr 0x%llx", parm.token_addr);
if (parm.token_addr || parm.select_mask ||
parm.compare_mask || parm.zarch)
return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
......@@ -174,7 +181,8 @@ static int __diag_ipl_functions(struct kvm_vcpu *vcpu)
unsigned int reg = vcpu->arch.sie_block->ipa & 0xf;
unsigned long subcode = vcpu->run->s.regs.gprs[reg] & 0xffff;
VCPU_EVENT(vcpu, 5, "diag ipl functions, subcode %lx", subcode);
VCPU_EVENT(vcpu, 3, "diag ipl functions, subcode %lx", subcode);
vcpu->stat.diagnose_308++;
switch (subcode) {
case 3:
vcpu->run->s390_reset_flags = KVM_S390_RESET_CLEAR;
......@@ -202,6 +210,7 @@ static int __diag_virtio_hypercall(struct kvm_vcpu *vcpu)
{
int ret;
vcpu->stat.diagnose_500++;
/* No virtio-ccw notification? Get out quickly. */
if (!vcpu->kvm->arch.css_support ||
(vcpu->run->s.regs.gprs[1] != KVM_S390_VIRTIO_CCW_NOTIFY))
......
......@@ -473,10 +473,45 @@ static void filter_guest_per_event(struct kvm_vcpu *vcpu)
vcpu->arch.sie_block->iprcc &= ~PGM_PER;
}
#define pssec(vcpu) (vcpu->arch.sie_block->gcr[1] & _ASCE_SPACE_SWITCH)
#define hssec(vcpu) (vcpu->arch.sie_block->gcr[13] & _ASCE_SPACE_SWITCH)
#define old_ssec(vcpu) ((vcpu->arch.sie_block->tecmc >> 31) & 0x1)
#define old_as_is_home(vcpu) !(vcpu->arch.sie_block->tecmc & 0xffff)
void kvm_s390_handle_per_event(struct kvm_vcpu *vcpu)
{
int new_as;
if (debug_exit_required(vcpu))
vcpu->guest_debug |= KVM_GUESTDBG_EXIT_PENDING;
filter_guest_per_event(vcpu);
/*
* Only RP, SAC, SACF, PT, PTI, PR, PC instructions can trigger
* a space-switch event. PER events enforce space-switch events
* for these instructions. So if no PER event for the guest is left,
* we might have to filter the space-switch element out, too.
*/
if (vcpu->arch.sie_block->iprcc == PGM_SPACE_SWITCH) {
vcpu->arch.sie_block->iprcc = 0;
new_as = psw_bits(vcpu->arch.sie_block->gpsw).as;
/*
* If the AS changed from / to home, we had RP, SAC or SACF
* instruction. Check primary and home space-switch-event
* controls. (theoretically home -> home produced no event)
*/
if (((new_as == PSW_AS_HOME) ^ old_as_is_home(vcpu)) &&
(pssec(vcpu) || hssec(vcpu)))
vcpu->arch.sie_block->iprcc = PGM_SPACE_SWITCH;
/*
* PT, PTI, PR, PC instruction operate on primary AS only. Check
* if the primary-space-switch-event control was or got set.
*/
if (new_as == PSW_AS_PRIMARY && !old_as_is_home(vcpu) &&
(pssec(vcpu) || old_ssec(vcpu)))
vcpu->arch.sie_block->iprcc = PGM_SPACE_SWITCH;
}
}
This diff is collapsed.
This diff is collapsed.
......@@ -27,6 +27,13 @@ typedef int (*intercept_handler_t)(struct kvm_vcpu *vcpu);
#define TDB_FORMAT1 1
#define IS_ITDB_VALID(vcpu) ((*(char *)vcpu->arch.sie_block->itdba == TDB_FORMAT1))
extern debug_info_t *kvm_s390_dbf;
#define KVM_EVENT(d_loglevel, d_string, d_args...)\
do { \
debug_sprintf_event(kvm_s390_dbf, d_loglevel, d_string "\n", \
d_args); \
} while (0)
#define VM_EVENT(d_kvm, d_loglevel, d_string, d_args...)\
do { \
debug_sprintf_event(d_kvm->arch.dbf, d_loglevel, d_string "\n", \
......@@ -65,6 +72,8 @@ static inline u32 kvm_s390_get_prefix(struct kvm_vcpu *vcpu)
static inline void kvm_s390_set_prefix(struct kvm_vcpu *vcpu, u32 prefix)
{
VCPU_EVENT(vcpu, 3, "set prefix of cpu %03u to 0x%x", vcpu->vcpu_id,
prefix);
vcpu->arch.sie_block->prefix = prefix >> GUEST_PREFIX_SHIFT;
kvm_make_request(KVM_REQ_TLB_FLUSH, vcpu);
kvm_make_request(KVM_REQ_MMU_RELOAD, vcpu);
......@@ -217,8 +226,6 @@ void exit_sie(struct kvm_vcpu *vcpu);
void kvm_s390_sync_request(int req, struct kvm_vcpu *vcpu);
int kvm_s390_vcpu_setup_cmma(struct kvm_vcpu *vcpu);
void kvm_s390_vcpu_unsetup_cmma(struct kvm_vcpu *vcpu);
/* is cmma enabled */
bool kvm_s390_cmma_enabled(struct kvm *kvm);
unsigned long kvm_s390_fac_list_mask_size(void);
extern unsigned long kvm_s390_fac_list_mask[];
......
......@@ -53,11 +53,14 @@ static int handle_set_clock(struct kvm_vcpu *vcpu)
kvm_s390_set_psw_cc(vcpu, 3);
return 0;
}
VCPU_EVENT(vcpu, 3, "SCK: setting guest TOD to 0x%llx", val);
val = (val - hostclk) & ~0x3fUL;
mutex_lock(&vcpu->kvm->lock);
preempt_disable();
kvm_for_each_vcpu(i, cpup, vcpu->kvm)
cpup->arch.sie_block->epoch = val;
preempt_enable();
mutex_unlock(&vcpu->kvm->lock);
kvm_s390_set_psw_cc(vcpu, 0);
......@@ -98,8 +101,6 @@ static int handle_set_prefix(struct kvm_vcpu *vcpu)
return kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING);
kvm_s390_set_prefix(vcpu, address);
VCPU_EVENT(vcpu, 5, "setting prefix to %x", address);
trace_kvm_s390_handle_prefix(vcpu, 1, address);
return 0;
}
......@@ -129,7 +130,7 @@ static int handle_store_prefix(struct kvm_vcpu *vcpu)
if (rc)
return kvm_s390_inject_prog_cond(vcpu, rc);
VCPU_EVENT(vcpu, 5, "storing prefix to %x", address);
VCPU_EVENT(vcpu, 3, "STPX: storing prefix 0x%x into 0x%llx", address, operand2);
trace_kvm_s390_handle_prefix(vcpu, 0, address);
return 0;
}
......@@ -155,7 +156,7 @@ static int handle_store_cpu_address(struct kvm_vcpu *vcpu)
if (rc)
return kvm_s390_inject_prog_cond(vcpu, rc);
VCPU_EVENT(vcpu, 5, "storing cpu address to %llx", ga);
VCPU_EVENT(vcpu, 3, "STAP: storing cpu address (%u) to 0x%llx", vcpu_id, ga);
trace_kvm_s390_handle_stap(vcpu, ga);
return 0;
}
......@@ -167,6 +168,7 @@ static int __skey_check_enable(struct kvm_vcpu *vcpu)
return rc;
rc = s390_enable_skey();
VCPU_EVENT(vcpu, 3, "%s", "enabling storage keys for guest");
trace_kvm_s390_skey_related_inst(vcpu);
vcpu->arch.sie_block->ictl &= ~(ICTL_ISKE | ICTL_SSKE | ICTL_RRBE);
return rc;
......@@ -370,7 +372,7 @@ static int handle_stfl(struct kvm_vcpu *vcpu)
&fac, sizeof(fac));
if (rc)
return rc;
VCPU_EVENT(vcpu, 5, "store facility list value %x", fac);
VCPU_EVENT(vcpu, 3, "STFL: store facility list 0x%x", fac);
trace_kvm_s390_handle_stfl(vcpu, fac);
return 0;
}
......@@ -468,7 +470,7 @@ static int handle_stidp(struct kvm_vcpu *vcpu)
if (rc)
return kvm_s390_inject_prog_cond(vcpu, rc);
VCPU_EVENT(vcpu, 5, "%s", "store cpu id");
VCPU_EVENT(vcpu, 3, "STIDP: store cpu id 0x%llx", stidp_data);
return 0;
}
......@@ -521,7 +523,7 @@ static int handle_stsi(struct kvm_vcpu *vcpu)
ar_t ar;
vcpu->stat.instruction_stsi++;
VCPU_EVENT(vcpu, 4, "stsi: fc: %x sel1: %x sel2: %x", fc, sel1, sel2);
VCPU_EVENT(vcpu, 3, "STSI: fc: %u sel1: %u sel2: %u", fc, sel1, sel2);
if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
......@@ -758,10 +760,10 @@ static int handle_essa(struct kvm_vcpu *vcpu)
struct gmap *gmap;
int i;
VCPU_EVENT(vcpu, 5, "cmma release %d pages", entries);
VCPU_EVENT(vcpu, 4, "ESSA: release %d pages", entries);
gmap = vcpu->arch.gmap;
vcpu->stat.instruction_essa++;
if (!kvm_s390_cmma_enabled(vcpu->kvm))
if (!vcpu->kvm->arch.use_cmma)
return kvm_s390_inject_program_int(vcpu, PGM_OPERATION);
if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
......@@ -829,7 +831,7 @@ int kvm_s390_handle_lctl(struct kvm_vcpu *vcpu)
if (ga & 3)
return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
VCPU_EVENT(vcpu, 5, "lctl r1:%x, r3:%x, addr:%llx", reg1, reg3, ga);
VCPU_EVENT(vcpu, 4, "LCTL: r1:%d, r3:%d, addr: 0x%llx", reg1, reg3, ga);
trace_kvm_s390_handle_lctl(vcpu, 0, reg1, reg3, ga);
nr_regs = ((reg3 - reg1) & 0xf) + 1;
......@@ -868,7 +870,7 @@ int kvm_s390_handle_stctl(struct kvm_vcpu *vcpu)
if (ga & 3)
return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
VCPU_EVENT(vcpu, 5, "stctl r1:%x, r3:%x, addr:%llx", reg1, reg3, ga);
VCPU_EVENT(vcpu, 4, "STCTL r1:%d, r3:%d, addr: 0x%llx", reg1, reg3, ga);
trace_kvm_s390_handle_stctl(vcpu, 0, reg1, reg3, ga);
reg = reg1;
......@@ -902,7 +904,7 @@ static int handle_lctlg(struct kvm_vcpu *vcpu)
if (ga & 7)
return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
VCPU_EVENT(vcpu, 5, "lctlg r1:%x, r3:%x, addr:%llx", reg1, reg3, ga);
VCPU_EVENT(vcpu, 4, "LCTLG: r1:%d, r3:%d, addr: 0x%llx", reg1, reg3, ga);
trace_kvm_s390_handle_lctl(vcpu, 1, reg1, reg3, ga);
nr_regs = ((reg3 - reg1) & 0xf) + 1;
......@@ -940,7 +942,7 @@ static int handle_stctg(struct kvm_vcpu *vcpu)
if (ga & 7)
return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
VCPU_EVENT(vcpu, 5, "stctg r1:%x, r3:%x, addr:%llx", reg1, reg3, ga);
VCPU_EVENT(vcpu, 4, "STCTG r1:%d, r3:%d, addr: 0x%llx", reg1, reg3, ga);
trace_kvm_s390_handle_stctl(vcpu, 1, reg1, reg3, ga);
reg = reg1;
......
......@@ -205,9 +205,6 @@ static int __sigp_set_prefix(struct kvm_vcpu *vcpu, struct kvm_vcpu *dst_vcpu,
*reg &= 0xffffffff00000000UL;
*reg |= SIGP_STATUS_INCORRECT_STATE;
return SIGP_CC_STATUS_STORED;
} else if (rc == 0) {
VCPU_EVENT(vcpu, 4, "set prefix of cpu %02x to %x",
dst_vcpu->vcpu_id, irq.u.prefix.address);
}
return rc;
......@@ -371,7 +368,8 @@ static int handle_sigp_dst(struct kvm_vcpu *vcpu, u8 order_code,
return rc;
}
static int handle_sigp_order_in_user_space(struct kvm_vcpu *vcpu, u8 order_code)
static int handle_sigp_order_in_user_space(struct kvm_vcpu *vcpu, u8 order_code,
u16 cpu_addr)
{
if (!vcpu->kvm->arch.user_sigp)
return 0;
......@@ -414,9 +412,8 @@ static int handle_sigp_order_in_user_space(struct kvm_vcpu *vcpu, u8 order_code)
default:
vcpu->stat.instruction_sigp_unknown++;
}
VCPU_EVENT(vcpu, 4, "sigp order %u: completely handled in user space",
order_code);
VCPU_EVENT(vcpu, 3, "SIGP: order %u for CPU %d handled in userspace",
order_code, cpu_addr);
return 1;
}
......@@ -435,7 +432,7 @@ int kvm_s390_handle_sigp(struct kvm_vcpu *vcpu)
return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
order_code = kvm_s390_get_base_disp_rs(vcpu, NULL);
if (handle_sigp_order_in_user_space(vcpu, order_code))
if (handle_sigp_order_in_user_space(vcpu, order_code, cpu_addr))
return -EOPNOTSUPP;
if (r1 % 2)
......
......@@ -105,11 +105,22 @@ TRACE_EVENT(kvm_s390_vcpu_start_stop,
{KVM_S390_PROGRAM_INT, "program interrupt"}, \
{KVM_S390_SIGP_SET_PREFIX, "sigp set prefix"}, \
{KVM_S390_RESTART, "sigp restart"}, \
{KVM_S390_INT_PFAULT_INIT, "pfault init"}, \
{KVM_S390_INT_PFAULT_DONE, "pfault done"}, \
{KVM_S390_MCHK, "machine check"}, \
{KVM_S390_INT_CLOCK_COMP, "clock comparator"}, \
{KVM_S390_INT_CPU_TIMER, "cpu timer"}, \
{KVM_S390_INT_VIRTIO, "virtio interrupt"}, \
{KVM_S390_INT_SERVICE, "sclp interrupt"}, \
{KVM_S390_INT_EMERGENCY, "sigp emergency"}, \
{KVM_S390_INT_EXTERNAL_CALL, "sigp ext call"}
#define get_irq_name(__type) \
(__type > KVM_S390_INT_IO_MAX ? \
__print_symbolic(__type, kvm_s390_int_type) : \
(__type & KVM_S390_INT_IO_AI_MASK ? \
"adapter I/O interrupt" : "subchannel I/O interrupt"))
TRACE_EVENT(kvm_s390_inject_vm,
TP_PROTO(__u64 type, __u32 parm, __u64 parm64, int who),
TP_ARGS(type, parm, parm64, who),
......@@ -131,22 +142,19 @@ TRACE_EVENT(kvm_s390_inject_vm,
TP_printk("inject%s: type:%x (%s) parm:%x parm64:%llx",
(__entry->who == 1) ? " (from kernel)" :
(__entry->who == 2) ? " (from user)" : "",
__entry->inttype,
__print_symbolic(__entry->inttype, kvm_s390_int_type),
__entry->inttype, get_irq_name(__entry->inttype),
__entry->parm, __entry->parm64)
);
TRACE_EVENT(kvm_s390_inject_vcpu,
TP_PROTO(unsigned int id, __u64 type, __u32 parm, __u64 parm64, \
int who),
TP_ARGS(id, type, parm, parm64, who),
TP_PROTO(unsigned int id, __u64 type, __u32 parm, __u64 parm64),
TP_ARGS(id, type, parm, parm64),
TP_STRUCT__entry(
__field(int, id)
__field(__u32, inttype)
__field(__u32, parm)
__field(__u64, parm64)
__field(int, who)
),
TP_fast_assign(
......@@ -154,15 +162,12 @@ TRACE_EVENT(kvm_s390_inject_vcpu,
__entry->inttype = type & 0x00000000ffffffff;
__entry->parm = parm;
__entry->parm64 = parm64;
__entry->who = who;
),
TP_printk("inject%s (vcpu %d): type:%x (%s) parm:%x parm64:%llx",
(__entry->who == 1) ? " (from kernel)" :
(__entry->who == 2) ? " (from user)" : "",
TP_printk("inject (vcpu %d): type:%x (%s) parm:%x parm64:%llx",
__entry->id, __entry->inttype,
__print_symbolic(__entry->inttype, kvm_s390_int_type),
__entry->parm, __entry->parm64)
get_irq_name(__entry->inttype), __entry->parm,
__entry->parm64)
);
/*
......@@ -189,8 +194,8 @@ TRACE_EVENT(kvm_s390_deliver_interrupt,
TP_printk("deliver interrupt (vcpu %d): type:%x (%s) " \
"data:%08llx %016llx",
__entry->id, __entry->inttype,
__print_symbolic(__entry->inttype, kvm_s390_int_type),
__entry->data0, __entry->data1)
get_irq_name(__entry->inttype), __entry->data0,
__entry->data1)
);
/*
......
......@@ -252,6 +252,11 @@ struct kvm_pio_request {
int size;
};
struct rsvd_bits_validate {
u64 rsvd_bits_mask[2][4];
u64 bad_mt_xwr;
};
/*
* x86 supports 3 paging modes (4-level 64-bit, 3-level 64-bit, and 2-level
* 32-bit). The kvm_mmu structure abstracts the details of the current mmu
......@@ -289,8 +294,15 @@ struct kvm_mmu {
u64 *pae_root;
u64 *lm_root;
u64 rsvd_bits_mask[2][4];
u64 bad_mt_xwr;
/*
* check zero bits on shadow page table entries, these
* bits include not only hardware reserved bits but also
* the bits spte never used.
*/
struct rsvd_bits_validate shadow_zero_check;
struct rsvd_bits_validate guest_rsvd_check;
/*
* Bitmap: bit set = last pte in walk
......@@ -358,6 +370,11 @@ struct kvm_mtrr {
struct list_head head;
};
/* Hyper-V per vcpu emulation context */
struct kvm_vcpu_hv {
u64 hv_vapic;
};
struct kvm_vcpu_arch {
/*
* rip and regs accesses must go through
......@@ -514,8 +531,7 @@ struct kvm_vcpu_arch {
/* used for guest single stepping over the given code position */
unsigned long singlestep_rip;
/* fields used by HYPER-V emulation */
u64 hv_vapic;
struct kvm_vcpu_hv hyperv;
cpumask_var_t wbinvd_dirty_mask;
......@@ -586,6 +602,17 @@ struct kvm_apic_map {
struct kvm_lapic *logical_map[16][16];
};
/* Hyper-V emulation context */
struct kvm_hv {
u64 hv_guest_os_id;
u64 hv_hypercall;
u64 hv_tsc_page;
/* Hyper-v based guest crash (NT kernel bugcheck) parameters */
u64 hv_crash_param[HV_X64_MSR_CRASH_PARAMS];
u64 hv_crash_ctl;
};
struct kvm_arch {
unsigned int n_used_mmu_pages;
unsigned int n_requested_mmu_pages;
......@@ -645,16 +672,14 @@ struct kvm_arch {
/* reads protected by irq_srcu, writes by irq_lock */
struct hlist_head mask_notifier_list;
/* fields used by HYPER-V emulation */
u64 hv_guest_os_id;
u64 hv_hypercall;
u64 hv_tsc_page;
struct kvm_hv hyperv;
#ifdef CONFIG_KVM_MMU_AUDIT
int audit_point;
#endif
bool boot_vcpu_runs_old_kvmclock;
u32 bsp_vcpu_id;
u64 disabled_quirks;
};
......@@ -1203,5 +1228,7 @@ int __x86_set_memory_region(struct kvm *kvm,
const struct kvm_userspace_memory_region *mem);
int x86_set_memory_region(struct kvm *kvm,
const struct kvm_userspace_memory_region *mem);
bool kvm_vcpu_is_reset_bsp(struct kvm_vcpu *vcpu);
bool kvm_vcpu_is_bsp(struct kvm_vcpu *vcpu);
#endif /* _ASM_X86_KVM_HOST_H */
......@@ -47,6 +47,7 @@
#define CPU_BASED_MOV_DR_EXITING 0x00800000
#define CPU_BASED_UNCOND_IO_EXITING 0x01000000
#define CPU_BASED_USE_IO_BITMAPS 0x02000000
#define CPU_BASED_MONITOR_TRAP_FLAG 0x08000000
#define CPU_BASED_USE_MSR_BITMAPS 0x10000000
#define CPU_BASED_MONITOR_EXITING 0x20000000
#define CPU_BASED_PAUSE_EXITING 0x40000000
......@@ -367,29 +368,29 @@ enum vmcs_field {
#define TYPE_PHYSICAL_APIC_EVENT (10 << 12)
#define TYPE_PHYSICAL_APIC_INST (15 << 12)
/* segment AR */
#define SEGMENT_AR_L_MASK (1 << 13)
#define AR_TYPE_ACCESSES_MASK 1
#define AR_TYPE_READABLE_MASK (1 << 1)
#define AR_TYPE_WRITEABLE_MASK (1 << 2)
#define AR_TYPE_CODE_MASK (1 << 3)
#define AR_TYPE_MASK 0x0f
#define AR_TYPE_BUSY_64_TSS 11
#define AR_TYPE_BUSY_32_TSS 11
#define AR_TYPE_BUSY_16_TSS 3
#define AR_TYPE_LDT 2
#define AR_UNUSABLE_MASK (1 << 16)
#define AR_S_MASK (1 << 4)
#define AR_P_MASK (1 << 7)
#define AR_L_MASK (1 << 13)
#define AR_DB_MASK (1 << 14)
#define AR_G_MASK (1 << 15)
#define AR_DPL_SHIFT 5
#define AR_DPL(ar) (((ar) >> AR_DPL_SHIFT) & 3)
#define AR_RESERVD_MASK 0xfffe0f00
/* segment AR in VMCS -- these are different from what LAR reports */
#define VMX_SEGMENT_AR_L_MASK (1 << 13)
#define VMX_AR_TYPE_ACCESSES_MASK 1
#define VMX_AR_TYPE_READABLE_MASK (1 << 1)
#define VMX_AR_TYPE_WRITEABLE_MASK (1 << 2)
#define VMX_AR_TYPE_CODE_MASK (1 << 3)
#define VMX_AR_TYPE_MASK 0x0f
#define VMX_AR_TYPE_BUSY_64_TSS 11
#define VMX_AR_TYPE_BUSY_32_TSS 11
#define VMX_AR_TYPE_BUSY_16_TSS 3
#define VMX_AR_TYPE_LDT 2
#define VMX_AR_UNUSABLE_MASK (1 << 16)
#define VMX_AR_S_MASK (1 << 4)
#define VMX_AR_P_MASK (1 << 7)
#define VMX_AR_L_MASK (1 << 13)
#define VMX_AR_DB_MASK (1 << 14)
#define VMX_AR_G_MASK (1 << 15)
#define VMX_AR_DPL_SHIFT 5
#define VMX_AR_DPL(ar) (((ar) >> VMX_AR_DPL_SHIFT) & 3)
#define VMX_AR_RESERVD_MASK 0xfffe0f00
#define TSS_PRIVATE_MEMSLOT (KVM_USER_MEM_SLOTS + 0)
#define APIC_ACCESS_PAGE_PRIVATE_MEMSLOT (KVM_USER_MEM_SLOTS + 1)
......
......@@ -58,6 +58,7 @@
#define EXIT_REASON_INVALID_STATE 33
#define EXIT_REASON_MSR_LOAD_FAIL 34
#define EXIT_REASON_MWAIT_INSTRUCTION 36
#define EXIT_REASON_MONITOR_TRAP_FLAG 37
#define EXIT_REASON_MONITOR_INSTRUCTION 39
#define EXIT_REASON_PAUSE_INSTRUCTION 40
#define EXIT_REASON_MCE_DURING_VMENTRY 41
......@@ -106,6 +107,7 @@
{ EXIT_REASON_MSR_READ, "MSR_READ" }, \
{ EXIT_REASON_MSR_WRITE, "MSR_WRITE" }, \
{ EXIT_REASON_MWAIT_INSTRUCTION, "MWAIT_INSTRUCTION" }, \
{ EXIT_REASON_MONITOR_TRAP_FLAG, "MONITOR_TRAP_FLAG" }, \
{ EXIT_REASON_MONITOR_INSTRUCTION, "MONITOR_INSTRUCTION" }, \
{ EXIT_REASON_PAUSE_INSTRUCTION, "PAUSE_INSTRUCTION" }, \
{ EXIT_REASON_MCE_DURING_VMENTRY, "MCE_DURING_VMENTRY" }, \
......
......@@ -12,7 +12,9 @@ kvm-y += $(KVM)/kvm_main.o $(KVM)/coalesced_mmio.o \
kvm-$(CONFIG_KVM_ASYNC_PF) += $(KVM)/async_pf.o
kvm-y += x86.o mmu.o emulate.o i8259.o irq.o lapic.o \
i8254.o ioapic.o irq_comm.o cpuid.o pmu.o mtrr.o
i8254.o ioapic.o irq_comm.o cpuid.o pmu.o mtrr.o \
hyperv.o
kvm-$(CONFIG_KVM_DEVICE_ASSIGNMENT) += assigned-dev.o iommu.o
kvm-intel-y += vmx.o pmu_intel.o
kvm-amd-y += svm.o pmu_amd.o
......
/*
* KVM Microsoft Hyper-V emulation
*
* derived from arch/x86/kvm/x86.c
*
* Copyright (C) 2006 Qumranet, Inc.
* Copyright (C) 2008 Qumranet, Inc.
* Copyright IBM Corporation, 2008
* Copyright 2010 Red Hat, Inc. and/or its affiliates.
* Copyright (C) 2015 Andrey Smetanin <asmetanin@virtuozzo.com>
*
* Authors:
* Avi Kivity <avi@qumranet.com>
* Yaniv Kamay <yaniv@qumranet.com>
* Amit Shah <amit.shah@qumranet.com>
* Ben-Ami Yassour <benami@il.ibm.com>
* Andrey Smetanin <asmetanin@virtuozzo.com>
*
* This work is licensed under the terms of the GNU GPL, version 2. See
* the COPYING file in the top-level directory.
*
*/
#include "x86.h"
#include "lapic.h"
#include "hyperv.h"
#include <linux/kvm_host.h>
#include <trace/events/kvm.h>
#include "trace.h"
static bool kvm_hv_msr_partition_wide(u32 msr)
{
bool r = false;
switch (msr) {
case HV_X64_MSR_GUEST_OS_ID:
case HV_X64_MSR_HYPERCALL:
case HV_X64_MSR_REFERENCE_TSC:
case HV_X64_MSR_TIME_REF_COUNT:
case HV_X64_MSR_CRASH_CTL:
case HV_X64_MSR_CRASH_P0 ... HV_X64_MSR_CRASH_P4:
r = true;
break;
}
return r;
}
static int kvm_hv_msr_get_crash_data(struct kvm_vcpu *vcpu,
u32 index, u64 *pdata)
{
struct kvm_hv *hv = &vcpu->kvm->arch.hyperv;
if (WARN_ON_ONCE(index >= ARRAY_SIZE(hv->hv_crash_param)))
return -EINVAL;
*pdata = hv->hv_crash_param[index];
return 0;
}
static int kvm_hv_msr_get_crash_ctl(struct kvm_vcpu *vcpu, u64 *pdata)
{
struct kvm_hv *hv = &vcpu->kvm->arch.hyperv;
*pdata = hv->hv_crash_ctl;
return 0;
}
static int kvm_hv_msr_set_crash_ctl(struct kvm_vcpu *vcpu, u64 data, bool host)
{
struct kvm_hv *hv = &vcpu->kvm->arch.hyperv;
if (host)
hv->hv_crash_ctl = data & HV_X64_MSR_CRASH_CTL_NOTIFY;
if (!host && (data & HV_X64_MSR_CRASH_CTL_NOTIFY)) {
vcpu_debug(vcpu, "hv crash (0x%llx 0x%llx 0x%llx 0x%llx 0x%llx)\n",
hv->hv_crash_param[0],
hv->hv_crash_param[1],
hv->hv_crash_param[2],
hv->hv_crash_param[3],
hv->hv_crash_param[4]);
/* Send notification about crash to user space */
kvm_make_request(KVM_REQ_HV_CRASH, vcpu);
}
return 0;
}
static int kvm_hv_msr_set_crash_data(struct kvm_vcpu *vcpu,
u32 index, u64 data)
{
struct kvm_hv *hv = &vcpu->kvm->arch.hyperv;
if (WARN_ON_ONCE(index >= ARRAY_SIZE(hv->hv_crash_param)))
return -EINVAL;
hv->hv_crash_param[index] = data;
return 0;
}
static int kvm_hv_set_msr_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data,
bool host)
{
struct kvm *kvm = vcpu->kvm;
struct kvm_hv *hv = &kvm->arch.hyperv;
switch (msr) {
case HV_X64_MSR_GUEST_OS_ID:
hv->hv_guest_os_id = data;
/* setting guest os id to zero disables hypercall page */
if (!hv->hv_guest_os_id)
hv->hv_hypercall &= ~HV_X64_MSR_HYPERCALL_ENABLE;
break;
case HV_X64_MSR_HYPERCALL: {
u64 gfn;
unsigned long addr;
u8 instructions[4];
/* if guest os id is not set hypercall should remain disabled */
if (!hv->hv_guest_os_id)
break;
if (!(data & HV_X64_MSR_HYPERCALL_ENABLE)) {
hv->hv_hypercall = data;
break;
}
gfn = data >> HV_X64_MSR_HYPERCALL_PAGE_ADDRESS_SHIFT;
addr = gfn_to_hva(kvm, gfn);
if (kvm_is_error_hva(addr))
return 1;
kvm_x86_ops->patch_hypercall(vcpu, instructions);
((unsigned char *)instructions)[3] = 0xc3; /* ret */
if (__copy_to_user((void __user *)addr, instructions, 4))
return 1;
hv->hv_hypercall = data;
mark_page_dirty(kvm, gfn);
break;
}
case HV_X64_MSR_REFERENCE_TSC: {
u64 gfn;
HV_REFERENCE_TSC_PAGE tsc_ref;
memset(&tsc_ref, 0, sizeof(tsc_ref));
hv->hv_tsc_page = data;
if (!(data & HV_X64_MSR_TSC_REFERENCE_ENABLE))
break;
gfn = data >> HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT;
if (kvm_write_guest(
kvm,
gfn << HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT,
&tsc_ref, sizeof(tsc_ref)))
return 1;
mark_page_dirty(kvm, gfn);
break;
}
case HV_X64_MSR_CRASH_P0 ... HV_X64_MSR_CRASH_P4:
return kvm_hv_msr_set_crash_data(vcpu,
msr - HV_X64_MSR_CRASH_P0,
data);
case HV_X64_MSR_CRASH_CTL:
return kvm_hv_msr_set_crash_ctl(vcpu, data, host);
default:
vcpu_unimpl(vcpu, "Hyper-V uhandled wrmsr: 0x%x data 0x%llx\n",
msr, data);
return 1;
}
return 0;
}
static int kvm_hv_set_msr(struct kvm_vcpu *vcpu, u32 msr, u64 data)
{
struct kvm_vcpu_hv *hv = &vcpu->arch.hyperv;
switch (msr) {
case HV_X64_MSR_APIC_ASSIST_PAGE: {
u64 gfn;
unsigned long addr;
if (!(data & HV_X64_MSR_APIC_ASSIST_PAGE_ENABLE)) {
hv->hv_vapic = data;
if (kvm_lapic_enable_pv_eoi(vcpu, 0))
return 1;
break;
}
gfn = data >> HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_SHIFT;
addr = kvm_vcpu_gfn_to_hva(vcpu, gfn);
if (kvm_is_error_hva(addr))
return 1;
if (__clear_user((void __user *)addr, PAGE_SIZE))
return 1;
hv->hv_vapic = data;
kvm_vcpu_mark_page_dirty(vcpu, gfn);
if (kvm_lapic_enable_pv_eoi(vcpu,
gfn_to_gpa(gfn) | KVM_MSR_ENABLED))
return 1;
break;
}
case HV_X64_MSR_EOI:
return kvm_hv_vapic_msr_write(vcpu, APIC_EOI, data);
case HV_X64_MSR_ICR:
return kvm_hv_vapic_msr_write(vcpu, APIC_ICR, data);
case HV_X64_MSR_TPR:
return kvm_hv_vapic_msr_write(vcpu, APIC_TASKPRI, data);
default:
vcpu_unimpl(vcpu, "Hyper-V uhandled wrmsr: 0x%x data 0x%llx\n",
msr, data);
return 1;
}
return 0;
}
static int kvm_hv_get_msr_pw(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata)
{
u64 data = 0;
struct kvm *kvm = vcpu->kvm;
struct kvm_hv *hv = &kvm->arch.hyperv;
switch (msr) {
case HV_X64_MSR_GUEST_OS_ID:
data = hv->hv_guest_os_id;
break;
case HV_X64_MSR_HYPERCALL:
data = hv->hv_hypercall;
break;
case HV_X64_MSR_TIME_REF_COUNT: {
data =
div_u64(get_kernel_ns() + kvm->arch.kvmclock_offset, 100);
break;
}
case HV_X64_MSR_REFERENCE_TSC:
data = hv->hv_tsc_page;
break;
case HV_X64_MSR_CRASH_P0 ... HV_X64_MSR_CRASH_P4:
return kvm_hv_msr_get_crash_data(vcpu,
msr - HV_X64_MSR_CRASH_P0,
pdata);
case HV_X64_MSR_CRASH_CTL:
return kvm_hv_msr_get_crash_ctl(vcpu, pdata);
default:
vcpu_unimpl(vcpu, "Hyper-V unhandled rdmsr: 0x%x\n", msr);
return 1;
}
*pdata = data;
return 0;
}
static int kvm_hv_get_msr(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata)
{
u64 data = 0;
struct kvm_vcpu_hv *hv = &vcpu->arch.hyperv;
switch (msr) {
case HV_X64_MSR_VP_INDEX: {
int r;
struct kvm_vcpu *v;
kvm_for_each_vcpu(r, v, vcpu->kvm) {
if (v == vcpu) {
data = r;
break;
}
}
break;
}
case HV_X64_MSR_EOI:
return kvm_hv_vapic_msr_read(vcpu, APIC_EOI, pdata);
case HV_X64_MSR_ICR:
return kvm_hv_vapic_msr_read(vcpu, APIC_ICR, pdata);
case HV_X64_MSR_TPR:
return kvm_hv_vapic_msr_read(vcpu, APIC_TASKPRI, pdata);
case HV_X64_MSR_APIC_ASSIST_PAGE:
data = hv->hv_vapic;
break;
default:
vcpu_unimpl(vcpu, "Hyper-V unhandled rdmsr: 0x%x\n", msr);
return 1;
}
*pdata = data;
return 0;
}
int kvm_hv_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data, bool host)
{
if (kvm_hv_msr_partition_wide(msr)) {
int r;
mutex_lock(&vcpu->kvm->lock);
r = kvm_hv_set_msr_pw(vcpu, msr, data, host);
mutex_unlock(&vcpu->kvm->lock);
return r;
} else
return kvm_hv_set_msr(vcpu, msr, data);
}
int kvm_hv_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata)
{
if (kvm_hv_msr_partition_wide(msr)) {
int r;
mutex_lock(&vcpu->kvm->lock);
r = kvm_hv_get_msr_pw(vcpu, msr, pdata);
mutex_unlock(&vcpu->kvm->lock);
return r;
} else
return kvm_hv_get_msr(vcpu, msr, pdata);
}
bool kvm_hv_hypercall_enabled(struct kvm *kvm)
{
return kvm->arch.hyperv.hv_hypercall & HV_X64_MSR_HYPERCALL_ENABLE;
}
int kvm_hv_hypercall(struct kvm_vcpu *vcpu)
{
u64 param, ingpa, outgpa, ret;
uint16_t code, rep_idx, rep_cnt, res = HV_STATUS_SUCCESS, rep_done = 0;
bool fast, longmode;
/*
* hypercall generates UD from non zero cpl and real mode
* per HYPER-V spec
*/
if (kvm_x86_ops->get_cpl(vcpu) != 0 || !is_protmode(vcpu)) {
kvm_queue_exception(vcpu, UD_VECTOR);
return 0;
}
longmode = is_64_bit_mode(vcpu);
if (!longmode) {
param = ((u64)kvm_register_read(vcpu, VCPU_REGS_RDX) << 32) |
(kvm_register_read(vcpu, VCPU_REGS_RAX) & 0xffffffff);
ingpa = ((u64)kvm_register_read(vcpu, VCPU_REGS_RBX) << 32) |
(kvm_register_read(vcpu, VCPU_REGS_RCX) & 0xffffffff);
outgpa = ((u64)kvm_register_read(vcpu, VCPU_REGS_RDI) << 32) |
(kvm_register_read(vcpu, VCPU_REGS_RSI) & 0xffffffff);
}
#ifdef CONFIG_X86_64
else {
param = kvm_register_read(vcpu, VCPU_REGS_RCX);
ingpa = kvm_register_read(vcpu, VCPU_REGS_RDX);
outgpa = kvm_register_read(vcpu, VCPU_REGS_R8);
}
#endif
code = param & 0xffff;
fast = (param >> 16) & 0x1;
rep_cnt = (param >> 32) & 0xfff;
rep_idx = (param >> 48) & 0xfff;
trace_kvm_hv_hypercall(code, fast, rep_cnt, rep_idx, ingpa, outgpa);
switch (code) {
case HV_X64_HV_NOTIFY_LONG_SPIN_WAIT:
kvm_vcpu_on_spin(vcpu);
break;
default:
res = HV_STATUS_INVALID_HYPERCALL_CODE;
break;
}
ret = res | (((u64)rep_done & 0xfff) << 32);
if (longmode) {
kvm_register_write(vcpu, VCPU_REGS_RAX, ret);
} else {
kvm_register_write(vcpu, VCPU_REGS_RDX, ret >> 32);
kvm_register_write(vcpu, VCPU_REGS_RAX, ret & 0xffffffff);
}
return 1;
}
/*
* KVM Microsoft Hyper-V emulation
*
* derived from arch/x86/kvm/x86.c
*
* Copyright (C) 2006 Qumranet, Inc.
* Copyright (C) 2008 Qumranet, Inc.
* Copyright IBM Corporation, 2008
* Copyright 2010 Red Hat, Inc. and/or its affiliates.
* Copyright (C) 2015 Andrey Smetanin <asmetanin@virtuozzo.com>
*
* Authors:
* Avi Kivity <avi@qumranet.com>
* Yaniv Kamay <yaniv@qumranet.com>
* Amit Shah <amit.shah@qumranet.com>
* Ben-Ami Yassour <benami@il.ibm.com>
* Andrey Smetanin <asmetanin@virtuozzo.com>
*
* This work is licensed under the terms of the GNU GPL, version 2. See
* the COPYING file in the top-level directory.
*
*/
#ifndef __ARCH_X86_KVM_HYPERV_H__
#define __ARCH_X86_KVM_HYPERV_H__
int kvm_hv_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data, bool host);
int kvm_hv_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata);
bool kvm_hv_hypercall_enabled(struct kvm *kvm);
int kvm_hv_hypercall(struct kvm_vcpu *vcpu);
#endif
......@@ -651,15 +651,10 @@ struct kvm_pic *kvm_create_pic(struct kvm *kvm)
return NULL;
}
void kvm_destroy_pic(struct kvm *kvm)
void kvm_destroy_pic(struct kvm_pic *vpic)
{
struct kvm_pic *vpic = kvm->arch.vpic;
if (vpic) {
kvm_io_bus_unregister_dev(kvm, KVM_PIO_BUS, &vpic->dev_master);
kvm_io_bus_unregister_dev(kvm, KVM_PIO_BUS, &vpic->dev_slave);
kvm_io_bus_unregister_dev(kvm, KVM_PIO_BUS, &vpic->dev_eclr);
kvm->arch.vpic = NULL;
kfree(vpic);
}
kvm_io_bus_unregister_dev(vpic->kvm, KVM_PIO_BUS, &vpic->dev_master);
kvm_io_bus_unregister_dev(vpic->kvm, KVM_PIO_BUS, &vpic->dev_slave);
kvm_io_bus_unregister_dev(vpic->kvm, KVM_PIO_BUS, &vpic->dev_eclr);
kfree(vpic);
}
......@@ -74,7 +74,7 @@ struct kvm_pic {
};
struct kvm_pic *kvm_create_pic(struct kvm *kvm);
void kvm_destroy_pic(struct kvm *kvm);
void kvm_destroy_pic(struct kvm_pic *vpic);
int kvm_pic_read_irq(struct kvm *kvm);
void kvm_pic_update_irq(struct kvm_pic *s);
......@@ -85,11 +85,11 @@ static inline struct kvm_pic *pic_irqchip(struct kvm *kvm)
static inline int irqchip_in_kernel(struct kvm *kvm)
{
int ret;
struct kvm_pic *vpic = pic_irqchip(kvm);
ret = (pic_irqchip(kvm) != NULL);
/* Read vpic before kvm->irq_routing. */
smp_rmb();
return ret;
return vpic != NULL;
}
void kvm_pic_reset(struct kvm_kpic_state *s);
......
......@@ -1900,8 +1900,9 @@ void kvm_lapic_sync_from_vapic(struct kvm_vcpu *vcpu)
if (!test_bit(KVM_APIC_CHECK_VAPIC, &vcpu->arch.apic_attention))
return;
kvm_read_guest_cached(vcpu->kvm, &vcpu->arch.apic->vapic_cache, &data,
sizeof(u32));
if (kvm_read_guest_cached(vcpu->kvm, &vcpu->arch.apic->vapic_cache, &data,
sizeof(u32)))
return;
apic_set_tpr(vcpu->arch.apic, data & 0xff);
}
......
......@@ -91,7 +91,7 @@ int kvm_hv_vapic_msr_read(struct kvm_vcpu *vcpu, u32 msr, u64 *data);
static inline bool kvm_hv_vapic_assist_page_enabled(struct kvm_vcpu *vcpu)
{
return vcpu->arch.hv_vapic & HV_X64_MSR_APIC_ASSIST_PAGE_ENABLE;
return vcpu->arch.hyperv.hv_vapic & HV_X64_MSR_APIC_ASSIST_PAGE_ENABLE;
}
int kvm_lapic_enable_pv_eoi(struct kvm_vcpu *vcpu, u64 data);
......
This diff is collapsed.
......@@ -50,9 +50,11 @@ static inline u64 rsvd_bits(int s, int e)
return ((1ULL << (e - s + 1)) - 1) << s;
}
int kvm_mmu_get_spte_hierarchy(struct kvm_vcpu *vcpu, u64 addr, u64 sptes[4]);
void kvm_mmu_set_mmio_spte_mask(u64 mmio_mask);
void
reset_shadow_zero_bits_mask(struct kvm_vcpu *vcpu, struct kvm_mmu *context);
/*
* Return values of handle_mmio_page_fault_common:
* RET_MMIO_PF_EMULATE: it is a real mmio page fault, emulate the instruction
......
......@@ -128,14 +128,6 @@ static inline void FNAME(protect_clean_gpte)(unsigned *access, unsigned gpte)
*access &= mask;
}
static bool FNAME(is_rsvd_bits_set)(struct kvm_mmu *mmu, u64 gpte, int level)
{
int bit7 = (gpte >> 7) & 1, low6 = gpte & 0x3f;
return (gpte & mmu->rsvd_bits_mask[bit7][level-1]) |
((mmu->bad_mt_xwr & (1ull << low6)) != 0);
}
static inline int FNAME(is_present_gpte)(unsigned long pte)
{
#if PTTYPE != PTTYPE_EPT
......@@ -172,7 +164,7 @@ static bool FNAME(prefetch_invalid_gpte)(struct kvm_vcpu *vcpu,
struct kvm_mmu_page *sp, u64 *spte,
u64 gpte)
{
if (FNAME(is_rsvd_bits_set)(&vcpu->arch.mmu, gpte, PT_PAGE_TABLE_LEVEL))
if (is_rsvd_bits_set(&vcpu->arch.mmu, gpte, PT_PAGE_TABLE_LEVEL))
goto no_present;
if (!FNAME(is_present_gpte)(gpte))
......@@ -353,8 +345,7 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker,
if (unlikely(!FNAME(is_present_gpte)(pte)))
goto error;
if (unlikely(FNAME(is_rsvd_bits_set)(mmu, pte,
walker->level))) {
if (unlikely(is_rsvd_bits_set(mmu, pte, walker->level))) {
errcode |= PFERR_RSVD_MASK | PFERR_PRESENT_MASK;
goto error;
}
......
......@@ -133,8 +133,6 @@ static int amd_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
/* MSR_K7_PERFCTRn */
pmc = get_gp_pmc(pmu, msr, MSR_K7_PERFCTR0);
if (pmc) {
if (!msr_info->host_initiated)
data = (s64)data;
pmc->counter += data - pmc_read_counter(pmc);
return 0;
}
......
......@@ -1173,6 +1173,10 @@ static u64 svm_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio)
if (!is_mmio && !kvm_arch_has_assigned_device(vcpu->kvm))
return 0;
if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_CD_NW_CLEARED) &&
kvm_read_cr0(vcpu) & X86_CR0_CD)
return _PAGE_NOCACHE;
mtrr = kvm_mtrr_get_guest_memory_type(vcpu, gfn);
return mtrr2protval[mtrr];
}
......@@ -1667,13 +1671,10 @@ static void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
if (!vcpu->fpu_active)
cr0 |= X86_CR0_TS;
/*
* re-enable caching here because the QEMU bios
* does not do it - this results in some delay at
* reboot
*/
if (kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_CD_NW_CLEARED))
cr0 &= ~(X86_CR0_CD | X86_CR0_NW);
/* These are emulated via page tables. */
cr0 &= ~(X86_CR0_CD | X86_CR0_NW);
svm->vmcb->save.cr0 = cr0;
mark_dirty(svm->vmcb, VMCB_CR);
update_cr0_intercept(svm);
......@@ -2106,6 +2107,7 @@ static void nested_svm_init_mmu_context(struct kvm_vcpu *vcpu)
vcpu->arch.mmu.get_pdptr = nested_svm_get_tdp_pdptr;
vcpu->arch.mmu.inject_page_fault = nested_svm_inject_npf_exit;
vcpu->arch.mmu.shadow_root_level = get_npt_level();
reset_shadow_zero_bits_mask(vcpu, &vcpu->arch.mmu);
vcpu->arch.walk_mmu = &vcpu->arch.nested_mmu;
}
......
This diff is collapsed.
This diff is collapsed.
......@@ -147,6 +147,11 @@ static inline void kvm_register_writel(struct kvm_vcpu *vcpu,
return kvm_register_write(vcpu, reg, val);
}
static inline u64 get_kernel_ns(void)
{
return ktime_get_boot_ns();
}
static inline bool kvm_check_has_quirk(struct kvm *kvm, u64 quirk)
{
return !(kvm->arch.disabled_quirks & quirk);
......
......@@ -139,6 +139,7 @@ static inline bool is_error_page(struct page *page)
#define KVM_REQ_DISABLE_IBS 24
#define KVM_REQ_APIC_PAGE_RELOAD 25
#define KVM_REQ_SMI 26
#define KVM_REQ_HV_CRASH 27
#define KVM_USERSPACE_IRQ_SOURCE_ID 0
#define KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID 1
......@@ -363,9 +364,6 @@ struct kvm {
struct kvm_memslots *memslots[KVM_ADDRESS_SPACE_NUM];
struct srcu_struct srcu;
struct srcu_struct irq_srcu;
#ifdef CONFIG_KVM_APIC_ARCHITECTURE
u32 bsp_vcpu_id;
#endif
struct kvm_vcpu *vcpus[KVM_MAX_VCPUS];
atomic_t online_vcpus;
int last_boosted_vcpu;
......@@ -424,8 +422,15 @@ struct kvm {
#define vcpu_unimpl(vcpu, fmt, ...) \
kvm_pr_unimpl("vcpu%i " fmt, (vcpu)->vcpu_id, ## __VA_ARGS__)
#define vcpu_debug(vcpu, fmt, ...) \
kvm_debug("vcpu%i " fmt, (vcpu)->vcpu_id, ## __VA_ARGS__)
static inline struct kvm_vcpu *kvm_get_vcpu(struct kvm *kvm, int i)
{
/* Pairs with smp_wmb() in kvm_vm_ioctl_create_vcpu, in case
* the caller has read kvm->online_vcpus before (as is the case
* for kvm_for_each_vcpu, for example).
*/
smp_rmb();
return kvm->vcpus[i];
}
......@@ -1055,22 +1060,9 @@ static inline int kvm_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
#endif /* CONFIG_HAVE_KVM_EVENTFD */
#ifdef CONFIG_KVM_APIC_ARCHITECTURE
static inline bool kvm_vcpu_is_reset_bsp(struct kvm_vcpu *vcpu)
{
return vcpu->kvm->bsp_vcpu_id == vcpu->vcpu_id;
}
static inline bool kvm_vcpu_is_bsp(struct kvm_vcpu *vcpu)
{
return (vcpu->arch.apic_base & MSR_IA32_APICBASE_BSP) != 0;
}
bool kvm_vcpu_compatible(struct kvm_vcpu *vcpu);
#else
static inline bool kvm_vcpu_compatible(struct kvm_vcpu *vcpu) { return true; }
#endif
static inline void kvm_make_request(int req, struct kvm_vcpu *vcpu)
......
......@@ -317,6 +317,7 @@ struct kvm_run {
struct {
#define KVM_SYSTEM_EVENT_SHUTDOWN 1
#define KVM_SYSTEM_EVENT_RESET 2
#define KVM_SYSTEM_EVENT_CRASH 3
__u32 type;
__u64 flags;
} system_event;
......@@ -481,6 +482,7 @@ struct kvm_s390_psw {
((ai) << 26))
#define KVM_S390_INT_IO_MIN 0x00000000u
#define KVM_S390_INT_IO_MAX 0xfffdffffu
#define KVM_S390_INT_IO_AI_MASK 0x04000000u
struct kvm_s390_interrupt {
......
......@@ -2206,6 +2206,11 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id)
}
kvm->vcpus[atomic_read(&kvm->online_vcpus)] = vcpu;
/*
* Pairs with smp_rmb() in kvm_get_vcpu. Write kvm->vcpus
* before kvm->online_vcpu's incremented value.
*/
smp_wmb();
atomic_inc(&kvm->online_vcpus);
......@@ -2618,9 +2623,6 @@ static long kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg)
case KVM_CAP_USER_MEMORY:
case KVM_CAP_DESTROY_MEMORY_REGION_WORKS:
case KVM_CAP_JOIN_MEMORY_REGIONS_WORKS:
#ifdef CONFIG_KVM_APIC_ARCHITECTURE
case KVM_CAP_SET_BOOT_CPU_ID:
#endif
case KVM_CAP_INTERNAL_ERROR_DATA:
#ifdef CONFIG_HAVE_KVM_MSI
case KVM_CAP_SIGNAL_MSI:
......@@ -2716,17 +2718,6 @@ static long kvm_vm_ioctl(struct file *filp,
r = kvm_ioeventfd(kvm, &data);
break;
}
#ifdef CONFIG_KVM_APIC_ARCHITECTURE
case KVM_SET_BOOT_CPU_ID:
r = 0;
mutex_lock(&kvm->lock);
if (atomic_read(&kvm->online_vcpus) != 0)
r = -EBUSY;
else
kvm->bsp_vcpu_id = arg;
mutex_unlock(&kvm->lock);
break;
#endif
#ifdef CONFIG_HAVE_KVM_MSI
case KVM_SIGNAL_MSI: {
struct kvm_msi msi;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment