Commits · f95eec9bed76d42194c23153cb1cc8f186bf91cb · nexedi / linux

08 Jul, 2020 30 commits

KVM: x86/mmu: Don't put invalid SPs back on the list of active pages · f95eec9b

Sean Christopherson authored Jun 23, 2020

Delete a shadow page from the invalidation list instead of throwing it
back on the list of active pages when it's a root shadow page with
active users.  Invalid active root pages will be explicitly freed by
mmu_free_root_page() when the root_count hits zero, i.e. they don't need
to be put on the active list to avoid leakage.

Use sp->role.invalid to detect that a shadow page has already been
zapped, i.e. is not on a list.

WARN if an invalid page is encountered when zapping pages, as it should
now be impossible.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200623193542.7554-2-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

f95eec9b

KVM: x86/mmu: Optimize MMU page cache lookup for fully direct MMUs · fb58a9c3

Sean Christopherson authored Jun 23, 2020

Skip the unsync checks and the write flooding clearing for fully direct
MMUs, which are guaranteed to not have unsync'd or indirect pages (write
flooding detection only applies to indirect pages). For TDP, this
avoids unnecessary memory reads and writes, and for the write flooding
count will also avoid dirtying a cache line (unsync_child_bitmap itself
consumes a cache line, i.e. write_flooding_count is guaranteed to be in
a different cache line than parent_ptes).
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200623194027.23135-3-sean.j.christopherson@intel.com>
Reviewed-By: Jon Cargille <jcargill@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

fb58a9c3

KVM: x86/mmu: Avoid multiple hash lookups in kvm_get_mmu_page() · ac101b7c

Sean Christopherson authored Jun 23, 2020

Refactor for_each_valid_sp() to take the list of shadow pages instead of
retrieving it from a gfn to avoid doing the gfn->list hash and lookup
multiple times during kvm_get_mmu_page().

Cc: Peter Feiner <pfeiner@google.com>
Cc: Jon Cargille <jcargill@google.com>
Cc: Jim Mattson <jmattson@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200623194027.23135-2-sean.j.christopherson@intel.com>
Reviewed-By: Jon Cargille <jcargill@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

ac101b7c

KVM: x86: Use VMCALL and VMMCALL mnemonics in kvm_para.h · 4cb5b77e

Uros Bizjak authored Jun 23, 2020

Current minimum required version of binutils is 2.23,
which supports VMCALL and VMMCALL instruction mnemonics.

Replace the byte-wise specification of VMCALL and
VMMCALL with these proper mnemonics.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200623183439.5526-1-ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

4cb5b77e

KVM: SVM: Rename svm_nested_virtualize_tpr() to nested_svm_virtualize_tpr() · 01c3b2b5

Joerg Roedel authored Jun 25, 2020

Match the naming with other nested svm functions.

No functional changes.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Message-Id: <20200625080325.28439-5-joro@8bytes.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

01c3b2b5

KVM: SVM: Add svm_ prefix to set/clr/is_intercept() · a284ba56

Joerg Roedel authored Jun 25, 2020

Make clear the symbols belong to the SVM code when they are built-in.

No functional changes.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Message-Id: <20200625080325.28439-4-joro@8bytes.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

a284ba56

KVM: SVM: Add vmcb_ prefix to mark_*() functions · 06e7852c

Joerg Roedel authored Jun 25, 2020

Make it more clear what data structure these functions operate on.

No functional changes.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Message-Id: <20200625080325.28439-3-joro@8bytes.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

06e7852c

KVM: SVM: Rename struct nested_state to svm_nested_state · 7693b3eb

Joerg Roedel authored Jun 25, 2020

Renaming is only needed in the svm.h header file.

No functional changes.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Message-Id: <20200625080325.28439-2-joro@8bytes.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

7693b3eb

KVM: nVMX: Wrap VM-Fail valid path in generic VM-Fail helper · b2656e4d

Sean Christopherson authored Jun 08, 2020

Add nested_vmx_fail() to wrap VM-Fail paths that _may_ result in VM-Fail
Valid to make it clear at the call sites that the Valid flavor isn't
guaranteed.
Suggested-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200609015607.6994-1-sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

b2656e4d

kvm: x86: Set last_vmentry_cpu in vcpu_enter_guest · c967118d

Jim Mattson authored Jun 03, 2020

Since this field is now in kvm_vcpu_arch, clean things up a little by
setting it in vendor-agnostic code: vcpu_enter_guest. Note that it
must be set after the call to kvm_x86_ops.run(), since it can't be
updated before pre_sev_run().
Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Message-Id: <20200603235623.245638-7-jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

c967118d

kvm: x86: Move last_cpu into kvm_vcpu_arch as last_vmentry_cpu · 8a14fe4f

Jim Mattson authored Jun 03, 2020

Both the vcpu_vmx structure and the vcpu_svm structure have a
'last_cpu' field. Move the common field into the kvm_vcpu_arch
structure. For clarity, rename it to 'last_vmentry_cpu.'
Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Message-Id: <20200603235623.245638-6-jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

8a14fe4f

kvm: x86: Add "last CPU" to some KVM_EXIT information · 1aa561b1

Jim Mattson authored Jun 03, 2020

More often than not, a failed VM-entry in an x86 production
environment is induced by a defective CPU. To help identify the bad
hardware, include the id of the last logical CPU to run a vCPU in the
information provided to userspace on a KVM exit for failed VM-entry or
for KVM internal errors not associated with emulation. The presence of
this additional information is indicated by a new capability,
KVM_CAP_LAST_CPU.
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Message-Id: <20200603235623.245638-5-jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

1aa561b1

kvm: vmx: Add last_cpu to struct vcpu_vmx · 80a1684c

Jim Mattson authored Jun 03, 2020

As we already do in svm, record the last logical processor on which a
vCPU has run, so that it can be communicated to userspace for
potential hardware errors.
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Message-Id: <20200603235623.245638-4-jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

80a1684c

kvm: svm: Always set svm->last_cpu on VMRUN · 24263634

Jim Mattson authored Jun 03, 2020

Previously, this field was only set when using SEV. Set it for all
vCPU configurations, so that it can be communicated to userspace for
diagnosing potential hardware errors.
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Message-Id: <20200603235623.245638-3-jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

24263634

kvm: svm: Prefer vcpu->cpu to raw_smp_processor_id() · 73cd6e5f

Jim Mattson authored Jun 03, 2020

The current logical processor id is cached in vcpu->cpu. Use it
instead of raw_smp_processor_id() when a kvm_vcpu struct is available.
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Message-Id: <20200603235623.245638-2-jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

73cd6e5f

KVM: x86: report sev_pin_memory errors with PTR_ERR · a8d908b5

Paolo Bonzini authored Jun 23, 2020

Callers of sev_pin_memory() treat
NULL differently:

sev_launch_secret()/svm_register_enc_region() return -ENOMEM
sev_dbg_crypt() returns -EFAULT.

Switching to ERR_PTR() preserves the error and enables cleaner reporting of
different kinds of failures.
Suggested-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

a8d908b5

KVM: SVM: convert get_user_pages() --> pin_user_pages() · dc42c8ae

John Hubbard authored May 25, 2020

This code was using get_user_pages*(), in a "Case 2" scenario
(DMA/RDMA), using the categorization from [1]. That means that it's
time to convert the get_user_pages*() + put_page() calls to
pin_user_pages*() + unpin_user_pages() calls.

There is some helpful background in [2]: basically, this is a small
part of fixing a long-standing disconnect between pinning pages, and
file systems' use of those pages.

[1] Documentation/core-api/pin_user_pages.rst

[2] "Explicit pinning of user-space pages":
    https://lwn.net/Articles/807108/

Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Wanpeng Li <wanpengli@tencent.com>
Cc: Jim Mattson <jmattson@google.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Message-Id: <20200526062207.1360225-3-jhubbard@nvidia.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

dc42c8ae

KVM: SVM: fix svn_pin_memory()'s use of get_user_pages_fast() · 78824fab

John Hubbard authored May 25, 2020

There are two problems in svn_pin_memory():

1) The return value of get_user_pages_fast() is stored in an
unsigned long, although the declared return value is of type int.
This will not cause any symptoms, but it is misleading.
Fix this by changing the type of npinned to "int".

2) The number of pages passed into get_user_pages_fast() is stored
in an unsigned long, even though get_user_pages_fast() accepts an
int. This means that it is possible to silently overflow the number
of pages.

Fix this by adding a WARN_ON_ONCE() and an early error return. The
npages variable is left as an unsigned long for convenience in
checking for overflow.

Fixes: 89c50580 ("KVM: SVM: Add support for KVM_SEV_LAUNCH_UPDATE_DATA command")
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Wanpeng Li <wanpengli@tencent.com>
Cc: Jim Mattson <jmattson@google.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Cc: kvm@vger.kernel.org
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Message-Id: <20200526062207.1360225-2-jhubbard@nvidia.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

78824fab

KVM: nSVM: Check that DR6[63:32] and DR7[64:32] are not set on vmrun of nested guests · 1aef8161

Krish Sadhukhan authored May 22, 2020

According to section "Canonicalization and Consistency Checks" in APM vol. 2
the following guest state is illegal:

    "DR6[63:32] are not zero."
    "DR7[63:32] are not zero."
    "Any MBZ bit of EFER is set."
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Message-Id: <20200522221954.32131-3-krish.sadhukhan@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

1aef8161

KVM: x86: Move the check for upper 32 reserved bits of DR6 to separate function · f5f6145e

Krish Sadhukhan authored May 22, 2020

Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Message-Id: <20200522221954.32131-2-krish.sadhukhan@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

f5f6145e

KVM: X86: Do the same ignore_msrs check for feature msrs · 12bc2132

Peter Xu authored Jun 22, 2020

Logically the ignore_msrs and report_ignored_msrs should also apply to feature
MSRs.  Add them in.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20200622220442.21998-3-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

12bc2132

KVM: X86: Move ignore_msrs handling upper the stack · 6abe9c13

Peter Xu authored Jun 22, 2020

MSR accesses can be one of:

  (1) KVM internal access,
  (2) userspace access (e.g., via KVM_SET_MSRS ioctl),
  (3) guest access.

The ignore_msrs was previously handled by kvm_get_msr_common() and
kvm_set_msr_common(), which is the bottom of the msr access stack.  It's
working in most cases, however it could dump unwanted warning messages to dmesg
even if kvm get/set the msrs internally when calling __kvm_set_msr() or
__kvm_get_msr() (e.g. kvm_cpuid()).  Ideally we only want to trap cases (2)
or (3), but not (1) above.

To achieve this, move the ignore_msrs handling upper until the callers of
__kvm_get_msr() and __kvm_set_msr().  To identify the "msr missing" event, a
new return value (KVM_MSR_RET_INVALID==2) is used for that.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20200622220442.21998-2-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

6abe9c13

KVM: x86/mmu: Make .write_log_dirty a nested operation · 02f5fb2e

Sean Christopherson authored Jun 22, 2020

Move .write_log_dirty() into kvm_x86_nested_ops to help differentiate it
from the non-nested dirty log hooks.  And because it's a nested-only
operation.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200622215832.22090-5-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

02f5fb2e

KVM: nVMX: WARN if PML emulation helper is invoked outside of nested guest · 2f1d48aa

Sean Christopherson authored Jun 22, 2020

WARN if vmx_write_pml_buffer() is called outside of guest mode instead
of silently ignoring the condition.  The only caller is nested EPT's
ept_update_accessed_dirty_bits(), which should only be reachable when
L2 is active.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200622215832.22090-4-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

2f1d48aa

KVM: x86/mmu: Drop kvm_arch_write_log_dirty() wrapper · f25a9dec

Sean Christopherson authored Jun 22, 2020

Drop kvm_arch_write_log_dirty() in favor of invoking .write_log_dirty()
directly from FNAME(update_accessed_dirty_bits). "kvm_arch" is usually
used for x86 functions that are invoked from generic KVM, and implies
that there are external callers, neither of which is true.

Remove the check for a non-NULL kvm_x86_ops hook as the call is wrapped
in PTTYPE_EPT and is unconditionally set by VMX.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200622215832.22090-3-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

f25a9dec

KVM: async_pf: change kvm_setup_async_pf()/kvm_arch_setup_async_pf() return type to bool · e8c22266

Vitaly Kuznetsov authored Jun 15, 2020

Unlike normal 'int' functions returning '0' on success, kvm_setup_async_pf()/
kvm_arch_setup_async_pf() return '1' when a job to handle page fault
asynchronously was scheduled and '0' otherwise. To avoid the confusion
change return type to 'bool'.

No functional change intended.
Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200615121334.91300-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

e8c22266

KVM: x86: drop KVM_PV_REASON_PAGE_READY case from kvm_handle_page_fault() · 9ce372b3

Vitaly Kuznetsov authored May 07, 2020

KVM guest code in Linux enables APF only when KVM_FEATURE_ASYNC_PF_INT
is supported, this means we will never see KVM_PV_REASON_PAGE_READY
when handling page fault vmexit in KVM.

While on it, make sure we only follow genuine page fault path when
APF reason is zero. If we happen to see something else this means
that the underlying hypervisor is misbehaving. Leave WARN_ON_ONCE()
to catch that.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

9ce372b3

Merge branch 'kvm-master' into HEAD · 6c6165f8
Paolo Bonzini authored Jul 08, 2020
```
Merge 5.8-rc bugfixes.
```
6c6165f8
Merge branch 'kvm-async-pf-int' into HEAD · 26d05b36
Paolo Bonzini authored Jun 15, 2020

26d05b36

KVM: MIPS: fix spelling mistake "Exteneded" -> "Extended" · 0ed076c7

Colin Ian King authored Jun 15, 2020

There is a spelling mistake in a couple of kvm_err messages. Fix them.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Message-Id: <20200615082636.7004-1-colin.king@canonical.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

0ed076c7

06 Jul, 2020 3 commits

Merge tag 'kvmarm-fixes-5.8-3' of... · 8038a922

Paolo Bonzini authored Jul 06, 2020

Merge tag 'kvmarm-fixes-5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master

KVM/arm fixes for 5.8, take #3

- Disable preemption on context-switching PMU EL0 state happening
  on system register trap
- Don't clobber X0 when tearing down KVM via a soft reset (kexec)

8038a922

KVM: arm64: Stop clobbering x0 for HVC_SOFT_RESTART · b9e10d4a

Andrew Scull authored Jul 06, 2020

HVC_SOFT_RESTART is given values for x0-2 that it should installed
before exiting to the new address so should not set x0 to stub HVC
success or failure code.

Fixes: af42f204 ("arm64: hyp-stub: Zero x0 on successful stub handling")
Cc: stable@vger.kernel.org
Signed-off-by: Andrew Scull <ascull@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200706095259.1338221-1-ascull@google.com

b9e10d4a

KVM: arm64: PMU: Fix per-CPU access in preemptible context · 146f76cc

Marc Zyngier authored Jul 04, 2020

Commit 07da1ffa ("KVM: arm64: Remove host_cpu_context
member from vcpu structure") has, by removing the host CPU
context pointer, exposed that kvm_vcpu_pmu_restore_guest
is called in preemptible contexts:

[  266.932442] BUG: using smp_processor_id() in preemptible [00000000] code: qemu-system-aar/779
[  266.939721] caller is debug_smp_processor_id+0x20/0x30
[  266.944157] CPU: 2 PID: 779 Comm: qemu-system-aar Tainted: G            E     5.8.0-rc3-00015-g8d4aa58b2fe3 #1374
[  266.954268] Hardware name: amlogic w400/w400, BIOS 2020.04 05/22/2020
[  266.960640] Call trace:
[  266.963064]  dump_backtrace+0x0/0x1e0
[  266.966679]  show_stack+0x20/0x30
[  266.969959]  dump_stack+0xe4/0x154
[  266.973338]  check_preemption_disabled+0xf8/0x108
[  266.977978]  debug_smp_processor_id+0x20/0x30
[  266.982307]  kvm_vcpu_pmu_restore_guest+0x2c/0x68
[  266.986949]  access_pmcr+0xf8/0x128
[  266.990399]  perform_access+0x8c/0x250
[  266.994108]  kvm_handle_sys_reg+0x10c/0x2f8
[  266.998247]  handle_exit+0x78/0x200
[  267.001697]  kvm_arch_vcpu_ioctl_run+0x2ac/0xab8

Note that the bug was always there, it is only the switch to
using percpu accessors that made it obvious.
The fix is to wrap these accesses in a preempt-disabled section,
so that we sample a coherent context on trap from the guest.

Fixes: 435e53fb ("arm64: KVM: Enable VHE support for :G/:H perf event modifiers")
Cc:: Andrew Murray <amurray@thegoodpenguin.co.uk>
Signed-off-by: Marc Zyngier <maz@kernel.org>

146f76cc

03 Jul, 2020 3 commits

KVM: VMX: Use KVM_POSSIBLE_CR*_GUEST_BITS to initialize guest/host masks · fa71e952

Sean Christopherson authored Jul 02, 2020

Use the "common" KVM_POSSIBLE_CR*_GUEST_BITS defines to initialize the
CR0/CR4 guest host masks instead of duplicating most of the CR4 mask and
open coding the CR0 mask. SVM doesn't utilize the masks, i.e. the masks
are effectively VMX specific even if they're not named as such. This
avoids duplicate code, better documents the guest owned CR0 bit, and
eliminates the need for a build-time assertion to keep VMX and x86
synchronized.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200703040422.31536-3-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

fa71e952

KVM: x86: Mark CR4.TSD as being possibly owned by the guest · 7c83d096

Sean Christopherson authored Jul 02, 2020

Mark CR4.TSD as being possibly owned by the guest as that is indeed the
case on VMX.  Without TSD being tagged as possibly owned by the guest, a
targeted read of CR4 to get TSD could observe a stale value.  This bug
is benign in the current code base as the sole consumer of TSD is the
emulator (for RDTSC) and the emulator always "reads" the entirety of CR4
when grabbing bits.

Add a build-time assertion in to ensure VMX doesn't hand over more CR4
bits without also updating x86.

Fixes: 52ce3c21 ("x86,kvm,vmx: Don't trap writes to CR4.TSD")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200703040422.31536-2-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

7c83d096

KVM: x86: Inject #GP if guest attempts to toggle CR4.LA57 in 64-bit mode · d74fcfc1

Sean Christopherson authored Jul 02, 2020

Inject a #GP on MOV CR4 if CR4.LA57 is toggled in 64-bit mode, which is
illegal per Intel's SDM:

  CR4.LA57
    57-bit linear addresses (bit 12 of CR4) ... blah blah blah ...
    This bit cannot be modified in IA-32e mode.

Note, the pseudocode for MOV CR doesn't call out the fault condition,
which is likely why the check was missed during initial development.
This is arguably an SDM bug and will hopefully be fixed in future
release of the SDM.

Fixes: fd8cb433 ("KVM: MMU: Expose the LA57 feature to VM.")
Cc: stable@vger.kernel.org
Reported-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200703021714.5549-1-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

d74fcfc1

02 Jul, 2020 1 commit

kvm: use more precise cast and do not drop __user · 1393b4aa

Paolo Bonzini authored Jul 02, 2020

Sparse complains on a call to get_compat_sigset, fix it.  The "if"
right above explains that sigmask_arg->sigset is basically a
compat_sigset_t.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

1393b4aa

01 Jul, 2020 1 commit

Merge tag 'kvmarm-fixes-5.8-2' of... · 6e1d72f1

Paolo Bonzini authored Jul 01, 2020

Merge tag 'kvmarm-fixes-5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master

KVM/arm fixes for 5.8, take #2

- Make sure a vcpu becoming non-resident doesn't race against the doorbell delivery
- Only advertise pvtime if accounting is enabled
- Return the correct error code if reset fails with SVE
- Make sure that pseudo-NMI functions are annotated as __always_inline

6e1d72f1

30 Jun, 2020 1 commit

KVM: x86: bit 8 of non-leaf PDPEs is not reserved · 5ecad245

Paolo Bonzini authored Jun 30, 2020

Bit 8 would be the "global" bit, which does not quite make sense for non-leaf
page table entries. Intel ignores it; AMD ignores it in PDEs and PDPEs, but
reserves it in PML4Es.

Probably, earlier versions of the AMD manual documented it as reserved in PDPEs
as well, and that behavior made it into KVM as well as kvm-unit-tests; fix it.

Cc: stable@vger.kernel.org
Reported-by: Nadav Amit <namit@vmware.com>
Fixes: a0c0feb5 ("KVM: x86: reserve bit 8 of non-leaf PDPEs and PML4Es in 64-bit mode on AMD", 2014-09-03)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

5ecad245

29 Jun, 2020 1 commit

KVM: X86: Fix async pf caused null-ptr-deref · 9d3c447c

Wanpeng Li authored Jun 29, 2020

Syzbot reported that:

  CPU: 1 PID: 6780 Comm: syz-executor153 Not tainted 5.7.0-syzkaller #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
  RIP: 0010:__apic_accept_irq+0x46/0xb80
  Call Trace:
   kvm_arch_async_page_present+0x7de/0x9e0
   kvm_check_async_pf_completion+0x18d/0x400
   kvm_arch_vcpu_ioctl_run+0x18bf/0x69f0
   kvm_vcpu_ioctl+0x46a/0xe20
   ksys_ioctl+0x11a/0x180
   __x64_sys_ioctl+0x6f/0xb0
   do_syscall_64+0xf6/0x7d0
   entry_SYSCALL_64_after_hwframe+0x49/0xb3

The testcase enables APF mechanism in MSR_KVM_ASYNC_PF_EN with ASYNC_PF_INT
enabled w/o setting MSR_KVM_ASYNC_PF_INT before, what's worse, interrupt
based APF 'page ready' event delivery depends on in kernel lapic, however,
we didn't bail out when lapic is not in kernel during guest setting
MSR_KVM_ASYNC_PF_EN which causes the null-ptr-deref in host later.
This patch fixes it.

Reported-by: syzbot+1bf777dfdde86d64b89b@syzkaller.appspotmail.com
Fixes: 2635b5c4 (KVM: x86: interrupt based APF 'page ready' event delivery)
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1593426391-8231-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

9d3c447c