- 08 Nov, 2020 5 commits
-
-
Maxim Levitsky authored
Recent introduction of the userspace msr filtering added code that uses negative error codes for cases that result in either #GP delivery to the guest, or handled by the userspace msr filtering. This breaks an assumption that a negative error code returned from the msr emulation code is a semi-fatal error which should be returned to userspace via KVM_RUN ioctl and usually kill the guest. Fix this by reusing the already existing KVM_MSR_RET_INVALID error code, and by adding a new KVM_MSR_RET_FILTERED error code for the userspace filtered msrs. Fixes: 291f35fb2c1d1 ("KVM: x86: report negative values from wrmsr emulation to userspace") Reported-by: Qian Cai <cai@redhat.com> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20201101115523.115780-1-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Peter Xu authored
Should be squashed into 66570e96. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20201023183358.50607-3-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Peter Xu authored
It should be an accident when rebase, since we've already have section 8.25 (which is KVM_CAP_S390_DIAG318). Fix the number. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20201001012044.5151-2-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Li RongQing authored
Fix an off-by-one style bug in pte_list_add() where it failed to account the last full set of SPTEs, i.e. when desc->sptes is full and desc->more is NULL. Merge the two "PTE_LIST_EXT-1" checks as part of the fix to avoid an extra comparison. Signed-off-by: Li RongQing <lirongqing@baidu.com> Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <1601196297-24104-1-git-send-email-lirongqing@baidu.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
Merge tag 'kvmarm-fixes-5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for v5.10, take #2 - Fix compilation error when PMD and PUD are folded - Fix regresssion of the RAZ behaviour of ID_AA64ZFR0_EL1
-
- 06 Nov, 2020 5 commits
-
-
Andrew Jones authored
The AA64ZFR0_EL1 accessors are just the general accessors with its visibility function open-coded. It also skips the if-else chain in read_id_reg, but there's no reason not to go there. Indeed consolidating ID register accessors and removing lines of code make it worthwhile. Remove the AA64ZFR0_EL1 accessors, replacing them with the general accessors for sanitized ID registers. No functional change intended. Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201105091022.15373-5-drjones@redhat.com
-
Andrew Jones authored
The instruction encodings of ID registers are preallocated. Until an encoding is assigned a purpose the register is RAZ. KVM's general ID register accessor functions already support both paths, RAZ or not. If for each ID register we can determine if it's RAZ or not, then all ID registers can build on the general functions. The register visibility function allows us to check whether a register should be completely hidden or not, extending it to also report when the register should be RAZ or not allows us to use it for ID registers as well. Check for RAZ visibility in the ID register accessor functions, allowing the RAZ case to be handled in a generic way for all system registers. The new REG_RAZ flag will be used in a later patch. This patch has no intended functional change. Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201105091022.15373-4-drjones@redhat.com
-
Andrew Jones authored
REG_HIDDEN_GUEST and REG_HIDDEN_USER are always used together. Consolidate them into a single REG_HIDDEN flag. We can always add another flag later if some register needs to expose itself differently to the guest than it does to userspace. No functional change intended. Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201105091022.15373-3-drjones@redhat.com
-
Andrew Jones authored
ID registers are RAZ until they've been allocated a purpose, but that doesn't mean they should be removed from the KVM_GET_REG_LIST list. So far we only have one register, SYS_ID_AA64ZFR0_EL1, that is hidden from userspace when its function, SVE, is not present. Expose SYS_ID_AA64ZFR0_EL1 to userspace as RAZ when SVE is not implemented. Removing the userspace visibility checks is enough to reexpose it, as it will already return zero to userspace when SVE is not present. The register already behaves as RAZ for the guest when SVE is not present. Fixes: 73433762 ("KVM: arm64/sve: System register context switch and access support") Reported-by: 张东旭 <xu910121@sina.com> Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org#v5.2+ Link: https://lore.kernel.org/r/20201105091022.15373-2-drjones@redhat.com
-
Gavin Shan authored
The PUD and PMD are folded into PGD when the following options are enabled. In that case, PUD_SHIFT is equal to PMD_SHIFT and we fail to build with the indicated errors: CONFIG_ARM64_VA_BITS_42=y CONFIG_ARM64_PAGE_SHIFT=16 CONFIG_PGTABLE_LEVELS=3 arch/arm64/kvm/mmu.c: In function ‘user_mem_abort’: arch/arm64/kvm/mmu.c:798:2: error: duplicate case value case PMD_SHIFT: ^~~~ arch/arm64/kvm/mmu.c:791:2: note: previously used here case PUD_SHIFT: ^~~~ This fixes the issue by skipping the check on PUD huge page when PUD and PMD are folded into PGD. Fixes: 2f40c460 ("KVM: arm64: Use fallback mapping sizes for contiguous huge page sizes") Reported-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201103003009.32955-1-gshan@redhat.com
-
- 31 Oct, 2020 4 commits
-
-
Paolo Bonzini authored
Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Andrew Jones authored
Unless we want to test with THP, then we shouldn't require it to be configured by the host kernel. Unfortunately, even advising with MADV_NOHUGEPAGE does require it, so check for THP first in order to avoid madvise failing with EINVAL. Signed-off-by: Andrew Jones <drjones@redhat.com> Message-Id: <20201029201703.102716-2-drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Vitaly Kuznetsov authored
It was noticed that evmcs_sanitize_exec_ctrls() is not being executed nowadays despite the code checking 'enable_evmcs' static key looking correct. Turns out, static key magic doesn't work in '__init' section (and it is unclear when things changed) but setup_vmcs_config() is called only once per CPU so we don't really need it to. Switch to checking 'enlightened_vmcs' instead, it is supposed to be in sync with 'enable_evmcs'. Opportunistically make evmcs_sanitize_exec_ctrls '__init' and drop unneeded extra newline from it. Reported-by: Yang Weijiang <weijiang.yang@intel.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20201014143346.2430936-1-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Jim Mattson authored
Add a regression test for commit 671ddc70 ("KVM: nVMX: Don't leak L1 MMIO regions to L2"). First, check to see that an L2 guest can be launched with a valid APIC-access address that is backed by a page of L1 physical memory. Next, set the APIC-access address to a (valid) L1 physical address that is not backed by memory. KVM can't handle this situation, so resuming L2 should result in a KVM exit for internal error (emulation). Signed-off-by: Jim Mattson <jmattson@google.com> Reviewed-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Peter Shier <pshier@google.com> Message-Id: <20201026180922.3120555-1-jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
- 30 Oct, 2020 7 commits
-
-
Takashi Iwai authored
The newly introduced kvm_msr_ignored_check() tries to print error or debug messages via vcpu_*() macros, but those may cause Oops when NULL vcpu is passed for KVM_GET_MSRS ioctl. Fix it by replacing the print calls with kvm_*() macros. (Note that this will leave vcpu argument completely unused in the function, but I didn't touch it to make the fix as small as possible. A clean up may be applied later.) Fixes: 12bc2132 ("KVM: X86: Do the same ignore_msrs check for feature msrs") BugLink: https://bugzilla.suse.com/show_bug.cgi?id=1178280 Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Message-Id: <20201030151414.20165-1-tiwai@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
Even though the compiler is able to replace static const variables with their value, it will warn about them being unused when Linux is built with W=1. Use good old macros instead, this is not C++. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
Paolo Bonzini authored
Merge tag 'kvmarm-fixes-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 5.10, take #1 - Force PTE mapping on device pages provided via VFIO - Fix detection of cacheable mapping at S2 - Fallback to PMD/PTE mappings for composite huge pages - Fix accounting of Stage-2 PGD allocation - Fix AArch32 handling of some of the debug registers - Simplify host HYP entry - Fix stray pointer conversion on nVHE TLB invalidation - Fix initialization of the nVHE code - Simplify handling of capabilities exposed to HYP - Nuke VCPUs caught using a forbidden AArch32 EL0
-
Qais Yousef authored
On a system without uniform support for AArch32 at EL0, it is possible for the guest to force run AArch32 at EL0 and potentially cause an illegal exception if running on a core without AArch32. Add an extra check so that if we catch the guest doing that, then we prevent it from running again by resetting vcpu->arch.target and return ARM_EXCEPTION_IL. We try to catch this misbehaviour as early as possible and not rely on an illegal exception occuring to signal the problem. Attempting to run a 32bit app in the guest will produce an error from QEMU if the guest exits while running in AArch32 EL0. Tested on Juno by instrumenting the host to fake asym aarch32 and instrumenting KVM to make the asymmetry visible to the guest. [will: Incorporated feedback from Marc] Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201021104611.2744565-2-qais.yousef@arm.com Link: https://lore.kernel.org/r/20201027215118.27003-2-will@kernel.org
-
Mark Rutland authored
We finalize caps before initializing kvm hyp code, and any use of cpus_have_const_cap() in kvm hyp code generates redundant and potentially unsound code to read the cpu_hwcaps array. A number of helper functions used in both hyp context and regular kernel context use cpus_have_const_cap(), as some regular kernel code runs before the capabilities are finalized. It's tedious and error-prone to write separate copies of these for hyp and non-hyp code. So that we can avoid the redundant code, let's automatically upgrade cpus_have_const_cap() to cpus_have_final_cap() when used in hyp context. With this change, there's never a reason to access to cpu_hwcaps array from hyp code, and we don't need to create an NVHE alias for this. This should have no effect on non-hyp code. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Cc: David Brazdil <dbrazdil@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201026134931.28246-4-mark.rutland@arm.com
-
Mark Rutland authored
In a subsequent patch we'll modify cpus_have_const_cap() to call cpus_have_final_cap(), and hence we need to define cpus_have_final_cap() first. To make subsequent changes easier to follow, this patch reorders the two without making any other changes. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Cc: David Brazdil <dbrazdil@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201026134931.28246-3-mark.rutland@arm.com
-
Mark Rutland authored
Currently has_vhe() detects whether it is being compiled for VHE/NVHE hyp code based on preprocessor definitions, and uses this knowledge to avoid redundant runtime checks. There are other cases where we'd like to use this knowledge, so let's factor the preprocessor checks out into separate helpers. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Cc: David Brazdil <dbrazdil@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201026134931.28246-2-mark.rutland@arm.com
-
- 29 Oct, 2020 8 commits
-
-
Santosh Shukla authored
VFIO allows a device driver to resolve a fault by mapping a MMIO range. This can be subsequently result in user_mem_abort() to try and compute a huge mapping based on the MMIO pfn, which is a sure recipe for things to go wrong. Instead, force a PTE mapping when the pfn faulted in has a device mapping. Fixes: 6d674e28 ("KVM: arm/arm64: Properly handle faulting of device mappings") Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Santosh Shukla <sashukla@nvidia.com> [maz: rewritten commit message] Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1603711447-11998-2-git-send-email-sashukla@nvidia.com
-
Gavin Shan authored
Although huge pages can be created out of multiple contiguous PMDs or PTEs, the corresponding sizes are not supported at Stage-2 yet. Instead of failing the mapping, fall back to the nearer supported mapping size (CONT_PMD to PMD and CONT_PTE to PTE respectively). Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Gavin Shan <gshan@redhat.com> [maz: rewritten commit message] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201025230626.18501-1-gshan@redhat.com
-
Will Deacon authored
stage2_pte_cacheable() tries to figure out whether the mapping installed in its 'pte' parameter is cacheable or not. Unfortunately, it fails miserably because it extracts the memory attributes from the entry using FIELD_GET(), which returns the attributes shifted down to bit 0, but then compares this with the unshifted value generated by the PAGE_S2_MEMATTR() macro. A direct consequence of this bug is that cache maintenance is silently skipped, which in turn causes 32-bit guests to crash early on when their set/way maintenance is trapped but not emulated correctly. Fix the broken masks by avoiding the use of FIELD_GET() altogether. Fixes: 6d9d2115 ("KVM: arm64: Add support for stage-2 map()/unmap() in generic page-table") Reported-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20201029144716.30476-1-will@kernel.org
-
Marc Zyngier authored
The DBGD{CCINT,SCRext} and DBGVCR register entries in the cp14 array are missing their target register, resulting in all accesses being targetted at the guard sysreg (indexed by __INVALID_SYSREG__). Point the emulation code at the actual register entries. Fixes: bdfb4b38 ("arm64: KVM: add trap handlers for AArch32 debug registers") Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20201029172409.2768336-1-maz@kernel.org
-
Will Deacon authored
For consistency with the rest of the stage-2 page-table page allocations (performing using a kvm_mmu_memory_cache), ensure that __GFP_ACCOUNT is included in the GFP flags for the PGD pages. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20201026144423.24683-1-will@kernel.org
-
Marc Zyngier authored
Setting PSTATE.PAN when entering EL2 on nVHE doesn't make much sense as this bit only means something for translation regimes that include EL0. This obviously isn't the case in the nVHE case, so let's drop this setting. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com> Link: https://lore.kernel.org/r/20201026095116.72051-4-maz@kernel.org
-
Marc Zyngier authored
The new calling convention says that pointers coming from the SMCCC interface are turned into their HYP version in the host HVC handler. However, there is still a stray kern_hyp_va() in the TLB invalidation code, which could result in a corrupted pointer. Drop the spurious conversion. Fixes: a071261d ("KVM: arm64: nVHE: Fix pointers during SMCCC convertion") Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201026095116.72051-3-maz@kernel.org
-
Marc Zyngier authored
The hyp-init code starts by stashing a register in TPIDR_EL2 in in order to free a register. This happens no matter if the HVC call is legal or not. Although nothing wrong seems to come out of it, it feels odd to alter the EL2 state for something that eventually returns an error. Instead, use the fact that we know exactly which bits of the __kvm_hyp_init call are non-zero to perform the check with a series of EOR/ROR instructions, combined with a build-time check that the value is the one we expect. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201026095116.72051-2-maz@kernel.org
-
- 28 Oct, 2020 1 commit
-
-
David Woodhouse authored
No functional change; just reserve the feature bit for now so that VMMs can start to implement it. This will allow the host to indicate that MSI emulation supports 15-bit destination IDs, allowing up to 32768 CPUs without interrupt remapping. cf. https://patchwork.kernel.org/patch/11816693/ for qemu Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <4cd59bed05f4b7410d3d1ffd1e997ab53683874d.camel@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
- 25 Oct, 2020 10 commits
-
-
Linus Torvalds authored
-
Joe Perches authored
Use a more generic form for __section that requires quotes to avoid complications with clang and gcc differences. Remove the quote operator # from compiler_attributes.h __section macro. Convert all unquoted __section(foo) uses to quoted __section("foo"). Also convert __attribute__((section("foo"))) uses to __section("foo") even if the __attribute__ has multiple list entry forms. Conversion done using the script at: https://lore.kernel.org/lkml/75393e5ddc272dc7403de74d645e6c6e0f4e70eb.camel@perches.com/2-convert_section.plSigned-off-by: Joe Perches <joe@perches.com> Reviewed-by: Nick Desaulniers <ndesaulniers@gooogle.com> Reviewed-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Rasmus Villemoes authored
tid_addr is not a "pointer to (pointer to int in userspace)"; it is in fact a "pointer to (pointer to int in userspace) in userspace". So sparse rightfully complains about passing a kernel pointer to put_user(). Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Eric Biggers authored
Commit 453431a5 ("mm, treewide: rename kzfree() to kfree_sensitive()") renamed kzfree() to kfree_sensitive(), but it left a compatibility definition of kzfree() to avoid being too disruptive. Since then a few more instances of kzfree() have slipped in. Just get rid of them and remove the compatibility definition once and for all. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Joe Perches authored
If set, use the environment variable GIT_DIR to change the default .git location of the kernel git tree. If GIT_DIR is unset, keep using the current ".git" default. Link: https://lkml.kernel.org/r/c5e23b45562373d632fccb8bc04e563abba4dd1d.camel@perches.comSigned-off-by: Joe Perches <joe@perches.com> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull timer fixes from Thomas Gleixner: "A time namespace fix and a matching selftest. The futex absolute timeouts which are based on CLOCK_MONOTONIC require time namespace corrected. This was missed in the original time namesapce support" * tag 'timers-urgent-2020-10-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: selftests/timens: Add a test for futex() futex: Adjust absolute futex timeouts with per time namespace offset
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull scheduler fixes from Thomas Gleixner: "Two scheduler fixes: - A trivial build fix for sched_feat() to compile correctly with CONFIG_JUMP_LABEL=n - Replace a zero lenght array with a flexible array" * tag 'sched-urgent-2020-10-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/features: Fix !CONFIG_JUMP_LABEL case sched: Replace zero-length array with flexible-array
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull perf fix from Thomas Gleixner: "A single fix to compute the field offset of the SNOOPX bit in the data source bitmask of perf events correctly" * tag 'perf-urgent-2020-10-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: correct SNOOPX field offset
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull locking fix from Thomas Gleixner: "Just a trivial fix for kernel-doc warnings" * tag 'locking-urgent-2020-10-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/seqlocks: Fix kernel-doc warnings
-
git://github.com/jonmason/ntbLinus Torvalds authored
Pull NTB fixes from Jon Mason. * tag 'ntb-5.10' of git://github.com/jonmason/ntb: NTB: Use struct_size() helper in devm_kzalloc() ntb: intel: Fix memleak in intel_ntb_pci_probe NTB: hw: amd: fix an issue about leak system resources
-