1. 15 May, 2022 2 commits
    • Quentin Perret's avatar
      KVM: arm64: Don't hypercall before EL2 init · 2e403167
      Quentin Perret authored
      Will reported the following splat when running with Protected KVM
      enabled:
      
      [    2.427181] ------------[ cut here ]------------
      [    2.427668] WARNING: CPU: 3 PID: 1 at arch/arm64/kvm/mmu.c:489 __create_hyp_private_mapping+0x118/0x1ac
      [    2.428424] Modules linked in:
      [    2.429040] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.18.0-rc2-00084-g8635adc4efc7 #1
      [    2.429589] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
      [    2.430286] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      [    2.430734] pc : __create_hyp_private_mapping+0x118/0x1ac
      [    2.431091] lr : create_hyp_exec_mappings+0x40/0x80
      [    2.431377] sp : ffff80000803baf0
      [    2.431597] x29: ffff80000803bb00 x28: 0000000000000000 x27: 0000000000000000
      [    2.432156] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
      [    2.432561] x23: ffffcd96c343b000 x22: 0000000000000000 x21: ffff80000803bb40
      [    2.433004] x20: 0000000000000004 x19: 0000000000001800 x18: 0000000000000000
      [    2.433343] x17: 0003e68cf7efdd70 x16: 0000000000000004 x15: fffffc81f602a2c8
      [    2.434053] x14: ffffdf8380000000 x13: ffffcd9573200000 x12: ffffcd96c343b000
      [    2.434401] x11: 0000000000000004 x10: ffffcd96c1738000 x9 : 0000000000000004
      [    2.434812] x8 : ffff80000803bb40 x7 : 7f7f7f7f7f7f7f7f x6 : 544f422effff306b
      [    2.435136] x5 : 000000008020001e x4 : ffff207d80a88c00 x3 : 0000000000000005
      [    2.435480] x2 : 0000000000001800 x1 : 000000014f4ab800 x0 : 000000000badca11
      [    2.436149] Call trace:
      [    2.436600]  __create_hyp_private_mapping+0x118/0x1ac
      [    2.437576]  create_hyp_exec_mappings+0x40/0x80
      [    2.438180]  kvm_init_vector_slots+0x180/0x194
      [    2.458941]  kvm_arch_init+0x80/0x274
      [    2.459220]  kvm_init+0x48/0x354
      [    2.459416]  arm_init+0x20/0x2c
      [    2.459601]  do_one_initcall+0xbc/0x238
      [    2.459809]  do_initcall_level+0x94/0xb4
      [    2.460043]  do_initcalls+0x54/0x94
      [    2.460228]  do_basic_setup+0x1c/0x28
      [    2.460407]  kernel_init_freeable+0x110/0x178
      [    2.460610]  kernel_init+0x20/0x1a0
      [    2.460817]  ret_from_fork+0x10/0x20
      [    2.461274] ---[ end trace 0000000000000000 ]---
      
      Indeed, the Protected KVM mode promotes __create_hyp_private_mapping()
      to a hypercall as EL1 no longer has access to the hypervisor's stage-1
      page-table. However, the call from kvm_init_vector_slots() happens after
      pKVM has been initialized on the primary CPU, but before it has been
      initialized on secondaries. As such, if the KVM initcall procedure is
      migrated from one CPU to another in this window, the hypercall may end up
      running on a CPU for which EL2 has not been initialized.
      
      Fortunately, the pKVM hypervisor doesn't rely on the host to re-map the
      vectors in the private range, so the hypercall in question is in fact
      superfluous. Skip it when pKVM is enabled.
      Reported-by: default avatarWill Deacon <will@kernel.org>
      Signed-off-by: default avatarQuentin Perret <qperret@google.com>
      [maz: simplified the checks slightly]
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      Link: https://lore.kernel.org/r/20220513092607.35233-1-qperret@google.com
      2e403167
    • Marc Zyngier's avatar
      KVM: arm64: vgic-v3: Consistently populate ID_AA64PFR0_EL1.GIC · 5163373a
      Marc Zyngier authored
      When adding support for the slightly wonky Apple M1, we had to
      populate ID_AA64PFR0_EL1.GIC==1 to present something to the guest,
      as the HW itself doesn't advertise the feature.
      
      However, we gated this on the in-kernel irqchip being created.
      This causes some trouble for QEMU, which snapshots the state of
      the registers before creating a virtual GIC, and then tries to
      restore these registers once the GIC has been created.  Obviously,
      between the two stages, ID_AA64PFR0_EL1.GIC has changed value,
      and the write fails.
      
      The fix is to actually emulate the HW, and always populate the
      field if the HW is capable of it.
      
      Fixes: 562e530f ("KVM: arm64: Force ID_AA64PFR0_EL1.GIC=1 when exposing a virtual GICv3")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      Reported-by: default avatarPeter Maydell <peter.maydell@linaro.org>
      Reviewed-by: default avatarOliver Upton <oupton@google.com>
      Link: https://lore.kernel.org/r/20220503211424.3375263-1-maz@kernel.org
      5163373a
  2. 27 Apr, 2022 3 commits
    • Marc Zyngier's avatar
      KVM: arm64: Inject exception on out-of-IPA-range translation fault · 85ea6b1e
      Marc Zyngier authored
      When taking a translation fault for an IPA that is outside of
      the range defined by the hypervisor (between the HW PARange and
      the IPA range), we stupidly treat it as an IO and forward the access
      to userspace. Of course, userspace can't do much with it, and things
      end badly.
      
      Arguably, the guest is braindead, but we should at least catch the
      case and inject an exception.
      
      Check the faulting IPA against:
      - the sanitised PARange: inject an address size fault
      - the IPA size: inject an abort
      Reported-by: default avatarChristoffer Dall <christoffer.dall@arm.com>
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      85ea6b1e
    • Alexandru Elisei's avatar
      KVM/arm64: Don't emulate a PMU for 32-bit guests if feature not set · 8f6379e2
      Alexandru Elisei authored
      kvm->arch.arm_pmu is set when userspace attempts to set the first PMU
      attribute. As certain attributes are mandatory, arm_pmu ends up always
      being set to a valid arm_pmu, otherwise KVM will refuse to run the VCPU.
      However, this only happens if the VCPU has the PMU feature. If the VCPU
      doesn't have the feature bit set, kvm->arch.arm_pmu will be left
      uninitialized and equal to NULL.
      
      KVM doesn't do ID register emulation for 32-bit guests and accesses to the
      PMU registers aren't gated by the pmu_visibility() function. This is done
      to prevent injecting unexpected undefined exceptions in guests which have
      detected the presence of a hardware PMU. But even though the VCPU feature
      is missing, KVM still attempts to emulate certain aspects of the PMU when
      PMU registers are accessed. This leads to a NULL pointer dereference like
      this one, which happens on an odroid-c4 board when running the
      kvm-unit-tests pmu-cycle-counter test with kvmtool and without the PMU
      feature being set:
      
      [  454.402699] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000150
      [  454.405865] Mem abort info:
      [  454.408596]   ESR = 0x96000004
      [  454.411638]   EC = 0x25: DABT (current EL), IL = 32 bits
      [  454.416901]   SET = 0, FnV = 0
      [  454.419909]   EA = 0, S1PTW = 0
      [  454.423010]   FSC = 0x04: level 0 translation fault
      [  454.427841] Data abort info:
      [  454.430687]   ISV = 0, ISS = 0x00000004
      [  454.434484]   CM = 0, WnR = 0
      [  454.437404] user pgtable: 4k pages, 48-bit VAs, pgdp=000000000c924000
      [  454.443800] [0000000000000150] pgd=0000000000000000, p4d=0000000000000000
      [  454.450528] Internal error: Oops: 96000004 [#1] PREEMPT SMP
      [  454.456036] Modules linked in:
      [  454.459053] CPU: 1 PID: 267 Comm: kvm-vcpu-0 Not tainted 5.18.0-rc4 #113
      [  454.465697] Hardware name: Hardkernel ODROID-C4 (DT)
      [  454.470612] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      [  454.477512] pc : kvm_pmu_event_mask.isra.0+0x14/0x74
      [  454.482427] lr : kvm_pmu_set_counter_event_type+0x2c/0x80
      [  454.487775] sp : ffff80000a9839c0
      [  454.491050] x29: ffff80000a9839c0 x28: ffff000000a83a00 x27: 0000000000000000
      [  454.498127] x26: 0000000000000000 x25: 0000000000000000 x24: ffff00000a510000
      [  454.505198] x23: ffff000000a83a00 x22: ffff000003b01000 x21: 0000000000000000
      [  454.512271] x20: 000000000000001f x19: 00000000000003ff x18: 0000000000000000
      [  454.519343] x17: 000000008003fe98 x16: 0000000000000000 x15: 0000000000000000
      [  454.526416] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
      [  454.533489] x11: 000000008003fdbc x10: 0000000000009d20 x9 : 000000000000001b
      [  454.540561] x8 : 0000000000000000 x7 : 0000000000000d00 x6 : 0000000000009d00
      [  454.547633] x5 : 0000000000000037 x4 : 0000000000009d00 x3 : 0d09000000000000
      [  454.554705] x2 : 000000000000001f x1 : 0000000000000000 x0 : 0000000000000000
      [  454.561779] Call trace:
      [  454.564191]  kvm_pmu_event_mask.isra.0+0x14/0x74
      [  454.568764]  kvm_pmu_set_counter_event_type+0x2c/0x80
      [  454.573766]  access_pmu_evtyper+0x128/0x170
      [  454.577905]  perform_access+0x34/0x80
      [  454.581527]  kvm_handle_cp_32+0x13c/0x160
      [  454.585495]  kvm_handle_cp15_32+0x1c/0x30
      [  454.589462]  handle_exit+0x70/0x180
      [  454.592912]  kvm_arch_vcpu_ioctl_run+0x1c4/0x5e0
      [  454.597485]  kvm_vcpu_ioctl+0x23c/0x940
      [  454.601280]  __arm64_sys_ioctl+0xa8/0xf0
      [  454.605160]  invoke_syscall+0x48/0x114
      [  454.608869]  el0_svc_common.constprop.0+0xd4/0xfc
      [  454.613527]  do_el0_svc+0x28/0x90
      [  454.616803]  el0_svc+0x34/0xb0
      [  454.619822]  el0t_64_sync_handler+0xa4/0x130
      [  454.624049]  el0t_64_sync+0x18c/0x190
      [  454.627675] Code: a9be7bfd 910003fd f9000bf3 52807ff3 (b9415001)
      [  454.633714] ---[ end trace 0000000000000000 ]---
      
      In this particular case, Linux hasn't detected the presence of a hardware
      PMU because the PMU node is missing from the DTB, so userspace would have
      been unable to set the VCPU PMU feature even if it attempted it. What
      happens is that the 32-bit guest reads ID_DFR0, which advertises the
      presence of the PMU, and when it tries to program a counter, it triggers
      the NULL pointer dereference because kvm->arch.arm_pmu is NULL.
      
      kvm-arch.arm_pmu was introduced by commit 46b18782 ("KVM: arm64:
      Keep a per-VM pointer to the default PMU"). Until that commit, this
      error would be triggered instead:
      
      [   73.388140] ------------[ cut here ]------------
      [   73.388189] Unknown PMU version 0
      [   73.390420] WARNING: CPU: 1 PID: 264 at arch/arm64/kvm/pmu-emul.c:36 kvm_pmu_event_mask.isra.0+0x6c/0x74
      [   73.399821] Modules linked in:
      [   73.402835] CPU: 1 PID: 264 Comm: kvm-vcpu-0 Not tainted 5.17.0 #114
      [   73.409132] Hardware name: Hardkernel ODROID-C4 (DT)
      [   73.414048] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      [   73.420948] pc : kvm_pmu_event_mask.isra.0+0x6c/0x74
      [   73.425863] lr : kvm_pmu_event_mask.isra.0+0x6c/0x74
      [   73.430779] sp : ffff80000a8db9b0
      [   73.434055] x29: ffff80000a8db9b0 x28: ffff000000dbaac0 x27: 0000000000000000
      [   73.441131] x26: ffff000000dbaac0 x25: 00000000c600000d x24: 0000000000180720
      [   73.448203] x23: ffff800009ffbe10 x22: ffff00000b612000 x21: 0000000000000000
      [   73.455276] x20: 000000000000001f x19: 0000000000000000 x18: ffffffffffffffff
      [   73.462348] x17: 000000008003fe98 x16: 0000000000000000 x15: 0720072007200720
      [   73.469420] x14: 0720072007200720 x13: ffff800009d32488 x12: 00000000000004e6
      [   73.476493] x11: 00000000000001a2 x10: ffff800009d32488 x9 : ffff800009d32488
      [   73.483565] x8 : 00000000ffffefff x7 : ffff800009d8a488 x6 : ffff800009d8a488
      [   73.490638] x5 : ffff0000f461a9d8 x4 : 0000000000000000 x3 : 0000000000000001
      [   73.497710] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000000dbaac0
      [   73.504784] Call trace:
      [   73.507195]  kvm_pmu_event_mask.isra.0+0x6c/0x74
      [   73.511768]  kvm_pmu_set_counter_event_type+0x2c/0x80
      [   73.516770]  access_pmu_evtyper+0x128/0x16c
      [   73.520910]  perform_access+0x34/0x80
      [   73.524532]  kvm_handle_cp_32+0x13c/0x160
      [   73.528500]  kvm_handle_cp15_32+0x1c/0x30
      [   73.532467]  handle_exit+0x70/0x180
      [   73.535917]  kvm_arch_vcpu_ioctl_run+0x20c/0x6e0
      [   73.540489]  kvm_vcpu_ioctl+0x2b8/0x9e0
      [   73.544283]  __arm64_sys_ioctl+0xa8/0xf0
      [   73.548165]  invoke_syscall+0x48/0x114
      [   73.551874]  el0_svc_common.constprop.0+0xd4/0xfc
      [   73.556531]  do_el0_svc+0x28/0x90
      [   73.559808]  el0_svc+0x28/0x80
      [   73.562826]  el0t_64_sync_handler+0xa4/0x130
      [   73.567054]  el0t_64_sync+0x1a0/0x1a4
      [   73.570676] ---[ end trace 0000000000000000 ]---
      [   73.575382] kvm: pmu event creation failed -2
      
      The root cause remains the same: kvm->arch.pmuver was never set to
      something sensible because the VCPU feature itself was never set.
      
      The odroid-c4 is somewhat of a special case, because Linux doesn't probe
      the PMU. But the above errors can easily be reproduced on any hardware,
      with or without a PMU driver, as long as userspace doesn't set the PMU
      feature.
      
      Work around the fact that KVM advertises a PMU even when the VCPU feature
      is not set by gating all PMU emulation on the feature. The guest can still
      access the registers without KVM injecting an undefined exception.
      Signed-off-by: default avatarAlexandru Elisei <alexandru.elisei@arm.com>
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      Link: https://lore.kernel.org/r/20220425145530.723858-1-alexandru.elisei@arm.com
      8f6379e2
    • Will Deacon's avatar
      KVM: arm64: Handle host stage-2 faults from 32-bit EL0 · 2a50fc5f
      Will Deacon authored
      When pKVM is enabled, host memory accesses are translated by an identity
      mapping at stage-2, which is populated lazily in response to synchronous
      exceptions from 64-bit EL1 and EL0.
      
      Extend this handling to cover exceptions originating from 32-bit EL0 as
      well. Although these are very unlikely to occur in practice, as the
      kernel typically ensures that user pages are initialised before mapping
      them in, drivers could still map previously untouched device pages into
      userspace and expect things to work rather than panic the system.
      
      Cc: Quentin Perret <qperret@google.com>
      Cc: Marc Zyngier <maz@kernel.org>
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      Link: https://lore.kernel.org/r/20220427171332.13635-1-will@kernel.org
      2a50fc5f
  3. 07 Apr, 2022 4 commits
  4. 06 Apr, 2022 7 commits
  5. 03 Apr, 2022 8 commits
  6. 02 Apr, 2022 16 commits