1. 16 Jan, 2018 13 commits
    • James Morse's avatar
      KVM: arm64: Save ESR_EL2 on guest SError · c60590b5
      James Morse authored
      When we exit a guest due to an SError the vcpu fault info isn't updated
      with the ESR. Today this is only done for traps.
      
      The v8.2 RAS Extensions define ISS values for SError. Update the vcpu's
      fault_info with the ESR on SError so that handle_exit() can determine
      if this was a RAS SError and decode its severity.
      Acked-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      c60590b5
    • James Morse's avatar
      KVM: arm64: Save/Restore guest DISR_EL1 · c773ae2b
      James Morse authored
      If we deliver a virtual SError to the guest, the guest may defer it
      with an ESB instruction. The guest reads the deferred value via DISR_EL1,
      but the guests view of DISR_EL1 is re-mapped to VDISR_EL2 when HCR_EL2.AMO
      is set.
      
      Add the KVM code to save/restore VDISR_EL2, and make it accessible to
      userspace as DISR_EL1.
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Reviewed-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      c773ae2b
    • James Morse's avatar
      KVM: arm64: Set an impdef ESR for Virtual-SError using VSESR_EL2. · 4715c14b
      James Morse authored
      Prior to v8.2's RAS Extensions, the HCR_EL2.VSE 'virtual SError' feature
      generated an SError with an implementation defined ESR_EL1.ISS, because we
      had no mechanism to specify the ESR value.
      
      On Juno this generates an all-zero ESR, the most significant bit 'ISV'
      is clear indicating the remainder of the ISS field is invalid.
      
      With the RAS Extensions we have a mechanism to specify this value, and the
      most significant bit has a new meaning: 'IDS - Implementation Defined
      Syndrome'. An all-zero SError ESR now means: 'RAS error: Uncategorized'
      instead of 'no valid ISS'.
      
      Add KVM support for the VSESR_EL2 register to specify an ESR value when
      HCR_EL2.VSE generates a virtual SError. Change kvm_inject_vabt() to
      specify an implementation-defined value.
      
      We only need to restore the VSESR_EL2 value when HCR_EL2.VSE is set, KVM
      save/restores this bit during __{,de}activate_traps() and hardware clears the
      bit once the guest has consumed the virtual-SError.
      
      Future patches may add an API (or KVM CAP) to pend a virtual SError with
      a specified ESR.
      
      Cc: Dongjiu Geng <gengdongjiu@huawei.com>
      Reviewed-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Reviewed-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      4715c14b
    • James Morse's avatar
      KVM: arm/arm64: mask/unmask daif around VHE guests · 4f5abad9
      James Morse authored
      Non-VHE systems take an exception to EL2 in order to world-switch into the
      guest. When returning from the guest KVM implicitly restores the DAIF
      flags when it returns to the kernel at EL1.
      
      With VHE none of this exception-level jumping happens, so KVMs
      world-switch code is exposed to the host kernel's DAIF values, and KVM
      spills the guest-exit DAIF values back into the host kernel.
      On entry to a guest we have Debug and SError exceptions unmasked, KVM
      has switched VBAR but isn't prepared to handle these. On guest exit
      Debug exceptions are left disabled once we return to the host and will
      stay this way until we enter user space.
      
      Add a helper to mask/unmask DAIF around VHE guests. The unmask can only
      happen after the hosts VBAR value has been synchronised by the isb in
      __vhe_hyp_call (via kvm_call_hyp()). Masking could be as late as
      setting KVMs VBAR value, but is kept here for symmetry.
      Acked-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Reviewed-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      4f5abad9
    • James Morse's avatar
      arm64: kernel: Prepare for a DISR user · 68ddbf09
      James Morse authored
      KVM would like to consume any pending SError (or RAS error) after guest
      exit. Today it has to unmask SError and use dsb+isb to synchronise the
      CPU. With the RAS extensions we can use ESB to synchronise any pending
      SError.
      
      Add the necessary macros to allow DISR to be read and converted to an
      ESR.
      
      We clear the DISR register when we enable the RAS cpufeature, and the
      kernel has not executed any ESB instructions. Any value we find in DISR
      must have belonged to firmware. Executing an ESB instruction is the
      only way to update DISR, so we can expect firmware to have handled
      any deferred SError. By the same logic we clear DISR in the idle path.
      Reviewed-by: default avatarSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Reviewed-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      68ddbf09
    • James Morse's avatar
      arm64: Unconditionally enable IESB on exception entry/return for firmware-first · f751daa4
      James Morse authored
      ARM v8.2 has a feature to add implicit error synchronization barriers
      whenever the CPU enters or returns from an exception level. Add this to the
      features we always enable. CPUs that don't support this feature will treat
      the bit as RES0.
      
      This feature causes RAS errors that are not yet visible to software to
      become pending SErrors. We expect to have firmware-first RAS support
      so synchronised RAS errors will be take immediately to EL3.
      Any system without firmware-first handling of errors will take the SError
      either immediatly after exception return, or when we unmask SError after
      entry.S's work.
      
      Adding IESB to the ELx flags causes it to be enabled by KVM and kexec
      too.
      
      Platform level RAS support may require additional firmware support.
      
      Cc: Christoffer Dall <christoffer.dall@linaro.org>
      Suggested-by: default avatarWill Deacon <will.deacon@arm.com>
      Link: https://www.spinics.net/lists/kvm-arm/msg28192.htmlAcked-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      f751daa4
    • James Morse's avatar
      arm64: kernel: Survive corrected RAS errors notified by SError · 6bf0dcfd
      James Morse authored
      Prior to v8.2, SError is an uncontainable fatal exception. The v8.2 RAS
      extensions use SError to notify software about RAS errors, these can be
      contained by the Error Syncronization Barrier.
      
      An ACPI system with firmware-first may use SError as its 'SEI'
      notification. Future patches may add code to 'claim' this SError as a
      notification.
      
      Other systems can distinguish these RAS errors from the SError ESR and
      use the AET bits and additional data from RAS-Error registers to handle
      the error. Future patches may add this kernel-first handling.
      
      Without support for either of these we will panic(), even if we received
      a corrected error. Add code to decode the severity of RAS errors. We can
      safely ignore contained errors where the CPU can continue to make
      progress. For all other errors we continue to panic().
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Reviewed-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      6bf0dcfd
    • Xie XiuQi's avatar
      arm64: cpufeature: Detect CPU RAS Extentions · 64c02720
      Xie XiuQi authored
      ARM's v8.2 Extentions add support for Reliability, Availability and
      Serviceability (RAS). On CPUs with these extensions system software
      can use additional barriers to isolate errors and determine if faults
      are pending. Add cpufeature detection.
      
      Platform level RAS support may require additional firmware support.
      Reviewed-by: default avatarSuzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: default avatarXie XiuQi <xiexiuqi@huawei.com>
      [Rebased added config option, reworded commit message]
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Reviewed-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      64c02720
    • James Morse's avatar
      arm64: sysreg: Move to use definitions for all the SCTLR bits · 7a00d68e
      James Morse authored
      __cpu_setup() configures SCTLR_EL1 using some hard coded hex masks,
      and el2_setup() duplicates some this when setting RES1 bits.
      
      Lets make this the same as KVM's hyp_init, which uses named bits.
      
      First, we add definitions for all the SCTLR_EL{1,2} bits, the RES{1,0}
      bits, and those we want to set or clear.
      
      Add a build_bug checks to ensures all bits are either set or clear.
      This means we don't need to preserve endian-ness configuration
      generated elsewhere.
      
      Finally, move the head.S and proc.S users of these hard-coded masks
      over to the macro versions.
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      7a00d68e
    • James Morse's avatar
      arm64: cpufeature: __this_cpu_has_cap() shouldn't stop early · edf298cf
      James Morse authored
      this_cpu_has_cap() tests caps->desc not caps->matches, so it stops
      walking the list when it finds a 'silent' feature, instead of
      walking to the end of the list.
      
      Prior to v4.6's 644c2ae1 ("arm64: cpufeature: Test 'matches' pointer
      to find the end of the list") we always tested desc to find the end of
      a capability list. This was changed for dubious things like PAN_NOT_UAO.
      v4.7's e3661b12 ("arm64: Allow a capability to be checked on
      single CPU") added this_cpu_has_cap() using the old desc style test.
      
      CC: Suzuki K Poulose <suzuki.poulose@arm.com>
      Reviewed-by: default avatarSuzuki K Poulose <suzuki.poulose@arm.com>
      Acked-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      edf298cf
    • Dave Martin's avatar
      arm64: fpsimd: Fix state leakage when migrating after sigreturn · 0abdeff5
      Dave Martin authored
      When refactoring the sigreturn code to handle SVE, I changed the
      sigreturn implementation to store the new FPSIMD state from the
      user sigframe into task_struct before reloading the state into the
      CPU regs.  This makes it easier to convert the data for SVE when
      needed.
      
      However, it turns out that the fpsimd_state structure passed into
      fpsimd_update_current_state is not fully initialised, so assigning
      the structure as a whole corrupts current->thread.fpsimd_state.cpu
      with uninitialised data.
      
      This means that if the garbage data written to .cpu happens to be a
      valid cpu number, and the task is subsequently migrated to the cpu
      identified by the that number, and then tries to enter userspace,
      the CPU FPSIMD regs will be assumed to be correct for the task and
      not reloaded as they should be.  This can result in returning to
      userspace with the FPSIMD registers containing data that is stale or
      that belongs to another task or to the kernel.
      
      Knowingly handing around a kernel structure that is incompletely
      initialised with user data is a potential source of mistakes,
      especially across source file boundaries.  To help avoid a repeat
      of this issue, this patch adapts the relevant internal API to hand
      around the user-accessible subset only: struct user_fpsimd_state.
      
      To avoid future surprises, this patch also converts all uses of
      struct fpsimd_state that really only access the user subset, to use
      struct user_fpsimd_state.  A few missing consts are added to
      function prototypes for good measure.
      
      Thanks to Will for spotting the cause of the bug here.
      Reported-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarDave Martin <Dave.Martin@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      0abdeff5
    • Punit Agrawal's avatar
      arm64: Correct type for PUD macros · 29d9bef1
      Punit Agrawal authored
      The PUD macros (PUD_TABLE_BIT, PUD_TYPE_MASK, PUD_TYPE_SECT) use the
      pgdval_t even when pudval_t is available. Even though the underlying
      type for both (u64) is the same it is confusing and may lead to issues
      in the future.
      
      Fix this by using pudval_t to define the PUD_* macros.
      
      Fixes: 084bd298 ("ARM64: mm: HugeTLB support.")
      Fixes: 206a2a73 ("arm64: mm: Create gigabyte kernel logical mappings where possible")
      Signed-off-by: default avatarPunit Agrawal <punit.agrawal@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      29d9bef1
    • Stephen Boyd's avatar
      arm64: Inform user if software PAN is in use · 894cfd14
      Stephen Boyd authored
      It isn't entirely obvious if we're using software PAN because we
      don't say anything about it in the boot log. But if we're using
      hardware PAN we'll print a nice CPU feature message indicating
      it. Add a print for software PAN too so we know if it's being
      used or not.
      Signed-off-by: default avatarStephen Boyd <sboyd@codeaurora.org>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      894cfd14
  2. 15 Jan, 2018 7 commits
  3. 14 Jan, 2018 9 commits
  4. 13 Jan, 2018 11 commits
    • James Morse's avatar
      firmware: arm_sdei: Add support for CPU and system power states · da351827
      James Morse authored
      When a CPU enters an idle lower-power state or is powering off, we
      need to mask SDE events so that no events can be delivered while we
      are messing with the MMU as the registered entry points won't be valid.
      
      If the system reboots, we want to unregister all events and mask the CPUs.
      For kexec this allows us to hand a clean slate to the next kernel
      instead of relying on it to call sdei_{private,system}_data_reset().
      
      For hibernate we unregister all events and re-register them on restore,
      in case we restored with the SDE code loaded at a different address.
      (e.g. KASLR).
      
      Add all the notifiers necessary to do this. We only support shared events
      so all events are left registered and enabled over CPU hotplug.
      Reviewed-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      [catalin.marinas@arm.com: added CPU_PM_ENTER_FAILED case]
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      da351827
    • James Morse's avatar
      arm64: kernel: Add arch-specific SDEI entry code and CPU masking · f5df2696
      James Morse authored
      The Software Delegated Exception Interface (SDEI) is an ARM standard
      for registering callbacks from the platform firmware into the OS.
      This is typically used to implement RAS notifications.
      
      Such notifications enter the kernel at the registered entry-point
      with the register values of the interrupted CPU context. Because this
      is not a CPU exception, it cannot reuse the existing entry code.
      (crucially we don't implicitly know which exception level we interrupted),
      
      Add the entry point to entry.S to set us up for calling into C code. If
      the event interrupted code that had interrupts masked, we always return
      to that location. Otherwise we pretend this was an IRQ, and use SDEI's
      complete_and_resume call to return to vbar_el1 + offset.
      
      This allows the kernel to deliver signals to user space processes. For
      KVM this triggers the world switch, a quick spin round vcpu_run, then
      back into the guest, unless there are pending signals.
      
      Add sdei_mask_local_cpu() calls to the smp_send_stop() code, this covers
      the panic() code-path, which doesn't invoke cpuhotplug notifiers.
      
      Because we can interrupt entry-from/exit-to another EL, we can't trust the
      value in sp_el0 or x29, even if we interrupted the kernel, in this case
      the code in entry.S will save/restore sp_el0 and use the value in
      __entry_task.
      
      When we have VMAP stacks we can interrupt the stack-overflow test, which
      stirs x0 into sp, meaning we have to have our own VMAP stacks. For now
      these are allocated when we probe the interface. Future patches will add
      refcounting hooks to allow the arch code to allocate them lazily.
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Reviewed-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      f5df2696
    • James Morse's avatar
      arm64: uaccess: Add PAN helper · e1281f56
      James Morse authored
      Add __uaccess_{en,dis}able_hw_pan() helpers to set/clear the PSTATE.PAN
      bit.
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      e1281f56
    • James Morse's avatar
      arm64: Add vmap_stack header file · ed8b20d4
      James Morse authored
      Today the arm64 arch code allocates an extra IRQ stack per-cpu. If we
      also have SDEI and VMAP stacks we need two extra per-cpu VMAP stacks.
      
      Move the VMAP stack allocation out to a helper in a new header file.
      This avoids missing THREADINFO_GFP, or getting the all-important alignment
      wrong.
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Reviewed-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Reviewed-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      ed8b20d4
    • James Morse's avatar
      firmware: arm_sdei: Add driver for Software Delegated Exceptions · ad6eb31e
      James Morse authored
      The Software Delegated Exception Interface (SDEI) is an ARM standard
      for registering callbacks from the platform firmware into the OS.
      This is typically used to implement firmware notifications (such as
      firmware-first RAS) or promote an IRQ that has been promoted to a
      firmware-assisted NMI.
      
      Add the code for detecting the SDEI version and the framework for
      registering and unregistering events. Subsequent patches will add the
      arch-specific backend code and the necessary power management hooks.
      
      Only shared events are supported, power management, private events and
      discovery for ACPI systems will be added by later patches.
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      ad6eb31e
    • James Morse's avatar
      Docs: dt: add devicetree binding for describing arm64 SDEI firmware · 86f04f64
      James Morse authored
      The Software Delegated Exception Interface (SDEI) is an ARM standard
      for registering callbacks from the platform firmware into the OS.
      This is typically used to implement RAS notifications, or from an
      IRQ that has been promoted to a firmware-assisted NMI.
      
      Add a new devicetree binding to describe the SDE firmware interface.
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Acked-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      86f04f64
    • James Morse's avatar
      KVM: arm64: Stop save/restoring host tpidr_el1 on VHE · 1f742679
      James Morse authored
      Now that a VHE host uses tpidr_el2 for the cpu offset we no longer
      need KVM to save/restore tpidr_el1. Move this from the 'common' code
      into the non-vhe code. While we're at it, on VHE we don't need to
      save the ELR or SPSR as kernel_entry in entry.S will have pushed these
      onto the kernel stack, and will restore them from there. Move these
      to the non-vhe code as we need them to get back to the host.
      
      Finally remove the always-copy-tpidr we hid in the stage2 setup
      code, cpufeature's enable callback will do this for VHE, we only
      need KVM to do it for non-vhe. Add the copy into kvm-init instead.
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Reviewed-by: default avatarChristoffer Dall <cdall@linaro.org>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      1f742679
    • James Morse's avatar
      arm64: alternatives: use tpidr_el2 on VHE hosts · 6d99b689
      James Morse authored
      Now that KVM uses tpidr_el2 in the same way as Linux's cpu_offset in
      tpidr_el1, merge the two. This saves KVM from save/restoring tpidr_el1
      on VHE hosts, and allows future code to blindly access per-cpu variables
      without triggering world-switch.
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Reviewed-by: default avatarChristoffer Dall <cdall@linaro.org>
      Reviewed-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      6d99b689
    • James Morse's avatar
      KVM: arm64: Change hyp_panic()s dependency on tpidr_el2 · c97e166e
      James Morse authored
      Make tpidr_el2 a cpu-offset for per-cpu variables in the same way the
      host uses tpidr_el1. This lets tpidr_el{1,2} have the same value, and
      on VHE they can be the same register.
      
      KVM calls hyp_panic() when anything unexpected happens. This may occur
      while a guest owns the EL1 registers. KVM stashes the vcpu pointer in
      tpidr_el2, which it uses to find the host context in order to restore
      the host EL1 registers before parachuting into the host's panic().
      
      The host context is a struct kvm_cpu_context allocated in the per-cpu
      area, and mapped to hyp. Given the per-cpu offset for this CPU, this is
      easy to find. Change hyp_panic() to take a pointer to the
      struct kvm_cpu_context. Wrap these calls with an asm function that
      retrieves the struct kvm_cpu_context from the host's per-cpu area.
      
      Copy the per-cpu offset from the hosts tpidr_el1 into tpidr_el2 during
      kvm init. (Later patches will make this unnecessary for VHE hosts)
      
      We print out the vcpu pointer as part of the panic message. Add a back
      reference to the 'running vcpu' in the host cpu context to preserve this.
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Reviewed-by: default avatarChristoffer Dall <cdall@linaro.org>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      c97e166e
    • James Morse's avatar
      KVM: arm/arm64: Convert kvm_host_cpu_state to a static per-cpu allocation · 36989e7f
      James Morse authored
      kvm_host_cpu_state is a per-cpu allocation made from kvm_arch_init()
      used to store the host EL1 registers when KVM switches to a guest.
      
      Make it easier for ASM to generate pointers into this per-cpu memory
      by making it a static allocation.
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Acked-by: default avatarChristoffer Dall <cdall@linaro.org>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      36989e7f
    • James Morse's avatar
      KVM: arm64: Store vcpu on the stack during __guest_enter() · 32b03d10
      James Morse authored
      KVM uses tpidr_el2 as its private vcpu register, which makes sense for
      non-vhe world switch as only KVM can access this register. This means
      vhe Linux has to use tpidr_el1, which KVM has to save/restore as part
      of the host context.
      
      If the SDEI handler code runs behind KVMs back, it mustn't access any
      per-cpu variables. To allow this on systems with vhe we need to make
      the host use tpidr_el2, saving KVM from save/restoring it.
      
      __guest_enter() stores the host_ctxt on the stack, do the same with
      the vcpu.
      Signed-off-by: default avatarJames Morse <james.morse@arm.com>
      Reviewed-by: default avatarChristoffer Dall <cdall@linaro.org>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      32b03d10