1. 15 Dec, 2020 3 commits
    • Marc Zyngier's avatar
      arm64: Warn the user when a small VA_BITS value wastes memory · 31f80a4e
      Marc Zyngier authored
      The memblock code ignores any memory that doesn't fit in the
      linear mapping. In order to preserve the distance between two physical
      memory locations and their mappings in the linear map, any hole between
      two memory regions occupies the same space in the linear map.
      
      On most systems, this is hardly a problem (the memory banks are close
      together, and VA_BITS represents a large space compared to the available
      memory *and* the potential gaps).
      
      On NUMA systems, things are quite different: the gaps between the
      memory nodes can be pretty large compared to the memory size itself,
      and the range from memblock_start_of_DRAM() to memblock_end_of_DRAM()
      can exceed the space described by VA_BITS.
      
      Unfortunately, we're not very good at making this obvious to the user,
      and on a D05 system (two sockets and 4 nodes with 64GB each)
      accidentally configured with 39bit VA, we display something like this:
      
      [    0.000000] NUMA: NODE_DATA [mem 0x1ffbffe100-0x1ffbffffff]
      [    0.000000] NUMA: NODE_DATA [mem 0x2febfc1100-0x2febfc2fff]
      [    0.000000] NUMA: Initmem setup node 2 [<memory-less node>]
      [    0.000000] NUMA: NODE_DATA [mem 0x2febfbf200-0x2febfc10ff]
      [    0.000000] NUMA: NODE_DATA(2) on node 1
      [    0.000000] NUMA: Initmem setup node 3 [<memory-less node>]
      [    0.000000] NUMA: NODE_DATA [mem 0x2febfbd300-0x2febfbf1ff]
      [    0.000000] NUMA: NODE_DATA(3) on node 1
      
      which isn't very explicit, and doesn't tell the user why 128GB
      have suddently disappeared.
      
      Let's add a warning message telling the user that memory has been
      truncated, and offer a potential solution (bumping VA_BITS up).
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      Acked-by: default avatarMark Rutland <mark.rutland@arm.com>
      Link: https://lore.kernel.org/r/20201215152918.1511108-1-maz@kernel.orgSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      31f80a4e
    • Mark Rutland's avatar
      arm64: entry: suppress W=1 prototype warnings · bf023e76
      Mark Rutland authored
      When building with W=1, GCC complains that we haven't defined prototypes
      for a number of non-static functions in entry-common.c:
      
      | arch/arm64/kernel/entry-common.c:203:25: warning: no previous prototype for 'el1_sync_handler' [-Wmissing-prototypes]
      |   203 | asmlinkage void noinstr el1_sync_handler(struct pt_regs *regs)
      |       |                         ^~~~~~~~~~~~~~~~
      | arch/arm64/kernel/entry-common.c:377:25: warning: no previous prototype for 'el0_sync_handler' [-Wmissing-prototypes]
      |   377 | asmlinkage void noinstr el0_sync_handler(struct pt_regs *regs)
      |       |                         ^~~~~~~~~~~~~~~~
      | arch/arm64/kernel/entry-common.c:447:25: warning: no previous prototype for 'el0_sync_compat_handler' [-Wmissing-prototypes]
      |   447 | asmlinkage void noinstr el0_sync_compat_handler(struct pt_regs *regs)
      |       |                         ^~~~~~~~~~~~~~~~~~~~~~~
      
      ... and so automated build systems using W=1 end up sending a number of
      emails, despite this not being a real problem as the only callers are in
      entry.S where prototypes cannot matter.
      
      For similar cases in entry-common.c we added prototypes to
      asm/exception.h, so let's do the same thing here for consistency.
      
      Note that there are a number of other warnings printed with W=1, both
      under arch/arm64 and in core code, and this patch only addresses the
      cases in entry-common.c. Automated build systems typically filter these
      warnings such that they're only reported when changes are made nearby,
      so we don't need to solve them all at once.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20201214113353.44417-1-mark.rutland@arm.comSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      bf023e76
    • Viresh Kumar's avatar
      arm64: topology: Drop the useless update to per-cpu cycles · 51550a48
      Viresh Kumar authored
      The previous call to update_freq_counters_refs() has already updated the
      per-cpu variables, don't overwrite them with the same value again.
      
      Fixes: 4b9cf23c ("arm64: wrap and generalise counter read functions")
      Signed-off-by: default avatarViresh Kumar <viresh.kumar@linaro.org>
      Reviewed-by: default avatarIonela Voinescu <ionela.voinescu@arm.com>
      Reviewed-by: default avatarSudeep Holla <sudeep.holla@arm.com>
      Link: https://lore.kernel.org/r/7a171f710cdc0f808a2bfbd7db839c0d265527e7.1607579234.git.viresh.kumar@linaro.orgSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      51550a48
  2. 09 Dec, 2020 8 commits
    • Catalin Marinas's avatar
      Merge remote-tracking branch 'arm64/for-next/fixes' into for-next/core · d8897975
      Catalin Marinas authored
      * arm64/for-next/fixes: (26 commits)
        arm64: mte: fix prctl(PR_GET_TAGGED_ADDR_CTRL) if TCF0=NONE
        arm64: mte: Fix typo in macro definition
        arm64: entry: fix EL1 debug transitions
        arm64: entry: fix NMI {user, kernel}->kernel transitions
        arm64: entry: fix non-NMI kernel<->kernel transitions
        arm64: ptrace: prepare for EL1 irq/rcu tracking
        arm64: entry: fix non-NMI user<->kernel transitions
        arm64: entry: move el1 irq/nmi logic to C
        arm64: entry: prepare ret_to_user for function call
        arm64: entry: move enter_from_user_mode to entry-common.c
        arm64: entry: mark entry code as noinstr
        arm64: mark idle code as noinstr
        arm64: syscall: exit userspace before unmasking exceptions
        arm64: pgtable: Ensure dirty bit is preserved across pte_wrprotect()
        arm64: pgtable: Fix pte_accessible()
        ACPI/IORT: Fix doc warnings in iort.c
        arm64/fpsimd: add <asm/insn.h> to <asm/kprobes.h> to fix fpsimd build
        arm64: cpu_errata: Apply Erratum 845719 to KRYO2XX Silver
        arm64: proton-pack: Add KRYO2XX silver CPUs to spectre-v2 safe-list
        arm64: kpti: Add KRYO2XX gold/silver CPU cores to kpti safelist
        ...
      
      # Conflicts:
      #	arch/arm64/include/asm/exception.h
      #	arch/arm64/kernel/sdei.c
      d8897975
    • Catalin Marinas's avatar
      Merge remote-tracking branch 'arm64/for-next/scs' into for-next/core · d45056ad
      Catalin Marinas authored
      * arm64/for-next/scs:
        arm64: sdei: Push IS_ENABLED() checks down to callee functions
        arm64: scs: use vmapped IRQ and SDEI shadow stacks
        scs: switch to vmapped shadow stacks
      d45056ad
    • Catalin Marinas's avatar
      Merge remote-tracking branch 'arm64/for-next/perf' into for-next/core · d8602f8b
      Catalin Marinas authored
      * arm64/for-next/perf:
        perf/imx_ddr: Add system PMU identifier for userspace
        bindings: perf: imx-ddr: add compatible string
        arm64: Fix build failure when HARDLOCKUP_DETECTOR_PERF is enabled
        arm64: Enable perf events based hard lockup detector
        perf/imx_ddr: Add stop event counters support for i.MX8MP
        perf/smmuv3: Support sysfs identifier file
        drivers/perf: hisi: Add identifier sysfs file
        perf: remove duplicate check on fwnode
        driver/perf: Add PMU driver for the ARM DMC-620 memory controller
      d8602f8b
    • Catalin Marinas's avatar
      Merge branch 'for-next/misc' into for-next/core · ba4259a6
      Catalin Marinas authored
      * for-next/misc:
        : Miscellaneous patches
        arm64: vmlinux.lds.S: Drop redundant *.init.rodata.*
        kasan: arm64: set TCR_EL1.TBID1 when enabled
        arm64: mte: optimize asynchronous tag check fault flag check
        arm64/mm: add fallback option to allocate virtually contiguous memory
        arm64/smp: Drop the macro S(x,s)
        arm64: consistently use reserved_pg_dir
        arm64: kprobes: Remove redundant kprobe_step_ctx
      
      # Conflicts:
      #	arch/arm64/kernel/vmlinux.lds.S
      ba4259a6
    • Catalin Marinas's avatar
      Merge branch 'for-next/uaccess' into for-next/core · e0f7a8d5
      Catalin Marinas authored
      * for-next/uaccess:
        : uaccess routines clean-up and set_fs() removal
        arm64: mark __system_matches_cap as __maybe_unused
        arm64: uaccess: remove vestigal UAO support
        arm64: uaccess: remove redundant PAN toggling
        arm64: uaccess: remove addr_limit_user_check()
        arm64: uaccess: remove set_fs()
        arm64: uaccess cleanup macro naming
        arm64: uaccess: split user/kernel routines
        arm64: uaccess: refactor __{get,put}_user
        arm64: uaccess: simplify __copy_user_flushcache()
        arm64: uaccess: rename privileged uaccess routines
        arm64: sdei: explicitly simulate PAN/UAO entry
        arm64: sdei: move uaccess logic to arch/arm64/
        arm64: head.S: always initialize PSTATE
        arm64: head.S: cleanup SCTLR_ELx initialization
        arm64: head.S: rename el2_setup -> init_kernel_el
        arm64: add C wrappers for SET_PSTATE_*()
        arm64: ensure ERET from kthread is illegal
      e0f7a8d5
    • Catalin Marinas's avatar
      Merge branches 'for-next/kvm-build-fix', 'for-next/va-refactor',... · 3c09ec59
      Catalin Marinas authored
      Merge branches 'for-next/kvm-build-fix', 'for-next/va-refactor', 'for-next/lto', 'for-next/mem-hotplug', 'for-next/cppc-ffh', 'for-next/pad-image-header', 'for-next/zone-dma-default-32-bit', 'for-next/signal-tag-bits' and 'for-next/cmdline-extended' into for-next/core
      
      * for-next/kvm-build-fix:
        : Fix KVM build issues with 64K pages
        KVM: arm64: Fix build error in user_mem_abort()
      
      * for-next/va-refactor:
        : VA layout changes
        arm64: mm: don't assume struct page is always 64 bytes
        Documentation/arm64: fix RST layout of memory.rst
        arm64: mm: tidy up top of kernel VA space
        arm64: mm: make vmemmap region a projection of the linear region
        arm64: mm: extend linear region for 52-bit VA configurations
      
      * for-next/lto:
        : Upgrade READ_ONCE() to RCpc acquire on arm64 with LTO
        arm64: lto: Strengthen READ_ONCE() to acquire when CONFIG_LTO=y
        arm64: alternatives: Remove READ_ONCE() usage during patch operation
        arm64: cpufeatures: Add capability for LDAPR instruction
        arm64: alternatives: Split up alternative.h
        arm64: uaccess: move uao_* alternatives to asm-uaccess.h
      
      * for-next/mem-hotplug:
        : Memory hotplug improvements
        arm64/mm/hotplug: Ensure early memory sections are all online
        arm64/mm/hotplug: Enable MEM_OFFLINE event handling
        arm64/mm/hotplug: Register boot memory hot remove notifier earlier
        arm64: mm: account for hotplug memory when randomizing the linear region
      
      * for-next/cppc-ffh:
        : Add CPPC FFH support using arm64 AMU counters
        arm64: abort counter_read_on_cpu() when irqs_disabled()
        arm64: implement CPPC FFH support using AMUs
        arm64: split counter validation function
        arm64: wrap and generalise counter read functions
      
      * for-next/pad-image-header:
        : Pad Image header to 64KB and unmap it
        arm64: head: tidy up the Image header definition
        arm64/head: avoid symbol names pointing into first 64 KB of kernel image
        arm64: omit [_text, _stext) from permanent kernel mapping
      
      * for-next/zone-dma-default-32-bit:
        : Default to 32-bit wide ZONE_DMA (previously reduced to 1GB for RPi4)
        of: unittest: Fix build on architectures without CONFIG_OF_ADDRESS
        mm: Remove examples from enum zone_type comment
        arm64: mm: Set ZONE_DMA size based on early IORT scan
        arm64: mm: Set ZONE_DMA size based on devicetree's dma-ranges
        of: unittest: Add test for of_dma_get_max_cpu_address()
        of/address: Introduce of_dma_get_max_cpu_address()
        arm64: mm: Move zone_dma_bits initialization into zone_sizes_init()
        arm64: mm: Move reserve_crashkernel() into mem_init()
        arm64: Force NO_BLOCK_MAPPINGS if crashkernel reservation is required
        arm64: Ignore any DMA offsets in the max_zone_phys() calculation
      
      * for-next/signal-tag-bits:
        : Expose the FAR_EL1 tag bits in siginfo
        arm64: expose FAR_EL1 tag bits in siginfo
        signal: define the SA_EXPOSE_TAGBITS bit in sa_flags
        signal: define the SA_UNSUPPORTED bit in sa_flags
        arch: provide better documentation for the arch-specific SA_* flags
        signal: clear non-uapi flag bits when passing/returning sa_flags
        arch: move SA_* definitions to generic headers
        parisc: start using signal-defs.h
        parisc: Drop parisc special case for __sighandler_t
      
      * for-next/cmdline-extended:
        : Add support for CONFIG_CMDLINE_EXTENDED
        arm64: Extend the kernel command line from the bootloader
        arm64: kaslr: Refactor early init command line parsing
      3c09ec59
    • Joakim Zhang's avatar
      perf/imx_ddr: Add system PMU identifier for userspace · 881b0520
      Joakim Zhang authored
      The DDR Perf for i.MX8 is a system PMU whose AXI ID would different from
      SoC to SoC. Need expose system PMU identifier for userspace which refer
      to /sys/bus/event_source/devices/<PMU DEVICE>/identifier.
      Signed-off-by: default avatarJoakim Zhang <qiangqing.zhang@nxp.com>
      Reviewed-by: default avatarJohn Garry <john.garry@huawei.com>
      Link: https://lore.kernel.org/r/20201130114202.26057-3-qiangqing.zhang@nxp.comSigned-off-by: default avatarWill Deacon <will@kernel.org>
      881b0520
    • Joakim Zhang's avatar
      d0c00977
  3. 04 Dec, 2020 1 commit
  4. 03 Dec, 2020 2 commits
  5. 02 Dec, 2020 17 commits
    • Mark Rutland's avatar
      arm64: uaccess: remove vestigal UAO support · 1517c4fa
      Mark Rutland authored
      Now that arm64 no longer uses UAO, remove the vestigal feature detection
      code and Kconfig text.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20201202131558.39270-13-mark.rutland@arm.comSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      1517c4fa
    • Mark Rutland's avatar
      arm64: uaccess: remove redundant PAN toggling · 7cf283c7
      Mark Rutland authored
      Some code (e.g. futex) needs to make privileged accesses to userspace
      memory, and uses uaccess_{enable,disable}_privileged() in order to
      permit this. All other uaccess primitives use LDTR/STTR, and never need
      to toggle PAN.
      
      Remove the redundant PAN toggling.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20201202131558.39270-12-mark.rutland@arm.comSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      7cf283c7
    • Mark Rutland's avatar
      arm64: uaccess: remove addr_limit_user_check() · b5a5a01d
      Mark Rutland authored
      Now that set_fs() is gone, addr_limit_user_check() is redundant. Remove
      the checks and associated thread flag.
      
      To ensure that _TIF_WORK_MASK can be used as an immediate value in an
      AND instruction (as it is in `ret_to_user`), TIF_MTE_ASYNC_FAULT is
      renumbered to keep the constituent bits of _TIF_WORK_MASK contiguous.
      
      There should be no functional change as a result of this patch.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20201202131558.39270-11-mark.rutland@arm.comSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      b5a5a01d
    • Mark Rutland's avatar
      arm64: uaccess: remove set_fs() · 3d2403fd
      Mark Rutland authored
      Now that the uaccess primitives dont take addr_limit into account, we
      have no need to manipulate this via set_fs() and get_fs(). Remove
      support for these, along with some infrastructure this renders
      redundant.
      
      We no longer need to flip UAO to access kernel memory under KERNEL_DS,
      and head.S unconditionally clears UAO for all kernel configurations via
      an ERET in init_kernel_el. Thus, we don't need to dynamically flip UAO,
      nor do we need to context-switch it. However, we still need to adjust
      PAN during SDEI entry.
      
      Masking of __user pointers no longer needs to use the dynamic value of
      addr_limit, and can use a constant derived from the maximum possible
      userspace task size. A new TASK_SIZE_MAX constant is introduced for
      this, which is also used by core code. In configurations supporting
      52-bit VAs, this may include a region of unusable VA space above a
      48-bit TTBR0 limit, but never includes any portion of TTBR1.
      
      Note that TASK_SIZE_MAX is an exclusive limit, while USER_DS and
      KERNEL_DS were inclusive limits, and is converted to a mask by
      subtracting one.
      
      As the SDEI entry code repurposes the otherwise unnecessary
      pt_regs::orig_addr_limit field to store the TTBR1 of the interrupted
      context, for now we rename that to pt_regs::sdei_ttbr1. In future we can
      consider factoring that out.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Acked-by: default avatarJames Morse <james.morse@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20201202131558.39270-10-mark.rutland@arm.comSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      3d2403fd
    • Mark Rutland's avatar
      arm64: uaccess cleanup macro naming · 7b90dc40
      Mark Rutland authored
      Now the uaccess primitives use LDTR/STTR unconditionally, the
      uao_{ldp,stp,user_alternative} asm macros are misnamed, and have a
      redundant argument. Let's remove the redundant argument and rename these
      to user_{ldp,stp,ldst} respectively to clean this up.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarRobin Murohy <robin.murphy@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20201202131558.39270-9-mark.rutland@arm.comSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      7b90dc40
    • Mark Rutland's avatar
      arm64: uaccess: split user/kernel routines · fc703d80
      Mark Rutland authored
      This patch separates arm64's user and kernel memory access primitives
      into distinct routines, adding new __{get,put}_kernel_nofault() helpers
      to access kernel memory, upon which core code builds larger copy
      routines.
      
      The kernel access routines (using LDR/STR) are not affected by PAN (when
      legitimately accessing kernel memory), nor are they affected by UAO.
      Switching to KERNEL_DS may set UAO, but this does not adversely affect
      the kernel access routines.
      
      The user access routines (using LDTR/STTR) are not affected by PAN (when
      legitimately accessing user memory), but are affected by UAO. As these
      are only legitimate to use under USER_DS with UAO clear, this should not
      be problematic.
      
      Routines performing atomics to user memory (futex and deprecated
      instruction emulation) still need to transiently clear PAN, and these
      are left as-is. These are never used on kernel memory.
      
      Subsequent patches will refactor the uaccess helpers to remove redundant
      code, and will also remove the redundant PAN/UAO manipulation.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20201202131558.39270-8-mark.rutland@arm.comSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      fc703d80
    • Mark Rutland's avatar
      arm64: uaccess: refactor __{get,put}_user · f253d827
      Mark Rutland authored
      As a step towards implementing __{get,put}_kernel_nofault(), this patch
      splits most user-memory specific logic out of __{get,put}_user(), with
      the memory access and fault handling in new __{raw_get,put}_mem()
      helpers.
      
      For now the LDR/LDTR patching is left within the *get_mem() helpers, and
      will be removed in a subsequent patch.
      
      There should be no functional change as a result of this patch.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20201202131558.39270-7-mark.rutland@arm.comSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      f253d827
    • Mark Rutland's avatar
      arm64: uaccess: simplify __copy_user_flushcache() · 9e94fdad
      Mark Rutland authored
      Currently __copy_user_flushcache() open-codes raw_copy_from_user(), and
      doesn't use uaccess_mask_ptr() on the user address. Let's have it call
      raw_copy_from_user(), which is both a simplification and ensures that
      user pointers are masked under speculation.
      
      There should be no functional change as a result of this patch.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20201202131558.39270-6-mark.rutland@arm.comSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      9e94fdad
    • Mark Rutland's avatar
      arm64: uaccess: rename privileged uaccess routines · 923e1e7d
      Mark Rutland authored
      We currently have many uaccess_*{enable,disable}*() variants, which
      subsequent patches will cut down as part of removing set_fs() and
      friends. Once this simplification is made, most uaccess routines will
      only need to ensure that the user page tables are mapped in TTBR0, as is
      currently dealt with by uaccess_ttbr0_{enable,disable}().
      
      The existing uaccess_{enable,disable}() routines ensure that user page
      tables are mapped in TTBR0, and also disable PAN protections, which is
      necessary to be able to use atomics on user memory, but also permit
      unrelated privileged accesses to access user memory.
      
      As preparatory step, let's rename uaccess_{enable,disable}() to
      uaccess_{enable,disable}_privileged(), highlighting this caveat and
      discouraging wider misuse. Subsequent patches can reuse the
      uaccess_{enable,disable}() naming for the common case of ensuring the
      user page tables are mapped in TTBR0.
      
      There should be no functional change as a result of this patch.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20201202131558.39270-5-mark.rutland@arm.comSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      923e1e7d
    • Mark Rutland's avatar
      arm64: sdei: explicitly simulate PAN/UAO entry · 2376e75c
      Mark Rutland authored
      In preparation for removing addr_limit and set_fs() we must decouple the
      SDEI PAN/UAO manipulation from the uaccess code, and explicitly
      reinitialize these as required.
      
      SDEI enters the kernel with a non-architectural exception, and prior to
      the most recent revision of the specification (ARM DEN 0054B), PSTATE
      bits (e.g. PAN, UAO) are not manipulated in the same way as for
      architectural exceptions. Notably, older versions of the spec can be
      read ambiguously as to whether PSTATE bits are inherited unchanged from
      the interrupted context or whether they are generated from scratch, with
      TF-A doing the latter.
      
      We have three cases to consider:
      
      1) The existing TF-A implementation of SDEI will clear PAN and clear UAO
         (along with other bits in PSTATE) when delivering an SDEI exception.
      
      2) In theory, implementations of SDEI prior to revision B could inherit
         PAN and UAO (along with other bits in PSTATE) unchanged from the
         interrupted context. However, in practice such implementations do not
         exist.
      
      3) Going forward, new implementations of SDEI must clear UAO, and
         depending on SCTLR_ELx.SPAN must either inherit or set PAN.
      
      As we can ignore (2) we can assume that upon SDEI entry, UAO is always
      clear, though PAN may be clear, inherited, or set per SCTLR_ELx.SPAN.
      Therefore, we must explicitly initialize PAN, but do not need to do
      anything for UAO.
      
      Considering what we need to do:
      
      * When set_fs() is removed, force_uaccess_begin() will have no HW
        side-effects. As this only clears UAO, which we can assume has already
        been cleared upon entry, this is not a problem. We do not need to add
        code to manipulate UAO explicitly.
      
      * PAN may be cleared upon entry (in case 1 above), so where a kernel is
        built to use PAN and this is supported by all CPUs, the kernel must
        set PAN upon entry to ensure expected behaviour.
      
      * PAN may be inherited from the interrupted context (in case 3 above),
        and so where a kernel is not built to use PAN or where PAN support is
        not uniform across CPUs, the kernel must clear PAN to ensure expected
        behaviour.
      
      This patch reworks the SDEI code accordingly, explicitly setting PAN to
      the expected state in all cases. To cater for the cases where the kernel
      does not use PAN or this is not uniformly supported by hardware we add a
      new cpu_has_pan() helper which can be used regardless of whether the
      kernel is built to use PAN.
      
      The existing system_uses_ttbr0_pan() is redefined in terms of
      system_uses_hw_pan() both for clarity and as a minor optimization when
      HW PAN is not selected.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarJames Morse <james.morse@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20201202131558.39270-3-mark.rutland@arm.comSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      2376e75c
    • Mark Rutland's avatar
      arm64: sdei: move uaccess logic to arch/arm64/ · a0ccf2ba
      Mark Rutland authored
      The SDEI support code is split across arch/arm64/ and drivers/firmware/,
      largley this is split so that the arch-specific portions are under
      arch/arm64, and the management logic is under drivers/firmware/.
      However, exception entry fixups are currently under drivers/firmware.
      
      Let's move the exception entry fixups under arch/arm64/. This
      de-clutters the management logic, and puts all the arch-specific
      portions in one place. Doing this also allows the fixups to be applied
      earlier, so things like PAN and UAO will be in a known good state before
      we run other logic. This will also make subsequent refactoring easier.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarJames Morse <james.morse@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20201202131558.39270-2-mark.rutland@arm.comSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      a0ccf2ba
    • Mark Rutland's avatar
      arm64: head.S: always initialize PSTATE · d87a8e65
      Mark Rutland authored
      As with SCTLR_ELx and other control registers, some PSTATE bits are
      UNKNOWN out-of-reset, and we may not be able to rely on hardware or
      firmware to initialize them to our liking prior to entry to the kernel,
      e.g. in the primary/secondary boot paths and return from idle/suspend.
      
      It would be more robust (and easier to reason about) if we consistently
      initialized PSTATE to a default value, as we do with control registers.
      This will ensure that the kernel is not adversely affected by bits it is
      not aware of, e.g. when support for a feature such as PAN/UAO is
      disabled.
      
      This patch ensures that PSTATE is consistently initialized at boot time
      via an ERET. This is not intended to relax the existing requirements
      (e.g. DAIF bits must still be set prior to entering the kernel). For
      features detected dynamically (which may require system-wide support),
      it is still necessary to subsequently modify PSTATE.
      
      As ERET is not always a Context Synchronization Event, an ISB is placed
      before each exception return to ensure updates to control registers have
      taken effect. This handles the kernel being entered with SCTLR_ELx.EOS
      clear (or any future control bits being in an UNKNOWN state).
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20201113124937.20574-6-mark.rutland@arm.comSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      d87a8e65
    • Mark Rutland's avatar
      arm64: head.S: cleanup SCTLR_ELx initialization · 2ffac9e3
      Mark Rutland authored
      Let's make SCTLR_ELx initialization a bit clearer by using meaningful
      names for the initialization values, following the same scheme for
      SCTLR_EL1 and SCTLR_EL2.
      
      These definitions will be used more widely in subsequent patches.
      
      There should be no functional change as a result of this patch.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20201113124937.20574-5-mark.rutland@arm.comSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      2ffac9e3
    • Mark Rutland's avatar
      arm64: head.S: rename el2_setup -> init_kernel_el · ecbb11ab
      Mark Rutland authored
      For a while now el2_setup has performed some basic initialization of EL1
      even when the kernel is booted at EL1, so the name is a little
      misleading. Further, some comments are stale as with VHE it doesn't drop
      the CPU to EL1.
      
      To clarify things, rename el2_setup to init_kernel_el, and update
      comments to be clearer as to the function's purpose.
      
      There should be no functional change as a result of this patch.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20201113124937.20574-4-mark.rutland@arm.comSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      ecbb11ab
    • Mark Rutland's avatar
      arm64: add C wrappers for SET_PSTATE_*() · 515d5c8a
      Mark Rutland authored
      To make callsites easier to read, add trivial C wrappers for the
      SET_PSTATE_*() helpers, and convert trivial uses over to these. The new
      wrappers will be used further in subsequent patches.
      
      There should be no functional change as a result of this patch.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20201113124937.20574-3-mark.rutland@arm.comSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      515d5c8a
    • Mark Rutland's avatar
      arm64: ensure ERET from kthread is illegal · f80d0340
      Mark Rutland authored
      For consistency, all tasks have a pt_regs reserved at the highest
      portion of their task stack. Among other things, this ensures that a
      task's SP is always pointing within its stack rather than pointing
      immediately past the end.
      
      While it is never legitimate to ERET from a kthread, we take pains to
      initialize pt_regs for kthreads as if this were legitimate. As this is
      never legitimate, the effects of an erroneous return are rarely tested.
      
      Let's simplify things by initializing a kthread's pt_regs such that an
      ERET is caught as an illegal exception return, and removing the explicit
      initialization of other exception context. Note that as
      spectre_v4_enable_task_mitigation() only manipulates the PSTATE within
      the unused regs this is safe to remove.
      
      As user tasks will have their exception context initialized via
      start_thread() or start_compat_thread(), this should only impact cases
      where something has gone very wrong and we'd like that to be clearly
      indicated.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20201113124937.20574-2-mark.rutland@arm.comSigned-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      f80d0340
    • Catalin Marinas's avatar
      of: unittest: Fix build on architectures without CONFIG_OF_ADDRESS · aed5041e
      Catalin Marinas authored
      of_dma_get_max_cpu_address() is not defined if !CONFIG_OF_ADDRESS, so
      return early in of_unittest_dma_get_max_cpu_address().
      
      Fixes: 07d13a1d ("of: unittest: Add test for of_dma_get_max_cpu_address()")
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      aed5041e
  6. 01 Dec, 2020 3 commits
  7. 30 Nov, 2020 6 commits
    • Vincenzo Frascino's avatar
      arm64: mte: Fix typo in macro definition · 9e5344e0
      Vincenzo Frascino authored
      UL in the definition of SYS_TFSR_EL1_TF1 was misspelled causing
      compilation issues when trying to implement in kernel MTE async
      mode.
      
      Fix the macro correcting the typo.
      
      Note: MTE async mode will be introduced with a future series.
      
      Fixes: c058b1c4 ("arm64: mte: system register definitions")
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarVincenzo Frascino <vincenzo.frascino@arm.com>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Link: https://lore.kernel.org/r/20201130170709.22309-1-vincenzo.frascino@arm.comSigned-off-by: default avatarWill Deacon <will@kernel.org>
      9e5344e0
    • Mark Rutland's avatar
      arm64: entry: fix EL1 debug transitions · 2a9b3e6a
      Mark Rutland authored
      In debug_exception_enter() and debug_exception_exit() we trace hardirqs
      on/off while RCU isn't guaranteed to be watching, and we don't save and
      restore the hardirq state, and so may return with this having changed.
      
      Handle this appropriately with new entry/exit helpers which do the bare
      minimum to ensure this is appropriately maintained, without marking
      debug exceptions as NMIs. These are placed in entry-common.c with the
      other entry/exit helpers.
      
      In future we'll want to reconsider whether some debug exceptions should
      be NMIs, but this will require a significant refactoring, and for now
      this should prevent issues with lockdep and RCU.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marins <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20201130115950.22492-12-mark.rutland@arm.comSigned-off-by: default avatarWill Deacon <will@kernel.org>
      2a9b3e6a
    • Mark Rutland's avatar
      arm64: entry: fix NMI {user, kernel}->kernel transitions · f0cd5ac1
      Mark Rutland authored
      Exceptions which can be taken at (almost) any time are consdiered to be
      NMIs. On arm64 that includes:
      
      * SDEI events
      * GICv3 Pseudo-NMIs
      * Kernel stack overflows
      * Unexpected/unhandled exceptions
      
      ... but currently debug exceptions (BRKs, breakpoints, watchpoints,
      single-step) are not considered NMIs.
      
      As these can be taken at any time, kernel features (lockdep, RCU,
      ftrace) may not be in a consistent kernel state. For example, we may
      take an NMI from the idle code or partway through an entry/exit path.
      
      While nmi_enter() and nmi_exit() handle most of this state, notably they
      don't save/restore the lockdep state across an NMI being taken and
      handled. When interrupts are enabled and an NMI is taken, lockdep may
      see interrupts become disabled within the NMI code, but not see
      interrupts become enabled when returning from the NMI, leaving lockdep
      believing interrupts are disabled when they are actually disabled.
      
      The x86 code handles this in idtentry_{enter,exit}_nmi(), which will
      shortly be moved to the generic entry code. As we can't use either yet,
      we copy the x86 approach in arm64-specific helpers. All the NMI
      entrypoints are marked as noinstr to prevent any instrumentation
      handling code being invoked before the state has been corrected.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20201130115950.22492-11-mark.rutland@arm.comSigned-off-by: default avatarWill Deacon <will@kernel.org>
      f0cd5ac1
    • Mark Rutland's avatar
      arm64: entry: fix non-NMI kernel<->kernel transitions · 7cd1ea10
      Mark Rutland authored
      There are periods in kernel mode when RCU is not watching and/or the
      scheduler tick is disabled, but we can still take exceptions such as
      interrupts. The arm64 exception handlers do not account for this, and
      it's possible that RCU is not watching while an exception handler runs.
      
      The x86/generic entry code handles this by ensuring that all (non-NMI)
      kernel exception handlers call irqentry_enter() and irqentry_exit(),
      which handle RCU, lockdep, and IRQ flag tracing. We can't yet move to
      the generic entry code, and already hadnle the user<->kernel transitions
      elsewhere, so we add new kernel<->kernel transition helpers alog the
      lines of the generic entry code.
      
      Since we now track interrupts becoming masked when an exception is
      taken, local_daif_inherit() is modified to track interrupts becoming
      re-enabled when the original context is inherited. To balance the
      entry/exit paths, each handler masks all DAIF exceptions before
      exit_to_kernel_mode().
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20201130115950.22492-10-mark.rutland@arm.comSigned-off-by: default avatarWill Deacon <will@kernel.org>
      7cd1ea10
    • Mark Rutland's avatar
      arm64: ptrace: prepare for EL1 irq/rcu tracking · 1ec2f2c0
      Mark Rutland authored
      Exceptions from EL1 may be taken when RCU isn't watching (e.g. in idle
      sequences), or when the lockdep hardirqs transiently out-of-sync with
      the hardware state (e.g. in the middle of local_irq_enable()). To
      correctly handle these cases, we'll need to save/restore this state
      across some exceptions taken from EL1.
      
      A series of subsequent patches will update EL1 exception handlers to
      handle this. In preparation for this, and to avoid dependencies between
      those patches, this patch adds two new fields to struct pt_regs so that
      exception handlers can track this state.
      
      Note that this is placed in pt_regs as some entry/exit sequences such as
      el1_irq are invoked from assembly, which makes it very difficult to add
      a separate structure as with the irqentry_state used by x86. We can
      separate this once more of the exception logic is moved to C. While the
      fields only need to be bool, they are both made u64 to keep pt_regs
      16-byte aligned.
      
      There should be no functional change as a result of this patch.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20201130115950.22492-9-mark.rutland@arm.comSigned-off-by: default avatarWill Deacon <will@kernel.org>
      1ec2f2c0
    • Mark Rutland's avatar
      arm64: entry: fix non-NMI user<->kernel transitions · 23529049
      Mark Rutland authored
      When built with PROVE_LOCKING, NO_HZ_FULL, and CONTEXT_TRACKING_FORCE
      will WARN() at boot time that interrupts are enabled when we call
      context_tracking_user_enter(), despite the DAIF flags indicating that
      IRQs are masked.
      
      The problem is that we're not tracking IRQ flag changes accurately, and
      so lockdep believes interrupts are enabled when they are not (and
      vice-versa). We can shuffle things so to make this more accurate. For
      kernel->user transitions there are a number of constraints we need to
      consider:
      
      1) When we call __context_tracking_user_enter() HW IRQs must be disabled
         and lockdep must be up-to-date with this.
      
      2) Userspace should be treated as having IRQs enabled from the PoV of
         both lockdep and tracing.
      
      3) As context_tracking_user_enter() stops RCU from watching, we cannot
         use RCU after calling it.
      
      4) IRQ flag tracing and lockdep have state that must be manipulated
         before RCU is disabled.
      
      ... with similar constraints applying for user->kernel transitions, with
      the ordering reversed.
      
      The generic entry code has enter_from_user_mode() and
      exit_to_user_mode() helpers to handle this. We can't use those directly,
      so we add arm64 copies for now (without the instrumentation markers
      which aren't used on arm64). These replace the existing user_exit() and
      user_exit_irqoff() calls spread throughout handlers, and the exception
      unmasking is left as-is.
      
      Note that:
      
      * The accounting for debug exceptions from userspace now happens in
        el0_dbg() and ret_to_user(), so this is removed from
        debug_exception_enter() and debug_exception_exit(). As
        user_exit_irqoff() wakes RCU, the userspace-specific check is removed.
      
      * The accounting for syscalls now happens in el0_svc(),
        el0_svc_compat(), and ret_to_user(), so this is removed from
        el0_svc_common(). This does not adversely affect the workaround for
        erratum 1463225, as this does not depend on any of the state tracking.
      
      * In ret_to_user() we mask interrupts with local_daif_mask(), and so we
        need to inform lockdep and tracing. Here a trace_hardirqs_off() is
        sufficient and safe as we have not yet exited kernel context and RCU
        is usable.
      
      * As PROVE_LOCKING selects TRACE_IRQFLAGS, the ifdeferry in entry.S only
        needs to check for the latter.
      
      * EL0 SError handling will be dealt with in a subsequent patch, as this
        needs to be treated as an NMI.
      
      Prior to this patch, booting an appropriately-configured kernel would
      result in spats as below:
      
      | DEBUG_LOCKS_WARN_ON(lockdep_hardirqs_enabled())
      | WARNING: CPU: 2 PID: 1 at kernel/locking/lockdep.c:5280 check_flags.part.54+0x1dc/0x1f0
      | Modules linked in:
      | CPU: 2 PID: 1 Comm: init Not tainted 5.10.0-rc3 #3
      | Hardware name: linux,dummy-virt (DT)
      | pstate: 804003c5 (Nzcv DAIF +PAN -UAO -TCO BTYPE=--)
      | pc : check_flags.part.54+0x1dc/0x1f0
      | lr : check_flags.part.54+0x1dc/0x1f0
      | sp : ffff80001003bd80
      | x29: ffff80001003bd80 x28: ffff66ce801e0000
      | x27: 00000000ffffffff x26: 00000000000003c0
      | x25: 0000000000000000 x24: ffffc31842527258
      | x23: ffffc31842491368 x22: ffffc3184282d000
      | x21: 0000000000000000 x20: 0000000000000001
      | x19: ffffc318432ce000 x18: 0080000000000000
      | x17: 0000000000000000 x16: ffffc31840f18a78
      | x15: 0000000000000001 x14: ffffc3184285c810
      | x13: 0000000000000001 x12: 0000000000000000
      | x11: ffffc318415857a0 x10: ffffc318406614c0
      | x9 : ffffc318415857a0 x8 : ffffc31841f1d000
      | x7 : 647261685f706564 x6 : ffffc3183ff7c66c
      | x5 : ffff66ce801e0000 x4 : 0000000000000000
      | x3 : ffffc3183fe00000 x2 : ffffc31841500000
      | x1 : e956dc24146b3500 x0 : 0000000000000000
      | Call trace:
      |  check_flags.part.54+0x1dc/0x1f0
      |  lock_is_held_type+0x10c/0x188
      |  rcu_read_lock_sched_held+0x70/0x98
      |  __context_tracking_enter+0x310/0x350
      |  context_tracking_enter.part.3+0x5c/0xc8
      |  context_tracking_user_enter+0x6c/0x80
      |  finish_ret_to_user+0x2c/0x13cr
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/20201130115950.22492-8-mark.rutland@arm.comSigned-off-by: default avatarWill Deacon <will@kernel.org>
      23529049