Commits · 156010ed9c2ac1e9df6c11b1f688cf8a6e0152e6 · Kirill Smelkov / linux

10 Feb, 2023 1 commit

Merge branches 'for-next/sysreg', 'for-next/sme', 'for-next/kselftest',... · 156010ed

Catalin Marinas authored Feb 10, 2023

Merge branches 'for-next/sysreg', 'for-next/sme', 'for-next/kselftest', 'for-next/misc', 'for-next/sme2', 'for-next/tpidr2', 'for-next/scs', 'for-next/compat-hwcap', 'for-next/ftrace', 'for-next/efi-boot-mmu-on', 'for-next/ptrauth' and 'for-next/pseudo-nmi', remote-tracking branch 'arm64/for-next/perf' into for-next/core

* arm64/for-next/perf:
  perf: arm_spe: Print the version of SPE detected
  perf: arm_spe: Add support for SPEv1.2 inverted event filtering
  perf: Add perf_event_attr::config3
  drivers/perf: fsl_imx8_ddr_perf: Remove set-but-not-used variable
  perf: arm_spe: Support new SPEv1.2/v8.7 'not taken' event
  perf: arm_spe: Use new PMSIDR_EL1 register enums
  perf: arm_spe: Drop BIT() and use FIELD_GET/PREP accessors
  arm64/sysreg: Convert SPE registers to automatic generation
  arm64: Drop SYS_ from SPE register defines
  perf: arm_spe: Use feature numbering for PMSEVFR_EL1 defines
  perf/marvell: Add ACPI support to TAD uncore driver
  perf/marvell: Add ACPI support to DDR uncore driver
  perf/arm-cmn: Reset DTM_PMU_CONFIG at probe
  drivers/perf: hisi: Extract initialization of "cpa_pmu->pmu"
  drivers/perf: hisi: Simplify the parameters of hisi_pmu_init()
  drivers/perf: hisi: Advertise the PERF_PMU_CAP_NO_EXCLUDE capability

* for-next/sysreg:
  : arm64 sysreg and cpufeature fixes/updates
  KVM: arm64: Use symbolic definition for ISR_EL1.A
  arm64/sysreg: Add definition of ISR_EL1
  arm64/sysreg: Add definition for ICC_NMIAR1_EL1
  arm64/cpufeature: Remove 4 bit assumption in ARM64_FEATURE_MASK()
  arm64/sysreg: Fix errors in 32 bit enumeration values
  arm64/cpufeature: Fix field sign for DIT hwcap detection

* for-next/sme:
  : SME-related updates
  arm64/sme: Optimise SME exit on syscall entry
  arm64/sme: Don't use streaming mode to probe the maximum SME VL
  arm64/ptrace: Use system_supports_tpidr2() to check for TPIDR2 support

* for-next/kselftest: (23 commits)
  : arm64 kselftest fixes and improvements
  kselftest/arm64: Don't require FA64 for streaming SVE+ZA tests
  kselftest/arm64: Copy whole EXTRA context
  kselftest/arm64: Fix enumeration of systems without 128 bit SME for SSVE+ZA
  kselftest/arm64: Fix enumeration of systems without 128 bit SME
  kselftest/arm64: Don't require FA64 for streaming SVE tests
  kselftest/arm64: Limit the maximum VL we try to set via ptrace
  kselftest/arm64: Correct buffer size for SME ZA storage
  kselftest/arm64: Remove the local NUM_VL definition
  kselftest/arm64: Verify simultaneous SSVE and ZA context generation
  kselftest/arm64: Verify that SSVE signal context has SVE_SIG_FLAG_SM set
  kselftest/arm64: Remove spurious comment from MTE test Makefile
  kselftest/arm64: Support build of MTE tests with clang
  kselftest/arm64: Initialise current at build time in signal tests
  kselftest/arm64: Don't pass headers to the compiler as source
  kselftest/arm64: Remove redundant _start labels from FP tests
  kselftest/arm64: Fix .pushsection for strings in FP tests
  kselftest/arm64: Run BTI selftests on systems without BTI
  kselftest/arm64: Fix test numbering when skipping tests
  kselftest/arm64: Skip non-power of 2 SVE vector lengths in fp-stress
  kselftest/arm64: Only enumerate power of two VLs in syscall-abi
  ...

* for-next/misc:
  : Miscellaneous arm64 updates
  arm64/mm: Intercept pfn changes in set_pte_at()
  Documentation: arm64: correct spelling
  arm64: traps: attempt to dump all instructions
  arm64: Apply dynamic shadow call stack patching in two passes
  arm64: el2_setup.h: fix spelling typo in comments
  arm64: Kconfig: fix spelling
  arm64: cpufeature: Use kstrtobool() instead of strtobool()
  arm64: Avoid repeated AA64MMFR1_EL1 register read on pagefault path
  arm64: make ARCH_FORCE_MAX_ORDER selectable

* for-next/sme2: (23 commits)
  : Support for arm64 SME 2 and 2.1
  arm64/sme: Fix __finalise_el2 SMEver check
  kselftest/arm64: Remove redundant _start labels from zt-test
  kselftest/arm64: Add coverage of SME 2 and 2.1 hwcaps
  kselftest/arm64: Add coverage of the ZT ptrace regset
  kselftest/arm64: Add SME2 coverage to syscall-abi
  kselftest/arm64: Add test coverage for ZT register signal frames
  kselftest/arm64: Teach the generic signal context validation about ZT
  kselftest/arm64: Enumerate SME2 in the signal test utility code
  kselftest/arm64: Cover ZT in the FP stress test
  kselftest/arm64: Add a stress test program for ZT0
  arm64/sme: Add hwcaps for SME 2 and 2.1 features
  arm64/sme: Implement ZT0 ptrace support
  arm64/sme: Implement signal handling for ZT
  arm64/sme: Implement context switching for ZT0
  arm64/sme: Provide storage for ZT0
  arm64/sme: Add basic enumeration for SME2
  arm64/sme: Enable host kernel to access ZT0
  arm64/sme: Manually encode ZT0 load and store instructions
  arm64/esr: Document ISS for ZT0 being disabled
  arm64/sme: Document SME 2 and SME 2.1 ABI
  ...

* for-next/tpidr2:
  : Include TPIDR2 in the signal context
  kselftest/arm64: Add test case for TPIDR2 signal frame records
  kselftest/arm64: Add TPIDR2 to the set of known signal context records
  arm64/signal: Include TPIDR2 in the signal context
  arm64/sme: Document ABI for TPIDR2 signal information

* for-next/scs:
  : arm64: harden shadow call stack pointer handling
  arm64: Stash shadow stack pointer in the task struct on interrupt
  arm64: Always load shadow stack pointer directly from the task struct

* for-next/compat-hwcap:
  : arm64: Expose compat ARMv8 AArch32 features (HWCAPs)
  arm64: Add compat hwcap SSBS
  arm64: Add compat hwcap SB
  arm64: Add compat hwcap I8MM
  arm64: Add compat hwcap ASIMDBF16
  arm64: Add compat hwcap ASIMDFHM
  arm64: Add compat hwcap ASIMDDP
  arm64: Add compat hwcap FPHP and ASIMDHP

* for-next/ftrace:
  : Add arm64 support for DYNAMICE_FTRACE_WITH_CALL_OPS
  arm64: avoid executing padding bytes during kexec / hibernation
  arm64: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS
  arm64: ftrace: Update stale comment
  arm64: patching: Add aarch64_insn_write_literal_u64()
  arm64: insn: Add helpers for BTI
  arm64: Extend support for CONFIG_FUNCTION_ALIGNMENT
  ACPI: Don't build ACPICA with '-Os'
  Compiler attributes: GCC cold function alignment workarounds
  ftrace: Add DYNAMIC_FTRACE_WITH_CALL_OPS

* for-next/efi-boot-mmu-on:
  : Permit arm64 EFI boot with MMU and caches on
  arm64: kprobes: Drop ID map text from kprobes blacklist
  arm64: head: Switch endianness before populating the ID map
  efi: arm64: enter with MMU and caches enabled
  arm64: head: Clean the ID map and the HYP text to the PoC if needed
  arm64: head: avoid cache invalidation when entering with the MMU on
  arm64: head: record the MMU state at primary entry
  arm64: kernel: move identity map out of .text mapping
  arm64: head: Move all finalise_el2 calls to after __enable_mmu

* for-next/ptrauth:
  : arm64 pointer authentication cleanup
  arm64: pauth: don't sign leaf functions
  arm64: unify asm-arch manipulation

* for-next/pseudo-nmi:
  : Pseudo-NMI code generation optimisations
  arm64: irqflags: use alternative branches for pseudo-NMI logic
  arm64: add ARM64_HAS_GIC_PRIO_RELAXED_SYNC cpucap
  arm64: make ARM64_HAS_GIC_PRIO_MASKING depend on ARM64_HAS_GIC_CPUIF_SYSREGS
  arm64: rename ARM64_HAS_IRQ_PRIO_MASKING to ARM64_HAS_GIC_PRIO_MASKING
  arm64: rename ARM64_HAS_SYSREG_GIC_CPUIF to ARM64_HAS_GIC_CPUIF_SYSREGS

156010ed

07 Feb, 2023 6 commits

kselftest/arm64: Don't require FA64 for streaming SVE+ZA tests · 2c4192c0

Mark Brown authored Feb 02, 2023

During early development a dependedncy was added on having FA64
available so we could use the full FPSIMD register set in the signal
handler which got copied over into the SSVE+ZA registers test case.
Subsequently the ABI was finialised so the handler is run with streaming
mode disabled meaning this is redundant but the dependency was never
removed, do so now.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230202-arm64-kselftest-sve-za-fa64-v1-1-5c5f3dabe441@kernel.orgSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

2c4192c0

kselftest/arm64: Copy whole EXTRA context · 6012b820

Mark Brown authored Feb 02, 2023

When copying the EXTRA context our calculation of the amount of data we
need to copy is incorrect, we only calculate the amount of data needed
within uc_mcontext.__reserved, not taking account of the fixed portion
of the context. Add in the offset of the reserved data so that we copy
everything we should.

This will only cause test failures in cases where the last context in the
EXTRA context is smaller than the missing data since we don't currently
validate any of the register data and all the buffers we copy into are
statically allocated so default to zero meaning that if we walk beyond the
end of what we copied we'll encounter what looks like a context with magic
and length both 0 which is a valid terminator record.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230201-arm64-kselftest-full-extra-v1-1-93741f32dd29@kernel.orgSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

6012b820

arm64: kprobes: Drop ID map text from kprobes blacklist · a088cf8e

Ard Biesheuvel authored Feb 04, 2023

The ID mapped text region is never accessed via the normal kernel
mapping of text, and so it was moved into .rodata instead. This means it
is no longer considered as a suitable place for kprobes by default, and
the explicit blacklist is unnecessary, and actually results in an error
message at boot:

kprobes: Failed to populate blacklist (error -22), kprobes not restricted, be careful using them!

So stop blacklisting the ID map text explicitly.

Fixes: af7249b3 ("arm64: kernel: move identity map out of .text mapping")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20230204101807.2862321-1-ardb@kernel.orgSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

a088cf8e

perf: arm_spe: Print the version of SPE detected · e8a709dc

Rob Herring authored Feb 06, 2023

There's up to 4 versions of SPE now. Let's add the version that's been
detected to the driver's informational print out.
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230206204746.1452942-1-robh@kernel.orgSigned-off-by: Will Deacon <will@kernel.org>

e8a709dc

perf: arm_spe: Add support for SPEv1.2 inverted event filtering · 8d9190f0

Rob Herring authored Jan 09, 2023

Arm SPEv1.2 (Arm v8.7/v9.2) adds a new feature called Inverted Event
Filter which excludes samples matching the event filter. The feature
mirrors the existing event filter in PMSEVFR_EL1 adding a new register,
PMSNEVFR_EL1, which has the same event bit assignments.
Tested-by: James Clark <james.clark@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20220825-arm-spe-v8-7-v4-8-327f860daf28@kernel.orgSigned-off-by: Will Deacon <will@kernel.org>

8d9190f0

perf: Add perf_event_attr::config3 · 09519ec3

Rob Herring authored Jan 09, 2023

Arm SPEv1.2 adds another 64-bits of event filtering control. As the
existing perf_event_attr::configN fields are all used up for SPE PMU, an
additional field is needed. Add a new 'config3' field.
Tested-by: James Clark <james.clark@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220825-arm-spe-v8-7-v4-7-327f860daf28@kernel.orgSigned-off-by: Will Deacon <will@kernel.org>

09519ec3

06 Feb, 2023 1 commit

arm64/sme: Fix __finalise_el2 SMEver check · 9442d05b

Marc Zyngier authored Feb 06, 2023

When checking for ID_AA64SMFR0_EL1.SMEver, __check_override assumes
that the ID_AA64SMFR0_EL1 value is in x1, and the intent of the code
is to reuse value read a few lines above.

However, as the comment says at the beginning of the macro, x1 will
be clobbered, and the checks always fails.

The easiest fix is just to reload the id register before checking it.

Fixes: f122576f ("arm64/sme: Enable host kernel to access ZT0")
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

9442d05b

03 Feb, 2023 1 commit

drivers/perf: fsl_imx8_ddr_perf: Remove set-but-not-used variable · 7f49b037

Sascha Hauer authored Feb 03, 2023

active_events is set but not used, remove it.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Link: https://lore.kernel.org/r/20230203121509.3580245-1-s.hauer@pengutronix.deSigned-off-by: Will Deacon <will@kernel.org>

7f49b037

01 Feb, 2023 3 commits

kselftest/arm64: Fix enumeration of systems without 128 bit SME for SSVE+ZA · a7db82f1

Mark Brown authored Jan 31, 2023

The current signal handling tests for SME do not account for the fact that
unlike SVE all SME vector lengths are optional so we can't guarantee that
we will encounter the minimum possible VL, they will hang enumerating VLs
on such systems. Abort enumeration when we find the lowest VL in the newly
added ssve_za_regs test.

Fixes: bc69da5f ("kselftest/arm64: Verify simultaneous SSVE and ZA context generation")
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230131-arm64-kselftest-sig-sme-no-128-v1-2-d47c13dc8e1e@kernel.orgSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

a7db82f1

kselftest/arm64: Fix enumeration of systems without 128 bit SME · 5f389238

Mark Brown authored Jan 31, 2023

Fixes: 4963aeb3 ("kselftest/arm64: signal: Add SME signal handling tests")
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230131-arm64-kselftest-sig-sme-no-128-v1-1-d47c13dc8e1e@kernel.orgSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

5f389238

kselftest/arm64: Don't require FA64 for streaming SVE tests · 4365eec8

Mark Brown authored Jan 31, 2023

During early development a dependedncy was added on having FA64
available so we could use the full FPSIMD register set in the signal
handler. Subsequently the ABI was finialised so the handler is run with
streaming mode disabled meaning this is redundant but the dependency was
never removed, do so now.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230131-arm64-kselfetest-ssve-fa64-v1-1-f418efcc2b60@kernel.orgSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

4365eec8

31 Jan, 2023 11 commits

arm64: irqflags: use alternative branches for pseudo-NMI logic · a5f61cc6

Mark Rutland authored Jan 30, 2023

Due to the way we use alternatives in the irqflags code, even when
CONFIG_ARM64_PSEUDO_NMI=n, we generate unused alternative code for
pseudo-NMI management. This patch reworks the irqflags code to remove
the redundant code when CONFIG_ARM64_PSEUDO_NMI=n, which benefits the
more common case, and will permit further rework of our DAIF management
(e.g. in preparation for ARMv8.8-A's NMI feature).

Prior to this patch a defconfig kernel has hundreds of redundant
instructions to access ICC_PMR_EL1 (which should only need to be
manipulated in setup code), which this patch removes:

| [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-before-defconfig | grep icc_pmr_el1 | wc -l
| 885
| [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-after-defconfig | grep icc_pmr_el1 | wc -l
| 5

Those instructions alone account for more than 3KiB of kernel text, and
will be associated with additional alt_instr entries, padding and
branches, etc.

These redundant instructions exist because we use alternative sequences
for to choose between DAIF / PMR management in irqflags.h, and even when
CONFIG_ARM64_PSEUDO_NMI=n, those alternative sequences will generate the
code for PMR management, along with alt_instr entries. We use
alternatives here as this was necessary to ensure that we never
encounter a mismatched local_irq_save() ... local_irq_restore() sequence
in the middle of patching, which was possible to see if we used static
keys to choose between DAIF and PMR management.

Since commit:

  21fb26bf ("arm64: alternatives: add alternative_has_feature_*()")

... we have a mechanism to use alternatives similarly to static keys,
allowing us to write the bulk of the logic in C code while also being
able to rely on all sites being patched in one go, and avoiding a
mismatched mismatched local_irq_save() ... local_irq_restore() sequence
during patching.

This patch rewrites arm64's local_irq_*() functions to use alternative
branches. This allows for the pseudo-NMI code to be entirely elided when
CONFIG_ARM64_PSEUDO_NMI=n, making a defconfig Image 64KiB smaller, and
not affectint the size of an Image with CONFIG_ARM64_PSEUDO_NMI=y:

| [mark@lakrids:~/src/linux]% ls -al vmlinux-*
| -rwxr-xr-x 1 mark mark 137473432 Jan 18 11:11 vmlinux-after-defconfig
| -rwxr-xr-x 1 mark mark 137918776 Jan 18 11:15 vmlinux-after-pnmi
| -rwxr-xr-x 1 mark mark 137380152 Jan 18 11:03 vmlinux-before-defconfig
| -rwxr-xr-x 1 mark mark 137523704 Jan 18 11:08 vmlinux-before-pnmi
| [mark@lakrids:~/src/linux]% ls -al Image-*
| -rw-r--r-- 1 mark mark 38646272 Jan 18 11:11 Image-after-defconfig
| -rw-r--r-- 1 mark mark 38777344 Jan 18 11:14 Image-after-pnmi
| -rw-r--r-- 1 mark mark 38711808 Jan 18 11:03 Image-before-defconfig
| -rw-r--r-- 1 mark mark 38777344 Jan 18 11:08 Image-before-pnmi

Some sensitive code depends on being run with interrupts enabled or with
interrupts disabled, and so when enabling or disabling interrupts we
must ensure that the compiler does not move such code around the actual
enable/disable. Before this patch, that was ensured by the combined asm
volatile blocks having memory clobbers (and any sensitive code either
being asm volatile, or touching memory). This patch consistently uses
explicit barrier() operations before and after the enable/disable, which
allows us to use the usual sysreg accessors (which are asm volatile) to
manipulate the interrupt masks. The use of pmr_sync() is pulled within
this critical section for consistency.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230130145429.903791-6-mark.rutland@arm.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

a5f61cc6

arm64: add ARM64_HAS_GIC_PRIO_RELAXED_SYNC cpucap · 8bf0a804

Mark Rutland authored Jan 30, 2023

When Priority Mask Hint Enable (PMHE) == 0b1, the GIC may use the PMR
value to determine whether to signal an IRQ to a PE, and consequently
after a change to the PMR value, a DSB SY may be required to ensure that
interrupts are signalled to a CPU in finite time. When PMHE == 0b0,
interrupts are always signalled to the relevant PE, and all masking
occurs locally, without requiring a DSB SY.

Since commit:

  f2266504 ("arm64: Relax ICC_PMR_EL1 accesses when ICC_CTLR_EL1.PMHE is clear")

... we handle this dynamically: in most cases a static key is used to
determine whether to issue a DSB SY, but the entry code must read from
ICC_CTLR_EL1 as static keys aren't accessible from plain assembly.

It would be much nicer to use an alternative instruction sequence for
the DSB, as this would avoid the need to read from ICC_CTLR_EL1 in the
entry code, and for most other code this will result in simpler code
generation with fewer instructions and fewer branches.

This patch adds a new ARM64_HAS_GIC_PRIO_RELAXED_SYNC cpucap which is
only set when ICC_CTLR_EL1.PMHE == 0b0 (and GIC priority masking is in
use). This allows us to replace the existing users of the
`gic_pmr_sync` static key with alternative sequences which default to a
DSB SY and are relaxed to a NOP when PMHE is not in use.

The entry assembly management of the PMR is slightly restructured to use
a branch (rather than multiple NOPs) when priority masking is not in
use. This is more in keeping with other alternatives in the entry
assembly, and permits the use of a separate alternatives for the
PMHE-dependent DSB SY (and removal of the conditional branch this
currently requires). For consistency I've adjusted both the save and
restore paths.

According to bloat-o-meter, when building defconfig +
CONFIG_ARM64_PSEUDO_NMI=y this shrinks the kernel text by ~4KiB:

| add/remove: 4/2 grow/shrink: 42/310 up/down: 332/-5032 (-4700)

The resulting vmlinux is ~66KiB smaller, though the resulting Image size
is unchanged due to padding and alignment:

| [mark@lakrids:~/src/linux]% ls -al vmlinux-*
| -rwxr-xr-x 1 mark mark 137508344 Jan 17 14:11 vmlinux-after
| -rwxr-xr-x 1 mark mark 137575440 Jan 17 13:49 vmlinux-before
| [mark@lakrids:~/src/linux]% ls -al Image-*
| -rw-r--r-- 1 mark mark 38777344 Jan 17 14:11 Image-after
| -rw-r--r-- 1 mark mark 38777344 Jan 17 13:49 Image-before

Prior to this patch we did not verify the state of ICC_CTLR_EL1.PMHE on
secondary CPUs. As of this patch this is verified by the cpufeature code
when using GIC priority masking (i.e. when using pseudo-NMIs).

Note that since commit:

  7e3a57fa ("arm64: Document ICC_CTLR_EL3.PMHE setting requirements")

... Documentation/arm64/booting.rst specifies:

|      - ICC_CTLR_EL3.PMHE (bit 6) must be set to the same value across
|        all CPUs the kernel is executing on, and must stay constant
|        for the lifetime of the kernel.

... so that should not adversely affect any compliant systems, and as
we'll only check for the absense of PMHE when using pseudo-NMIs, this
will only fire when such mismatch will adversely affect the system.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230130145429.903791-5-mark.rutland@arm.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

8bf0a804

arm64: make ARM64_HAS_GIC_PRIO_MASKING depend on ARM64_HAS_GIC_CPUIF_SYSREGS · 4b43f1cd

Mark Rutland authored Jan 30, 2023

Currently the arm64_cpu_capabilities structure for
ARM64_HAS_GIC_PRIO_MASKING open-codes the same CPU field definitions as
the arm64_cpu_capabilities structure for ARM64_HAS_GIC_CPUIF_SYSREGS, so
that can_use_gic_priorities() can use has_useable_gicv3_cpuif().

This duplication isn't ideal for the legibility of the code, and sets a
bad example for any ARM64_HAS_GIC_* definitions added by subsequent
patches.

Instead, have ARM64_HAS_GIC_PRIO_MASKING check for the
ARM64_HAS_GIC_CPUIF_SYSREGS cpucap, and add a comment explaining why
this is safe. Subsequent patches will use the same pattern where one
cpucap depends upon another.

There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230130145429.903791-4-mark.rutland@arm.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

4b43f1cd

arm64: rename ARM64_HAS_IRQ_PRIO_MASKING to ARM64_HAS_GIC_PRIO_MASKING · c888b7bd

Mark Rutland authored Jan 30, 2023

Subsequent patches will add more GIC-related cpucaps. When we do so, it
would be nice to give them a consistent HAS_GIC_* prefix.

In preparation for doing so, this patch renames the existing
ARM64_HAS_IRQ_PRIO_MASKING cap to ARM64_HAS_GIC_PRIO_MASKING.

The cpucaps file was hand-modified; all other changes were scripted
with:

  find . -type f -name '*.[chS]' -print0 | \
    xargs -0 sed -i 's/ARM64_HAS_IRQ_PRIO_MASKING/ARM64_HAS_GIC_PRIO_MASKING/'

There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230130145429.903791-3-mark.rutland@arm.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

c888b7bd

arm64: rename ARM64_HAS_SYSREG_GIC_CPUIF to ARM64_HAS_GIC_CPUIF_SYSREGS · 0e62ccb9

Mark Rutland authored Jan 30, 2023

Subsequent patches will add more GIC-related cpucaps. When we do so, it
would be nice to give them a consistent HAS_GIC_* prefix.

In preparation for doing so, this patch renames the existing
ARM64_HAS_SYSREG_GIC_CPUIF cap to ARM64_HAS_GIC_CPUIF_SYSREGS.

The 'CPUIF_SYSREGS' suffix is chosen so that this will be ordered ahead
of other ARM64_HAS_GIC_* definitions in subsequent patches.

The cpucaps file was hand-modified; all other changes were scripted
with:

  find . -type f -name '*.[chS]' -print0 | \
    xargs -0 sed -i
    's/ARM64_HAS_SYSREG_GIC_CPUIF/ARM64_HAS_GIC_CPUIF_SYSREGS/'

There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230130145429.903791-2-mark.rutland@arm.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

0e62ccb9

arm64: pauth: don't sign leaf functions · c68cf528

Mark Rutland authored Jan 31, 2023

Currently, when CONFIG_ARM64_PTR_AUTH_KERNEL=y (and
CONFIG_UNWIND_PATCH_PAC_INTO_SCS=n), we enable pointer authentication
for all functions, including leaf functions. This isn't necessary, and
is unfortunate for a few reasons:

* Any PACIASP instruction is implicitly a `BTI C` landing pad, and
  forcing the addition of a PACIASP in every function introduces a
  larger set of BTI gadgets than is necessary.

* The PACIASP and AUTIASP instructions make leaf functions larger than
  necessary, bloating the kernel Image. For a defconfig v6.2-rc3 kernel,
  this appears to add ~64KiB relative to not signing leaf functions,
  which is unfortunate but not entirely onerous.

* The PACIASP and AUTIASP instructions potentially make leaf functions
  more expensive in terms of performance and/or power. For many trivial
  leaf functions, this is clearly unnecessary, e.g.

  | <arch_local_save_flags>:
  |        d503233f        paciasp
  |        d53b4220        mrs     x0, daif
  |        d50323bf        autiasp
  |        d65f03c0        ret

  | <calibration_delay_done>:
  |        d503233f        paciasp
  |        d50323bf        autiasp
  |        d65f03c0        ret
  |        d503201f        nop

* When CONFIG_UNWIND_PATCH_PAC_INTO_SCS=y we disable pointer
  authentication for leaf functions, so clearly this is not functionally
  necessary, indicates we have an inconsistent threat model, and
  convolutes the Makefile logic.

We've used pointer authentication in leaf functions since the
introduction of in-kernel pointer authentication in commit:

  74afda40 ("arm64: compile the kernel with ptrauth return address signing")

... but at the time we had no rationale for signing leaf functions.

Subsequently, we considered avoiding signing leaf functions:

  https://lore.kernel.org/linux-arm-kernel/1586856741-26839-1-git-send-email-amit.kachhap@arm.com/
  https://lore.kernel.org/linux-arm-kernel/1588149371-20310-1-git-send-email-amit.kachhap@arm.com/

... however at the time we didn't have an abundance of reasons to avoid
signing leaf functions as above (e.g. the BTI case), we had no hardware
to make performance measurements, and it was reasoned that this gave
some level of protection against a limited set of code-reuse gadgets
which would fall through to a RET. We documented this in commit:

  717b938e ("arm64: Document why we enable PAC support for leaf functions")

Notably, this was before we supported any forward-edge CFI scheme (e.g.
Arm BTI, or Clang CFI/kCFI), which would prevent jumping into the middle
of a function.

In addition, even with signing forced for leaf functions, AUTIASP may be
placed before a number of instructions which might constitute such a
gadget, e.g.

| <user_regs_reset_single_step>:
|        f9400022        ldr     x2, [x1]
|        d503233f        paciasp
|        d50323bf        autiasp
|        f9408401        ldr     x1, [x0, #264]
|        720b005f        tst     w2, #0x200000
|        b26b0022        orr     x2, x1, #0x200000
|        926af821        and     x1, x1, #0xffffffffffdfffff
|        9a820021        csel    x1, x1, x2, eq  // eq = none
|        f9008401        str     x1, [x0, #264]
|        d65f03c0        ret

| <fpsimd_cpu_dead>:
|        2a0003e3        mov     w3, w0
|        9000ff42        adrp    x2, ffff800009ffd000 <xen_dynamic_chip+0x48>
|        9120e042        add     x2, x2, #0x838
|        52800000        mov     w0, #0x0                        // #0
|        d503233f        paciasp
|        f000d041        adrp    x1, ffff800009a20000 <this_cpu_vector>
|        d50323bf        autiasp
|        9102c021        add     x1, x1, #0xb0
|        f8635842        ldr     x2, [x2, w3, uxtw #3]
|        f821685f        str     xzr, [x2, x1]
|        d65f03c0        ret
|        d503201f        nop

So generally, trying to use AUTIASP to detect such gadgetization is not
robust, and this is dealt with far better by forward-edge CFI (which is
designed to prevent such cases). We should bite the bullet and stop
pretending that AUTIASP is a mitigation for such forward-edge
gadgetization.

For the above reasons, this patch has the kernel consistently sign
non-leaf functions and avoid signing leaf functions.

Considering a defconfig v6.2-rc3 kernel built with LLVM 15.0.6:

* The vmlinux is ~43KiB smaller:

  | [mark@lakrids:~/src/linux]% ls -al vmlinux-*
  | -rwxr-xr-x 1 mark mark 338547808 Jan 25 17:17 vmlinux-after
  | -rwxr-xr-x 1 mark mark 338591472 Jan 25 17:22 vmlinux-before

* The resulting Image is 64KiB smaller:

  | [mark@lakrids:~/src/linux]% ls -al Image-*
  | -rwxr-xr-x 1 mark mark 32702976 Jan 25 17:17 Image-after
  | -rwxr-xr-x 1 mark mark 32768512 Jan 25 17:22 Image-before

* There are ~400 fewer BTI gadgets:

  | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-before 2> /dev/null | grep -ow 'paciasp\|bti\sc\?' | sort | uniq -c
  |    1219 bti     c
  |   61982 paciasp

  | [mark@lakrids:~/src/linux]% usekorg 12.1.0 aarch64-linux-objdump -d vmlinux-after 2> /dev/null | grep -ow 'paciasp\|bti\sc\?' | sort | uniq -c
  |   10099 bti     c
  |   52699 paciasp

  Which is +8880 BTIs, and -9283 PACIASPs, for -403 unnecessary BTI
  gadgets. While this is small relative to the total, distinguishing the
  two cases will make it easier to analyse and reduce this set further
  in future.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Cc: Amit Daniel Kachhap <amit.kachhap@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230131105809.991288-3-mark.rutland@arm.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

c68cf528

arm64: unify asm-arch manipulation · 1e249c41

Mark Rutland authored Jan 31, 2023

Assemblers will reject instructions not supported by a target
architecture version, and so we must explicitly tell the assembler the
latest architecture version for which we want to assemble instructions
from.

We've added a few AS_HAS_ARMV8_<N> definitions for this, in addition to
an inconsistently named AS_HAS_PAC definition, from which arm64's
top-level Makefile determines the architecture version that we intend to
target, and generates the `asm-arch` variable.

To make this a bit clearer and easier to maintain, this patch reworks
the Makefile to determine asm-arch in a single if-else-endif chain.
AS_HAS_PAC, which is defined when the assembler supports
`-march=armv8.3-a`, is renamed to AS_HAS_ARMV8_3.

As the logic for armv8.3-a is lifted out of the block handling pointer
authentication, `asm-arch` may now be set to armv8.3-a regardless of
whether support for pointer authentication is selected. This means that
it will be possible to assemble armv8.3-a instructions even if we didn't
intend to, but this is consistent with our handling of other
architecture versions, and the compiler won't generate armv8.3-a
instructions regardless.

For the moment there's no need for an CONFIG_AS_HAS_ARMV8_1, as the code
for LSE atomics and LDAPR use individual `.arch_extension` entries and
do not require the baseline asm arch to be bumped to armv8.1-a. The
other armv8.1-a features (e.g. PAN) do not require assembler support.

There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230131105809.991288-2-mark.rutland@arm.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

1e249c41

kselftest/arm64: Remove redundant _start labels from zt-test · b2ab432b

Mark Brown authored Jan 30, 2023

The newly added zt-test program copied the pattern from the other FP
stress test programs of having a redundant _start label which is
rejected by clang, as we did in a parallel series for the other tests
remove the label so we can build with clang.

No functional change.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230130-arm64-fix-sme2-clang-v1-1-3ce81d99ea8f@kernel.orgSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

b2ab432b

arm64/mm: Intercept pfn changes in set_pte_at() · 004fc58f

Anshuman Khandual authored Jan 30, 2023

Changing pfn on a user page table mapped entry, without first going through
break-before-make (BBM) procedure is unsafe. This just updates set_pte_at()
to intercept such changes, via an updated pgattr_change_is_safe(). This new
check happens via __check_racy_pte_update(), which has now been renamed as
__check_safe_pte_update().

Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20230130121457.1607675-1-anshuman.khandual@arm.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

004fc58f

Documentation: arm64: correct spelling · a70f00e7

Randy Dunlap authored Jan 26, 2023

Correct spelling problems for Documentation/arm64/ as reported
by codespell.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Reviewed-by: Mukesh Ojha <quic_mojha@quicinc.com>
Link: https://lore.kernel.org/r/20230127064005.1558-3-rdunlap@infradead.orgSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

a70f00e7

kselftest/arm64: Limit the maximum VL we try to set via ptrace · 89ff30b9

Mark Brown authored Jan 11, 2023

When SVE was initially merged we chose to export the maximum VQ in the ABI
as being 512, rather more than the architecturally supported maximum of 16.
For the ptrace tests this results in us generating a lot of test cases and
hence log output which are redundant since a system couldn't possibly
support them. Instead only check values up to the current architectural
limit, plus one more so that we're covering the constraining of higher
vector lengths.

This makes no practical difference to our test coverage, speeds things up
on slower consoles and makes the output much more managable.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230111-arm64-kselftest-ptrace-max-vl-v1-1-8167f41d1ad8@kernel.orgSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

89ff30b9

27 Jan, 2023 2 commits

arm64: traps: attempt to dump all instructions · a873bb49

Mark Rutland authored Jan 27, 2023

Currently dump_kernel_instr() dumps a few instructions around the
pt_regs::pc value, dumping 4 instructions before the PC before dumping
the instruction at the PC. If an attempt to read an instruction fails,
it gives up and does not attempt to dump any subsequent instructions.

This is unfortunate when the pt_regs::pc value points to the start of a
page with a leading guard page, where the instruction at the PC can be
read, but prior instructions cannot.

This patch makes dump_kernel_instr() attempt to dump each instruction
regardless of whether reading a prior instruction could be read, which
gives a more useful code dump in such cases. When an instruction cannot
be read, it is reported as "????????", which cannot be confused with a
hex value,

For example, with a `UDF #0` (AKA 0x00000000) early in the kexec control
page, we'll now get the following code dump:

| Internal error: Oops - Undefined instruction: 0000000002000000 [#1] SMP
| Modules linked in:
| CPU: 0 PID: 261 Comm: kexec Not tainted 6.2.0-rc5+ #26
| Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
| pstate: 604003c5 (nZCv DAIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| pc : 0x48c00000
| lr : machine_kexec+0x190/0x200
| sp : ffff80000d36ba80
| x29: ffff80000d36ba80 x28: ffff000002dfc380 x27: 0000000000000000
| x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
| x23: ffff80000a9f7858 x22: 000000004c460000 x21: 0000000000000010
| x20: 00000000ad821000 x19: ffff000000aa0000 x18: 0000000000000006
| x17: ffff8000758a2000 x16: ffff800008000000 x15: ffff80000d36b568
| x14: 0000000000000000 x13: ffff80000d36b707 x12: ffff80000a9bf6e0
| x11: 00000000ffffdfff x10: ffff80000aaaf8e0 x9 : ffff80000815eff8
| x8 : 000000000002ffe8 x7 : c0000000ffffdfff x6 : 00000000000affa8
| x5 : 0000000000001fff x4 : 0000000000000001 x3 : ffff80000a263008
| x2 : ffff80000a9e20f8 x1 : 0000000048c00000 x0 : ffff000000aa0000
| Call trace:
|  0x48c00000
|  kernel_kexec+0x88/0x138
|  __do_sys_reboot+0x108/0x288
|  __arm64_sys_reboot+0x2c/0x40
|  invoke_syscall+0x78/0x140
|  el0_svc_common.constprop.0+0x4c/0x100
|  do_el0_svc+0x34/0x80
|  el0_svc+0x34/0x140
|  el0t_64_sync_handler+0xf4/0x140
|  el0t_64_sync+0x194/0x1c0
| Code: ???????? ???????? ???????? ???????? (00000000)
| ---[ end trace 0000000000000000 ]---
| Kernel panic - not syncing: Oops - Undefined instruction: Fatal exception
| Kernel Offset: disabled
| CPU features: 0x002000,00050108,c8004203
| Memory Limit: none
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20230127121256.2141368-1-mark.rutland@arm.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

a873bb49

arm64: avoid executing padding bytes during kexec / hibernation · dc4824fa

Mark Rutland authored Jan 27, 2023

Currently we rely on the HIBERNATE_TEXT section starting with the entry
point to swsusp_arch_suspend_exit, and the KEXEC_TEXT section starting
with the entry point to arm64_relocate_new_kernel. In both cases we copy
the entire section into a dynamically-allocated page, and then later
branch to the start of this page.

SYM_FUNC_START() will align the function entry points to
CONFIG_FUNCTION_ALIGNMENT, and when the linker later processes the
assembled code it will place padding bytes before the function entry
point if the location counter was not already sufficiently aligned. The
linker happens to use the value zero for these padding bytes.

This padding may end up being applied whenever CONFIG_FUNCTION_ALIGNMENT
is greater than 4, which can be the case with
CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B=y or
CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS=y.

When such padding is applied, attempting to kexec or resume from
hibernate will result ina crash: the kernel will branch to the padding
bytes as the start of the dynamically-allocated page, and as those bytes
are zero they will decode as UDF #0, which reliably triggers an
UNDEFINED exception. For example:

| # ./kexec --reuse-cmdline -f Image
| [   46.965800] kexec_core: Starting new kernel
| [   47.143641] psci: CPU1 killed (polled 0 ms)
| [   47.233653] psci: CPU2 killed (polled 0 ms)
| [   47.323465] psci: CPU3 killed (polled 0 ms)
| [   47.324776] Bye!
| [   47.327072] Internal error: Oops - Undefined instruction: 0000000002000000 [#1] SMP
| [   47.328510] Modules linked in:
| [   47.329086] CPU: 0 PID: 259 Comm: kexec Not tainted 6.2.0-rc5+ #3
| [   47.330223] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
| [   47.331497] pstate: 604003c5 (nZCv DAIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| [   47.332782] pc : 0x43a95000
| [   47.333338] lr : machine_kexec+0x190/0x1e0
| [   47.334169] sp : ffff80000d293b70
| [   47.334845] x29: ffff80000d293b70 x28: ffff000002cc0000 x27: 0000000000000000
| [   47.336292] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
| [   47.337744] x23: ffff80000a837858 x22: 0000000048ec9000 x21: 0000000000000010
| [   47.339192] x20: 00000000adc83000 x19: ffff000000827000 x18: 0000000000000006
| [   47.340638] x17: ffff800075a61000 x16: ffff800008000000 x15: ffff80000d293658
| [   47.342085] x14: 0000000000000000 x13: ffff80000d2937f7 x12: ffff80000a7ff6e0
| [   47.343530] x11: 00000000ffffdfff x10: ffff80000a8ef8e0 x9 : ffff80000813ef00
| [   47.344976] x8 : 000000000002ffe8 x7 : c0000000ffffdfff x6 : 00000000000affa8
| [   47.346431] x5 : 0000000000001fff x4 : 0000000000000001 x3 : ffff80000a0a3008
| [   47.347877] x2 : ffff80000a8220f8 x1 : 0000000043a95000 x0 : ffff000000827000
| [   47.349334] Call trace:
| [   47.349834]  0x43a95000
| [   47.350338]  kernel_kexec+0x88/0x100
| [   47.351070]  __do_sys_reboot+0x108/0x268
| [   47.351873]  __arm64_sys_reboot+0x2c/0x40
| [   47.352689]  invoke_syscall+0x78/0x108
| [   47.353458]  el0_svc_common.constprop.0+0x4c/0x100
| [   47.354426]  do_el0_svc+0x34/0x50
| [   47.355102]  el0_svc+0x34/0x108
| [   47.355747]  el0t_64_sync_handler+0xf4/0x120
| [   47.356617]  el0t_64_sync+0x194/0x198
| [   47.357374] Code: bad PC value
| [   47.357999] ---[ end trace 0000000000000000 ]---
| [   47.358937] Kernel panic - not syncing: Oops - Undefined instruction: Fatal exception
| [   47.360515] Kernel Offset: disabled
| [   47.361230] CPU features: 0x002000,00050108,c8004203
| [   47.362232] Memory Limit: none

Note: Unfortunately the code dump reports "bad PC value" as it attempts
to dump some instructions prior to the UDF (i.e. before the start of the
page), and terminates early upon a fault, obscuring the problem.

This patch fixes this issue by aligning the section starter markes to
CONFIG_FUNCTION_ALIGNMENT using the ALIGN_FUNCTION() helper, which
ensures that the linker never needs to place padding bytes within the
section. Assertions are added to verify each section begins with the
function we expect, making our implicit requirement explicit.

In future it might be nice to rework the kexec and hibernation code to
decouple the section start from the entry point, but that involves much
more significant changes that come with a higher risk of error, so I've
tried to keep this fix as simple as possible for now.

Fixes: 47a15aa5 ("arm64: Extend support for CONFIG_FUNCTION_ALIGNMENT")
Reported-by: CKI Project <cki-project@redhat.com>
Link: https://lore.kernel.org/linux-arm-kernel/29992.123012504212600261@us-mta-139.us.mimecast.lan/Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

dc4824fa

26 Jan, 2023 4 commits

arm64: Apply dynamic shadow call stack patching in two passes · 54c968be

Ard Biesheuvel authored Dec 13, 2022

Code patching for the dynamically enabled shadow call stack comes down
to finding PACIASP and AUTIASP instructions -which behave as NOPs on
cores that do not implement pointer authentication- and converting them
into shadow call stack pushes and pops, respectively.

Due to past bad experiences with the highly complex and overengineered
DWARF standard that describes the unwind metadata that we are using to
locate these instructions, let's make this patching logic a little bit
more robust so that any issues with the unwind metadata detected at boot
time can de dealt with gracefully.

The DWARF annotations that are used for this are emitted at function
granularity, and due to the fact that the instructions we are patching
will simply behave as NOPs if left unpatched, we can abort on errors as
long as we don't leave any functions in a half-patched state.

So do a dry run of each FDE frame (covering a single function) before
performing the actual patching, and give up if the DWARF metadata cannot
be understood.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Link: https://lore.kernel.org/r/20221213142849.1629026-1-ardb@kernel.orgSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

54c968be

arm64: el2_setup.h: fix spelling typo in comments · bb457bdd

Prathu Baronia authored Jan 23, 2023

- "evailable" -> "available"
Signed-off-by: Prathu Baronia <prathubaronia2011@gmail.com>
Reviewed-by: Mukesh Ojha <quic_mojha@quicinc.com>
Link: https://lore.kernel.org/r/20230123110639.10473-1-prathubaronia2011@gmail.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

bb457bdd

arm64: Kconfig: fix spelling · 11fc944f

Randy Dunlap authored Jan 24, 2023

Fix spelling typos in arm64: (reported by codespell)
s/upto/up to/
s/familly/family/
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230124181605.14144-1-rdunlap@infradead.orgSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

11fc944f

arm64: head: Switch endianness before populating the ID map · 2ced0f30

Ard Biesheuvel authored Jan 25, 2023

Ensure that the endianness used for populating the ID map matches the
endianness that the running kernel will be using, as this is no longer
guaranteed now that create_idmap() is invoked before init_kernel_el().

Note that doing so is only safe if the MMU is off, as switching the
endianness with the MMU on results in the active ID map to become
invalid. So also clear the M bit when toggling the EE bit in SCTLR, and
mark the MMU as disabled at boot.

Note that the same issue has resulted in preserve_boot_args() recording
the contents of registers X0 ... X3 in the wrong byte order, although
this is arguably a very minor concern.

Fixes: 32b135a7 ("arm64: head: avoid cache invalidation when entering with the MMU on")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20230125185910.962733-1-ardb@kernel.orgSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

2ced0f30

24 Jan, 2023 11 commits

efi: arm64: enter with MMU and caches enabled · 61786170

Ard Biesheuvel authored Jan 11, 2023

Instead of cleaning the entire loaded kernel image to the PoC and
disabling the MMU and caches before branching to the kernel's bare metal
entry point, we can leave the MMU and caches enabled, and rely on EFI's
cacheable 1:1 mapping of all of system RAM (which is mandated by the
spec) to populate the initial page tables.

This removes the need for managing coherency in software, which is
tedious and error prone.

Note that we still need to clean the executable region of the image to
the PoU if this is required for I/D coherency, but only if we actually
decided to move the image in memory, as otherwise, this will have been
taken care of by the loader.

This change affects both the builtin EFI stub as well as the zboot
decompressor, which now carries the entire EFI stub along with the
decompression code and the compressed image.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20230111102236.1430401-7-ardb@kernel.orgSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

61786170

arm64: head: Clean the ID map and the HYP text to the PoC if needed · 3dcf60bb

Ard Biesheuvel authored Jan 11, 2023

If we enter with the MMU and caches enabled, the bootloader may not have
performed any cache maintenance to the PoC. So clean the ID mapped page
to the PoC, to ensure that instruction and data accesses with the MMU
off see the correct data. For similar reasons, clean all the HYP text to
the PoC as well when entering at EL2 with the MMU and caches enabled.

Note that this means primary_entry() itself needs to be moved into the
ID map as well, as we will return from init_kernel_el() with the MMU and
caches off.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20230111102236.1430401-6-ardb@kernel.orgSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

3dcf60bb

arm64: head: avoid cache invalidation when entering with the MMU on · 32b135a7

Ard Biesheuvel authored Jan 11, 2023

If we enter with the MMU on, there is no need for explicit cache
invalidation for stores to memory, as they will be coherent with the
caches.

Let's take advantage of this, and create the ID map with the MMU still
enabled if that is how we entered, and avoid any cache invalidation
calls in that case.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20230111102236.1430401-5-ardb@kernel.orgSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

32b135a7

arm64: head: record the MMU state at primary entry · 9d7c13e5

Ard Biesheuvel authored Jan 11, 2023

Prepare for being able to deal with primary entry with the MMU and
caches enabled, by recording whether or not we entered with the MMU on
in register x19 and in a global variable. (Note that setting this
variable to '1' does not require cache invalidation, nor is it required
for storing the bootargs in that case, so omit the cache maintenance).

Since boot with the MMU and caches enabled is not permitted by the bare
metal boot protocol, ensure that a diagnostic is emitted and a taint bit
set if the MMU was found to be enabled on a non-EFI boot, and panic()
once the console is likely to be up. We will make an exception for EFI
boot later, which has strict requirements for the mapping of system
memory, permitting us to relax the boot protocol and hand over from the
EFI stub to the core kernel with MMU and caches left enabled.

While at it, add 'pre_disable_mmu_workaround' macro invocations to
init_kernel_el, as its manipulation of SCTLR_ELx may amount to disabling
of the MMU after subsequent patches.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20230111102236.1430401-4-ardb@kernel.orgSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

9d7c13e5

arm64: kernel: move identity map out of .text mapping · af7249b3

Ard Biesheuvel authored Jan 11, 2023

Reorganize the ID map slightly so that only code that is executed with
the MMU off or via the 1:1 mapping remains. This allows us to move the
identity map out of the .text segment, as it will no longer need
executable permissions via the kernel mapping.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20230111102236.1430401-3-ardb@kernel.orgSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

af7249b3

arm64: head: Move all finalise_el2 calls to after __enable_mmu · 82e49588

Ard Biesheuvel authored Jan 11, 2023

In the primary boot path, finalise_el2() is called much later than on
the secondary boot or resume-from-suspend paths, and this does not
appear to be intentional.

Since we aim to do as little as possible before enabling the MMU and
caches, align secondary and resume with primary boot, and defer the call
to after the MMU is turned on. This also removes the need to clean
finalise_el2() to the PoC once we enable support for booting with the
MMU on.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20230111102236.1430401-2-ardb@kernel.orgSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

82e49588

arm64: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS · baaf553d

Mark Rutland authored Jan 23, 2023

This patch enables support for DYNAMIC_FTRACE_WITH_CALL_OPS on arm64.
This allows each ftrace callsite to provide an ftrace_ops to the common
ftrace trampoline, allowing each callsite to invoke distinct tracer
functions without the need to fall back to list processing or to
allocate custom trampolines for each callsite. This significantly speeds
up cases where multiple distinct trace functions are used and callsites
are mostly traced by a single tracer.

The main idea is to place a pointer to the ftrace_ops as a literal at a
fixed offset from the function entry point, which can be recovered by
the common ftrace trampoline. Using a 64-bit literal avoids branch range
limitations, and permits the ops to be swapped atomically without
special considerations that apply to code-patching. In future this will
also allow for the implementation of DYNAMIC_FTRACE_WITH_DIRECT_CALLS
without branch range limitations by using additional fields in struct
ftrace_ops.

As noted in the core patch adding support for
DYNAMIC_FTRACE_WITH_CALL_OPS, this approach allows for directly invoking
ftrace_ops::func even for ftrace_ops which are dynamically-allocated (or
part of a module), without going via ftrace_ops_list_func.

Currently, this approach is not compatible with CLANG_CFI, as the
presence/absence of pre-function NOPs changes the offset of the
pre-function type hash, and there's no existing mechanism to ensure a
consistent offset for instrumented and uninstrumented functions. When
CLANG_CFI is enabled, the existing scheme with a global ops->func
pointer is used, and there should be no functional change. I am
currently working with others to allow the two to work together in
future (though this will liekly require updated compiler support).

I've benchamrked this with the ftrace_ops sample module [1], which is
not currently upstream, but available at:

  https://lore.kernel.org/lkml/20230103124912.2948963-1-mark.rutland@arm.com
  git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git ftrace-ops-sample-20230109

Using that module I measured the total time taken for 100,000 calls to a
trivial instrumented function, with a number of tracers enabled with
relevant filters (which would apply to the instrumented function) and a
number of tracers enabled with irrelevant filters (which would not apply
to the instrumented function). I tested on an M1 MacBook Pro, running
under a HVF-accelerated QEMU VM (i.e. on real hardware).

Before this patch:

  Number of tracers     || Total time  | Per-call average time (ns)
  Relevant | Irrelevant || (ns)        | Total        | Overhead
  =========+============++=============+==============+============
         0 |          0 ||      94,583 |         0.95 |           -
         0 |          1 ||      93,709 |         0.94 |           -
         0 |          2 ||      93,666 |         0.94 |           -
         0 |         10 ||      93,709 |         0.94 |           -
         0 |        100 ||      93,792 |         0.94 |           -
  ---------+------------++-------------+--------------+------------
         1 |          1 ||   6,467,833 |        64.68 |       63.73
         1 |          2 ||   7,509,708 |        75.10 |       74.15
         1 |         10 ||  23,786,792 |       237.87 |      236.92
         1 |        100 || 106,432,500 |     1,064.43 |     1063.38
  ---------+------------++-------------+--------------+------------
         1 |          0 ||   1,431,875 |        14.32 |       13.37
         2 |          0 ||   6,456,334 |        64.56 |       63.62
        10 |          0 ||  22,717,000 |       227.17 |      226.22
       100 |          0 || 103,293,667 |      1032.94 |     1031.99
  ---------+------------++-------------+--------------+--------------

  Note: per-call overhead is estimated relative to the baseline case
  with 0 relevant tracers and 0 irrelevant tracers.

After this patch

  Number of tracers     || Total time  | Per-call average time (ns)
  Relevant | Irrelevant || (ns)        | Total        | Overhead
  =========+============++=============+==============+============
         0 |          0 ||      94,541 |         0.95 |           -
         0 |          1 ||      93,666 |         0.94 |           -
         0 |          2 ||      93,709 |         0.94 |           -
         0 |         10 ||      93,667 |         0.94 |           -
         0 |        100 ||      93,792 |         0.94 |           -
  ---------+------------++-------------+--------------+------------
         1 |          1 ||     281,000 |         2.81 |        1.86
         1 |          2 ||     281,042 |         2.81 |        1.87
         1 |         10 ||     280,958 |         2.81 |        1.86
         1 |        100 ||     281,250 |         2.81 |        1.87
  ---------+------------++-------------+--------------+------------
         1 |          0 ||     280,959 |         2.81 |        1.86
         2 |          0 ||   6,502,708 |        65.03 |       64.08
        10 |          0 ||  18,681,209 |       186.81 |      185.87
       100 |          0 || 103,550,458 |     1,035.50 |     1034.56
  ---------+------------++-------------+--------------+------------

  Note: per-call overhead is estimated relative to the baseline case
  with 0 relevant tracers and 0 irrelevant tracers.

As can be seen from the above:

a) Whenever there is a single relevant tracer function associated with a
   tracee, the overhead of invoking the tracer is constant, and does not
   scale with the number of tracers which are *not* associated with that
   tracee.

b) The overhead for a single relevant tracer has dropped to ~1/7 of the
   overhead prior to this series (from 13.37ns to 1.86ns). This is
   largely due to permitting calls to dynamically-allocated ftrace_ops
   without going through ftrace_ops_list_func.

I've run the ftrace selftests from v6.2-rc3, which reports:

| # of passed:  110
| # of failed:  0
| # of unresolved:  3
| # of untested:  0
| # of unsupported:  0
| # of xfailed:  1
| # of undefined(test bug):  0

... where the unresolved entries were the tests for DIRECT functions
(which are not supported), and the checkbashisms selftest (which is
irrelevant here):

| [8] Test ftrace direct functions against tracers        [UNRESOLVED]
| [9] Test ftrace direct functions against kprobes        [UNRESOLVED]
| [62] Meta-selftest: Checkbashisms       [UNRESOLVED]

... with all other tests passing (or failing as expected).
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Florent Revest <revest@chromium.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230123134603.1064407-9-mark.rutland@arm.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

baaf553d

arm64: ftrace: Update stale comment · 90955d77

Mark Rutland authored Jan 23, 2023

In commit:

  26299b3f ("ftrace: arm64: move from REGS to ARGS")

... we folded ftrace_regs_entry into ftrace_caller, and
ftrace_regs_entry no longer exists.

Update the comment accordingly.

There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Florent Revest <revest@chromium.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230123134603.1064407-8-mark.rutland@arm.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

90955d77

arm64: patching: Add aarch64_insn_write_literal_u64() · e4ecbe83

Mark Rutland authored Jan 23, 2023

In subsequent patches we'll need to atomically write to a
naturally-aligned 64-bit literal embedded within the kernel text.

Add a helper for this. For consistency with other text patching code we
use copy_to_kernel_nofault(), which is atomic for naturally-aligned
accesses up to 64-bits.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Florent Revest <revest@chromium.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230123134603.1064407-7-mark.rutland@arm.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

e4ecbe83

arm64: insn: Add helpers for BTI · 2bbbb401

Mark Rutland authored Jan 23, 2023

In subsequent patches we'd like to check whether an instruction is a
BTI. In preparation for this, add basic instruction helpers for BTI
instructions.

Per ARM DDI 0487H.a section C6.2.41, BTI is encoded in binary as
follows, MSB to LSB:

  1101 0101 000 0011 0010 0100 xx01 1111

Where the `xx` bits encode J/C/JC:

  00 : (omitted)
  01 : C
  10 : J
  11 : JC
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Florent Revest <revest@chromium.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230123134603.1064407-6-mark.rutland@arm.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

2bbbb401

arm64: Extend support for CONFIG_FUNCTION_ALIGNMENT · 47a15aa5

Mark Rutland authored Jan 23, 2023

On arm64 we don't align assembly function in the same way as C
functions. This somewhat limits the utility of
CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B for testing, and adds noise when
testing that we're correctly aligning functions as will be necessary for
ftrace in subsequent patches.

Follow the example of x86, and align assembly functions in the same way
as C functions. Selecting FUNCTION_ALIGNMENT_4B ensures
CONFIG_FUCTION_ALIGNMENT will be a minimum of 4 bytes, matching the
minimum alignment that __ALIGN and __ALIGN_STR provide prior to this
patch.

I've tested this by selecting CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B=y,
building and booting a kernel, and looking for misaligned text symbols:

Before, v6.2-rc3:
  # uname -rm
  6.2.0-rc3 aarch64
  # grep ' [Tt] ' /proc/kallsyms | grep -iv '[048c]0 [Tt] ' | wc -l
  5009

Before, v6.2-rc3 + fixed __cold:
  # uname -rm
  6.2.0-rc3-00001-g2a2bedf8bfa9 aarch64
  # grep ' [Tt] ' /proc/kallsyms | grep -iv '[048c]0 [Tt] ' | wc -l
  919

Before, v6.2-rc3 + fixed __cold + fixed ACPICA:
  # uname -rm
  6.2.0-rc3-00002-g267bddc38572 aarch64
  # grep ' [Tt] ' /proc/kallsyms | grep -iv '[048c]0 [Tt] ' | wc -l
  323
  # grep ' [Tt] ' /proc/kallsyms | grep -iv '[048c]0 [Tt] ' | grep acpi | wc -l
  0

After:
  # uname -rm
  6.2.0-rc3-00003-g71db61ee3ea1 aarch64
  # grep ' [Tt] ' /proc/kallsyms | grep -iv '[048c]0 [Tt] ' | wc -l
  112

Considering the remaining 112 unaligned text symbols:

* 20 are non-function KVM NVHE assembly symbols, which are never
  instrumented by ftrace:

  # grep ' [Tt] ' /proc/kallsyms | grep -iv '[048c]0 [Tt] ' | grep __kvm_nvhe | wc -l
  20
  # grep ' [Tt] ' /proc/kallsyms | grep -iv '[048c]0 [Tt] ' | grep __kvm_nvhe
  ffffbe6483f73784 t __kvm_nvhe___invalid
  ffffbe6483f73788 t __kvm_nvhe___do_hyp_init
  ffffbe6483f73ab0 t __kvm_nvhe_reset
  ffffbe6483f73b8c T __kvm_nvhe___hyp_idmap_text_end
  ffffbe6483f73b8c T __kvm_nvhe___hyp_text_start
  ffffbe6483f77864 t __kvm_nvhe___host_enter_restore_full
  ffffbe6483f77874 t __kvm_nvhe___host_enter_for_panic
  ffffbe6483f778a4 t __kvm_nvhe___host_enter_without_restoring
  ffffbe6483f81178 T __kvm_nvhe___guest_exit_panic
  ffffbe6483f811c8 T __kvm_nvhe___guest_exit
  ffffbe6483f81354 t __kvm_nvhe_abort_guest_exit_start
  ffffbe6483f81358 t __kvm_nvhe_abort_guest_exit_end
  ffffbe6483f81830 t __kvm_nvhe_wa_epilogue
  ffffbe6483f81844 t __kvm_nvhe_el1_trap
  ffffbe6483f81864 t __kvm_nvhe_el1_fiq
  ffffbe6483f81864 t __kvm_nvhe_el1_irq
  ffffbe6483f81884 t __kvm_nvhe_el1_error
  ffffbe6483f818a4 t __kvm_nvhe_el2_sync
  ffffbe6483f81920 t __kvm_nvhe_el2_error
  ffffbe6483f865c8 T __kvm_nvhe___start___kvm_ex_table

* 53 are position-independent functions only used during early boot, which are
  built with '-Os', but are never instrumented by ftrace:

  # grep ' [Tt] ' /proc/kallsyms | grep -iv '[048c]0 [Tt] ' | grep __pi | wc -l
  53

  We *could* drop '-Os' when building these for consistency, but that is
  not necessary to ensure that ftrace works correctly.

* The remaining 39 are non-function symbols, and 3 runtime BPF
  functions, which are never instrumented by ftrace:

  # grep ' [Tt] ' /proc/kallsyms | grep -iv '[048c]0 [Tt] ' | grep -v __kvm_nvhe | grep -v __pi | wc -l
  39
  # grep ' [Tt] ' /proc/kallsyms | grep -iv '[048c]0 [Tt] ' | grep -v __kvm_nvhe | grep -v __pi
  ffffbe6482e1009c T __irqentry_text_end
  ffffbe6482e10358 T __softirqentry_text_end
  ffffbe6482e1435c T __entry_text_end
  ffffbe6482e825f8 T __guest_exit_panic
  ffffbe6482e82648 T __guest_exit
  ffffbe6482e827d4 t abort_guest_exit_start
  ffffbe6482e827d8 t abort_guest_exit_end
  ffffbe6482e83030 t wa_epilogue
  ffffbe6482e83044 t el1_trap
  ffffbe6482e83064 t el1_fiq
  ffffbe6482e83064 t el1_irq
  ffffbe6482e83084 t el1_error
  ffffbe6482e830a4 t el2_sync
  ffffbe6482e83120 t el2_error
  ffffbe6482e93550 T sha256_block_neon
  ffffbe64830f3ae0 t e843419@01cc_00002a0c_3104
  ffffbe648378bd90 t e843419@09b3_0000d7cb_bc4
  ffffbe6483bdab20 t e843419@0c66_000116e2_34c8
  ffffbe6483f62c94 T __noinstr_text_end
  ffffbe6483f70a18 T __sched_text_end
  ffffbe6483f70b2c T __cpuidle_text_end
  ffffbe6483f722d4 T __lock_text_end
  ffffbe6483f73b8c T __hyp_idmap_text_end
  ffffbe6483f73b8c T __hyp_text_start
  ffffbe6483f865c8 T __start___kvm_ex_table
  ffffbe6483f870d0 t init_el1
  ffffbe6483f870f8 t init_el2
  ffffbe6483f87324 t pen
  ffffbe6483f87b48 T __idmap_text_end
  ffffbe64848eb010 T __hibernate_exit_text_start
  ffffbe64848eb124 T __hibernate_exit_text_end
  ffffbe64848eb124 T __relocate_new_kernel_start
  ffffbe64848eb260 T __relocate_new_kernel_end
  ffffbe648498a8e8 T _einittext
  ffffbe648498a8e8 T __exittext_begin
  ffffbe6484999d84 T __exittext_end
  ffff8000080756b4 t bpf_prog_6deef7357e7b4530    [bpf]
  ffff80000808dd78 t bpf_prog_6deef7357e7b4530    [bpf]
  ffff80000809d684 t bpf_prog_6deef7357e7b4530    [bpf]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Florent Revest <revest@chromium.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230123134603.1064407-5-mark.rutland@arm.comSigned-off-by: Catalin Marinas <catalin.marinas@arm.com>

47a15aa5