- 10 Feb, 2022 1 commit
-
-
Ard Biesheuvel authored
The SMC calling convention uses R7 as an argument register, which conflicts with its use as a frame pointer when building in Thumb2 mode. Given that Clang with ftrace does not permit frame pointers to be disabled, let's omit this compilation unit from ftrace instrumentation. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Acked-by:
Nick Desaulniers <ndesaulniers@google.com>
-
- 09 Feb, 2022 10 commits
-
-
Ard Biesheuvel authored
Thumb2 uses R7 rather than R11 as the frame pointer, and even if we rarely use a frame pointer to begin with when building in Thumb2 mode, there are cases where it is required by the compiler (Clang when inserting profiling hooks via -pg) However, preserving and restoring the frame pointer is risky, as any unhandled exceptions raised in the mean time will produce a bogus backtrace, and it would be better not to touch the frame pointer at all. This is the case even when CONFIG_FRAME_POINTER is not set, as the unwind directive used by the unwinder may also use R7 or R11 as the unwind anchor, even if the frame pointer is not managed strictly according to the frame pointer ABI. So let's tweak the cacheflush asm code not to clobber R7 or R11 at all, so that we can drop R7 from the clobber lists of the inline asm blocks that call these routines, and remove the code that preserves/restores R11. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Reviewed-by:
Nick Desaulniers <ndesaulniers@google.com>
-
Ard Biesheuvel authored
Thumb2 code uses R7 as the frame pointer rather than R11, because the opcodes to access it are generally shorter. This means that there are cases where we cannot simply add it to the clobber list of an asm() block, but need to preserve/restore it explicitly, or the compiler may complain in some cases (e.g., Clang builds with ftrace enabled). Since R11 is not special in that case, clobber it instead, and use it to preserve/restore the value of R7. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Reviewed-by:
Masami Hiramatsu <mhiramat@kernel.org>
-
Ard Biesheuvel authored
Enable the function graph tracer in combination with the EABI unwinder, so that Thumb2 builds or Clang ARM builds can make use of it. This involves using the unwinder to locate the return address of an instrumented function on the stack, so that it can be overridden and made to refer to the ftrace handling routines that need to be called at function return. Given that for these builds, it is not guaranteed that the value of the link register is stored on the stack, fall back to the stack slot that will be used by the ftrace exit code to restore LR in the instrumented function's execution context. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Reviewed-by:
Steven Rostedt (Google) <rostedt@goodmis.org>
-
Ard Biesheuvel authored
The ftrace graph tracer needs to override the return address of an instrumented function, in order to install a hook that gets invoked when the function returns again. Currently, we only support this when building for ARM using GCC with frame pointers, as in this case, it is guaranteed that the function will reload LR from [FP, #-4] in all cases, and we can simply pass that address to the ftrace code. In order to support this for configurations that rely on the EABI unwinder, such as Thumb2 builds, make the unwinder keep track of the address from which LR was unwound, permitting ftrace to make use of this in a subsequent patch. Drop the call to is_kernel_text_address(), which is problematic in terms of ftrace recursion, given that it may be instrumented itself. The call is redundant anyway, as no unwind directives will be found unless the PC points to memory that is known to contain executable code. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Reviewed-by:
Nick Desaulniers <ndesaulniers@google.com>
-
Ard Biesheuvel authored
Fix the frame pointer handling in the function graph tracer entry and exit code so we can enable HAVE_FUNCTION_GRAPH_FP_TEST. Instead of using FP directly (which will have different values between the entry and exit pieces of the function graph tracer), use the value of SP at entry and exit, as we can derive the former value from the frame pointer. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Reviewed-by:
Steven Rostedt (Google) <rostedt@goodmis.org>
-
Ard Biesheuvel authored
Avoid explicit literal loads and instead, use accessor macros that generate the optimal sequence depending on the architecture revision being targeted. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Reviewed-by:
Nick Desaulniers <ndesaulniers@google.com> Reviewed-by:
Steven Rostedt (Google) <rostedt@goodmis.org>
-
Ard Biesheuvel authored
Tweak the ftrace return paths to avoid redundant loads of SP, as well as unnecessary clobbering of IP. This also fixes the inconsistency of using MOV to perform a function return, which is sub-optimal on recent micro-architectures but more importantly, does not perform an interworking return, unlike compiler generated function returns in Thumb2 builds. Let's fix this by popping PC from the stack like most ordinary code does. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Reviewed-by:
Steven Rostedt (Google) <rostedt@goodmis.org>
-
Ard Biesheuvel authored
Kernel images that are large in comparison to the range of a direct branch may fail to work as expected with ftrace, as patching a direct branch to one of the core ftrace routines may not be possible from the .init.text section, if it is emitted too far away from the normal .text section. This is more likely to affect Thumb2 builds, given that its range is only -/+ 16 MiB (as opposed to ARM which has -/+ 32 MiB), but may occur in either ISA. To work around this, add a couple of trampolines to .init.text and swap these in when the ftrace patching code is operating on callers in .init.text. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Reviewed-by:
Nick Desaulniers <ndesaulniers@google.com> Reviewed-by:
Linus Walleij <linus.walleij@linaro.org> Reviewed-by:
Steven Rostedt (Google) <rostedt@goodmis.org>
-
Ard Biesheuvel authored
The compiler emitted hook used for ftrace consists of a PUSH {LR} to preserve the link register, followed by a branch-and-link (BL) to __gnu_mount_nc. Dynamic ftrace patches away the latter to turn the combined sequence into a NOP, using a POP {LR} instruction. This is not necessary, since the link register does not get clobbered in this case, and simply adding #4 to the stack pointer is sufficient, and avoids a memory access that may take a few cycles to resolve depending on the micro-architecture. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Reviewed-by:
Nick Desaulniers <ndesaulniers@google.com> Reviewed-by:
Linus Walleij <linus.walleij@linaro.org> Reviewed-by:
Steven Rostedt (Google) <rostedt@goodmis.org>
-
Ard Biesheuvel authored
Using ADR to take the address of 'ftrace_stub' via a local label produces an address that has the Thumb bit cleared, which means the subsequent comparison is guaranteed to fail. Instead, use the badr macro, which forces the Thumb bit to be set. Fixes: a3ba87a6 ("ARM: 6316/1: ftrace: add Thumb-2 support") Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Reviewed-by:
Nick Desaulniers <ndesaulniers@google.com> Reviewed-by:
Steven Rostedt (Google) <rostedt@goodmis.org> Reviewed-by:
Linus Walleij <linus.walleij@linaro.org>
-
- 31 Jan, 2022 2 commits
-
-
Russell King (Oracle) authored
Merge tag 'arm-vmap-stacks-v6' of git://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux into devel-stable ARM: support for IRQ and vmap'ed stacks [v6] This tag covers the changes between the version of vmap'ed + IRQ stacks support pulled into rmk/devel-stable [0] (which was dropped from v5.17 due to issues discovered too late in the cycle), and my v5 proposed for the v5.18 cycle [1]. [0] git://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git arm-irq-and-vmap-stacks-for-rmk [1] https://lore.kernel.org/linux-arm-kernel/20220124174744.1054712-1-ardb@kernel.org/
-
Ard Biesheuvel authored
The get_current() and __my_cpu_offset() accessors evaluate to only a single instruction emitted inline, but due to the size of the asm string that is created for SMP+v6 configurations, the compiler assumes otherwise, and may emit the functions out of line instead. So use __always_inline to avoid this. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Reviewed-by:
Nick Desaulniers <ndesaulniers@google.com>
-
- 25 Jan, 2022 5 commits
-
-
Ard Biesheuvel authored
Only SMP systems use the secondary startup path by definition, so there is no need for SMP conditionals there. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
The build bots complain about iop_handle_irq() not being declared so let's make it static instead. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
Rework the vmalloc_seq handling so it can be used safely under SMP, as we started using it to ensure that vmap'ed stacks are guaranteed to be mapped by the active mm before switching to a task, and here we need to ensure that changes to the page tables are visible to other CPUs when they observe a change in the sequence count. Since LPAE needs none of this, fold a check against it into the vmalloc_seq counter check after breaking it out into a separate static inline helper. Given that vmap'ed stacks are now also supported on !SMP configurations, let's drop the WARN() that could potentially now fire spuriously. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
Avoid using R9 in the IRQ handler code, as the entry code uses it for tsk, and expects it to remain untouched between the IRQ entry and exit code. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
Use the SMP_ON_UP patching framework to elide HWCAP_TLS tests from the context switch and return to userspace code paths, as SMP systems are guaranteed to have this h/w capability. At the same time, omit the update of __entry_task if the system is detected to be UP at runtime, as in that case, the value is never used. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org>
-
- 24 Jan, 2022 2 commits
-
-
Ard Biesheuvel authored
Nathan reports the group relocations go out of range in pathological cases such as allyesconfig kernels, which have little chance of actually booting but are still used in validation. So add a Kconfig symbol for this feature, and make it depend on !COMPILE_TEST. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
When onlining a CPU, switch to swapper_pg_dir as soon as possible so that it is guaranteed that the vmap'ed stack is mapped before it is used. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org>
-
- 06 Jan, 2022 1 commit
-
-
Ard Biesheuvel authored
Nathan reports that the new get_current() and per-CPU offset accessors may cause problems at build time due to the use of a literal to hold the address of the respective variables. This is due to the fact that LLD before v14 does not support the PC-relative group relocations that are normally used for this, and the fallback relies on literals but does not emit the literal pools explictly using the .ltorg directive. ./arch/arm/include/asm/current.h:53:6: error: out of range pc-relative fixup value asm(LOAD_SYM_ARMV6(%0, __current) : "=r"(cur)); ^ ./arch/arm/include/asm/insn.h:25:2: note: expanded from macro 'LOAD_SYM_ARMV6' " ldr " #reg ", =" #sym " nt" ^ <inline asm>:1:3: note: instantiated into assembly here ldr r0, =__current ^ Since emitting a literal pool in this particular case is not possible, let's avoid the LOAD_SYM_ARMV6() entirely, and use the ordinary C assigment instead. As it turns out, there are other such cases, and here, using .ltorg to emit the literal pool within range of the LDR instruction would be possible due to the presence of an unconditional branch right after it. Unfortunately, putting .ltorg directives in subsections appears to confuse the Clang inline assembler, resulting in similar errors even though the .ltorg is most definitely within range. So let's fix this by emitting the literal explicitly, and not rely on the assembler to figure this out. This means we have move the fallback out of the LOAD_SYM_ARMV6() macro and into the callers. Link: https://github.com/ClangBuiltLinux/linux/issues/1551 Fixes: 9c46929e ("ARM: implement THREAD_INFO_IN_TASK for uniprocessor systems") Reported-by:
Nathan Chancellor <natechancellor@gmail.com> Tested-by:
Nathan Chancellor <nathan@kernel.org> Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Signed-off-by:
Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
-
- 05 Jan, 2022 1 commit
-
-
Ard Biesheuvel authored
There are several reports about the new vmap'ed stacks code breaking suspend/resume on Exynos, Renesas and Tegra SMP platforms. While this is under investigation, let's disable the vmap'ed stacks feature for the time being for SMP configurations that have suspend/resume enabled. [0] https://lore.kernel.org/linux-arm-kernel/20211122092816.2865873-8-ardb@kernel.org/ Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Jon Hunter <jonathanh@nvidia.com> Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Signed-off-by:
Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
-
- 17 Dec, 2021 1 commit
-
-
Russell King (Oracle) authored
Merge tag 'arm-irq-and-vmap-stacks-for-rmk' of git://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux into devel-stable ARM: support for IRQ and vmap'ed stacks This PR covers all the work related to implementing IRQ stacks and vmap'ed stacks for all 32-bit ARM systems that are currently supported by the Linux kernel, including RiscPC and Footbridge. It has been submitted for review in three different waves: - IRQ stacks support for v7 SMP systems [0], - vmap'ed stacks support for v7 SMP systems[1], - extending support for both IRQ stacks and vmap'ed stacks for all remaining configurations, including v6/v7 SMP multiplatform kernels and uniprocessor configurations including v7-M [2] [0] https://lore.kernel.org/linux-arm-kernel/20211115084732.3704393-1-ardb@kernel.org/ [1] https://lore.kernel.org/linux-arm-kernel/20211122092816.2865873-1-ardb@kernel.org/ [2] https://lore.kernel.org/linux-arm-kernel/20211206164659.1495084-1-ardb@kernel.org/
-
- 06 Dec, 2021 13 commits
-
-
Ard Biesheuvel authored
Enable support for IRQ stacks on !MMU, and add the code to the IRQ entry path to switch to the IRQ stack if not running from it already. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Tested-by:
Marc Zyngier <maz@kernel.org> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
-
Ard Biesheuvel authored
On UP systems, only a single task can be 'current' at the same time, which means we can use a global variable to track it. This means we can also enable THREAD_INFO_IN_TASK for those systems, as in that case, thread_info is accessed via current rather than the other way around, removing the need to store thread_info at the base of the task stack. This, in turn, permits us to enable IRQ stacks and vmap'ed stacks on UP systems as well. To partially mitigate the performance overhead of this arrangement, use a ADD/ADD/LDR sequence with the appropriate PC-relative group relocations to load the value of current when needed. This means that accessing current will still only require a single load as before, avoiding the need for a literal to carry the address of the global variable in each function. However, accessing thread_info will now require this load as well. Acked-by:
Linus Walleij <linus.walleij@linaro.org> Acked-by:
Nicolas Pitre <nico@fluxnic.net> Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Tested-by:
Marc Zyngier <maz@kernel.org> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
-
Ard Biesheuvel authored
Defer TPIDURO updates for user space until exit also for CPU_V6+SMP configurations so that we can decide at runtime whether to use it to carry the current pointer, provided that we are running on a CPU that actually implements this register. This is needed for THREAD_INFO_IN_TASK support for UP systems, which requires that all SMP capable systems use the TPIDRURO based access to 'current' as the only remaining alternative will be a global variable which only works on UP. Acked-by:
Linus Walleij <linus.walleij@linaro.org> Acked-by:
Nicolas Pitre <nico@fluxnic.net> Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Tested-by:
Marc Zyngier <maz@kernel.org> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
-
Ard Biesheuvel authored
Enable the use of the TLS register to hold the 'current' pointer also on non-SMP configurations that target v6k or later CPUs. This will permit the use of THREAD_INFO_IN_TASK as well as IRQ stacks and vmap'ed stacks for such configurations. Acked-by:
Linus Walleij <linus.walleij@linaro.org> Acked-by:
Nicolas Pitre <nico@fluxnic.net> Acked-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Tested-by:
Marc Zyngier <maz@kernel.org> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
-
Ard Biesheuvel authored
Permit the use of the TPIDRPRW system register for carrying the per-CPU offset in generic SMP configurations that also target non-SMP capable ARMv6 cores. This uses the SMP_ON_UP code patching framework to turn all TPIDRPRW accesses into reads/writes of entry #0 in the __per_cpu_offset array. While at it, switch over some existing direct TPIDRPRW accesses in asm code to invocations of a new helper that is patched in the same way when necessary. Note that CPU_V6+SMP without SMP_ON_UP results in a kernel that does not boot on v6 CPUs without SMP extensions, so add this dependency to Kconfig as well. Acked-by:
Linus Walleij <linus.walleij@linaro.org> Acked-by:
Nicolas Pitre <nico@fluxnic.net> Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Tested-by:
Marc Zyngier <maz@kernel.org> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
-
Ard Biesheuvel authored
We will be adding variable loads to various hot paths, so it makes sense to add a helper macro that can load variables from asm code without the use of literal pool entries. On v7 or later, we can simply use MOVW/MOVT pairs, but on earlier cores, this requires a bit of hackery to emit a instruction sequence that implements this using a sequence of ADD/LDR instructions. Acked-by:
Linus Walleij <linus.walleij@linaro.org> Acked-by:
Nicolas Pitre <nico@fluxnic.net> Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Tested-by:
Marc Zyngier <maz@kernel.org> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
-
Ard Biesheuvel authored
Add support for the R_ARM_ALU_PC_Gn_NC and R_ARM_LDR_PC_G2 group relocations [0] so we can use them in modules. These will be used to load the current task pointer from a global variable without having to rely on a literal pool entry to carry the address of this variable, which may have a significant negative impact on cache utilization for variables that are used often and in many different places, as each occurrence will result in a literal pool entry and therefore a line in the D-cache. [0] 'ELF for the ARM architecture' https://github.com/ARM-software/abi-aa/releasesAcked-by:
Linus Walleij <linus.walleij@linaro.org> Acked-by:
Nicolas Pitre <nico@fluxnic.net> Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Tested-by:
Marc Zyngier <maz@kernel.org> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
-
Ard Biesheuvel authored
Tweak the UP stack protector handling code so that the thread info pointer is preserved in R7 until set_current is called. This is needed for a subsequent patch that implements THREAD_INFO_IN_TASK and set_current for UP as well. This also means we will prefer the per-task protector on UP systems that implement the thread ID registers, so tweak the preprocessor conditionals to reflect this. Acked-by:
Linus Walleij <linus.walleij@linaro.org> Acked-by:
Nicolas Pitre <nico@fluxnic.net> Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Tested-by:
Marc Zyngier <maz@kernel.org> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
-
Vladimir Murzin authored
Rather then restructuring the ARMv7M entrly logic per TODO, just move NVIC to GENERIC_IRQ_MULTI_HANDLER. Signed-off-by:
Vladimir Murzin <vladimir.murzin@arm.com> Acked-by:
Mark Rutland <mark.rutland@arm.com> Acked-by:
Arnd Bergmann <arnd@arndb.de> Acked-by:
Marc Zyngier <maz@kernel.org> Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Tested-by:
Marc Zyngier <maz@kernel.org> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
-
Arnd Bergmann authored
The last user of arch_irq_handler_default is gone now, so the entry-macro-multi.S file and all references to mach/entry-macro.S can be removed, as well as the asm_do_IRQ() entrypoint into the interrupt handling routines implemented in C. Note: The ARMv7-M entry still uses its own top-level IRQ entry, calling nvic_handle_irq() from assembly. This could be changed to go through generic_handle_arch_irq() as well, but it's unclear to me if there are any benefits. Signed-off-by:
Arnd Bergmann <arnd@arndb.de> [ardb: keep irq_handler macro as it carries all the IRQ stack handling] Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Tested-by:
Marc Zyngier <maz@kernel.org> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M Reviewed-by:
Linus Walleij <linus.walleij@linaro.org>
-
Arnd Bergmann authored
iop32x uses the entry-macro.S file for both the IRQ entry and for hooking into the arch_ret_to_user code path. This is done because the cp6 registers have to be enabled before accessing any of the interrupt controller registers but have to be disabled when running in user space. There is also a lazy-enable logic in cp6.c, but during a hardirq, we know it has to be enabled. Both the cp6-enable code and the code to read the IRQ status can be lifted into the normal generic_handle_arch_irq() path, but the cp6-disable code has to remain in the user return code. As nothing other than iop32x uses this hook, just open-code it there with an ifdef for the platform that can eventually be removed when iop32x has reached the end of its life. The cp6-enable path in the IRQ entry has an extra cp_wait barrier that the trap version does not have, but it is harmless to do it in both cases to simplify the logic here at the cost of a few extra cycles for the trap. Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Tested-by:
Marc Zyngier <maz@kernel.org> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
-
Arnd Bergmann authored
iop32x is one of the last platforms to use IRQ 0, and this has apparently stopped working in a 2014 cleanup without anyone noticing. This interrupt is used for the DMA engine, so most likely this has not actually worked in the past 7 years, but it's also not essential for using this board. I'm splitting out this change from my GENERIC_IRQ_MULTI_HANDLER conversion so it can be backported if anyone cares. Fixes: a71b092a ("ARM: Convert handle_IRQ to use __handle_domain_irq") Signed-off-by:
Arnd Bergmann <arnd@arndb.de> [ardb: take +1 offset into account in mask/unmask and init as well] Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Tested-by:
Marc Zyngier <maz@kernel.org> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M Reviewed-by:
Linus Walleij <linus.walleij@linaro.org>
-
Arnd Bergmann authored
Footbridge still uses the classic IRQ entry path in assembler, but this is easily converted into an equivalent C version. In this case, the correlation between IRQ numbers and bits in the status register is non-obvious, and the priorities are handled by manually checking each bit in a static order, re-reading the status register after each handled event. I moved the code into the new file and edited the syntax without changing this sequence to keep the behavior as close as possible to what it traditionally did. Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Tested-by:
Marc Zyngier <maz@kernel.org> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M Reviewed-by:
Linus Walleij <linus.walleij@linaro.org>
-
- 03 Dec, 2021 4 commits
-
-
Arnd Bergmann authored
This is one of the last platforms using the old entry path. While this code path is spread over a few files, it is fairly straightforward to convert it into an equivalent C version, leaving the existing algorithm and all the priority handling the same. Unlike most irqchip drivers, this means reading the status register(s) in a loop and always handling the highest-priority irq first. The IOMD_IRQREQC and IOMD_IRQREQD registers are not actaully used here, but I left the code in place for the time being, to keep the conversion as direct as possible. It could be removed in a cleanup on top. Signed-off-by:
Arnd Bergmann <arnd@arndb.de> [ardb: drop obsolete IOMD_IRQREQC/IOMD_IRQREQD handling] Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Tested-by:
Marc Zyngier <maz@kernel.org> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
-
Ard Biesheuvel authored
IOMD_IRQREQC nor IOMD_IRQREQD are ever defined, so any conditionally compiled code that depends on them is dead code, and can be removed. Suggested-by:
Russell King <linux@armlinux.org.uk> Signed-off-by:
Ard Biesheuvel <ardb@kernel.org>
-
Ard Biesheuvel authored
Wire up the generic support for managing task stack allocations via vmalloc, and implement the entry code that detects whether we faulted because of a stack overrun (or future stack overrun caused by pushing the pt_regs array) While this adds a fair amount of tricky entry asm code, it should be noted that it only adds a TST + branch to the svc_entry path. The code implementing the non-trivial handling of the overflow stack is emitted out-of-line into the .text section. Since on ARM, we rely on do_translation_fault() to keep PMD level page table entries that cover the vmalloc region up to date, we need to ensure that we don't hit such a stale PMD entry when accessing the stack. So we do a dummy read from the new stack while still running from the old one on the context switch path, and bump the vmalloc_seq counter when PMD level entries in the vmalloc range are modified, so that the MM switch fetches the latest version of the entries. Note that we need to increase the per-mode stack by 1 word, to gain some space to stash a GPR until we know it is safe to touch the stack. However, due to the cacheline alignment of the struct, this does not actually increase the memory footprint of the struct stack array at all. Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Tested-by:
Keith Packard <keithpac@amazon.com> Tested-by:
Marc Zyngier <maz@kernel.org> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
-
Ard Biesheuvel authored
The original Thumb-2 enablement patches updated the stack realignment code in svc_entry to work around the lack of a STMIB instruction in Thumb-2, by subtracting 4 from the frame size, inverting the sense of the misaligment check, and changing to a STMIA instruction and a final stack push of a 4 byte quantity that results in the stack becoming aligned at the end of the sequence. It also pushes and pops R0 to the stack in order to have a temp register that Thumb-2 allows in general purpose ALU instructions, as TST using SP is not permitted. Both are a bit problematic for vmap'ed stacks, as using the stack is only permitted after we decide that we did not overflow the stack, or have already switched to the overflow stack. As for the alignment check: the current approach creates a corner case where, if the initial SUB of SP ends up right at the start of the stack, we will end up subtracting another 8 bytes and overflowing it. This means we would need to add the overflow check *after* the SUB that deliberately misaligns the stack. However, this would require us to keep local state (i.e., whether we performed the subtract or not) across the overflow check, but without any GPRs or stack available. So let's switch to an approach where we don't use the stack, and where the alignment check of the stack pointer occurs in the usual way, as this is guaranteed not to result in overflow. This means we will be able to do the overflow check first. While at it, switch to R1 so the mode stack pointer in R0 remains accessible. Acked-by:
Nicolas Pitre <nico@fluxnic.net> Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Tested-by:
Marc Zyngier <maz@kernel.org> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
-