- 12 Jul, 2018 17 commits
-
-
Mark Rutland authored
Using this helper allows us to avoid the in-kernel calls to the compat_sys_{f,}statfs64() sycalls, as are necessary for parameter mangling in arm64's compat handling. Following the example of ksys_* functions, kcompat_sys_* functions are intended to be a drop-in replacement for their compat_sys_* counterparts, with the same calling convention. This is necessary to enable conversion of arm64's syscall handling to use pt_regs wrappers. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Mark Rutland authored
Using this helper allows us to avoid the in-kernel call to the sys_personality() syscall. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as sys_personality(). Since ksys_personality is trivial, it is implemented directly in <linux/syscalls.h>, as we do for ksys_close() and friends. This helper is necessary to enable conversion of arm64's syscall handling to use pt_regs wrappers. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Cc: Dave Martin <dave.martin@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Mark Rutland authored
Our syscall tables are aligned to 4096 bytes, which allowed their addresses to be generated with a single adrp in entry.S. This has the unfortunate property of wasting space in .rodata for the necessary padding. Now that the address is generated by C code, we can rely on the compiler to do the right thing, and drop the alignemnt. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Mark Rutland authored
We can zero GPRs x0 - x29 upon entry from EL0 to make it harder for userspace to control values consumed by speculative gadgets. We don't blat x30, since this is stashed much later, and we'll blat it before invoking C code. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Mark Rutland authored
Now that all of the syscall logic works on the saved pt_regs, apply_ssbd can safely corrupt x0-x3 in the entry paths, and we no longer need to restore them. So let's remove the logic doing so. With that logic gone, we can fold the branch target into the macro, so that callers need not deal with this. GAS provides \@, which provides a unique value per macro invocation, which we can use to create a unique label. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Mark Rutland authored
Now that syscalls are invoked with pt_regs, we no longer need to ensure that the argument regsiters are live in the entry assembly, and it's fine to not restore them after context_tracking_user_exit() has corrupted them. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Mark Rutland authored
Now that the syscall invocation logic is in C, we can migrate the rest of the syscall entry logic over, so that the entry assembly needn't look at the register values at all. The SVE reset across syscall logic now unconditionally clears TIF_SVE, but sve_user_disable() will only write back to CPACR_EL1 when SVE is actually enabled. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Dave Martin <dave.martin@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Mark Rutland authored
Currently syscall tracing is a tricky assembly state machine, which can be rather difficult to follow, and even harder to modify. Before we start fiddling with it for pt_regs syscalls, let's convert it to C. This is not intended to have any functional change. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Mark Rutland authored
As a first step towards invoking syscalls with a pt_regs argument, convert the raw syscall invocation logic to C. We end up with a bit more register shuffling, but the unified invocation logic means we can unify the tracing paths, too. Previously, assembly had to open-code calls to ni_sys() when the system call number was out-of-bounds for the relevant syscall table. This case is now handled by invoke_syscall(), and the assembly no longer need to handle this case explicitly. This allows the tracing paths to be simplified and unified, as we no longer need the __ni_sys_trace path and the __sys_trace_return label. This only converts the invocation of the syscall. The rest of the syscall triage and tracing is left in assembly for now, and will be converted in subsequent patches. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Mark Rutland authored
In preparation for invoking arbitrary syscalls from C code, let's define a type for an arbitrary syscall, matching the parameter passing rules of the AAPCS. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Mark Rutland authored
The arm64 sigreturn* syscall handlers are non-standard. Rather than taking a number of user parameters in registers as per the AAPCS, they expect the pt_regs as their sole argument. To make this work, we override the syscall definitions to invoke wrappers written in assembly, which mov the SP into x0, and branch to their respective C functions. On other architectures (such as x86), the sigreturn* functions take no argument and instead use current_pt_regs() to acquire the user registers. This requires less boilerplate code, and allows for other features such as interposing C code in this path. This patch takes the same approach for arm64. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Tentatively-reviewed-by: Dave Martin <dave.martin@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Mark Rutland authored
In subsequent patches, we'll want to make use of sve_user_enable() and sve_user_disable() outside of kernel/fpsimd.c. Let's move these to <asm/fpsimd.h> where we can make use of them. To avoid ifdeffery in sequences like: if (system_supports_sve() && some_condition) sve_user_disable(); ... empty stubs are provided when support for SVE is not enabled. Note that system_supports_sve() contains as IS_ENABLED(CONFIG_ARM64_SVE), so the sve_user_disable() call should be optimized away entirely when CONFIG_ARM64_SVE is not selected. To ensure that this is the case, the stub definitions contain a BUILD_BUG(), as we do for other stubs for which calls should always be optimized away when the relevant config option is not selected. At the same time, the include list of <asm/fpsimd.h> is sorted while adding <asm/sysreg.h>. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Dave Martin <dave.martin@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Mark Rutland authored
Now that we have sysreg_clear_set(), we can use this instead of change_cpacr(). Note that the order of the set and clear arguments differs between change_cpacr() and sysreg_clear_set(), so these are flipped as part of the conversion. Also, sve_user_enable() redundantly clears CPACR_EL1_ZEN_EL0EN before setting it; this is removed for clarity. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Dave Martin <dave.martin@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Mark Rutland authored
Now that we have sysreg_clear_set(), we can consistently use this instead of config_sctlr_el1(). Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Dave Martin <dave.martin@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Mark Rutland authored
Currently we assert that the SCTLR_EL{1,2}_{SET,CLEAR} bits are self-consistent with an assertion in config_sctlr_el1(). This is a bit unusual, since config_sctlr_el1() doesn't make use of these definitions, and is far away from the definitions themselves. We can use the CPP #error directive to have equivalent assertions in <asm/sysreg.h>, next to the definitions of the set/clear bits, which is a bit clearer and simpler. At the same time, lets fill in the upper 32 bits for both registers in their respective RES0 definitions. This could be a little nicer with GENMASK_ULL(63, 32), but this currently lives in <linux/bitops.h>, which cannot safely be included from assembly, as <asm/sysreg.h> can. Note the when the preprocessor evaluates an expression for an #if directive, all signed or unsigned values are treated as intmax_t or uintmax_t respectively. To avoid ambiguity, we define explicitly define the mask of all 64 bits. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Martin <dave.martin@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Mark Rutland authored
In do_notify_resume, we manipulate thread_flags as a 32-bit unsigned int, whereas thread_info::flags is a 64-bit unsigned long, and elsewhere (e.g. in the entry assembly) we manipulate the flags as a 64-bit quantity. For consistency, and to avoid problems if we end up with more than 32 flags, let's make do_notify_resume take the flags as a 64-bit unsigned long. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Dave Martin <dave.martin@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Will Deacon authored
This reverts commit 7e7df71f. When unwinding out of the IRQ stack and onto the interrupted EL1 stack, we cannot rely on the frame pointer being strictly increasing, as this could terminate the backtrace early depending on how the stacks have been allocated. Signed-off-by: Will Deacon <will.deacon@arm.com>
-
- 11 Jul, 2018 2 commits
-
-
Will Deacon authored
The new rseq call arrived in 4.18-rc1, so provide it in the asm-generic unistd.h for architectures such as arm64. Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Will Deacon authored
Implement calls to rseq_signal_deliver, rseq_handle_notify_resume and rseq_syscall so that we can select HAVE_RSEQ on arm64. Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
- 10 Jul, 2018 1 commit
-
-
Arnd Bergmann authored
Building without NUMA but with FLATMEM results in a link error because mem_map[] is not available: aarch64-linux-ld -EB -maarch64elfb --no-undefined -X -pie -shared -Bsymbolic --no-apply-dynamic-relocs --build-id -o .tmp_vmlinux1 -T ./arch/arm64/kernel/vmlinux.lds --whole-archive built-in.a --no-whole-archive --start-group arch/arm64/lib/lib.a lib/lib.a --end-group init/do_mounts.o: In function `mount_block_root': do_mounts.c:(.init.text+0x1e8): undefined reference to `mem_map' arch/arm64/kernel/vdso.o: In function `vdso_init': vdso.c:(.init.text+0xb4): undefined reference to `mem_map' This uses the same trick as the other architectures, making flatmem depend on !NUMA to avoid the broken configuration. Fixes: e7d4bac4 ("arm64: add ARM64-specific support for flatmem") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
- 09 Jul, 2018 3 commits
-
-
Lorenzo Pieralisi authored
Current ACPI ARM64 NUMA initialization code in acpi_numa_gicc_affinity_init() carries out NUMA nodes creation and cpu<->node mappings at the same time in the arch backend so that a single SRAT walk is needed to parse both pieces of information. This implies that the cpu<->node mappings must be stashed in an array (sized NR_CPUS) so that SMP code can later use the stashed values to avoid another SRAT table walk to set-up the early cpu<->node mappings. If the kernel is configured with a NR_CPUS value less than the actual processor entries in the SRAT (and MADT), the logic in acpi_numa_gicc_affinity_init() is broken in that the cpu<->node mapping is only carried out (and stashed for future use) only for a number of SRAT entries up to NR_CPUS, which do not necessarily correspond to the possible cpus detected at SMP initialization in acpi_map_gic_cpu_interface() (ie MADT and SRAT processor entries order is not enforced), which leaves the kernel with broken cpu<->node mappings. Furthermore, given the current ACPI NUMA code parsing logic in acpi_numa_gicc_affinity_init(), PXM domains for CPUs that are not parsed because they exceed NR_CPUS entries are not mapped to NUMA nodes (ie the PXM corresponding node is not created in the kernel) leaving the system with a broken NUMA topology. Rework the ACPI ARM64 NUMA initialization process so that the NUMA nodes creation and cpu<->node mappings are decoupled. cpu<->node mappings are moved to SMP initialization code (where they are needed), at the cost of an extra SRAT walk so that ACPI NUMA mappings can be batched before being applied, fixing current parsing pitfalls. Acked-by: Hanjun Guo <hanjun.guo@linaro.org> Tested-by: John Garry <john.garry@huawei.com> Fixes: d8b47fca ("arm64, ACPI, NUMA: NUMA support based on SRAT and SLIT") Link: http://lkml.kernel.org/r/1527768879-88161-2-git-send-email-xiexiuqi@huawei.comReported-by: Xie XiuQi <xiexiuqi@huawei.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Punit Agrawal <punit.agrawal@arm.com> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com> Cc: Jeremy Linton <jeremy.linton@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Xie XiuQi <xiexiuqi@huawei.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Nikunj Kela authored
Flatmem is useful in reducing kernel memory usage. One usecase is in kdump kernel. We are able to save ~14M by moving to flatmem scheme. Cc: xe-kernel@external.cisco.com Cc: Nikunj Kela <nkela@cisco.com> Signed-off-by: Nikunj Kela <nkela@cisco.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Will Deacon authored
The arm-soc tree does a good job handling .dts files, so exclude them from the ARM64 entry in MAINTAINERS. Cc: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Olof Johansson <olof@lixom.net> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
- 06 Jul, 2018 12 commits
-
-
Will Deacon authored
lkdtm calls flush_icache_range(), which results in an out-of-line call to __flush_icache_range(), which is not exported to modules. Export the symbol to modules to fix this build breakage. Fixes: 3b8c9f1c ("arm64: IPI each CPU after invalidating the I-cache for kernel mappings") Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Sudeep Holla authored
Commit 37c3ec2d ("arm64: topology: divorce MC scheduling domain from core_siblings") selected the smallest of LLC, socket siblings, and NUMA node siblings to ensure that the sched domain we build for the MC layer isn't larger than the DIE above it or it's shrunk to the socket or NUMA node if LLC exist acrosis NUMA node/chiplets. Commit acd32e52e4e0 ("arm64: topology: Avoid checking numa mask for scheduler MC selection") reverted the NUMA siblings checks since the CPU topology masks weren't updated on hotplug at that time. This patch re-introduces numa mask check as the CPU and NUMA topology is now updated in hotplug paths. Effectively, this patch does the partial revert of commit acd32e52e4e0. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Tested-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Sudeep Holla authored
Similar to core_sibling and thread_sibling, it's better to align and rename llc_siblings to llc_sibling. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Tested-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Sudeep Holla authored
We already repopulate the information on CPU hotplug-in, so we can safely remove the CPU topology and NUMA cpumap information during CPU hotplug out operation. This will help to provide the correct cpumask for scheduler domains. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Tested-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Sudeep Holla authored
It's incorrect to iterate over all the possible CPUs to update the sibling masks when any CPU is hotplugged in. In case the topology siblings masks of the CPU is removed when is it hotplugged out, we end up updating those masks when one of it's sibling is powered up again. This will provide inconsistent view. Further, since the CPU calling update_sibling_masks is yet to be set online, there's no need to compare itself with each online CPU when updating the siblings masks. This patch restricts updation of sibling masks only for CPUs that are already online. It also the drops the unnecessary cpuid check. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Tested-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Sudeep Holla authored
This patch adds support to remove all the CPU topology information using clear_cpu_topology and also resetting the sibling information on other sibling CPUs. This will be used in cpu_disable so that all the topology sibling information is removed on CPU hotplug out. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Tested-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Sudeep Holla authored
Currently numa_clear_node removes both cpu information from the NUMA node cpumap as well as the NUMA node id from the cpu. Similarly numa_store_cpu_info updates both percpu nodeid and NUMA cpumap. However we need to retain the numa node id for the cpu and only remove the cpu information from the numa node cpumap during CPU hotplug out. The same can be extended for hotplugging in the CPU. This patch separates out numa_{add,remove}_cpu from numa_clear_node and numa_store_cpu_info. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Reviewed-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Tested-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Sudeep Holla authored
Currently reset_cpu_topology clears all the CPU topology information and resets to default values. However we may need to just clear the information when we hotplug out the CPU. In preparation to add the support the same, let's refactor reset_cpu_topology to just reset the information and move clearing out the topology information to clear_cpu_topology. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Tested-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Will Deacon authored
The ERRATA_MIDR_REV_RANGE macro assigns ARM64_CPUCAP_LOCAL_CPU_ERRATUM to the '.type' field of the 'struct arm64_cpu_capabilities', so there's no need to assign it explicitly as well. Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Chintan Pandya authored
arm64 requires break-before-make. Originally, before setting up new pmd/pud entry for huge mapping, in few cases, the modifying pmd/pud entry was still valid and pointing to next level page table as we only clear off leaf PTE in unmap leg. a) This was resulting into stale entry in TLBs (as few TLBs also cache intermediate mapping for performance reasons) b) Also, modifying pmd/pud was the only reference to next level page table and it was getting lost without freeing it. So, page leaks were happening. Implement pud_free_pmd_page() and pmd_free_pte_page() to enforce BBM and also free the leaking page tables. Implementation requires, 1) Clearing off the current pud/pmd entry 2) Invalidation of TLB 3) Freeing of the un-used next level page tables Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Chintan Pandya <cpandya@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Chintan Pandya authored
Add an interface to invalidate intermediate page tables from TLB for kernel. Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Chintan Pandya <cpandya@codeaurora.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipWill Deacon authored
Merge branch 'x86/mm' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into aarch64/for-next/core Pull in core ioremap changes from -tip, since we depend on these for re-enabling huge I/O mappings on arm64. Signed-off-by: Will Deacon <will.deacon@arm.com>
-
- 05 Jul, 2018 5 commits
-
-
Will Deacon authored
Patching kernel instructions at runtime requires other CPUs to undergo a context synchronisation event via an explicit ISB or an IPI in order to ensure that the new instructions are visible. This is required even for "hotpatch" instructions such as NOP and BL, so avoid optimising in this case and always go via stop_machine() when performing general patching. ftrace isn't quite as strict, so it can continue to call the nosync code directly. Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Will Deacon authored
When invalidating the instruction cache for a kernel mapping via flush_icache_range(), it is also necessary to flush the pipeline for other CPUs so that instructions fetched into the pipeline before the I-cache invalidation are discarded. For example, if module 'foo' is unloaded and then module 'bar' is loaded into the same area of memory, a CPU could end up executing instructions from 'foo' when branching into 'bar' if these instructions were fetched into the pipeline before 'foo' was unloaded. Whilst this is highly unlikely to occur in practice, particularly as any exception acts as a context-synchronizing operation, following the letter of the architecture requires us to execute an ISB on each CPU in order for the new instruction stream to be visible. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Mark Rutland authored
Now that users have been migrated to PSR_AA32, kill the unused COMPAT_PSR definitions. The only difference we need a definition for is COMPAT_PSR_DIT_BIT, which differs from PSR_AA32_DIT_BIT. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Mark Rutland authored
Some code cares about the SPSR_ELx format for exceptions taken from AArch32 to inspect or manipulate the SPSR_ELx value, which is already in the SPSR_ELx format, and not in the AArch32 PSR format. To separate these from cases where we care about the AArch32 PSR format, migrate these cases to use the PSR_AA32_* definitions rather than COMPAT_PSR_*. There should be no functional change as a result of this patch. Note that arm64 KVM does not support a compat KVM API, and always uses the SPSR_ELx format, even for AArch32 guests. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Christoffer Dall <christoffer.dall@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
-
Mark Rutland authored
Some code cares about the SPSR_ELx format for exceptions taken from AArch32 to inspect or manipulate the SPSR_ELx value, which is already in the SPSR_ELx format, and not in the AArch32 PSR format. To separate these from cases where we care about the AArch32 PSR format, migrate these cases to use the PSR_AA32_* definitions rather than COMPAT_PSR_*. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
-