Commit e0152e74 authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'riscv-for-linus-6.6-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - Support for the new "riscv,isa-extensions" and "riscv,isa-base"
   device tree interfaces for probing extensions

 - Support for userspace access to the performance counters

 - Support for more instructions in kprobes

 - Crash kernels can be allocated above 4GiB

 - Support for KCFI

 - Support for ELFs in !MMU configurations

 - ARCH_KMALLOC_MINALIGN has been reduced to 8

 - mmap() defaults to sv48-sized addresses, with longer addresses hidden
   behind a hint (similar to Arm and Intel)

 - Also various fixes and cleanups

* tag 'riscv-for-linus-6.6-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (51 commits)
  lib/Kconfig.debug: Restrict DEBUG_INFO_SPLIT for RISC-V
  riscv: support PREEMPT_DYNAMIC with static keys
  riscv: Move create_tmp_mapping() to init sections
  riscv: Mark KASAN tmp* page tables variables as static
  riscv: mm: use bitmap_zero() API
  riscv: enable DEBUG_FORCE_FUNCTION_ALIGN_64B
  riscv: remove redundant mv instructions
  RISC-V: mm: Document mmap changes
  RISC-V: mm: Update pgtable comment documentation
  RISC-V: mm: Add tests for RISC-V mm
  RISC-V: mm: Restrict address space for sv39,sv48,sv57
  riscv: enable DMA_BOUNCE_UNALIGNED_KMALLOC for !dma_coherent
  riscv: allow kmalloc() caches aligned to the smallest value
  riscv: support the elf-fdpic binfmt loader
  binfmt_elf_fdpic: support 64-bit systems
  riscv: Allow CONFIG_CFI_CLANG to be selected
  riscv/purgatory: Disable CFI
  riscv: Add CFI error handling
  riscv: Add ftrace_stub_graph
  riscv: Add types to indirectly called assembly functions
  ...
parents a031eba2 89775a27
......@@ -873,7 +873,7 @@
memory region [offset, offset + size] for that kernel
image. If '@offset' is omitted, then a suitable offset
is selected automatically.
[KNL, X86-64, ARM64] Select a region under 4G first, and
[KNL, X86-64, ARM64, RISCV] Select a region under 4G first, and
fall back to reserve region above 4G when '@offset'
hasn't been specified.
See Documentation/admin-guide/kdump/kdump.rst for further details.
......@@ -886,14 +886,14 @@
Documentation/admin-guide/kdump/kdump.rst for an example.
crashkernel=size[KMG],high
[KNL, X86-64, ARM64] range could be above 4G. Allow kernel
to allocate physical memory region from top, so could
be above 4G if system have more than 4G ram installed.
Otherwise memory region will be allocated below 4G, if
available.
[KNL, X86-64, ARM64, RISCV] range could be above 4G.
Allow kernel to allocate physical memory region from top,
so could be above 4G if system have more than 4G ram
installed. Otherwise memory region will be allocated
below 4G, if available.
It will be ignored if crashkernel=X is specified.
crashkernel=size[KMG],low
[KNL, X86-64, ARM64] range under 4G. When crashkernel=X,high
[KNL, X86-64, ARM64, RISCV] range under 4G. When crashkernel=X,high
is passed, kernel could allocate physical memory region
above 4G, that cause second kernel crash on system
that require some amount of low memory, e.g. swiotlb
......@@ -904,6 +904,7 @@
size is platform dependent.
--> x86: max(swiotlb_size_or_default() + 8MiB, 256MiB)
--> arm64: 128MiB
--> riscv: 128MiB
This one lets the user specify own low range under 4G
for second kernel instead.
0: to disable low allocation.
......@@ -5554,6 +5555,13 @@
[KNL] Disable ring 3 MONITOR/MWAIT feature on supported
CPUs.
riscv_isa_fallback [RISCV]
When CONFIG_RISCV_ISA_FALLBACK is not enabled, permit
falling back to detecting extension support by parsing
"riscv,isa" property on devicetree systems when the
replacement properties are not found. See the Kconfig
entry for RISCV_ISA_FALLBACK.
ro [KNL] Mount root device read-only on boot
rodata= [KNL]
......
......@@ -941,16 +941,35 @@ enabled, otherwise writing to this file will return ``-EBUSY``.
The default value is 8.
perf_user_access (arm64 only)
=================================
perf_user_access (arm64 and riscv only)
=======================================
Controls user space access for reading perf event counters.
Controls user space access for reading perf event counters. When set to 1,
user space can read performance monitor counter registers directly.
arm64
=====
The default value is 0 (access disabled).
When set to 1, user space can read performance monitor counter registers
directly.
See Documentation/arch/arm64/perf.rst for more information.
riscv
=====
When set to 0, user space access is disabled.
The default value is 1, user space can read performance monitor counter
registers through perf, any direct access without perf intervention will trigger
an illegal instruction.
When set to 2, which enables legacy mode (user space has direct access to cycle
and insret CSRs only). Note that this legacy value is deprecated and will be
removed once all user space applications are fixed.
Note that the time CSR is always directly accessible to all modes.
pid_max
=======
......
......@@ -133,3 +133,25 @@ RISC-V Linux Kernel SV57
ffffffff00000000 | -4 GB | ffffffff7fffffff | 2 GB | modules, BPF
ffffffff80000000 | -2 GB | ffffffffffffffff | 2 GB | kernel
__________________|____________|__________________|_________|____________________________________________________________
Userspace VAs
--------------------
To maintain compatibility with software that relies on the VA space with a
maximum of 48 bits the kernel will, by default, return virtual addresses to
userspace from a 48-bit range (sv48). This default behavior is achieved by
passing 0 into the hint address parameter of mmap. On CPUs with an address space
smaller than sv48, the CPU maximum supported address space will be the default.
Software can "opt-in" to receiving VAs from another VA space by providing
a hint address to mmap. A hint address passed to mmap will cause the largest
address space that fits entirely into the hint to be used, unless there is no
space left in the address space. If there is no space available in the requested
address space, an address in the next smallest available address space will be
returned.
For example, in order to obtain 48-bit VA space, a hint address greater than
:code:`1 << 47` must be provided. Note that this is 47 due to sv48 userspace
ending at :code:`1 << 47` and the addresses beyond this are reserved for the
kernel. Similarly, to obtain 57-bit VA space addresses, a hint address greater
than or equal to :code:`1 << 56` must be provided.
......@@ -35,6 +35,7 @@ config RISCV
select ARCH_HAS_SET_MEMORY if MMU
select ARCH_HAS_STRICT_KERNEL_RWX if MMU && !XIP_KERNEL
select ARCH_HAS_STRICT_MODULE_RWX if MMU && !XIP_KERNEL
select ARCH_HAS_SYSCALL_WRAPPER
select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
select ARCH_HAS_UBSAN_SANITIZE_ALL
select ARCH_HAS_VDSO_DATA
......@@ -42,12 +43,14 @@ config RISCV
select ARCH_OPTIONAL_KERNEL_RWX_DEFAULT
select ARCH_STACKWALK
select ARCH_SUPPORTS_ATOMIC_RMW
select ARCH_SUPPORTS_CFI_CLANG
select ARCH_SUPPORTS_DEBUG_PAGEALLOC if MMU
select ARCH_SUPPORTS_HUGETLBFS if MMU
select ARCH_SUPPORTS_PAGE_TABLE_CHECK if MMU
select ARCH_SUPPORTS_PER_VMA_LOCK if MMU
select ARCH_USE_MEMTEST
select ARCH_USE_QUEUED_RWLOCKS
select ARCH_USES_CFI_TRAPS if CFI_CLANG
select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT if MMU
select ARCH_WANT_FRAME_POINTERS
select ARCH_WANT_GENERAL_HUGETLB if !RISCV_ISA_SVNAPOT
......@@ -62,6 +65,7 @@ config RISCV
select COMMON_CLK
select CPU_PM if CPU_IDLE || HIBERNATION
select EDAC_SUPPORT
select FRAME_POINTER if PERF_EVENTS || (FUNCTION_TRACER && !DYNAMIC_FTRACE)
select GENERIC_ARCH_TOPOLOGY
select GENERIC_ATOMIC64 if !64BIT
select GENERIC_CLOCKEVENTS_BROADCAST if SMP
......@@ -130,6 +134,7 @@ config RISCV
select HAVE_PERF_REGS
select HAVE_PERF_USER_STACK_DUMP
select HAVE_POSIX_CPU_TIMERS_TASK_WORK
select HAVE_PREEMPT_DYNAMIC_KEY if !XIP_KERNEL
select HAVE_REGS_AND_STACK_ACCESS_API
select HAVE_RETHOOK if !XIP_KERNEL
select HAVE_RSEQ
......@@ -267,6 +272,7 @@ config RISCV_DMA_NONCOHERENT
select ARCH_HAS_SETUP_DMA_OPS
select ARCH_HAS_SYNC_DMA_FOR_CPU
select ARCH_HAS_SYNC_DMA_FOR_DEVICE
select DMA_BOUNCE_UNALIGNED_KMALLOC if SWIOTLB
select DMA_DIRECT_REMAP
config AS_HAS_INSN
......@@ -836,6 +842,24 @@ config XIP_PHYS_ADDR
be linked for and stored to. This address is dependent on your
own flash usage.
config RISCV_ISA_FALLBACK
bool "Permit falling back to parsing riscv,isa for extension support by default"
default y
help
Parsing the "riscv,isa" devicetree property has been deprecated and
replaced by a list of explicitly defined strings. For compatibility
with existing platforms, the kernel will fall back to parsing the
"riscv,isa" property if the replacements are not found.
Selecting N here will result in a kernel that does not use the
fallback, unless the commandline "riscv_isa_fallback" parameter is
present.
Please see the dt-binding, located at
Documentation/devicetree/bindings/riscv/extensions.yaml for details
on the replacement properties, "riscv,isa-base" and
"riscv,isa-extensions".
endmenu # "Boot options"
config BUILTIN_DTB
......
......@@ -87,9 +87,6 @@ endif
ifeq ($(CONFIG_CMODEL_MEDANY),y)
KBUILD_CFLAGS += -mcmodel=medany
endif
ifeq ($(CONFIG_PERF_EVENTS),y)
KBUILD_CFLAGS += -fno-omit-frame-pointer
endif
# Avoid generating .eh_frame sections.
KBUILD_CFLAGS += -fno-asynchronous-unwind-tables -fno-unwind-tables
......
......@@ -146,7 +146,7 @@
* vendor_id: The CPU vendor ID.
* patch_id: The patch ID (erratum ID or cpufeature ID).
* CONFIG_k: The Kconfig of this patch ID. When Kconfig is disabled, the old
* content will alwyas be executed.
* content will always be executed.
*/
#define ALTERNATIVE(old_content, new_content, vendor_id, patch_id, CONFIG_k) \
_ALTERNATIVE_CFG(old_content, new_content, vendor_id, patch_id, CONFIG_k)
......
......@@ -13,6 +13,7 @@
#ifdef CONFIG_RISCV_DMA_NONCOHERENT
#define ARCH_DMA_MINALIGN L1_CACHE_BYTES
#define ARCH_KMALLOC_MINALIGN (8)
#endif
/*
......@@ -23,4 +24,17 @@
#define ARCH_SLAB_MINALIGN 16
#endif
#ifndef __ASSEMBLY__
#ifdef CONFIG_RISCV_DMA_NONCOHERENT
extern int dma_cache_alignment;
#define dma_get_cache_alignment dma_get_cache_alignment
static inline int dma_get_cache_alignment(void)
{
return dma_cache_alignment;
}
#endif
#endif /* __ASSEMBLY__ */
#endif /* _ASM_RISCV_CACHE_H */
......@@ -58,8 +58,10 @@ void riscv_init_cbo_blocksizes(void);
#ifdef CONFIG_RISCV_DMA_NONCOHERENT
void riscv_noncoherent_supported(void);
void __init riscv_set_dma_cache_alignment(void);
#else
static inline void riscv_noncoherent_supported(void) {}
static inline void riscv_set_dma_cache_alignment(void) {}
#endif
/*
......
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef _ASM_RISCV_CFI_H
#define _ASM_RISCV_CFI_H
/*
* Clang Control Flow Integrity (CFI) support.
*
* Copyright (C) 2023 Google LLC
*/
#include <linux/cfi.h>
#ifdef CONFIG_CFI_CLANG
enum bug_trap_type handle_cfi_failure(struct pt_regs *regs);
#else
static inline enum bug_trap_type handle_cfi_failure(struct pt_regs *regs)
{
return BUG_TRAP_TYPE_NONE;
}
#endif /* CONFIG_CFI_CLANG */
#endif /* _ASM_RISCV_CFI_H */
......@@ -41,6 +41,7 @@ extern bool compat_elf_check_arch(Elf32_Ehdr *hdr);
#define compat_elf_check_arch compat_elf_check_arch
#define CORE_DUMP_USE_REGSET
#define ELF_FDPIC_CORE_EFLAGS 0
#define ELF_EXEC_PAGESIZE (PAGE_SIZE)
/*
......@@ -49,7 +50,7 @@ extern bool compat_elf_check_arch(Elf32_Ehdr *hdr);
* the loader. We need to make sure that it is out of the way of the program
* that it will "exec", and that there is sufficient room for the brk.
*/
#define ELF_ET_DYN_BASE ((TASK_SIZE / 3) * 2)
#define ELF_ET_DYN_BASE ((DEFAULT_MAP_WINDOW / 3) * 2)
#ifdef CONFIG_64BIT
#ifdef CONFIG_COMPAT
......@@ -69,6 +70,13 @@ extern bool compat_elf_check_arch(Elf32_Ehdr *hdr);
#define ELF_HWCAP riscv_get_elf_hwcap()
extern unsigned long elf_hwcap;
#define ELF_FDPIC_PLAT_INIT(_r, _exec_map_addr, _interp_map_addr, dynamic_addr) \
do { \
(_r)->a1 = _exec_map_addr; \
(_r)->a2 = _interp_map_addr; \
(_r)->a3 = dynamic_addr; \
} while (0)
/*
* This yields a string that ld.so will use to load implementation
* specific libraries for optimization. This is more specific in
......@@ -78,7 +86,6 @@ extern unsigned long elf_hwcap;
#define COMPAT_ELF_PLATFORM (NULL)
#ifdef CONFIG_MMU
#define ARCH_DLINFO \
do { \
/* \
......@@ -115,6 +122,8 @@ do { \
else \
NEW_AUX_ENT(AT_IGNORE, 0); \
} while (0)
#ifdef CONFIG_MMU
#define ARCH_HAS_SETUP_ADDITIONAL_PAGES
struct linux_binprm;
extern int arch_setup_additional_pages(struct linux_binprm *bprm,
......
......@@ -14,12 +14,17 @@
#include <uapi/asm/hwcap.h>
#define RISCV_ISA_EXT_a ('a' - 'a')
#define RISCV_ISA_EXT_b ('b' - 'a')
#define RISCV_ISA_EXT_c ('c' - 'a')
#define RISCV_ISA_EXT_d ('d' - 'a')
#define RISCV_ISA_EXT_f ('f' - 'a')
#define RISCV_ISA_EXT_h ('h' - 'a')
#define RISCV_ISA_EXT_i ('i' - 'a')
#define RISCV_ISA_EXT_j ('j' - 'a')
#define RISCV_ISA_EXT_k ('k' - 'a')
#define RISCV_ISA_EXT_m ('m' - 'a')
#define RISCV_ISA_EXT_p ('p' - 'a')
#define RISCV_ISA_EXT_q ('q' - 'a')
#define RISCV_ISA_EXT_s ('s' - 'a')
#define RISCV_ISA_EXT_u ('u' - 'a')
#define RISCV_ISA_EXT_v ('v' - 'a')
......@@ -55,7 +60,6 @@
#define RISCV_ISA_EXT_ZIHPM 42
#define RISCV_ISA_EXT_MAX 64
#define RISCV_ISA_EXT_NAME_LEN_MAX 32
#ifdef CONFIG_RISCV_M_MODE
#define RISCV_ISA_EXT_SxAIA RISCV_ISA_EXT_SMAIA
......@@ -70,12 +74,15 @@
unsigned long riscv_get_elf_hwcap(void);
struct riscv_isa_ext_data {
/* Name of the extension displayed to userspace via /proc/cpuinfo */
char uprop[RISCV_ISA_EXT_NAME_LEN_MAX];
/* The logical ISA extension ID */
unsigned int isa_ext_id;
const unsigned int id;
const char *name;
const char *property;
};
extern const struct riscv_isa_ext_data riscv_isa_ext[];
extern const size_t riscv_isa_ext_count;
extern bool riscv_isa_fallback;
unsigned long riscv_isa_extension_base(const unsigned long *isa_bitmap);
#define riscv_isa_extension_mask(ext) BIT_MASK(RISCV_ISA_EXT_##ext)
......
......@@ -63,6 +63,7 @@
#define RVG_RS1_OPOFF 15
#define RVG_RS2_OPOFF 20
#define RVG_RD_OPOFF 7
#define RVG_RS1_MASK GENMASK(4, 0)
#define RVG_RD_MASK GENMASK(4, 0)
/* The bit field of immediate value in RVC J instruction */
......@@ -130,6 +131,7 @@
#define RVC_C2_RS1_OPOFF 7
#define RVC_C2_RS2_OPOFF 2
#define RVC_C2_RD_OPOFF 7
#define RVC_C2_RS1_MASK GENMASK(4, 0)
/* parts of opcode for RVG*/
#define RVG_OPCODE_FENCE 0x0f
......@@ -289,6 +291,10 @@ static __always_inline bool riscv_insn_is_c_jalr(u32 code)
#define RV_X(X, s, mask) (((X) >> (s)) & (mask))
#define RVC_X(X, s, mask) RV_X(X, s, mask)
#define RV_EXTRACT_RS1_REG(x) \
({typeof(x) x_ = (x); \
(RV_X(x_, RVG_RS1_OPOFF, RVG_RS1_MASK)); })
#define RV_EXTRACT_RD_REG(x) \
({typeof(x) x_ = (x); \
(RV_X(x_, RVG_RD_OPOFF, RVG_RD_MASK)); })
......@@ -316,6 +322,10 @@ static __always_inline bool riscv_insn_is_c_jalr(u32 code)
(RV_X(x_, RV_B_IMM_11_OPOFF, RV_B_IMM_11_MASK) << RV_B_IMM_11_OFF) | \
(RV_IMM_SIGN(x_) << RV_B_IMM_SIGN_OFF); })
#define RVC_EXTRACT_C2_RS1_REG(x) \
({typeof(x) x_ = (x); \
(RV_X(x_, RVC_C2_RS1_OPOFF, RVC_C2_RS1_MASK)); })
#define RVC_EXTRACT_JTYPE_IMM(x) \
({typeof(x) x_ = (x); \
(RVC_X(x_, RVC_J_IMM_3_1_OPOFF, RVC_J_IMM_3_1_MASK) << RVC_J_IMM_3_1_OFF) | \
......
......@@ -20,6 +20,10 @@ typedef struct {
/* A local icache flush is needed before user execution can resume. */
cpumask_t icache_stale_mask;
#endif
#ifdef CONFIG_BINFMT_ELF_FDPIC
unsigned long exec_fdpic_loadmap;
unsigned long interp_fdpic_loadmap;
#endif
} mm_context_t;
void __init create_pgd_mapping(pgd_t *pgdp, uintptr_t va, phys_addr_t pa,
......
......@@ -62,11 +62,16 @@
* struct pages to map half the virtual address space. Then
* position vmemmap directly below the VMALLOC region.
*/
#define VA_BITS_SV32 32
#ifdef CONFIG_64BIT
#define VA_BITS_SV39 39
#define VA_BITS_SV48 48
#define VA_BITS_SV57 57
#define VA_BITS (pgtable_l5_enabled ? \
57 : (pgtable_l4_enabled ? 48 : 39))
VA_BITS_SV57 : (pgtable_l4_enabled ? VA_BITS_SV48 : VA_BITS_SV39))
#else
#define VA_BITS 32
#define VA_BITS VA_BITS_SV32
#endif
#define VMEMMAP_SHIFT \
......@@ -111,11 +116,27 @@
#include <asm/page.h>
#include <asm/tlbflush.h>
#include <linux/mm_types.h>
#include <asm/compat.h>
#define __page_val_to_pfn(_val) (((_val) & _PAGE_PFN_MASK) >> _PAGE_PFN_SHIFT)
#ifdef CONFIG_64BIT
#include <asm/pgtable-64.h>
#define VA_USER_SV39 (UL(1) << (VA_BITS_SV39 - 1))
#define VA_USER_SV48 (UL(1) << (VA_BITS_SV48 - 1))
#define VA_USER_SV57 (UL(1) << (VA_BITS_SV57 - 1))
#ifdef CONFIG_COMPAT
#define MMAP_VA_BITS_64 ((VA_BITS >= VA_BITS_SV48) ? VA_BITS_SV48 : VA_BITS)
#define MMAP_MIN_VA_BITS_64 (VA_BITS_SV39)
#define MMAP_VA_BITS (is_compat_task() ? VA_BITS_SV32 : MMAP_VA_BITS_64)
#define MMAP_MIN_VA_BITS (is_compat_task() ? VA_BITS_SV32 : MMAP_MIN_VA_BITS_64)
#else
#define MMAP_VA_BITS ((VA_BITS >= VA_BITS_SV48) ? VA_BITS_SV48 : VA_BITS)
#define MMAP_MIN_VA_BITS (VA_BITS_SV39)
#endif /* CONFIG_COMPAT */
#else
#include <asm/pgtable-32.h>
#endif /* CONFIG_64BIT */
......@@ -843,14 +864,16 @@ static inline pte_t pte_swp_clear_exclusive(pte_t pte)
* Task size is 0x4000000000 for RV64 or 0x9fc00000 for RV32.
* Note that PGDIR_SIZE must evenly divide TASK_SIZE.
* Task size is:
* - 0x9fc00000 (~2.5GB) for RV32.
* - 0x4000000000 ( 256GB) for RV64 using SV39 mmu
* - 0x800000000000 ( 128TB) for RV64 using SV48 mmu
* - 0x9fc00000 (~2.5GB) for RV32.
* - 0x4000000000 ( 256GB) for RV64 using SV39 mmu
* - 0x800000000000 ( 128TB) for RV64 using SV48 mmu
* - 0x100000000000000 ( 64PB) for RV64 using SV57 mmu
*
* Note that PGDIR_SIZE must evenly divide TASK_SIZE since "RISC-V
* Instruction Set Manual Volume II: Privileged Architecture" states that
* "load and store effective addresses, which are 64bits, must have bits
* 63–48 all equal to bit 47, or else a page-fault exception will occur."
* Similarly for SV57, bits 63–57 must be equal to bit 56.
*/
#ifdef CONFIG_64BIT
#define TASK_SIZE_64 (PGDIR_SIZE * PTRS_PER_PGD / 2)
......
......@@ -13,19 +13,59 @@
#include <asm/ptrace.h>
#ifdef CONFIG_64BIT
#define DEFAULT_MAP_WINDOW (UL(1) << (MMAP_VA_BITS - 1))
#define STACK_TOP_MAX TASK_SIZE_64
#define arch_get_mmap_end(addr, len, flags) \
({ \
unsigned long mmap_end; \
typeof(addr) _addr = (addr); \
if ((_addr) == 0 || (IS_ENABLED(CONFIG_COMPAT) && is_compat_task())) \
mmap_end = STACK_TOP_MAX; \
else if ((_addr) >= VA_USER_SV57) \
mmap_end = STACK_TOP_MAX; \
else if ((((_addr) >= VA_USER_SV48)) && (VA_BITS >= VA_BITS_SV48)) \
mmap_end = VA_USER_SV48; \
else \
mmap_end = VA_USER_SV39; \
mmap_end; \
})
#define arch_get_mmap_base(addr, base) \
({ \
unsigned long mmap_base; \
typeof(addr) _addr = (addr); \
typeof(base) _base = (base); \
unsigned long rnd_gap = DEFAULT_MAP_WINDOW - (_base); \
if ((_addr) == 0 || (IS_ENABLED(CONFIG_COMPAT) && is_compat_task())) \
mmap_base = (_base); \
else if (((_addr) >= VA_USER_SV57) && (VA_BITS >= VA_BITS_SV57)) \
mmap_base = VA_USER_SV57 - rnd_gap; \
else if ((((_addr) >= VA_USER_SV48)) && (VA_BITS >= VA_BITS_SV48)) \
mmap_base = VA_USER_SV48 - rnd_gap; \
else \
mmap_base = VA_USER_SV39 - rnd_gap; \
mmap_base; \
})
#else
#define DEFAULT_MAP_WINDOW TASK_SIZE
#define STACK_TOP_MAX TASK_SIZE
#endif
#define STACK_ALIGN 16
#define STACK_TOP DEFAULT_MAP_WINDOW
/*
* This decides where the kernel will search for a free chunk of vm
* space during mmap's.
*/
#define TASK_UNMAPPED_BASE PAGE_ALIGN(TASK_SIZE / 3)
#define STACK_TOP TASK_SIZE
#ifdef CONFIG_64BIT
#define STACK_TOP_MAX TASK_SIZE_64
#define TASK_UNMAPPED_BASE PAGE_ALIGN((UL(1) << MMAP_MIN_VA_BITS) / 3)
#else
#define STACK_TOP_MAX TASK_SIZE
#define TASK_UNMAPPED_BASE PAGE_ALIGN(TASK_SIZE / 3)
#endif
#define STACK_ALIGN 16
#ifndef __ASSEMBLY__
......
......@@ -75,7 +75,7 @@ static inline int syscall_get_arch(struct task_struct *task)
#endif
}
typedef long (*syscall_t)(ulong, ulong, ulong, ulong, ulong, ulong, ulong);
typedef long (*syscall_t)(const struct pt_regs *);
static inline void syscall_handler(struct pt_regs *regs, ulong syscall)
{
syscall_t fn;
......@@ -87,8 +87,7 @@ static inline void syscall_handler(struct pt_regs *regs, ulong syscall)
#endif
fn = sys_call_table[syscall];
regs->a0 = fn(regs->orig_a0, regs->a1, regs->a2,
regs->a3, regs->a4, regs->a5, regs->a6);
regs->a0 = fn(regs);
}
static inline bool arch_syscall_is_vdso_sigreturn(struct pt_regs *regs)
......
/* SPDX-License-Identifier: GPL-2.0 */
/*
* syscall_wrapper.h - riscv specific wrappers to syscall definitions
*
* Based on arch/arm64/include/syscall_wrapper.h
*/
#ifndef __ASM_SYSCALL_WRAPPER_H
#define __ASM_SYSCALL_WRAPPER_H
#include <asm/ptrace.h>
asmlinkage long __riscv_sys_ni_syscall(const struct pt_regs *);
#define SC_RISCV_REGS_TO_ARGS(x, ...) \
__MAP(x,__SC_ARGS \
,,regs->orig_a0,,regs->a1,,regs->a2 \
,,regs->a3,,regs->a4,,regs->a5,,regs->a6)
#ifdef CONFIG_COMPAT
#define COMPAT_SYSCALL_DEFINEx(x, name, ...) \
asmlinkage long __riscv_compat_sys##name(const struct pt_regs *regs); \
ALLOW_ERROR_INJECTION(__riscv_compat_sys##name, ERRNO); \
static long __se_compat_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
static inline long __do_compat_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)); \
asmlinkage long __riscv_compat_sys##name(const struct pt_regs *regs) \
{ \
return __se_compat_sys##name(SC_RISCV_REGS_TO_ARGS(x,__VA_ARGS__)); \
} \
static long __se_compat_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
{ \
return __do_compat_sys##name(__MAP(x,__SC_DELOUSE,__VA_ARGS__)); \
} \
static inline long __do_compat_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__))
#define COMPAT_SYSCALL_DEFINE0(sname) \
asmlinkage long __riscv_compat_sys_##sname(const struct pt_regs *__unused); \
ALLOW_ERROR_INJECTION(__riscv_compat_sys_##sname, ERRNO); \
asmlinkage long __riscv_compat_sys_##sname(const struct pt_regs *__unused)
#define COND_SYSCALL_COMPAT(name) \
asmlinkage long __weak __riscv_compat_sys_##name(const struct pt_regs *regs); \
asmlinkage long __weak __riscv_compat_sys_##name(const struct pt_regs *regs) \
{ \
return sys_ni_syscall(); \
}
#define COMPAT_SYS_NI(name) \
SYSCALL_ALIAS(__riscv_compat_sys_##name, sys_ni_posix_timers);
#endif /* CONFIG_COMPAT */
#define __SYSCALL_DEFINEx(x, name, ...) \
asmlinkage long __riscv_sys##name(const struct pt_regs *regs); \
ALLOW_ERROR_INJECTION(__riscv_sys##name, ERRNO); \
static long __se_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
static inline long __do_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)); \
asmlinkage long __riscv_sys##name(const struct pt_regs *regs) \
{ \
return __se_sys##name(SC_RISCV_REGS_TO_ARGS(x,__VA_ARGS__)); \
} \
static long __se_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
{ \
long ret = __do_sys##name(__MAP(x,__SC_CAST,__VA_ARGS__)); \
__MAP(x,__SC_TEST,__VA_ARGS__); \
__PROTECT(x, ret,__MAP(x,__SC_ARGS,__VA_ARGS__)); \
return ret; \
} \
static inline long __do_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__))
#define SYSCALL_DEFINE0(sname) \
SYSCALL_METADATA(_##sname, 0); \
asmlinkage long __riscv_sys_##sname(const struct pt_regs *__unused); \
ALLOW_ERROR_INJECTION(__riscv_sys_##sname, ERRNO); \
asmlinkage long __riscv_sys_##sname(const struct pt_regs *__unused)
#define COND_SYSCALL(name) \
asmlinkage long __weak __riscv_sys_##name(const struct pt_regs *regs); \
asmlinkage long __weak __riscv_sys_##name(const struct pt_regs *regs) \
{ \
return sys_ni_syscall(); \
}
#define SYS_NI(name) SYSCALL_ALIAS(__riscv_sys_##name, sys_ni_posix_timers);
#endif /* __ASM_SYSCALL_WRAPPER_H */
......@@ -10,6 +10,11 @@
#include <linux/types.h>
#define PTRACE_GETFDPIC 33
#define PTRACE_GETFDPIC_EXEC 0
#define PTRACE_GETFDPIC_INTERP 1
/*
* User-mode register state for core dumps, ptrace, sigcontext
*
......
......@@ -25,7 +25,7 @@ struct __sc_riscv_v_state {
* Signal context structure
*
* This contains the context saved before a signal handler is invoked;
* it is restored by sys_sigreturn / sys_rt_sigreturn.
* it is restored by sys_rt_sigreturn.
*/
struct sigcontext {
struct user_regs_struct sc_regs;
......
......@@ -91,6 +91,8 @@ obj-$(CONFIG_CRASH_CORE) += crash_core.o
obj-$(CONFIG_JUMP_LABEL) += jump_label.o
obj-$(CONFIG_CFI_CLANG) += cfi.o
obj-$(CONFIG_EFI) += efi.o
obj-$(CONFIG_COMPAT) += compat_syscall_table.o
obj-$(CONFIG_COMPAT) += compat_signal.o
......
// SPDX-License-Identifier: GPL-2.0
/*
* Clang Control Flow Integrity (CFI) support.
*
* Copyright (C) 2023 Google LLC
*/
#include <asm/cfi.h>
#include <asm/insn.h>
/*
* Returns the target address and the expected type when regs->epc points
* to a compiler-generated CFI trap.
*/
static bool decode_cfi_insn(struct pt_regs *regs, unsigned long *target,
u32 *type)
{
unsigned long *regs_ptr = (unsigned long *)regs;
int rs1_num;
u32 insn;
*target = *type = 0;
/*
* The compiler generates the following instruction sequence
* for indirect call checks:
*
*   lw t1, -4(<reg>)
* lui t2, <hi20>
* addiw t2, t2, <lo12>
* beq t1, t2, .Ltmp1
* ebreak ; <- regs->epc
* .Ltmp1:
* jalr <reg>
*
* We can read the expected type and the target address from the
* registers passed to the beq/jalr instructions.
*/
if (get_kernel_nofault(insn, (void *)regs->epc - 4))
return false;
if (!riscv_insn_is_beq(insn))
return false;
*type = (u32)regs_ptr[RV_EXTRACT_RS1_REG(insn)];
if (get_kernel_nofault(insn, (void *)regs->epc) ||
get_kernel_nofault(insn, (void *)regs->epc + GET_INSN_LENGTH(insn)))
return false;
if (riscv_insn_is_jalr(insn))
rs1_num = RV_EXTRACT_RS1_REG(insn);
else if (riscv_insn_is_c_jalr(insn))
rs1_num = RVC_EXTRACT_C2_RS1_REG(insn);
else
return false;
*target = regs_ptr[rs1_num];
return true;
}
/*
* Checks if the ebreak trap is because of a CFI failure, and handles the trap
* if needed. Returns a bug_trap_type value similarly to report_bug.
*/
enum bug_trap_type handle_cfi_failure(struct pt_regs *regs)
{
unsigned long target;
u32 type;
if (!is_cfi_trap(regs->epc))
return BUG_TRAP_TYPE_NONE;
if (!decode_cfi_insn(regs, &target, &type))
return report_cfi_failure_noaddr(regs, regs->epc);
return report_cfi_failure(regs, regs->epc, &target, type);
}
......@@ -9,11 +9,15 @@
#include <asm/syscall.h>
#undef __SYSCALL
#define __SYSCALL(nr, call) [nr] = (call),
#define __SYSCALL(nr, call) asmlinkage long __riscv_##call(const struct pt_regs *);
#include <asm/unistd.h>
#undef __SYSCALL
#define __SYSCALL(nr, call) [nr] = __riscv_##call,
asmlinkage long compat_sys_rt_sigreturn(void);
void * const compat_sys_call_table[__NR_syscalls] = {
[0 ... __NR_syscalls - 1] = sys_ni_syscall,
[0 ... __NR_syscalls - 1] = __riscv_sys_ni_syscall,
#include <asm/unistd.h>
};
......@@ -46,7 +46,7 @@ int riscv_of_processor_hartid(struct device_node *node, unsigned long *hart)
return 0;
}
int riscv_early_of_processor_hartid(struct device_node *node, unsigned long *hart)
int __init riscv_early_of_processor_hartid(struct device_node *node, unsigned long *hart)
{
const char *isa;
......@@ -66,16 +66,53 @@ int riscv_early_of_processor_hartid(struct device_node *node, unsigned long *har
return -ENODEV;
}
if (of_property_read_string(node, "riscv,isa-base", &isa))
goto old_interface;
if (IS_ENABLED(CONFIG_32BIT) && strncasecmp(isa, "rv32i", 5)) {
pr_warn("CPU with hartid=%lu does not support rv32i", *hart);
return -ENODEV;
}
if (IS_ENABLED(CONFIG_64BIT) && strncasecmp(isa, "rv64i", 5)) {
pr_warn("CPU with hartid=%lu does not support rv64i", *hart);
return -ENODEV;
}
if (!of_property_present(node, "riscv,isa-extensions"))
return -ENODEV;
if (of_property_match_string(node, "riscv,isa-extensions", "i") < 0 ||
of_property_match_string(node, "riscv,isa-extensions", "m") < 0 ||
of_property_match_string(node, "riscv,isa-extensions", "a") < 0) {
pr_warn("CPU with hartid=%lu does not support ima", *hart);
return -ENODEV;
}
return 0;
old_interface:
if (!riscv_isa_fallback) {
pr_warn("CPU with hartid=%lu is invalid: this kernel does not parse \"riscv,isa\"",
*hart);
return -ENODEV;
}
if (of_property_read_string(node, "riscv,isa", &isa)) {
pr_warn("CPU with hartid=%lu has no \"riscv,isa\" property\n", *hart);
pr_warn("CPU with hartid=%lu has no \"riscv,isa-base\" or \"riscv,isa\" property\n",
*hart);
return -ENODEV;
}
if (IS_ENABLED(CONFIG_32BIT) && strncasecmp(isa, "rv32ima", 7))
if (IS_ENABLED(CONFIG_32BIT) && strncasecmp(isa, "rv32ima", 7)) {
pr_warn("CPU with hartid=%lu does not support rv32ima", *hart);
return -ENODEV;
}
if (IS_ENABLED(CONFIG_64BIT) && strncasecmp(isa, "rv64ima", 7))
if (IS_ENABLED(CONFIG_64BIT) && strncasecmp(isa, "rv64ima", 7)) {
pr_warn("CPU with hartid=%lu does not support rv64ima", *hart);
return -ENODEV;
}
return 0;
}
......@@ -165,132 +202,46 @@ arch_initcall(riscv_cpuinfo_init);
#ifdef CONFIG_PROC_FS
#define __RISCV_ISA_EXT_DATA(UPROP, EXTID) \
{ \
.uprop = #UPROP, \
.isa_ext_id = EXTID, \
}
/*
* The canonical order of ISA extension names in the ISA string is defined in
* chapter 27 of the unprivileged specification.
*
* Ordinarily, for in-kernel data structures, this order is unimportant but
* isa_ext_arr defines the order of the ISA string in /proc/cpuinfo.
*
* The specification uses vague wording, such as should, when it comes to
* ordering, so for our purposes the following rules apply:
*
* 1. All multi-letter extensions must be separated from other extensions by an
* underscore.
*
* 2. Additional standard extensions (starting with 'Z') must be sorted after
* single-letter extensions and before any higher-privileged extensions.
* 3. The first letter following the 'Z' conventionally indicates the most
* closely related alphabetical extension category, IMAFDQLCBKJTPVH.
* If multiple 'Z' extensions are named, they must be ordered first by
* category, then alphabetically within a category.
*
* 3. Standard supervisor-level extensions (starting with 'S') must be listed
* after standard unprivileged extensions. If multiple supervisor-level
* extensions are listed, they must be ordered alphabetically.
*
* 4. Standard machine-level extensions (starting with 'Zxm') must be listed
* after any lower-privileged, standard extensions. If multiple
* machine-level extensions are listed, they must be ordered
* alphabetically.
*
* 5. Non-standard extensions (starting with 'X') must be listed after all
* standard extensions. If multiple non-standard extensions are listed, they
* must be ordered alphabetically.
*
* An example string following the order is:
* rv64imadc_zifoo_zigoo_zafoo_sbar_scar_zxmbaz_xqux_xrux
*
* New entries to this struct should follow the ordering rules described above.
*/
static struct riscv_isa_ext_data isa_ext_arr[] = {
__RISCV_ISA_EXT_DATA(zicbom, RISCV_ISA_EXT_ZICBOM),
__RISCV_ISA_EXT_DATA(zicboz, RISCV_ISA_EXT_ZICBOZ),
__RISCV_ISA_EXT_DATA(zicntr, RISCV_ISA_EXT_ZICNTR),
__RISCV_ISA_EXT_DATA(zicsr, RISCV_ISA_EXT_ZICSR),
__RISCV_ISA_EXT_DATA(zifencei, RISCV_ISA_EXT_ZIFENCEI),
__RISCV_ISA_EXT_DATA(zihintpause, RISCV_ISA_EXT_ZIHINTPAUSE),
__RISCV_ISA_EXT_DATA(zihpm, RISCV_ISA_EXT_ZIHPM),
__RISCV_ISA_EXT_DATA(zba, RISCV_ISA_EXT_ZBA),
__RISCV_ISA_EXT_DATA(zbb, RISCV_ISA_EXT_ZBB),
__RISCV_ISA_EXT_DATA(zbs, RISCV_ISA_EXT_ZBS),
__RISCV_ISA_EXT_DATA(smaia, RISCV_ISA_EXT_SMAIA),
__RISCV_ISA_EXT_DATA(ssaia, RISCV_ISA_EXT_SSAIA),
__RISCV_ISA_EXT_DATA(sscofpmf, RISCV_ISA_EXT_SSCOFPMF),
__RISCV_ISA_EXT_DATA(sstc, RISCV_ISA_EXT_SSTC),
__RISCV_ISA_EXT_DATA(svinval, RISCV_ISA_EXT_SVINVAL),
__RISCV_ISA_EXT_DATA(svnapot, RISCV_ISA_EXT_SVNAPOT),
__RISCV_ISA_EXT_DATA(svpbmt, RISCV_ISA_EXT_SVPBMT),
__RISCV_ISA_EXT_DATA("", RISCV_ISA_EXT_MAX),
};
static void print_isa_ext(struct seq_file *f)
static void print_isa(struct seq_file *f)
{
struct riscv_isa_ext_data *edata;
int i = 0, arr_sz;
arr_sz = ARRAY_SIZE(isa_ext_arr) - 1;
seq_puts(f, "isa\t\t: ");
/* No extension support available */
if (arr_sz <= 0)
return;
if (IS_ENABLED(CONFIG_32BIT))
seq_write(f, "rv32", 4);
else
seq_write(f, "rv64", 4);
for (i = 0; i <= arr_sz; i++) {
edata = &isa_ext_arr[i];
if (!__riscv_isa_extension_available(NULL, edata->isa_ext_id))
for (int i = 0; i < riscv_isa_ext_count; i++) {
if (!__riscv_isa_extension_available(NULL, riscv_isa_ext[i].id))
continue;
seq_printf(f, "_%s", edata->uprop);
}
}
/*
* These are the only valid base (single letter) ISA extensions as per the spec.
* It also specifies the canonical order in which it appears in the spec.
* Some of the extension may just be a place holder for now (B, K, P, J).
* This should be updated once corresponding extensions are ratified.
*/
static const char base_riscv_exts[13] = "imafdqcbkjpvh";
/* Only multi-letter extensions are split by underscores */
if (strnlen(riscv_isa_ext[i].name, 2) != 1)
seq_puts(f, "_");
static void print_isa(struct seq_file *f, const char *isa)
{
int i;
seq_puts(f, "isa\t\t: ");
/* Print the rv[64/32] part */
seq_write(f, isa, 4);
for (i = 0; i < sizeof(base_riscv_exts); i++) {
if (__riscv_isa_extension_available(NULL, base_riscv_exts[i] - 'a'))
/* Print only enabled the base ISA extensions */
seq_write(f, &base_riscv_exts[i], 1);
seq_printf(f, "%s", riscv_isa_ext[i].name);
}
print_isa_ext(f);
seq_puts(f, "\n");
}
static void print_mmu(struct seq_file *f)
{
char sv_type[16];
const char *sv_type;
#ifdef CONFIG_MMU
#if defined(CONFIG_32BIT)
strncpy(sv_type, "sv32", 5);
sv_type = "sv32";
#elif defined(CONFIG_64BIT)
if (pgtable_l5_enabled)
strncpy(sv_type, "sv57", 5);
sv_type = "sv57";
else if (pgtable_l4_enabled)
strncpy(sv_type, "sv48", 5);
sv_type = "sv48";
else
strncpy(sv_type, "sv39", 5);
sv_type = "sv39";
#endif
#else
strncpy(sv_type, "none", 5);
sv_type = "none";
#endif /* CONFIG_MMU */
seq_printf(f, "mmu\t\t: %s\n", sv_type);
}
......@@ -321,27 +272,21 @@ static int c_show(struct seq_file *m, void *v)
unsigned long cpu_id = (unsigned long)v - 1;
struct riscv_cpuinfo *ci = per_cpu_ptr(&riscv_cpuinfo, cpu_id);
struct device_node *node;
const char *compat, *isa;
const char *compat;
seq_printf(m, "processor\t: %lu\n", cpu_id);
seq_printf(m, "hart\t\t: %lu\n", cpuid_to_hartid_map(cpu_id));
print_isa(m);
print_mmu(m);
if (acpi_disabled) {
node = of_get_cpu_node(cpu_id, NULL);
if (!of_property_read_string(node, "riscv,isa", &isa))
print_isa(m, isa);
print_mmu(m);
if (!of_property_read_string(node, "compatible", &compat) &&
strcmp(compat, "riscv"))
seq_printf(m, "uarch\t\t: %s\n", compat);
of_node_put(node);
} else {
if (!acpi_get_riscv_isa(NULL, cpu_id, &isa))
print_isa(m, isa);
print_mmu(m);
}
seq_printf(m, "mvendorid\t: 0x%lx\n", ci->mvendorid);
......
This diff is collapsed.
......@@ -289,10 +289,6 @@ clear_bss:
blt a3, a4, clear_bss
clear_bss_done:
#endif
/* Save hart ID and DTB physical address */
mv s0, a0
mv s1, a1
la a2, boot_cpu_hartid
XIP_FIXUP_OFFSET a2
REG_S a0, (a2)
......@@ -306,7 +302,7 @@ clear_bss_done:
la a0, __dtb_start
XIP_FIXUP_OFFSET a0
#else
mv a0, s1
mv a0, a1
#endif /* CONFIG_BUILTIN_DTB */
call setup_vm
#ifdef CONFIG_MMU
......
......@@ -3,6 +3,7 @@
#include <linux/init.h>
#include <linux/linkage.h>
#include <linux/cfi_types.h>
#include <asm/asm.h>
#include <asm/csr.h>
#include <asm/unistd.h>
......@@ -47,15 +48,19 @@
addi sp, sp, 4*SZREG
.endm
ENTRY(ftrace_stub)
SYM_TYPED_FUNC_START(ftrace_stub)
#ifdef CONFIG_DYNAMIC_FTRACE
.global MCOUNT_NAME
.set MCOUNT_NAME, ftrace_stub
#endif
ret
ENDPROC(ftrace_stub)
SYM_FUNC_END(ftrace_stub)
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
SYM_TYPED_FUNC_START(ftrace_stub_graph)
ret
SYM_FUNC_END(ftrace_stub_graph)
ENTRY(return_to_handler)
/*
* On implementing the frame point test, the ideal way is to compare the
......
......@@ -29,13 +29,14 @@ riscv_probe_decode_insn(probe_opcode_t *addr, struct arch_probe_insn *api)
* TODO: the REJECTED ones below need to be implemented
*/
#ifdef CONFIG_RISCV_ISA_C
RISCV_INSN_REJECTED(c_j, insn);
RISCV_INSN_REJECTED(c_jr, insn);
RISCV_INSN_REJECTED(c_jal, insn);
RISCV_INSN_REJECTED(c_jalr, insn);
RISCV_INSN_REJECTED(c_beqz, insn);
RISCV_INSN_REJECTED(c_bnez, insn);
RISCV_INSN_REJECTED(c_ebreak, insn);
RISCV_INSN_SET_SIMULATE(c_j, insn);
RISCV_INSN_SET_SIMULATE(c_jr, insn);
RISCV_INSN_SET_SIMULATE(c_jalr, insn);
RISCV_INSN_SET_SIMULATE(c_beqz, insn);
RISCV_INSN_SET_SIMULATE(c_bnez, insn);
#endif
RISCV_INSN_SET_SIMULATE(jal, insn);
......
......@@ -188,3 +188,108 @@ bool __kprobes simulate_branch(u32 opcode, unsigned long addr, struct pt_regs *r
return true;
}
bool __kprobes simulate_c_j(u32 opcode, unsigned long addr, struct pt_regs *regs)
{
/*
* 15 13 12 2 1 0
* | funct3 | offset[11|4|9:8|10|6|7|3:1|5] | opcode |
* 3 11 2
*/
s32 offset;
offset = ((opcode >> 3) & 0x7) << 1;
offset |= ((opcode >> 11) & 0x1) << 4;
offset |= ((opcode >> 2) & 0x1) << 5;
offset |= ((opcode >> 7) & 0x1) << 6;
offset |= ((opcode >> 6) & 0x1) << 7;
offset |= ((opcode >> 9) & 0x3) << 8;
offset |= ((opcode >> 8) & 0x1) << 10;
offset |= ((opcode >> 12) & 0x1) << 11;
instruction_pointer_set(regs, addr + sign_extend32(offset, 11));
return true;
}
static bool __kprobes simulate_c_jr_jalr(u32 opcode, unsigned long addr, struct pt_regs *regs,
bool is_jalr)
{
/*
* 15 12 11 7 6 2 1 0
* | funct4 | rs1 | rs2 | op |
* 4 5 5 2
*/
unsigned long jump_addr;
u32 rs1 = (opcode >> 7) & 0x1f;
if (rs1 == 0) /* C.JR is only valid when rs1 != x0 */
return false;
if (!rv_insn_reg_get_val(regs, rs1, &jump_addr))
return false;
if (is_jalr && !rv_insn_reg_set_val(regs, 1, addr + 2))
return false;
instruction_pointer_set(regs, jump_addr);
return true;
}
bool __kprobes simulate_c_jr(u32 opcode, unsigned long addr, struct pt_regs *regs)
{
return simulate_c_jr_jalr(opcode, addr, regs, false);
}
bool __kprobes simulate_c_jalr(u32 opcode, unsigned long addr, struct pt_regs *regs)
{
return simulate_c_jr_jalr(opcode, addr, regs, true);
}
static bool __kprobes simulate_c_bnez_beqz(u32 opcode, unsigned long addr, struct pt_regs *regs,
bool is_bnez)
{
/*
* 15 13 12 10 9 7 6 2 1 0
* | funct3 | offset[8|4:3] | rs1' | offset[7:6|2:1|5] | op |
* 3 3 3 5 2
*/
s32 offset;
u32 rs1;
unsigned long rs1_val;
rs1 = 0x8 | ((opcode >> 7) & 0x7);
if (!rv_insn_reg_get_val(regs, rs1, &rs1_val))
return false;
if ((rs1_val != 0 && is_bnez) || (rs1_val == 0 && !is_bnez)) {
offset = ((opcode >> 3) & 0x3) << 1;
offset |= ((opcode >> 10) & 0x3) << 3;
offset |= ((opcode >> 2) & 0x1) << 5;
offset |= ((opcode >> 5) & 0x3) << 6;
offset |= ((opcode >> 12) & 0x1) << 8;
offset = sign_extend32(offset, 8);
} else {
offset = 2;
}
instruction_pointer_set(regs, addr + offset);
return true;
}
bool __kprobes simulate_c_bnez(u32 opcode, unsigned long addr, struct pt_regs *regs)
{
return simulate_c_bnez_beqz(opcode, addr, regs, true);
}
bool __kprobes simulate_c_beqz(u32 opcode, unsigned long addr, struct pt_regs *regs)
{
return simulate_c_bnez_beqz(opcode, addr, regs, false);
}
......@@ -24,5 +24,10 @@ bool simulate_auipc(u32 opcode, unsigned long addr, struct pt_regs *regs);
bool simulate_branch(u32 opcode, unsigned long addr, struct pt_regs *regs);
bool simulate_jal(u32 opcode, unsigned long addr, struct pt_regs *regs);
bool simulate_jalr(u32 opcode, unsigned long addr, struct pt_regs *regs);
bool simulate_c_j(u32 opcode, unsigned long addr, struct pt_regs *regs);
bool simulate_c_jr(u32 opcode, unsigned long addr, struct pt_regs *regs);
bool simulate_c_jalr(u32 opcode, unsigned long addr, struct pt_regs *regs);
bool simulate_c_bnez(u32 opcode, unsigned long addr, struct pt_regs *regs);
bool simulate_c_beqz(u32 opcode, unsigned long addr, struct pt_regs *regs);
#endif /* _RISCV_KERNEL_PROBES_SIMULATE_INSN_H */
......@@ -178,6 +178,11 @@ static void __init init_resources(void)
if (ret < 0)
goto error;
}
if (crashk_low_res.start != crashk_low_res.end) {
ret = add_resource(&iomem_resource, &crashk_low_res);
if (ret < 0)
goto error;
}
#endif
#ifdef CONFIG_CRASH_DUMP
......@@ -311,6 +316,7 @@ void __init setup_arch(char **cmdline_p)
if (IS_ENABLED(CONFIG_RISCV_ISA_ZICBOM) &&
riscv_isa_extension_available(NULL, ZICBOM))
riscv_noncoherent_supported();
riscv_set_dma_cache_alignment();
}
static int __init topology_init(void)
......
......@@ -5,6 +5,7 @@
*/
#include <linux/linkage.h>
#include <linux/cfi_types.h>
#include <asm/asm.h>
#include <asm/asm-offsets.h>
#include <asm/assembler.h>
......@@ -58,7 +59,7 @@ ENTRY(__cpu_suspend_enter)
ret
END(__cpu_suspend_enter)
ENTRY(__cpu_resume_enter)
SYM_TYPED_FUNC_START(__cpu_resume_enter)
/* Load the global pointer */
.option push
.option norelax
......@@ -94,4 +95,4 @@ ENTRY(__cpu_resume_enter)
/* Return to C code */
ret
END(__cpu_resume_enter)
SYM_FUNC_END(__cpu_resume_enter)
......@@ -335,3 +335,9 @@ SYSCALL_DEFINE5(riscv_hwprobe, struct riscv_hwprobe __user *, pairs,
return do_riscv_hwprobe(pairs, pair_count, cpu_count,
cpus, flags);
}
/* Not defined using SYSCALL_DEFINE0 to avoid error injection */
asmlinkage long __riscv_sys_ni_syscall(const struct pt_regs *__unused)
{
return -ENOSYS;
}
......@@ -10,9 +10,13 @@
#include <asm/syscall.h>
#undef __SYSCALL
#define __SYSCALL(nr, call) [nr] = (call),
#define __SYSCALL(nr, call) asmlinkage long __riscv_##call(const struct pt_regs *);
#include <asm/unistd.h>
#undef __SYSCALL
#define __SYSCALL(nr, call) [nr] = __riscv_##call,
void * const sys_call_table[__NR_syscalls] = {
[0 ... __NR_syscalls - 1] = sys_ni_syscall,
[0 ... __NR_syscalls - 1] = __riscv_sys_ni_syscall,
#include <asm/unistd.h>
};
......@@ -21,6 +21,7 @@
#include <asm/asm-prototypes.h>
#include <asm/bug.h>
#include <asm/cfi.h>
#include <asm/csr.h>
#include <asm/processor.h>
#include <asm/ptrace.h>
......@@ -271,7 +272,8 @@ void handle_break(struct pt_regs *regs)
== NOTIFY_STOP)
return;
#endif
else if (report_bug(regs->epc, regs) == BUG_TRAP_TYPE_WARN)
else if (report_bug(regs->epc, regs) == BUG_TRAP_TYPE_WARN ||
handle_cfi_failure(regs) == BUG_TRAP_TYPE_WARN)
regs->epc += get_break_insn_length(regs->epc);
else
die(regs, "Kernel BUG");
......
......@@ -67,7 +67,7 @@ static void __flush_context(void)
lockdep_assert_held(&context_lock);
/* Update the list of reserved ASIDs and the ASID bitmap. */
bitmap_clear(context_asid_map, 0, num_asids);
bitmap_zero(context_asid_map, num_asids);
/* Mark already active ASIDs as used */
for_each_possible_cpu(i) {
......
......@@ -11,6 +11,8 @@
#include <asm/cacheflush.h>
static bool noncoherent_supported __ro_after_init;
int dma_cache_alignment __ro_after_init = ARCH_DMA_MINALIGN;
EXPORT_SYMBOL_GPL(dma_cache_alignment);
void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
enum dma_data_direction dir)
......@@ -78,3 +80,9 @@ void riscv_noncoherent_supported(void)
"Non-coherent DMA support enabled without a block size\n");
noncoherent_supported = true;
}
void __init riscv_set_dma_cache_alignment(void)
{
if (!noncoherent_supported)
dma_cache_alignment = 1;
}
......@@ -1299,6 +1299,28 @@ static inline void setup_vm_final(void)
}
#endif /* CONFIG_MMU */
/* Reserve 128M low memory by default for swiotlb buffer */
#define DEFAULT_CRASH_KERNEL_LOW_SIZE (128UL << 20)
static int __init reserve_crashkernel_low(unsigned long long low_size)
{
unsigned long long low_base;
low_base = memblock_phys_alloc_range(low_size, PMD_SIZE, 0, dma32_phys_limit);
if (!low_base) {
pr_err("cannot allocate crashkernel low memory (size:0x%llx).\n", low_size);
return -ENOMEM;
}
pr_info("crashkernel low memory reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
low_base, low_base + low_size, low_size >> 20);
crashk_low_res.start = low_base;
crashk_low_res.end = low_base + low_size - 1;
return 0;
}
/*
* reserve_crashkernel() - reserves memory for crash kernel
*
......@@ -1310,8 +1332,12 @@ static void __init reserve_crashkernel(void)
{
unsigned long long crash_base = 0;
unsigned long long crash_size = 0;
unsigned long long crash_low_size = 0;
unsigned long search_start = memblock_start_of_DRAM();
unsigned long search_end = memblock_end_of_DRAM();
unsigned long search_end = (unsigned long)dma32_phys_limit;
char *cmdline = boot_command_line;
bool fixed_base = false;
bool high = false;
int ret = 0;
......@@ -1327,14 +1353,36 @@ static void __init reserve_crashkernel(void)
return;
}
ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
&crash_size, &crash_base);
if (ret || !crash_size)
if (ret == -ENOENT) {
/* Fallback to crashkernel=X,[high,low] */
ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
if (ret || !crash_size)
return;
/*
* crashkernel=Y,low is valid only when crashkernel=X,high
* is passed.
*/
ret = parse_crashkernel_low(cmdline, 0, &crash_low_size, &crash_base);
if (ret == -ENOENT)
crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
else if (ret)
return;
search_start = (unsigned long)dma32_phys_limit;
search_end = memblock_end_of_DRAM();
high = true;
} else if (ret || !crash_size) {
/* Invalid argument value specified */
return;
}
crash_size = PAGE_ALIGN(crash_size);
if (crash_base) {
fixed_base = true;
search_start = crash_base;
search_end = crash_base + crash_size;
}
......@@ -1347,12 +1395,37 @@ static void __init reserve_crashkernel(void)
* swiotlb can work on the crash kernel.
*/
crash_base = memblock_phys_alloc_range(crash_size, PMD_SIZE,
search_start,
min(search_end, (unsigned long)(SZ_4G - 1)));
search_start, search_end);
if (crash_base == 0) {
/* Try again without restricting region to 32bit addressible memory */
/*
* For crashkernel=size[KMG]@offset[KMG], print out failure
* message if can't reserve the specified region.
*/
if (fixed_base) {
pr_warn("crashkernel: allocating failed with given size@offset\n");
return;
}
if (high) {
/*
* For crashkernel=size[KMG],high, if the first attempt was
* for high memory, fall back to low memory.
*/
search_start = memblock_start_of_DRAM();
search_end = (unsigned long)dma32_phys_limit;
} else {
/*
* For crashkernel=size[KMG], if the first attempt was for
* low memory, fall back to high memory, the minimum required
* low memory will be reserved later.
*/
search_start = (unsigned long)dma32_phys_limit;
search_end = memblock_end_of_DRAM();
crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
}
crash_base = memblock_phys_alloc_range(crash_size, PMD_SIZE,
search_start, search_end);
search_start, search_end);
if (crash_base == 0) {
pr_warn("crashkernel: couldn't allocate %lldKB\n",
crash_size >> 10);
......@@ -1360,6 +1433,12 @@ static void __init reserve_crashkernel(void)
}
}
if ((crash_base >= dma32_phys_limit) && crash_low_size &&
reserve_crashkernel_low(crash_low_size)) {
memblock_phys_free(crash_base, crash_size);
return;
}
pr_info("crashkernel: reserved 0x%016llx - 0x%016llx (%lld MB)\n",
crash_base, crash_base + crash_size, crash_size >> 20);
......
......@@ -22,9 +22,9 @@
* region is not and then we have to go down to the PUD level.
*/
pgd_t tmp_pg_dir[PTRS_PER_PGD] __page_aligned_bss;
p4d_t tmp_p4d[PTRS_PER_P4D] __page_aligned_bss;
pud_t tmp_pud[PTRS_PER_PUD] __page_aligned_bss;
static pgd_t tmp_pg_dir[PTRS_PER_PGD] __page_aligned_bss;
static p4d_t tmp_p4d[PTRS_PER_P4D] __page_aligned_bss;
static pud_t tmp_pud[PTRS_PER_PUD] __page_aligned_bss;
static void __init kasan_populate_pte(pmd_t *pmd, unsigned long vaddr, unsigned long end)
{
......@@ -438,7 +438,7 @@ static void __init kasan_shallow_populate(void *start, void *end)
kasan_shallow_populate_pgd(vaddr, vend);
}
static void create_tmp_mapping(void)
static void __init create_tmp_mapping(void)
{
void *ptr;
p4d_t *base_p4d;
......
......@@ -77,6 +77,10 @@ ifdef CONFIG_STACKPROTECTOR_STRONG
PURGATORY_CFLAGS_REMOVE += -fstack-protector-strong
endif
ifdef CONFIG_CFI_CLANG
PURGATORY_CFLAGS_REMOVE += $(CC_FLAGS_CFI)
endif
CFLAGS_REMOVE_purgatory.o += $(PURGATORY_CFLAGS_REMOVE)
CFLAGS_purgatory.o += $(PURGATORY_CFLAGS)
......
......@@ -14,9 +14,81 @@
#include <linux/perf/riscv_pmu.h>
#include <linux/printk.h>
#include <linux/smp.h>
#include <linux/sched_clock.h>
#include <asm/sbi.h>
static bool riscv_perf_user_access(struct perf_event *event)
{
return ((event->attr.type == PERF_TYPE_HARDWARE) ||
(event->attr.type == PERF_TYPE_HW_CACHE) ||
(event->attr.type == PERF_TYPE_RAW)) &&
!!(event->hw.flags & PERF_EVENT_FLAG_USER_READ_CNT);
}
void arch_perf_update_userpage(struct perf_event *event,
struct perf_event_mmap_page *userpg, u64 now)
{
struct clock_read_data *rd;
unsigned int seq;
u64 ns;
userpg->cap_user_time = 0;
userpg->cap_user_time_zero = 0;
userpg->cap_user_time_short = 0;
userpg->cap_user_rdpmc = riscv_perf_user_access(event);
#ifdef CONFIG_RISCV_PMU
/*
* The counters are 64-bit but the priv spec doesn't mandate all the
* bits to be implemented: that's why, counter width can vary based on
* the cpu vendor.
*/
if (userpg->cap_user_rdpmc)
userpg->pmc_width = to_riscv_pmu(event->pmu)->ctr_get_width(event->hw.idx) + 1;
#endif
do {
rd = sched_clock_read_begin(&seq);
userpg->time_mult = rd->mult;
userpg->time_shift = rd->shift;
userpg->time_zero = rd->epoch_ns;
userpg->time_cycles = rd->epoch_cyc;
userpg->time_mask = rd->sched_clock_mask;
/*
* Subtract the cycle base, such that software that
* doesn't know about cap_user_time_short still 'works'
* assuming no wraps.
*/
ns = mul_u64_u32_shr(rd->epoch_cyc, rd->mult, rd->shift);
userpg->time_zero -= ns;
} while (sched_clock_read_retry(seq));
userpg->time_offset = userpg->time_zero - now;
/*
* time_shift is not expected to be greater than 31 due to
* the original published conversion algorithm shifting a
* 32-bit value (now specifies a 64-bit value) - refer
* perf_event_mmap_page documentation in perf_event.h.
*/
if (userpg->time_shift == 32) {
userpg->time_shift = 31;
userpg->time_mult >>= 1;
}
/*
* Internal timekeeping for enabled/running/stopped times
* is always computed with the sched_clock.
*/
userpg->cap_user_time = 1;
userpg->cap_user_time_zero = 1;
userpg->cap_user_time_short = 1;
}
static unsigned long csr_read_num(int csr_num)
{
#define switchcase_csr_read(__csr_num, __val) {\
......@@ -171,6 +243,8 @@ int riscv_pmu_event_set_period(struct perf_event *event)
local64_set(&hwc->prev_count, (u64)-left);
perf_event_update_userpage(event);
return overflow;
}
......@@ -264,6 +338,9 @@ static int riscv_pmu_event_init(struct perf_event *event)
hwc->idx = -1;
hwc->event_base = mapped_event;
if (rvpmu->event_init)
rvpmu->event_init(event);
if (!is_sampling_event(event)) {
/*
* For non-sampling runs, limit the sample_period to half
......@@ -280,6 +357,39 @@ static int riscv_pmu_event_init(struct perf_event *event)
return 0;
}
static int riscv_pmu_event_idx(struct perf_event *event)
{
struct riscv_pmu *rvpmu = to_riscv_pmu(event->pmu);
if (!(event->hw.flags & PERF_EVENT_FLAG_USER_READ_CNT))
return 0;
if (rvpmu->csr_index)
return rvpmu->csr_index(event) + 1;
return 0;
}
static void riscv_pmu_event_mapped(struct perf_event *event, struct mm_struct *mm)
{
struct riscv_pmu *rvpmu = to_riscv_pmu(event->pmu);
if (rvpmu->event_mapped) {
rvpmu->event_mapped(event, mm);
perf_event_update_userpage(event);
}
}
static void riscv_pmu_event_unmapped(struct perf_event *event, struct mm_struct *mm)
{
struct riscv_pmu *rvpmu = to_riscv_pmu(event->pmu);
if (rvpmu->event_unmapped) {
rvpmu->event_unmapped(event, mm);
perf_event_update_userpage(event);
}
}
struct riscv_pmu *riscv_pmu_alloc(void)
{
struct riscv_pmu *pmu;
......@@ -304,6 +414,9 @@ struct riscv_pmu *riscv_pmu_alloc(void)
}
pmu->pmu = (struct pmu) {
.event_init = riscv_pmu_event_init,
.event_mapped = riscv_pmu_event_mapped,
.event_unmapped = riscv_pmu_event_unmapped,
.event_idx = riscv_pmu_event_idx,
.add = riscv_pmu_add,
.del = riscv_pmu_del,
.start = riscv_pmu_start,
......
......@@ -13,7 +13,7 @@
#include <linux/platform_device.h>
#define RISCV_PMU_LEGACY_CYCLE 0
#define RISCV_PMU_LEGACY_INSTRET 1
#define RISCV_PMU_LEGACY_INSTRET 2
static bool pmu_init_done;
......@@ -71,6 +71,29 @@ static void pmu_legacy_ctr_start(struct perf_event *event, u64 ival)
local64_set(&hwc->prev_count, initial_val);
}
static uint8_t pmu_legacy_csr_index(struct perf_event *event)
{
return event->hw.idx;
}
static void pmu_legacy_event_mapped(struct perf_event *event, struct mm_struct *mm)
{
if (event->attr.config != PERF_COUNT_HW_CPU_CYCLES &&
event->attr.config != PERF_COUNT_HW_INSTRUCTIONS)
return;
event->hw.flags |= PERF_EVENT_FLAG_USER_READ_CNT;
}
static void pmu_legacy_event_unmapped(struct perf_event *event, struct mm_struct *mm)
{
if (event->attr.config != PERF_COUNT_HW_CPU_CYCLES &&
event->attr.config != PERF_COUNT_HW_INSTRUCTIONS)
return;
event->hw.flags &= ~PERF_EVENT_FLAG_USER_READ_CNT;
}
/*
* This is just a simple implementation to allow legacy implementations
* compatible with new RISC-V PMU driver framework.
......@@ -91,6 +114,9 @@ static void pmu_legacy_init(struct riscv_pmu *pmu)
pmu->ctr_get_width = NULL;
pmu->ctr_clear_idx = NULL;
pmu->ctr_read = pmu_legacy_read_ctr;
pmu->event_mapped = pmu_legacy_event_mapped;
pmu->event_unmapped = pmu_legacy_event_unmapped;
pmu->csr_index = pmu_legacy_csr_index;
perf_pmu_register(&pmu->pmu, "cpu", PERF_TYPE_RAW);
}
......
......@@ -24,6 +24,14 @@
#include <asm/sbi.h>
#include <asm/hwcap.h>
#define SYSCTL_NO_USER_ACCESS 0
#define SYSCTL_USER_ACCESS 1
#define SYSCTL_LEGACY 2
#define PERF_EVENT_FLAG_NO_USER_ACCESS BIT(SYSCTL_NO_USER_ACCESS)
#define PERF_EVENT_FLAG_USER_ACCESS BIT(SYSCTL_USER_ACCESS)
#define PERF_EVENT_FLAG_LEGACY BIT(SYSCTL_LEGACY)
PMU_FORMAT_ATTR(event, "config:0-47");
PMU_FORMAT_ATTR(firmware, "config:63");
......@@ -43,6 +51,9 @@ static const struct attribute_group *riscv_pmu_attr_groups[] = {
NULL,
};
/* Allow user mode access by default */
static int sysctl_perf_user_access __read_mostly = SYSCTL_USER_ACCESS;
/*
* RISC-V doesn't have heterogeneous harts yet. This need to be part of
* per_cpu in case of harts with different pmu counters
......@@ -301,6 +312,11 @@ int riscv_pmu_get_hpm_info(u32 *hw_ctr_width, u32 *num_hw_ctr)
}
EXPORT_SYMBOL_GPL(riscv_pmu_get_hpm_info);
static uint8_t pmu_sbi_csr_index(struct perf_event *event)
{
return pmu_ctr_list[event->hw.idx].csr - CSR_CYCLE;
}
static unsigned long pmu_sbi_get_filter_flags(struct perf_event *event)
{
unsigned long cflags = 0;
......@@ -329,18 +345,34 @@ static int pmu_sbi_ctr_get_idx(struct perf_event *event)
struct cpu_hw_events *cpuc = this_cpu_ptr(rvpmu->hw_events);
struct sbiret ret;
int idx;
uint64_t cbase = 0;
uint64_t cbase = 0, cmask = rvpmu->cmask;
unsigned long cflags = 0;
cflags = pmu_sbi_get_filter_flags(event);
/*
* In legacy mode, we have to force the fixed counters for those events
* but not in the user access mode as we want to use the other counters
* that support sampling/filtering.
*/
if (hwc->flags & PERF_EVENT_FLAG_LEGACY) {
if (event->attr.config == PERF_COUNT_HW_CPU_CYCLES) {
cflags |= SBI_PMU_CFG_FLAG_SKIP_MATCH;
cmask = 1;
} else if (event->attr.config == PERF_COUNT_HW_INSTRUCTIONS) {
cflags |= SBI_PMU_CFG_FLAG_SKIP_MATCH;
cmask = 1UL << (CSR_INSTRET - CSR_CYCLE);
}
}
/* retrieve the available counter index */
#if defined(CONFIG_32BIT)
ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_CFG_MATCH, cbase,
rvpmu->cmask, cflags, hwc->event_base, hwc->config,
cmask, cflags, hwc->event_base, hwc->config,
hwc->config >> 32);
#else
ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_CFG_MATCH, cbase,
rvpmu->cmask, cflags, hwc->event_base, hwc->config, 0);
cmask, cflags, hwc->event_base, hwc->config, 0);
#endif
if (ret.error) {
pr_debug("Not able to find a counter for event %lx config %llx\n",
......@@ -474,6 +506,22 @@ static u64 pmu_sbi_ctr_read(struct perf_event *event)
return val;
}
static void pmu_sbi_set_scounteren(void *arg)
{
struct perf_event *event = (struct perf_event *)arg;
csr_write(CSR_SCOUNTEREN,
csr_read(CSR_SCOUNTEREN) | (1 << pmu_sbi_csr_index(event)));
}
static void pmu_sbi_reset_scounteren(void *arg)
{
struct perf_event *event = (struct perf_event *)arg;
csr_write(CSR_SCOUNTEREN,
csr_read(CSR_SCOUNTEREN) & ~(1 << pmu_sbi_csr_index(event)));
}
static void pmu_sbi_ctr_start(struct perf_event *event, u64 ival)
{
struct sbiret ret;
......@@ -490,6 +538,10 @@ static void pmu_sbi_ctr_start(struct perf_event *event, u64 ival)
if (ret.error && (ret.error != SBI_ERR_ALREADY_STARTED))
pr_err("Starting counter idx %d failed with error %d\n",
hwc->idx, sbi_err_map_linux_errno(ret.error));
if ((hwc->flags & PERF_EVENT_FLAG_USER_ACCESS) &&
(hwc->flags & PERF_EVENT_FLAG_USER_READ_CNT))
pmu_sbi_set_scounteren((void *)event);
}
static void pmu_sbi_ctr_stop(struct perf_event *event, unsigned long flag)
......@@ -497,6 +549,10 @@ static void pmu_sbi_ctr_stop(struct perf_event *event, unsigned long flag)
struct sbiret ret;
struct hw_perf_event *hwc = &event->hw;
if ((hwc->flags & PERF_EVENT_FLAG_USER_ACCESS) &&
(hwc->flags & PERF_EVENT_FLAG_USER_READ_CNT))
pmu_sbi_reset_scounteren((void *)event);
ret = sbi_ecall(SBI_EXT_PMU, SBI_EXT_PMU_COUNTER_STOP, hwc->idx, 1, flag, 0, 0, 0);
if (ret.error && (ret.error != SBI_ERR_ALREADY_STOPPED) &&
flag != SBI_PMU_STOP_FLAG_RESET)
......@@ -704,10 +760,13 @@ static int pmu_sbi_starting_cpu(unsigned int cpu, struct hlist_node *node)
struct cpu_hw_events *cpu_hw_evt = this_cpu_ptr(pmu->hw_events);
/*
* Enable the access for CYCLE, TIME, and INSTRET CSRs from userspace,
* as is necessary to maintain uABI compatibility.
* We keep enabling userspace access to CYCLE, TIME and INSTRET via the
* legacy option but that will be removed in the future.
*/
csr_write(CSR_SCOUNTEREN, 0x7);
if (sysctl_perf_user_access == SYSCTL_LEGACY)
csr_write(CSR_SCOUNTEREN, 0x7);
else
csr_write(CSR_SCOUNTEREN, 0x2);
/* Stop all the counters so that they can be enabled from perf */
pmu_sbi_stop_all(pmu);
......@@ -838,6 +897,121 @@ static void riscv_pmu_destroy(struct riscv_pmu *pmu)
cpuhp_state_remove_instance(CPUHP_AP_PERF_RISCV_STARTING, &pmu->node);
}
static void pmu_sbi_event_init(struct perf_event *event)
{
/*
* The permissions are set at event_init so that we do not depend
* on the sysctl value that can change.
*/
if (sysctl_perf_user_access == SYSCTL_NO_USER_ACCESS)
event->hw.flags |= PERF_EVENT_FLAG_NO_USER_ACCESS;
else if (sysctl_perf_user_access == SYSCTL_USER_ACCESS)
event->hw.flags |= PERF_EVENT_FLAG_USER_ACCESS;
else
event->hw.flags |= PERF_EVENT_FLAG_LEGACY;
}
static void pmu_sbi_event_mapped(struct perf_event *event, struct mm_struct *mm)
{
if (event->hw.flags & PERF_EVENT_FLAG_NO_USER_ACCESS)
return;
if (event->hw.flags & PERF_EVENT_FLAG_LEGACY) {
if (event->attr.config != PERF_COUNT_HW_CPU_CYCLES &&
event->attr.config != PERF_COUNT_HW_INSTRUCTIONS) {
return;
}
}
/*
* The user mmapped the event to directly access it: this is where
* we determine based on sysctl_perf_user_access if we grant userspace
* the direct access to this event. That means that within the same
* task, some events may be directly accessible and some other may not,
* if the user changes the value of sysctl_perf_user_accesss in the
* meantime.
*/
event->hw.flags |= PERF_EVENT_FLAG_USER_READ_CNT;
/*
* We must enable userspace access *before* advertising in the user page
* that it is possible to do so to avoid any race.
* And we must notify all cpus here because threads that currently run
* on other cpus will try to directly access the counter too without
* calling pmu_sbi_ctr_start.
*/
if (event->hw.flags & PERF_EVENT_FLAG_USER_ACCESS)
on_each_cpu_mask(mm_cpumask(mm),
pmu_sbi_set_scounteren, (void *)event, 1);
}
static void pmu_sbi_event_unmapped(struct perf_event *event, struct mm_struct *mm)
{
if (event->hw.flags & PERF_EVENT_FLAG_NO_USER_ACCESS)
return;
if (event->hw.flags & PERF_EVENT_FLAG_LEGACY) {
if (event->attr.config != PERF_COUNT_HW_CPU_CYCLES &&
event->attr.config != PERF_COUNT_HW_INSTRUCTIONS) {
return;
}
}
/*
* Here we can directly remove user access since the user does not have
* access to the user page anymore so we avoid the racy window where the
* user could have read cap_user_rdpmc to true right before we disable
* it.
*/
event->hw.flags &= ~PERF_EVENT_FLAG_USER_READ_CNT;
if (event->hw.flags & PERF_EVENT_FLAG_USER_ACCESS)
on_each_cpu_mask(mm_cpumask(mm),
pmu_sbi_reset_scounteren, (void *)event, 1);
}
static void riscv_pmu_update_counter_access(void *info)
{
if (sysctl_perf_user_access == SYSCTL_LEGACY)
csr_write(CSR_SCOUNTEREN, 0x7);
else
csr_write(CSR_SCOUNTEREN, 0x2);
}
static int riscv_pmu_proc_user_access_handler(struct ctl_table *table,
int write, void *buffer,
size_t *lenp, loff_t *ppos)
{
int prev = sysctl_perf_user_access;
int ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
/*
* Test against the previous value since we clear SCOUNTEREN when
* sysctl_perf_user_access is set to SYSCTL_USER_ACCESS, but we should
* not do that if that was already the case.
*/
if (ret || !write || prev == sysctl_perf_user_access)
return ret;
on_each_cpu(riscv_pmu_update_counter_access, NULL, 1);
return 0;
}
static struct ctl_table sbi_pmu_sysctl_table[] = {
{
.procname = "perf_user_access",
.data = &sysctl_perf_user_access,
.maxlen = sizeof(unsigned int),
.mode = 0644,
.proc_handler = riscv_pmu_proc_user_access_handler,
.extra1 = SYSCTL_ZERO,
.extra2 = SYSCTL_TWO,
},
{ }
};
static int pmu_sbi_device_probe(struct platform_device *pdev)
{
struct riscv_pmu *pmu = NULL;
......@@ -881,6 +1055,10 @@ static int pmu_sbi_device_probe(struct platform_device *pdev)
pmu->ctr_get_width = pmu_sbi_ctr_get_width;
pmu->ctr_clear_idx = pmu_sbi_ctr_clear_idx;
pmu->ctr_read = pmu_sbi_ctr_read;
pmu->event_init = pmu_sbi_event_init;
pmu->event_mapped = pmu_sbi_event_mapped;
pmu->event_unmapped = pmu_sbi_event_unmapped;
pmu->csr_index = pmu_sbi_csr_index;
ret = cpuhp_state_add_instance(CPUHP_AP_PERF_RISCV_STARTING, &pmu->node);
if (ret)
......@@ -894,6 +1072,8 @@ static int pmu_sbi_device_probe(struct platform_device *pdev)
if (ret)
goto out_unregister;
register_sysctl("kernel", sbi_pmu_sysctl_table);
return 0;
out_unregister:
......@@ -907,7 +1087,7 @@ static int pmu_sbi_device_probe(struct platform_device *pdev)
static struct platform_driver pmu_sbi_driver = {
.probe = pmu_sbi_device_probe,
.driver = {
.name = RISCV_PMU_PDEV_NAME,
.name = RISCV_PMU_SBI_PDEV_NAME,
},
};
......@@ -934,7 +1114,7 @@ static int __init pmu_sbi_devinit(void)
if (ret)
return ret;
pdev = platform_device_register_simple(RISCV_PMU_PDEV_NAME, -1, NULL, 0);
pdev = platform_device_register_simple(RISCV_PMU_SBI_PDEV_NAME, -1, NULL, 0);
if (IS_ERR(pdev)) {
platform_driver_unregister(&pmu_sbi_driver);
return PTR_ERR(pdev);
......
......@@ -58,7 +58,7 @@ config ARCH_USE_GNU_PROPERTY
config BINFMT_ELF_FDPIC
bool "Kernel support for FDPIC ELF binaries"
default y if !BINFMT_ELF
depends on ARM || ((M68K || SUPERH || XTENSA) && !MMU)
depends on ARM || ((M68K || RISCV || SUPERH || XTENSA) && !MMU)
select ELFCORE
help
ELF FDPIC binaries are based on ELF, but allow the individual load
......
......@@ -138,7 +138,7 @@ static int is_constdisp(struct elfhdr *hdr)
static int elf_fdpic_fetch_phdrs(struct elf_fdpic_params *params,
struct file *file)
{
struct elf32_phdr *phdr;
struct elf_phdr *phdr;
unsigned long size;
int retval, loop;
loff_t pos = params->hdr.e_phoff;
......@@ -560,8 +560,8 @@ static int create_elf_fdpic_tables(struct linux_binprm *bprm,
sp &= ~7UL;
/* stack the load map(s) */
len = sizeof(struct elf32_fdpic_loadmap);
len += sizeof(struct elf32_fdpic_loadseg) * exec_params->loadmap->nsegs;
len = sizeof(struct elf_fdpic_loadmap);
len += sizeof(struct elf_fdpic_loadseg) * exec_params->loadmap->nsegs;
sp = (sp - len) & ~7UL;
exec_params->map_addr = sp;
......@@ -571,8 +571,8 @@ static int create_elf_fdpic_tables(struct linux_binprm *bprm,
current->mm->context.exec_fdpic_loadmap = (unsigned long) sp;
if (interp_params->loadmap) {
len = sizeof(struct elf32_fdpic_loadmap);
len += sizeof(struct elf32_fdpic_loadseg) *
len = sizeof(struct elf_fdpic_loadmap);
len += sizeof(struct elf_fdpic_loadseg) *
interp_params->loadmap->nsegs;
sp = (sp - len) & ~7UL;
interp_params->map_addr = sp;
......@@ -740,13 +740,13 @@ static int elf_fdpic_map_file(struct elf_fdpic_params *params,
struct mm_struct *mm,
const char *what)
{
struct elf32_fdpic_loadmap *loadmap;
struct elf_fdpic_loadmap *loadmap;
#ifdef CONFIG_MMU
struct elf32_fdpic_loadseg *mseg;
struct elf_fdpic_loadseg *mseg;
unsigned long load_addr;
#endif
struct elf32_fdpic_loadseg *seg;
struct elf32_phdr *phdr;
struct elf_fdpic_loadseg *seg;
struct elf_phdr *phdr;
unsigned nloads, tmp;
unsigned long stop;
int loop, ret;
......@@ -766,7 +766,7 @@ static int elf_fdpic_map_file(struct elf_fdpic_params *params,
params->loadmap = loadmap;
loadmap->version = ELF32_FDPIC_LOADMAP_VERSION;
loadmap->version = ELF_FDPIC_LOADMAP_VERSION;
loadmap->nsegs = nloads;
/* map the requested LOADs into the memory space */
......@@ -839,8 +839,8 @@ static int elf_fdpic_map_file(struct elf_fdpic_params *params,
if (phdr->p_vaddr >= seg->p_vaddr &&
phdr->p_vaddr + phdr->p_memsz <=
seg->p_vaddr + seg->p_memsz) {
Elf32_Dyn __user *dyn;
Elf32_Sword d_tag;
Elf_Dyn __user *dyn;
Elf_Sword d_tag;
params->dynamic_addr =
(phdr->p_vaddr - seg->p_vaddr) +
......@@ -850,11 +850,11 @@ static int elf_fdpic_map_file(struct elf_fdpic_params *params,
* one item, and that the last item is a NULL
* entry */
if (phdr->p_memsz == 0 ||
phdr->p_memsz % sizeof(Elf32_Dyn) != 0)
phdr->p_memsz % sizeof(Elf_Dyn) != 0)
goto dynamic_error;
tmp = phdr->p_memsz / sizeof(Elf32_Dyn);
dyn = (Elf32_Dyn __user *)params->dynamic_addr;
tmp = phdr->p_memsz / sizeof(Elf_Dyn);
dyn = (Elf_Dyn __user *)params->dynamic_addr;
if (get_user(d_tag, &dyn[tmp - 1].d_tag) ||
d_tag != 0)
goto dynamic_error;
......@@ -923,8 +923,8 @@ static int elf_fdpic_map_file_constdisp_on_uclinux(
struct file *file,
struct mm_struct *mm)
{
struct elf32_fdpic_loadseg *seg;
struct elf32_phdr *phdr;
struct elf_fdpic_loadseg *seg;
struct elf_phdr *phdr;
unsigned long load_addr, base = ULONG_MAX, top = 0, maddr = 0;
int loop, ret;
......@@ -1007,8 +1007,8 @@ static int elf_fdpic_map_file_by_direct_mmap(struct elf_fdpic_params *params,
struct file *file,
struct mm_struct *mm)
{
struct elf32_fdpic_loadseg *seg;
struct elf32_phdr *phdr;
struct elf_fdpic_loadseg *seg;
struct elf_phdr *phdr;
unsigned long load_addr, delta_vaddr;
int loop, dvset;
......
......@@ -80,9 +80,21 @@ static __always_inline bool should_resched(int preempt_offset)
#ifdef CONFIG_PREEMPTION
extern asmlinkage void preempt_schedule(void);
#define __preempt_schedule() preempt_schedule()
extern asmlinkage void preempt_schedule_notrace(void);
#if defined(CONFIG_PREEMPT_DYNAMIC) && defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY)
void dynamic_preempt_schedule(void);
void dynamic_preempt_schedule_notrace(void);
#define __preempt_schedule() dynamic_preempt_schedule()
#define __preempt_schedule_notrace() dynamic_preempt_schedule_notrace()
#else /* !CONFIG_PREEMPT_DYNAMIC || !CONFIG_HAVE_PREEMPT_DYNAMIC_KEY*/
#define __preempt_schedule() preempt_schedule()
#define __preempt_schedule_notrace() preempt_schedule_notrace()
#endif /* CONFIG_PREEMPT_DYNAMIC && CONFIG_HAVE_PREEMPT_DYNAMIC_KEY*/
#endif /* CONFIG_PREEMPTION */
#endif /* __ASM_PREEMPT_H */
......@@ -10,13 +10,25 @@
#include <uapi/linux/elf-fdpic.h>
#if ELF_CLASS == ELFCLASS32
#define Elf_Sword Elf32_Sword
#define elf_fdpic_loadseg elf32_fdpic_loadseg
#define elf_fdpic_loadmap elf32_fdpic_loadmap
#define ELF_FDPIC_LOADMAP_VERSION ELF32_FDPIC_LOADMAP_VERSION
#else
#define Elf_Sword Elf64_Sxword
#define elf_fdpic_loadmap elf64_fdpic_loadmap
#define elf_fdpic_loadseg elf64_fdpic_loadseg
#define ELF_FDPIC_LOADMAP_VERSION ELF64_FDPIC_LOADMAP_VERSION
#endif
/*
* binfmt binary parameters structure
*/
struct elf_fdpic_params {
struct elfhdr hdr; /* ref copy of ELF header */
struct elf_phdr *phdrs; /* ref copy of PT_PHDR table */
struct elf32_fdpic_loadmap *loadmap; /* loadmap to be passed to userspace */
struct elf_fdpic_loadmap *loadmap; /* loadmap to be passed to userspace */
unsigned long elfhdr_addr; /* mapped ELF header user address */
unsigned long ph_addr; /* mapped PT_PHDR user address */
unsigned long map_addr; /* mapped loadmap user address */
......
......@@ -6,8 +6,8 @@
*
*/
#ifndef _ASM_RISCV_PERF_EVENT_H
#define _ASM_RISCV_PERF_EVENT_H
#ifndef _RISCV_PMU_H
#define _RISCV_PMU_H
#include <linux/perf_event.h>
#include <linux/ptrace.h>
......@@ -21,7 +21,7 @@
#define RISCV_MAX_COUNTERS 64
#define RISCV_OP_UNSUPP (-EOPNOTSUPP)
#define RISCV_PMU_PDEV_NAME "riscv-pmu"
#define RISCV_PMU_SBI_PDEV_NAME "riscv-pmu-sbi"
#define RISCV_PMU_LEGACY_PDEV_NAME "riscv-pmu-legacy"
#define RISCV_PMU_STOP_FLAG_RESET 1
......@@ -55,6 +55,10 @@ struct riscv_pmu {
void (*ctr_start)(struct perf_event *event, u64 init_val);
void (*ctr_stop)(struct perf_event *event, unsigned long flag);
int (*event_map)(struct perf_event *event, u64 *config);
void (*event_init)(struct perf_event *event);
void (*event_mapped)(struct perf_event *event, struct mm_struct *mm);
void (*event_unmapped)(struct perf_event *event, struct mm_struct *mm);
uint8_t (*csr_index)(struct perf_event *event);
struct cpu_hw_events __percpu *hw_events;
struct hlist_node node;
......@@ -81,4 +85,4 @@ int riscv_pmu_get_hpm_info(u32 *hw_ctr_width, u32 *num_hw_ctr);
#endif /* CONFIG_RISCV_PMU */
#endif /* _ASM_RISCV_PERF_EVENT_H */
#endif /* _RISCV_PMU_H */
......@@ -444,7 +444,8 @@ struct pmu {
/*
* Will return the value for perf_event_mmap_page::index for this event,
* if no implementation is provided it will default to: event->hw.idx + 1.
* if no implementation is provided it will default to 0 (see
* perf_event_idx_default).
*/
int (*event_idx) (struct perf_event *event); /*optional */
......
......@@ -32,4 +32,19 @@ struct elf32_fdpic_loadmap {
#define ELF32_FDPIC_LOADMAP_VERSION 0x0000
/* segment mappings for ELF FDPIC libraries/executables/interpreters */
struct elf64_fdpic_loadseg {
Elf64_Addr addr; /* core address to which mapped */
Elf64_Addr p_vaddr; /* VMA recorded in file */
Elf64_Word p_memsz; /* allocation size recorded in file */
};
struct elf64_fdpic_loadmap {
Elf64_Half version; /* version of these structures, just in case... */
Elf64_Half nsegs; /* number of segments */
struct elf64_fdpic_loadseg segs[];
};
#define ELF64_FDPIC_LOADMAP_VERSION 0x0000
#endif /* _UAPI_LINUX_ELF_FDPIC_H */
......@@ -355,6 +355,11 @@ endchoice # "Compressed Debug information"
config DEBUG_INFO_SPLIT
bool "Produce split debuginfo in .dwo files"
depends on $(cc-option,-gsplit-dwarf)
# RISC-V linker relaxation + -gsplit-dwarf has issues with LLVM and GCC
# prior to 12.x:
# https://github.com/llvm/llvm-project/issues/56642
# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99090
depends on !RISCV || GCC_VERSION >= 120000
help
Generate debug info into separate .dwo files. This significantly
reduces the build directory size for builds with DEBUG_INFO,
......@@ -502,7 +507,7 @@ config SECTION_MISMATCH_WARN_ONLY
config DEBUG_FORCE_FUNCTION_ALIGN_64B
bool "Force all function address 64B aligned"
depends on EXPERT && (X86_64 || ARM64 || PPC32 || PPC64 || ARC || S390)
depends on EXPERT && (X86_64 || ARM64 || PPC32 || PPC64 || ARC || RISCV || S390)
select FUNCTION_ALIGNMENT_64B
help
There are cases that a commit from one domain changes the function
......
......@@ -392,6 +392,72 @@ static u64 read_perf_counter(unsigned int counter)
static u64 read_timestamp(void) { return read_sysreg(cntvct_el0); }
/* __riscv_xlen contains the witdh of the native base integer, here 64-bit */
#elif defined(__riscv) && __riscv_xlen == 64
/* TODO: implement rv32 support */
#define CSR_CYCLE 0xc00
#define CSR_TIME 0xc01
#define csr_read(csr) \
({ \
register unsigned long __v; \
__asm__ __volatile__ ("csrr %0, %1" \
: "=r" (__v) \
: "i" (csr) : ); \
__v; \
})
static unsigned long csr_read_num(int csr_num)
{
#define switchcase_csr_read(__csr_num, __val) {\
case __csr_num: \
__val = csr_read(__csr_num); \
break; }
#define switchcase_csr_read_2(__csr_num, __val) {\
switchcase_csr_read(__csr_num + 0, __val) \
switchcase_csr_read(__csr_num + 1, __val)}
#define switchcase_csr_read_4(__csr_num, __val) {\
switchcase_csr_read_2(__csr_num + 0, __val) \
switchcase_csr_read_2(__csr_num + 2, __val)}
#define switchcase_csr_read_8(__csr_num, __val) {\
switchcase_csr_read_4(__csr_num + 0, __val) \
switchcase_csr_read_4(__csr_num + 4, __val)}
#define switchcase_csr_read_16(__csr_num, __val) {\
switchcase_csr_read_8(__csr_num + 0, __val) \
switchcase_csr_read_8(__csr_num + 8, __val)}
#define switchcase_csr_read_32(__csr_num, __val) {\
switchcase_csr_read_16(__csr_num + 0, __val) \
switchcase_csr_read_16(__csr_num + 16, __val)}
unsigned long ret = 0;
switch (csr_num) {
switchcase_csr_read_32(CSR_CYCLE, ret)
default:
break;
}
return ret;
#undef switchcase_csr_read_32
#undef switchcase_csr_read_16
#undef switchcase_csr_read_8
#undef switchcase_csr_read_4
#undef switchcase_csr_read_2
#undef switchcase_csr_read
}
static u64 read_perf_counter(unsigned int counter)
{
return csr_read_num(CSR_CYCLE + counter);
}
static u64 read_timestamp(void)
{
return csr_read_num(CSR_TIME);
}
#else
static u64 read_perf_counter(unsigned int counter __maybe_unused) { return 0; }
static u64 read_timestamp(void) { return 0; }
......
......@@ -284,7 +284,8 @@ static struct test_case tests__basic_mmap[] = {
"permissions"),
TEST_CASE_REASON("User space counter reading of instructions",
mmap_user_read_instr,
#if defined(__i386__) || defined(__x86_64__) || defined(__aarch64__)
#if defined(__i386__) || defined(__x86_64__) || defined(__aarch64__) || \
(defined(__riscv) && __riscv_xlen == 64)
"permissions"
#else
"unsupported"
......@@ -292,7 +293,8 @@ static struct test_case tests__basic_mmap[] = {
),
TEST_CASE_REASON("User space counter reading of cycles",
mmap_user_read_cycles,
#if defined(__i386__) || defined(__x86_64__) || defined(__aarch64__)
#if defined(__i386__) || defined(__x86_64__) || defined(__aarch64__) || \
(defined(__riscv) && __riscv_xlen == 64)
"permissions"
#else
"unsupported"
......
......@@ -5,7 +5,7 @@
ARCH ?= $(shell uname -m 2>/dev/null || echo not)
ifneq (,$(filter $(ARCH),riscv))
RISCV_SUBTARGETS ?= hwprobe vector
RISCV_SUBTARGETS ?= hwprobe vector mm
else
RISCV_SUBTARGETS :=
endif
......
# SPDX-License-Identifier: GPL-2.0
# Copyright (C) 2021 ARM Limited
# Originally tools/testing/arm64/abi/Makefile
# Additional include paths needed by kselftest.h and local headers
CFLAGS += -D_GNU_SOURCE -std=gnu99 -I.
TEST_GEN_FILES := testcases/mmap_default testcases/mmap_bottomup
TEST_PROGS := testcases/run_mmap.sh
include ../../lib.mk
$(OUTPUT)/mm: testcases/mmap_default.c testcases/mmap_bottomup.c testcases/mmap_tests.h
$(CC) -o$@ $(CFLAGS) $(LDFLAGS) $^
// SPDX-License-Identifier: GPL-2.0-only
#include <sys/mman.h>
#include <testcases/mmap_test.h>
#include "../../kselftest_harness.h"
TEST(infinite_rlimit)
{
// Only works on 64 bit
#if __riscv_xlen == 64
struct addresses mmap_addresses;
EXPECT_EQ(BOTTOM_UP, memory_layout());
do_mmaps(&mmap_addresses);
EXPECT_NE(MAP_FAILED, mmap_addresses.no_hint);
EXPECT_NE(MAP_FAILED, mmap_addresses.on_37_addr);
EXPECT_NE(MAP_FAILED, mmap_addresses.on_38_addr);
EXPECT_NE(MAP_FAILED, mmap_addresses.on_46_addr);
EXPECT_NE(MAP_FAILED, mmap_addresses.on_47_addr);
EXPECT_NE(MAP_FAILED, mmap_addresses.on_55_addr);
EXPECT_NE(MAP_FAILED, mmap_addresses.on_56_addr);
EXPECT_GT(1UL << 47, (unsigned long)mmap_addresses.no_hint);
EXPECT_GT(1UL << 38, (unsigned long)mmap_addresses.on_37_addr);
EXPECT_GT(1UL << 38, (unsigned long)mmap_addresses.on_38_addr);
EXPECT_GT(1UL << 38, (unsigned long)mmap_addresses.on_46_addr);
EXPECT_GT(1UL << 47, (unsigned long)mmap_addresses.on_47_addr);
EXPECT_GT(1UL << 47, (unsigned long)mmap_addresses.on_55_addr);
EXPECT_GT(1UL << 56, (unsigned long)mmap_addresses.on_56_addr);
#endif
}
TEST_HARNESS_MAIN
// SPDX-License-Identifier: GPL-2.0-only
#include <sys/mman.h>
#include <testcases/mmap_test.h>
#include "../../kselftest_harness.h"
TEST(default_rlimit)
{
// Only works on 64 bit
#if __riscv_xlen == 64
struct addresses mmap_addresses;
EXPECT_EQ(TOP_DOWN, memory_layout());
do_mmaps(&mmap_addresses);
EXPECT_NE(MAP_FAILED, mmap_addresses.no_hint);
EXPECT_NE(MAP_FAILED, mmap_addresses.on_37_addr);
EXPECT_NE(MAP_FAILED, mmap_addresses.on_38_addr);
EXPECT_NE(MAP_FAILED, mmap_addresses.on_46_addr);
EXPECT_NE(MAP_FAILED, mmap_addresses.on_47_addr);
EXPECT_NE(MAP_FAILED, mmap_addresses.on_55_addr);
EXPECT_NE(MAP_FAILED, mmap_addresses.on_56_addr);
EXPECT_GT(1UL << 47, (unsigned long)mmap_addresses.no_hint);
EXPECT_GT(1UL << 38, (unsigned long)mmap_addresses.on_37_addr);
EXPECT_GT(1UL << 38, (unsigned long)mmap_addresses.on_38_addr);
EXPECT_GT(1UL << 38, (unsigned long)mmap_addresses.on_46_addr);
EXPECT_GT(1UL << 47, (unsigned long)mmap_addresses.on_47_addr);
EXPECT_GT(1UL << 47, (unsigned long)mmap_addresses.on_55_addr);
EXPECT_GT(1UL << 56, (unsigned long)mmap_addresses.on_56_addr);
#endif
}
TEST_HARNESS_MAIN
/* SPDX-License-Identifier: GPL-2.0-only */
#ifndef _TESTCASES_MMAP_TEST_H
#define _TESTCASES_MMAP_TEST_H
#include <sys/mman.h>
#include <sys/resource.h>
#include <stddef.h>
#define TOP_DOWN 0
#define BOTTOM_UP 1
struct addresses {
int *no_hint;
int *on_37_addr;
int *on_38_addr;
int *on_46_addr;
int *on_47_addr;
int *on_55_addr;
int *on_56_addr;
};
static inline void do_mmaps(struct addresses *mmap_addresses)
{
/*
* Place all of the hint addresses on the boundaries of mmap
* sv39, sv48, sv57
* User addresses end at 1<<38, 1<<47, 1<<56 respectively
*/
void *on_37_bits = (void *)(1UL << 37);
void *on_38_bits = (void *)(1UL << 38);
void *on_46_bits = (void *)(1UL << 46);
void *on_47_bits = (void *)(1UL << 47);
void *on_55_bits = (void *)(1UL << 55);
void *on_56_bits = (void *)(1UL << 56);
int prot = PROT_READ | PROT_WRITE;
int flags = MAP_PRIVATE | MAP_ANONYMOUS;
mmap_addresses->no_hint =
mmap(NULL, 5 * sizeof(int), prot, flags, 0, 0);
mmap_addresses->on_37_addr =
mmap(on_37_bits, 5 * sizeof(int), prot, flags, 0, 0);
mmap_addresses->on_38_addr =
mmap(on_38_bits, 5 * sizeof(int), prot, flags, 0, 0);
mmap_addresses->on_46_addr =
mmap(on_46_bits, 5 * sizeof(int), prot, flags, 0, 0);
mmap_addresses->on_47_addr =
mmap(on_47_bits, 5 * sizeof(int), prot, flags, 0, 0);
mmap_addresses->on_55_addr =
mmap(on_55_bits, 5 * sizeof(int), prot, flags, 0, 0);
mmap_addresses->on_56_addr =
mmap(on_56_bits, 5 * sizeof(int), prot, flags, 0, 0);
}
static inline int memory_layout(void)
{
int prot = PROT_READ | PROT_WRITE;
int flags = MAP_PRIVATE | MAP_ANONYMOUS;
void *value1 = mmap(NULL, sizeof(int), prot, flags, 0, 0);
void *value2 = mmap(NULL, sizeof(int), prot, flags, 0, 0);
return value2 > value1;
}
#endif /* _TESTCASES_MMAP_TEST_H */
#!/bin/sh
# SPDX-License-Identifier: GPL-2.0
original_stack_limit=$(ulimit -s)
./mmap_default
# Force mmap_bottomup to be ran with bottomup memory due to
# the unlimited stack
ulimit -s unlimited
./mmap_bottomup
ulimit -s $original_stack_limit
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment