- 09 Dec, 2021 27 commits
-
-
Christophe Leroy authored
This adds KUAP support to 85xx in 32 bits mode. This is done by reading the content of SPRN_MAS1 and checking the TID at the time user pgtable is loaded. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/f8696f8980ca1532ada3a2f0e0a03e756269c7fe.1634627931.git.christophe.leroy@csgroup.eu
-
Christophe Leroy authored
This adds KUAP support to 40x. This is done by checking the content of SPRN_PID at the time user pgtable is loaded. 40x doesn't have KUEP, but KUAP implies KUEP because when the PID doesn't match the page's PID, the page cannot be read nor executed. So KUEP is now automatically selected when KUAP is selected and disabled when KUAP is disabled. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/aaefa91897ddc42ac11019dc0e1d1a525bd08e90.1634627931.git.christophe.leroy@csgroup.eu
-
Christophe Leroy authored
This adds KUAP support to 44x. This is done by checking the content of SPRN_PID at the time it is read and written into SPRN_MMUCR. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/7d6c3f1978a26feada74b084f651e8cf1e3b3a47.1634627931.git.christophe.leroy@csgroup.eu
-
Christophe Leroy authored
On booke/40x we don't have segments like book3s/32. On booke/40x we don't have access protection groups like 8xx. Use the PID register to provide user access protection. Kernel address space can be accessed with any PID. User address space has to be accessed with the PID of the user. User PID is always not null. Everytime the kernel is entered, set PID register to 0 and restore PID register when returning to user. Everytime kernel needs to access user data, PID is restored for the access. In TLB miss handlers, check the PID and bail out to data storage exception when PID is 0 and accessed address is in user space. Note that also forbids execution of user text by kernel except when user access is unlocked. But this shouldn't be a problem as the kernel is not supposed to ever run user text. This patch prepares the infrastructure but the real activation of KUAP is done by following patches for each processor type one by one. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/5d65576a8e31e9480415785a180c92dd4e72306d.1634627931.git.christophe.leroy@csgroup.eu
-
Christophe Leroy authored
PPC_KUAP_DEBUG is supported by all platforms doing PPC_KUAP, it doesn't depend on Radix on book3s/64. This will avoid adding one more dependency when implementing KUAP on book3e/64. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a5ff6228a36e51783b83d8c10d058db76e450f63.1634627931.git.christophe.leroy@csgroup.eu
-
Christophe Leroy authored
Also call kuap_lock() and kuap_save_and_lock() from interrupt functions with CONFIG_PPC64. For book3s/64 we keep them empty as it is done in assembly. Also do the locked assert when switching task unless it is book3s/64. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1cbf94e26e6d6e2e028fd687588a7e6622d454a6.1634627931.git.christophe.leroy@csgroup.eu
-
Christophe Leroy authored
We have many functionnalities common to 40x and BOOKE, it leads to many places with #if defined(CONFIG_BOOKE) || defined(CONFIG_40x). We are going to add a few more with KUAP for booke/40x, so create a new symbol which is defined when either BOOKE or 40x is defined. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/9a3dbd60924cb25c9f944d3d8205ac5a0d15e229.1634627931.git.christophe.leroy@csgroup.eu
-
Christophe Leroy authored
In order to reuse it on booke/4xx, move KUAP setup routine out of 8xx.c Make them usable on SMP by removing the __init tag as it is called for each CPU. And use __prevent_user_access() instead of hard coding initial lock. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/ae35eec3426509efc2b8ae69586c822e2fe2642a.1634627931.git.christophe.leroy@csgroup.eu
-
Christophe Leroy authored
Add kuap_lock() and call it when entering interrupts from user. It is called kuap_lock() as it is similar to kuap_save_and_lock() without the save. However book3s/32 already have a kuap_lock(). Rename it kuap_lock_addr(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4437e2deb9f6f549f7089d45e9c6f96a7e77905a.1634627931.git.christophe.leroy@csgroup.eu
-
Christophe Leroy authored
__kuap_assert_locked() is redundant with __kuap_get_and_assert_locked(). Move the verification of CONFIG_PPC_KUAP_DEBUG in kuap_assert_locked() and make it call __kuap_get_and_assert_locked() directly. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1a60198a25d2ba38a37f1b92bc7d096435df4224.1634627931.git.christophe.leroy@csgroup.eu
-
Christophe Leroy authored
Today, every platform checks that KUAP is not de-activated before doing the real job. Move the verification out of platform specific functions. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/894f110397fcd248e125fb855d1e863e4e633a0d.1634627931.git.christophe.leroy@csgroup.eu
-
Christophe Leroy authored
Make the following functions generic to all platforms. - bad_kuap_fault() - kuap_assert_locked() - kuap_save_and_lock() (PPC32 only) - kuap_kernel_restore() - kuap_get_and_assert_locked() And for all platforms except book3s/64 - allow_user_access() - prevent_user_access() - prevent_user_access_return() - restore_user_access() Prepend __ in front of the name of platform specific ones. For now the generic just calls the platform specific, but next patch will move redundant parts of specific functions into the generic one. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/eaef143a8dae7288cd34565ffa7b49c16aee1ec3.1634627931.git.christophe.leroy@csgroup.eu
-
Christophe Leroy authored
Deactivating KUEP at boot time is unrelevant for PPC32 and BOOK3E/64. Remove it. It allows to refactor setup_kuep() via a __weak function that only PPC64s will overide for now. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> [mpe: Fix CONFIG_PPC_BOOKS_64 -> CONFIG_PPC_BOOK3S_64 typo] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4c36df18b41c988c4512f45d96220486adbe4c99.1634627931.git.christophe.leroy@csgroup.eu
-
Christophe Leroy authored
Calling 'mfsr' to get the content of segment registers is heavy, in addition it requires clearing of the 'reserved' bits. In order to avoid this operation, save it in mm context and in thread struct. The saved sr0 is the one used by kernel, this means that on locking entry it can be used as is. For unlocking, the only thing to do is to clear SR_NX. This improves null_syscall selftest by 12 cycles, ie 4%. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b02baf2ed8f09bad910dfaeeb7353b2ae6830525.1634627931.git.christophe.leroy@csgroup.eu
-
Christophe Leroy authored
When interrupt and syscall entries where converted to C, KUEP locking and unlocking was also converted. It improved performance by unrolling the loop, and allowed easily implementing boot time deactivation of KUEP. However, null_syscall selftest shows that KUEP is still heavy (361 cycles with KUEP, 212 cycles without). A way to improve more is to group 'mtsr's together, instead of repeating 'addi' + 'mtsr' several times. In order to do that, more registers need to be available. In C, GCC will always be able to provide the requested number of registers, but at the cost of saving some data on the stack, which is counter performant here. So let's do it in assembly, when we have full control of which register can be used. It also has the advantage of locking earlier and unlocking later and it helps GCC generating less tricky code. The only drawback is to make boot time deactivation less straight forward and require 'hand' instruction patching. Group 'mtsr's by 4. With this change, null_syscall selftest reports 336 cycles. Without the change it was 361 cycles, that's a 7% reduction. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/115cb279e9b9948dfd93a065e047081c59e3a2a6.1634627931.git.christophe.leroy@csgroup.eu
-
Christophe Leroy authored
Disabling KUEP at boottime makes things unnecessarily complex. Still allow disabling KUEP at build time, but when it's built-in it is always there. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/96f583f82423a29a4205c60b9721079111b35567.1634627931.git.christophe.leroy@csgroup.eu
-
Christophe Leroy authored
On book3e, - When using 64 bits PTE: User pages don't have the SX bit defined so KUEP is always active. - When using 32 bits PTE: Implement KUEP by clearing SX bit during TLB miss for user pages. The impact is minimal and worth neither boot time nor build time selection. Activate it at all time. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e376b114283fb94504e2aa2de846780063252cde.1634627931.git.christophe.leroy@csgroup.eu
-
Christophe Leroy authored
On 44x, KUEP is implemented by clearing SX bit during TLB miss for user pages. The impact is minimal and not worth neither boot time nor build time selection. Activate it at all time. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/2414d662558e7fb27d1ed41c8e47c591d576acac.1634627931.git.christophe.leroy@csgroup.eu
-
Christophe Leroy authored
On the 8xx, there is absolutely no runtime impact with KUEP. Protection against execution of user code in kernel mode is set up at boot time by configuring the groups with contain all user pages as having swapped protection rights, in extenso EX for user and NA for supervisor. Configure KUEP at startup and force selection of CONFIG_PPC_KUEP. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/2129e86944323ffe9ed07fffbeafdfd2e363690a.1634627931.git.christophe.leroy@csgroup.eu
-
Christophe Leroy authored
This reverts commit 1791ebd1. setup_kup() was inlined to manage conflict between PPC32 marking setup_{kuap/kuep}() __init and PPC64 not marking them __init. But in fact PPC32 has removed the __init mark for all but 8xx in order to properly handle SMP. In order to make setup_kup() grow a bit, revert the commit mentioned above but remove __init for 8xx as well so that we don't have to mark setup_kup() as __ref. Also switch the order so that KUAP is initialised before KUEP because on the 40x, KUEP will depend on the activation of KUAP. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/7691088fd0994ee3c8db6298dc8c00259e3f6a7f.1634627931.git.christophe.leroy@csgroup.eu
-
Christophe Leroy authored
As reported by Carlo, 16Mbytes is not enough with modern kernels that tend to be a bit big, so map another 16M page at boot. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/89b5f974a7fa5011206682cd092e2c905530ff46.1632755552.git.christophe.leroy@csgroup.eu
-
Nicholas Piggin authored
Microwatt implements a subset of ISA v3.0 (which is equivalent to the POWER9_CPU option). It is radix-only, so does not require hash MMU support. This saves 20kB compressed dtbImage and 56kB vmlinux size. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211201144153.2456614-19-npiggin@gmail.com
-
Nicholas Piggin authored
Compiling out hash support code when CONFIG_PPC_64S_HASH_MMU=n saves 128kB kernel image size (90kB text) on powernv_defconfig minus KVM, 350kB on pseries_defconfig minus KVM, 40kB on a tiny config. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Fixup defined(ARCH_HAS_MEMREMAP_COMPAT_ALIGN), which needs CONFIG. Fix radix_enabled() use in setup_initial_memory_limit(). Add some stubs to reduce number of ifdefs.] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211201144153.2456614-18-npiggin@gmail.com
-
Nicholas Piggin authored
This adds Kconfig selection which allows 64s hash MMU support to be disabled. It can be disabled if radix support is enabled, the minimum supported CPU type is POWER9 (or higher), and KVM is not selected. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211201144153.2456614-17-npiggin@gmail.com
-
Nicholas Piggin authored
To avoid any functional changes to radix paths when building with hash MMU support disabled (and CONFIG_PPC_MM_SLICES=n), always define the arch get_unmapped_area calls on 64s platforms. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211201144153.2456614-16-npiggin@gmail.com
-
Nicholas Piggin authored
There are a few places that require MMU_FTR_HPTE_TABLE to be set even when running in radix mode. Fix those up. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211201144153.2456614-15-npiggin@gmail.com
-
Nicholas Piggin authored
mmu_linear_psize is only set at boot once on 64e, is not necessarily the correct size of the linear map pages, and is never used anywhere. Remove it. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Retain the extern, so we can use IS_ENABLED() for related code] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211201144153.2456614-14-npiggin@gmail.com
-
- 02 Dec, 2021 12 commits
-
-
Nicholas Piggin authored
memremap_compat_align is only relevant when ZONE_DEVICE is selected. ZONE_DEVICE depends on ARCH_HAS_PTE_DEVMAP, which is only selected by PPC_BOOK3S_64. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211201144153.2456614-13-npiggin@gmail.com
-
Nicholas Piggin authored
Radix never sets mmu_linear_psize so it's always 4K, which causes pcpu atom_size to always be PAGE_SIZE. 64e sets it to 1GB always. Make paths for these platforms to be explicit about what value they set atom_size to. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211201144153.2456614-12-npiggin@gmail.com
-
Nicholas Piggin authored
This file contains functions and data common to radix, so rename it to remove the hash_ prefix. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211201144153.2456614-11-npiggin@gmail.com
-
Nicholas Piggin authored
The radix code uses some of the psize variables. Move the common ones from hash_utils.c to pgtable.c. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211201144153.2456614-10-npiggin@gmail.com
-
Nicholas Piggin authored
The radix test can exclude slb_flush_all_realmode() from being called because flush_and_reload_slb() is only expected to flush ERAT when called by flush_erat(), which is only on pre-ISA v3.0 CPUs that do not support radix. This helps the later change to make hash support configurable to not introduce runtime changes to radix mode behaviour. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211201144153.2456614-9-npiggin@gmail.com
-
Nicholas Piggin authored
In preparation for making hash MMU support configurable, move THP trace point function definitions out of an otherwise hash-specific file. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211201144153.2456614-8-npiggin@gmail.com
-
Nicholas Piggin authored
This avoids a change in behaviour in the later patch making hash support configurable. This is possibly a user interface change, so the alternative would be a hard-coded slb_size=0 here. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211201144153.2456614-7-npiggin@gmail.com
-
Nicholas Piggin authored
This reduces ifdefs in a later change which makes hash support configurable. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211201144153.2456614-6-npiggin@gmail.com
-
Nicholas Piggin authored
slb.c is hash-specific SLB management, but do_bad_slb_fault deals with segment interrupts that occur with radix MMU as well. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211201144153.2456614-5-npiggin@gmail.com
-
Nicholas Piggin authored
The pseries platform does not use the native hash code but the PAPR virtualised hash interfaces, so remove PPC_HASH_MMU_NATIVE. This requires moving tlbiel code from hash_native.c to hash_utils.c. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211201144153.2456614-4-npiggin@gmail.com
-
Nicholas Piggin authored
PPC_NATIVE now only controls the native HPT code, so rename it to be more descriptive. Restrict it to Book3S only. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211201144153.2456614-3-npiggin@gmail.com
-
Nicholas Piggin authored
FW_FEATURE_NATIVE_ALWAYS and FW_FEATURE_NATIVE_POSSIBLE are always zero and never do anything. Remove them. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211201144153.2456614-2-npiggin@gmail.com
-
- 01 Dec, 2021 1 commit
-
-
Cédric Le Goater authored
The automatic "save & restore" of interrupt context is a POWER10/XIVE2 feature exploited by KVM under the PowerNV platform. It is not available under pSeries and the associated toggle should not be exposed under the XIVE debugfs directory. Introduce a platform handler for debugfs initialization and move the 'save-restore' entry under the native (PowerNV) backend to fix compile when !CONFIG_PPC_POWERNV. Fixes: 1e7684dc ("powerpc/xive: Add a debugfs toggle for save-restore") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211201165418.1041842-1-clg@kaod.org
-