- 30 Nov, 2016 7 commits
-
-
Thiago Jung Bauermann authored
This purgatory implementation is based on the versions from kexec-tools and kexec-lite, with additional changes. Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Thiago Jung Bauermann authored
This patch adds the support code needed for implementing kexec_file_load() on powerpc. This consists of functions to load the ELF kernel, either big or little endian, and setup the purgatory enviroment which switches from the first kernel to the second kernel. None of this code is built yet, as it depends on CONFIG_KEXEC_FILE which we have not yet defined. Although we could define CONFIG_KEXEC_FILE in this patch, we'd then have a window in history where the kconfig symbol is present but the syscall is not, which would be awkward. Signed-off-by: Josh Sklar <sklar@linux.vnet.ibm.com> Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Thiago Jung Bauermann authored
Commit 2965faa5 ("kexec: split kexec_load syscall from kexec core code") introduced CONFIG_KEXEC_CORE so that CONFIG_KEXEC means whether the kexec_load system call should be compiled-in and CONFIG_KEXEC_FILE means whether the kexec_file_load system call should be compiled-in. These options can be set independently from each other. Since until now powerpc only supported kexec_load, CONFIG_KEXEC and CONFIG_KEXEC_CORE were synonyms. That is not the case anymore, so we need to make a distinction. Almost all places where CONFIG_KEXEC was being used should be using CONFIG_KEXEC_CORE instead, since kexec_file_load also needs that code compiled in. Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Thiago Jung Bauermann authored
kexec_locate_mem_hole will be used by the PowerPC kexec_file_load implementation to find free memory for the purgatory stack. Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Thiago Jung Bauermann authored
This is done to simplify the kexec_add_buffer argument list. Adapt all callers to set up a kexec_buf to pass to kexec_add_buffer. In addition, change the type of kexec_buf.buffer from char * to void *. There is no particular reason for it to be a char *, and the change allows us to get rid of 3 existing casts to char * in the code. Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Thiago Jung Bauermann authored
Allow architectures to specify a different memory walking function for kexec_add_buffer. x86 uses iomem to track reserved memory ranges, but PowerPC uses the memblock subsystem. Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Balbir Singh authored
Aneesh/Ben reported that the change to do_page_fault() we made in commit 1d18ad02 ("powerpc/mm: Detect instruction fetch denied and report") needs to handle the case where CPU_FTR_COHERENT_ICACHE is missing but we have CPU_FTR_NOEXECUTE. In those cases the check added for SRR1_ISI_N_OR_G might trigger a false positive. This patch adds a check for CPU_FTR_COHERENT_ICACHE in addition to the MSR value. Fixes: 1d18ad02 ("powerpc/mm: Detect instruction fetch denied and report") Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
- 29 Nov, 2016 3 commits
-
-
Michael Ellerman authored
Now that we don't set ARCH incorrectly when calling the boot Makefile, we can use the generic cpp_lds_S rule for converting our zImage.lds.S into zImage.lds. The main advantage of using the generic rule is that it correctly uses if_changed, which means we correctly regenerate the linker script when switching endian. Fixing that means we are finally able to build one endian and then rebuild the other endian without requiring to clean between builds. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Michael Ellerman authored
If we're using if_changed then we must depend on FORCE, so that if_changed gets a chance to check if something changed. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Michael Ellerman authored
Back in 2005 when the ppc/ppc64 merge started, we used to build the kernel code in arch/powerpc but use the boot code from arch/ppc or arch/ppc64 depending on whether we were building for 32 or 64-bit. Originally we called the boot Makefile passing ARCH=$(OLDARCH), where OLDARCH was ppc or ppc64. In commit 20f62954 ("powerpc: Make building the boot image work for both 32-bit and 64-bit") (2005-10-11) we split the call for 32/64-bit using an ifeq check, because the two Makefiles took different targets, and explicitly passed ARCH=ppc64 for the 64-bit case and ARCH=ppc for the 32-bit case. Then in commit 94b212c2 ("powerpc: Move ppc64 boot wrapper code over to arch/powerpc") (2005-11-16) we moved the boot code into arch/powerpc and dropped the ppc case, but kept passing ARCH=ppc64 to arch/powerpc/boot/Makefile. Since then there have been several more boot targets added, all of which have copied the ARCH=ppc64 setting, such that now we have four targets using it. Currently it seems that nothing actually uses the ARCH value, but that's basically just luck, and in particular it prevents us from using the generic cpp_lds_S rule. It's also clearly wrong, ARCH=ppc64 is dead, buried and cremated. Fix it by dropping the setting of ARCH completely, the correct value is exported by the top level Makefile. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
- 28 Nov, 2016 9 commits
-
-
Aneesh Kumar K.V authored
This will improve the task exit case, by batching tlb invalidates. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Aneesh Kumar K.V authored
When we are updating a pte, we just need to flush the tlb mapping that pte. Right now we do a full mm flush because we don't track page size. Now that we have page size details in pte use that to do the optimized flush Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Aneesh Kumar K.V authored
When we are updating a pte, we just need to flush the tlb mapping that pte. Right now we do a full mm flush because we don't track the page size. Now that we have page size details in pte use that to do the optimized flush Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Aneesh Kumar K.V authored
Now that we have page size details encoded in pte using software pte bits, use that to find the page size needed for tlb flush. This function should only be used on P9 DD1, so give it a horrible name to make that clear. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Aneesh Kumar K.V authored
This patch adds a new software defined pte bit. We use the reserved fields of ISA 3.0 pte definition since we will only be using this on DD1 code paths. We can possibly look at removing this code later. The software bit will be used to differentiate between 64K/4K and 2M ptes. This helps in finding the page size mapping by a pte so that we can do efficient tlb flush. We don't support 1G hugetlb pages yet. So we add a DEBUG WARN_ON to catch wrong usage. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Aneesh Kumar K.V authored
W.r.t hash page table config, we support 16MB and 16GB as the hugepage size. Update the hstate_get_psize to handle 16M and 16G. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Aneesh Kumar K.V authored
We will start moving some book3s specific hugetlb functions there. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Nicholas Piggin authored
This converts one that was missed by b1576fec ("powerpc: No need to use dot symbols when branching to a function"). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Nicholas Piggin authored
From 80f23935 ("powerpc: Convert cmp to cmpd in idle enter sequence"): PowerPC's "cmp" instruction has four operands. Normally people write "cmpw" or "cmpd" for the second cmp operand 0 or 1. But, frequently people forget, and write "cmp" with just three operands. With older binutils this is silently accepted as if this was "cmpw", while often "cmpd" is wanted. With newer binutils GAS will complain about this for 64-bit code. For 32-bit code it still silently assumes "cmpw" is what is meant. In this case, cmpwi is called for, so this is just a build fix for new toolchains. Cc: stable@vger.kernel.org # v3.0+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
- 26 Nov, 2016 1 commit
-
-
Balbir Singh authored
ISA 3 defines new encoded access authority that allows instruction access prevention in privileged mode and allows normal access to problem state. This patch just enables IAMR (Instruction Authority Mask Register), enabling AMR would require more work. I've tested this with a buggy driver and a simple payload. The payload is specific to the build I've tested. mpe: Also tested with LKDTM: # echo EXEC_USERSPACE > /sys/kernel/debug/provoke-crash/DIRECT lkdtm: Performing direct entry EXEC_USERSPACE lkdtm: attempting ok execution at c0000000005bf560 lkdtm: attempting bad execution at 00003fff8d940000 Unable to handle kernel paging request for instruction fetch Faulting instruction address: 0x3fff8d940000 Oops: Kernel access of bad area, sig: 11 [#1] NIP: 00003fff8d940000 LR: c0000000005bfa58 CTR: 00003fff8d940000 REGS: c0000000f1fcf900 TRAP: 0400 Not tainted (4.9.0-rc5-compiler_gcc-6.2.0-00109-g956dbc06232a) MSR: 9000000010009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 48002222 XER: 00000000 ... Call Trace: lkdtm_EXEC_USERSPACE+0x104/0x120 (unreliable) lkdtm_do_action+0x3c/0x80 direct_entry+0x100/0x1b0 full_proxy_write+0x94/0x100 __vfs_write+0x3c/0x1b0 vfs_write+0xcc/0x230 SyS_write+0x60/0x110 system_call+0x38/0xfc Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
- 25 Nov, 2016 7 commits
-
-
Balbir Singh authored
ISA 3 allows for prevention of instruction fetch and execution of user mode pages. If such an error occurs, SRR1 bit 35 reports the error. We catch and report the error in do_page_fault(). Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Balbir Singh authored
Setup AMOR (Authority Mask Override Register) in HV mode so that the host and guest kernel can in turn setup IAMR. This allows us to enable key 0 in a following patch. Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Gautham R. Shenoy authored
Ensure that PSSCR is set to a safe value corresponding to no state-loss each time a POWER9 CPU comes online. Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Acked-By: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Michael Ellerman authored
There is a nice interface for asking ftrace to dump all its tracing buffers. The only down side for use in xmon is that it uses printk. Depending on circumstances printk may not work when in xmon, but it also may, so add a 'dt' command which dumps the ftrace buffers, and add a note to the help to mentiont that it uses printk. Calling this routine also disables tracing, which is problematic if you return from xmon and expect the system to keep operating normally. So after we do the dump turn tracing back on. Both functions already have nop versions defined for when ftrace is not enabled, so we don't need any extra #ifdefs. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Geliang Tang authored
Use builtin_platform_driver() helper to simplify the code. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Acked-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Geliang Tang authored
Drop duplicate header sched.h from native.c. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Michael Ellerman authored
In commit d0563a12 ("powerpc: Implement {cmp}xchg for u8 and u16") we removed the volatile from __cmpxchg(). This is leading to warnings such as: drivers/gpu/drm/drm_lock.c: In function ‘drm_lock_take’: arch/powerpc/include/asm/cmpxchg.h:484:37: warning: passing argument 1 of ‘__cmpxchg’ discards ‘volatile’ qualifier from pointer target (__typeof__(*(ptr))) __cmpxchg((ptr), (unsigned long)_o_, \ There doesn't seem to be consensus across architectures whether the argument is volatile or not, so at least for now put the volatile back. Fixes: d0563a12 ("powerpc: Implement {cmp}xchg for u8 and u16") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
- 24 Nov, 2016 1 commit
-
-
Michael Ellerman authored
Merge the topic branch we're sharing with the kvm-ppc tree.
-
- 23 Nov, 2016 8 commits
-
-
Andrew Donnellan authored
Fix the following coccinelle warnings: drivers/misc/cxl/debugfs.c:46:0-23: WARNING: fops_io_x64 should be defined with DEFINE_DEBUGFS_ATTRIBUTE drivers/misc/cxl/guest.c:890:5-26: WARNING: Comparison to bool drivers/misc/cxl/irq.c:107:3-23: WARNING: Assignment of bool to 0/1 drivers/misc/cxl/native.c:57:2-3: Unneeded semicolon drivers/misc/cxl/native.c:170:2-3: Unneeded semicolon Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Christophe Leroy authored
Partially copied from commit df0698be ("ARM: stack protector: change the canary value per task") A new random value for the canary is stored in the task struct whenever a new task is forked. This is meant to allow for different canary values per task. On powerpc, GCC expects the canary value to be found in a global variable called __stack_chk_guard. So this variable has to be updated with the value stored in the task struct whenever a task switch occurs. Because the variable GCC expects is global, this cannot work on SMP unfortunately. So, on SMP, the same initial canary value is kept throughout, making this feature a bit less effective although it is still useful. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Christophe Leroy authored
Partialy copied from commit c743f380 ("ARM: initial stack protector (-fstack-protector) support") This is the very basic stuff without the changing canary upon task switch yet. Just the Kconfig option and a constant canary value initialized at boot time. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Pan Xinhui authored
Implement xchg{u8,u16}{local,relaxed}, and cmpxchg{u8,u16}{,local,acquire,relaxed}. It works on all ppc. remove volatile of first parameter in __cmpxchg_local and __cmpxchg Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com> Acked-by: Boqun Feng <boqun.feng@gmail.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Lars-Peter Clausen authored
There are no ibmebus driver that make use of legacy suspend/resume. This patch removes the support for it from ibmebus framework, new ibmebus driver (as unlikely as they are) wanting to use suspend/resume should use dev_pm_ops. Since there aren't any special bus specific things to do during suspend/resume and since the PM core will automatically fallback directly to using the device's PM ops if no bus PM ops are specified there is no need to have any special ibmebus PM ops at all. Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Naveen N. Rao authored
Invoke the kprobe handlers directly rather than through notify_die(), to reduce path taken for handling kprobes. Similar to commit 6f6343f5 ("kprobes/x86: Call exception handlers directly from do_int3/do_debug"). While at it, rename post_kprobe_handler() to kprobe_post_handler() for more uniform naming. Reported-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Naveen N. Rao authored
Commit 03465f89 ("powerpc: Use kprobe blacklist for exception handlers") removed __kprobes annotation from some of the prototypes, but left the kprobes header include directive unchanged. Remove it as it is no longer needed. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Michael Neuling authored
Define and set the POWER9 HFSCR doorbell bit so that guests can use msgsndp. ISA 3.0 calls this MSGP, so name it accordingly in the code. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
- 22 Nov, 2016 4 commits
-
-
Michael Ellerman authored
ISA 3.0 defines a new PECE (Power-saving mode Exit Cause Enable) field in the LPCR (Logical Partitioning Control Register), called LPCR_PECE_HVEE (Hypervisor Virtualization Exit Enable). KVM code will need to know about this bit, so add a definition for it. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Suraj Jitindar Singh authored
ISA 3.00 adds the logical PVR value 0x0f000005, so add a definition for this. Define PCR_ARCH_207 to reflect ISA 2.07 compatibility mode in the processor compatibility register (PCR). [paulus@ozlabs.org - moved dummy PCR_ARCH_300 value into next patch] Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Paul Mackerras authored
This defines real-mode versions of opal_int_get_xirr(), opal_int_eoi() and opal_int_set_mfrr(), for use by KVM real-mode code. It also exports opal_int_set_mfrr() so that the modular part of KVM can use it to send IPIs. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Paul Mackerras authored
POWER9 requires the host to set up a partition table, which is a table in memory indexed by logical partition ID (LPID) which contains the pointers to page tables and process tables for the host and each guest. This factors out the initialization of the partition table into a single function. This code was previously duplicated between hash_utils_64.c and pgtable-radix.c. This provides a function for setting a partition table entry, which is used in early MMU initialization, and will be used by KVM whenever a guest is created. This function includes a tlbie instruction which will flush all TLB entries for the LPID and all caches of the partition table entry for the LPID, across the system. This also moves a call to memblock_set_current_limit(), which was in radix_init_partition_table(), but has nothing to do with the partition table. By analogy with the similar code for hash, the call gets moved to near the end of radix__early_init_mmu(). It now gets called when running as a guest, whereas previously it would only be called if the kernel is running as the host. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-