- 31 Aug, 2017 40 commits
-
-
Oliver O'Halloran authored
When building the CPU scheduler topology the kernel uses the ibm,chipid property from the devicetree to group logical CPUs. Currently the DT search for this property is open-coded in smp.c and this functionality is a duplication of what's in cpu_to_chip_id() already. This patch removes the existing search in favor of that. It's worth mentioning that the semantics of the search are different in cpu_to_chip_id(). When there is no ibm,chipid in the CPUs node it will also search /cpus and / for the property, but this should not effect the output topology. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Frederic Barrat authored
cxl keeps a driver use count, which is used with the hash memory model on p8 to know when to upgrade local TLBIs to global and to trigger callbacks to manage the MMU for PSL8. If a process opens a context and closes without attaching or fails the attachment, the driver use count is never decremented. As a consequence, TLB invalidations remain global, even if there are no active cxl contexts. We should increment the driver use count when the process is attaching to the cxl adapter, and not on open. It's not needed before the adapter starts using the context and the use count is decremented on the detach path, so it makes more sense. It affects only the user api. The kernel api is already doing The Right Thing. Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Cc: stable@vger.kernel.org # v4.2+ Fixes: 7bb5d91a ("cxl: Rework context lifetimes") Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Michael Neuling authored
Currently these tests won't build with a `--enable-default-pie` compiler as they require r30 to be clobbered. This gives an error: ptrace-tm-spd-gpr.c:41:2: error: PIC register clobbered by 'r30' in 'asm' This forces these tests to be built no-pie. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Hannes Reinecke authored
mpsc.c and mpc52xx-psc.c are platform-specific serial drivers, and should be compiled for the respective platforms only. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Torsten Duwe <duwe@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Tobin C. Harding authored
.llong is an undocumented PPC specific directive. The generic equivalent is .quad, but even better (because it's self describing) is .8byte. Convert all .llong directives to .8byte. Signed-off-by: Tobin C. Harding <me@tobin.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Balbir Singh authored
Enable 64K page size and THP. I use ppc64le_defconfig when I need a single config across guest and host, but having 4K page size as default is not what I expect. I could move these over to server.config and merge if ppc64_defconfig is meant for systems that use 4k pages by default. Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Murilo Opsfelder Araujo authored
drivers/watchdog/wdrtas.c is of interest of linuxppc maintainers. Signed-off-by: Murilo Opsfelder Araujo <mopsfelder@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Balbir Singh authored
Most (all?) distros turn these on, so it makes sense to enable them for testing coverage, and they're also useful for developers. Signed-off-by: Balbir Singh <bsingharora@gmail.com> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> [mpe: Reword change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Balbir Singh authored
Add support for printing the PIDR/TIDR for ISA 300 and PSSCR and PTCR in ISA 3.0 hypervisor mode. SPRN_PSSCR_PR is the privileged mode access and is used when we are not in hypervisor mode. Signed-off-by: Balbir Singh <bsingharora@gmail.com> [mpe: Split out of larger patch] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Balbir Singh authored
This patch adds support to xmon for dumping the AMR, UAMOR, AMOR and IAMR SPRs based on their supported ISA revisions. Signed-off-by: Balbir Singh <bsingharora@gmail.com> [mpe: Split out of larger patch] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Balbir Singh authored
ISA 3.0 defines hypervisor decrementer to be 64 bits in length. This patch extends the print format for to be 64 bits. Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Masahiro Yamada authored
Remove unneeded variables and assignments. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Michael Ellerman authored
When we map memory at boot we print out the ranges of real addresses that we mapped and the page size that was used. Currently it's a bit ugly: Mapped range 0x0 - 0x2000000000 with 0x40000000 Mapped range 0x200000000000 - 0x202000000000 with 0x40000000 Pad the addresses so they line up, and print the page size using actual units, eg: Mapped 0x0000000000000000-0x0000000001200000 with 64.0 KiB pages Mapped 0x0000000001200000-0x0000000040000000 with 2.00 MiB pages Mapped 0x0000000040000000-0x0000000100000000 with 1.00 GiB pages Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Michael Ellerman authored
Make the printks look a bit nicer by adding a prefix. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Bryant G. Ly authored
For a PCI device it's pci_dn can be retrieved from pdev->dev.archdata.firmware_data, PCI_DN(devnode), or parent's list. Thus, we should just use the existing function pci_get_pdn_by_devfn to get the pci_dn. Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com> Reviewed-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Aneesh Kumar K.V authored
We need to add memory barrier so that the page table walk doesn't happen before the cpumask is set and made visible to the other cpus. We need to use a sync here instead of lwsync because lwsync is not sufficient for store/load ordering. We also need to add an if (mm) check so that we do the right thing when called with a kernel context. For kernel context, we have mm = NULL. W.r.t kernel address we can skip setting the mm cpumask. Fixes: 0f4bc093 ("powerpc/mm/cxl: Add the fault handling cpu to mm cpumask") Cc: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Sukadev Bhattiprolu authored
Define interfaces (wrappers) to the 'copy' and 'paste' instructions (which are new in PowerISA 3.0). These are intended to be used to by NX driver(s) to submit Coprocessor Request Blocks (CRBs) to the NX hardware engines. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Sukadev Bhattiprolu authored
Define an interface to open a VAS send window. This interface is intended to be used the Nest Accelerator (NX) driver(s) to open a send window and use it to submit compression/encryption requests to a VAS receive window. The receive window, identified by the [vasid, cop] parameters, must already be open in VAS (i.e connected to an NX engine). Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Sukadev Bhattiprolu authored
Define the vas_win_close() interface which should be used to close a send or receive windows. While the hardware configurations required to open send and receive windows differ, the configuration to close a window is the same for both. So we use a single interface to close the window. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Sukadev Bhattiprolu authored
Define the vas_rx_win_open() interface. This interface is intended to be used by the Nest Accelerator (NX) driver(s) to setup receive windows for one or more NX engines (which implement compression & encryption algorithms in the hardware). Follow-on patches will provide an interface to close the window and to open a send window that kernel subsystems can use to access the NX engines. The interface to open a receive window is expected to be invoked for each instance of VAS in the system. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Sukadev Bhattiprolu authored
Define helpers to allocate/free VAS window objects. These will be used in follow-on patches when opening/closing windows. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Sukadev Bhattiprolu authored
Define helpers to initialize window context registers of the VAS hardware. These will be used in follow-on patches when opening/closing VAS windows. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Sukadev Bhattiprolu authored
Define some helper functions to access the MMIO regions. We use these in follow-on patches to read/write VAS hardware registers. They are also used to later issue 'paste' instructions to submit requests to the NX hardware engines. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Sukadev Bhattiprolu authored
Implement vas_init() and vas_exit() functions for a new VAS module. This VAS module is essentially a library for other device drivers and kernel users of the NX coprocessors like NX-842 and NX-GZIP. In the future this will be extended to add support for user space to access the NX coprocessors. VAS is currently only supported with 64K page size. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Sukadev Bhattiprolu authored
Move the GET_FIELD and SET_FIELD macros to vas.h as VAS and other users of VAS, including NX-842 can use those macros. There is a lot of related code between the VAS/NX kernel drivers and skiboot. For consistency, switch the order of parameters in SET_FIELD to match the order in skiboot. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Reviewed-by: Dan Streetman <ddstreet@ieee.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Sukadev Bhattiprolu authored
Define macros for the VAS hardware registers and bit-fields as well as couple of data structures needed by the VAS driver. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> [mpe: Fixup include guard to use _ASM_POWERPC_VAS_H] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Balbir Singh authored
Convert 0.16x to 0.16lx. Otherwise we lose the top 8 nibbles and effectively print only the last 32 bits. Fixes: 1846193b ("powerpc/xmon: Dump ISA 2.06 SPRs") Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alexey Kardashevskiy authored
The check_req() helper uses pci_get_pdn() to get an OF node pointer. pci_get_pdn() returns a pci_dn pointer which either: 1) from the OF node returned by pci_device_to_OF_node(); 2) from the parent child_list where entries don't have OF node pointers. Since check_req() does not care about 2), it can call pci_device_to_OF_node() directly, hence the change. The find_pe_dn() helper uses embedded pci_dn to get an OF node which is also stored in edev->pdev so let's take a shortcut and call pci_device_to_OF_node() directly. With these 2 changes, we can finally get rid of the OF node back pointer. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alexey Kardashevskiy authored
The pci_dn struct caches a OF device node pointer in order to access the "ibm,loc-code" property when EEH is recovering. However, when this happens in eeh_dev_check_failure(), we also have a pci_dev pointer which should have a valid pointer to the device node when pci_dn has one (both pointers are not NULL for physical functions and are NULL for virtual functions). This changes pci_remove_device_node_info() to look for a parent of the node being removed, just like pci_add_device_node_info() does when it references the parent node. This is the first step to get rid of pci_dn::node. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alexey Kardashevskiy authored
The eeh_dev struct hold a config space address of an associated node and the very same address is also stored in the pci_dn struct which is always present during the eeh_dev lifetime. This uses bus:devfn directly from pci_dn instead of cached and packed config_addr. Since config_addr is made from device's bus:dev.fn, there is no point in keeping it in the debugfs either so remove that too. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alexey Kardashevskiy authored
The eeh_dev struct already holds a pointer to pci_dn which it does not exist without and pci_dn itself holds the very same pointer so just use it. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alexey Kardashevskiy authored
arch/powerpc/kernel/eeh_dev.c:57 is the only legit place where edev is allocated; other 2 places allocate it on stack and in the heap for a very short period of time to use eeh_pe_get() as takes edev. This changes eeh_pe_get() to receive required parameters explicitly. This removes unnecessary temporary allocation of edev. This uses the "pe_no" name instead of the "pe_config_addr" name as it actually is a PE number and not a config space address as it seemed. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alexey Kardashevskiy authored
pdev is always NULL, remove it. To make checkpatch.pl happy, this also removes the "out of memory" message. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Arvind Yadav authored
clk_div_tables are not supposed to change at runtime. mpc512x_clk_divtable function working with const clk_div_table. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Dan Carpenter authored
My static checker complains that 0x00001800 >> 13 is zero. Looking at the context, it seems like a copy and paste bug from the line below and probably 0x3 << 13 or 0x00006000 was intended. Fixes: 2af59f7d ("[POWERPC] 4xx: Add 405GPr and 405EP support in boot wrapper") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Dan Carpenter authored
There is a cut and paste error here so we use sizeof(struct mpc83xx_pmc) to remap the memory for "clock_regs". That sizeof() is 20 bytes and we only need to remap 12 bytes. It presumably doesn't affect run time too much... I changed them to both use sizeof(*variable_name) because that's the preferred kernel style these days. Fixes: d49747bd ("powerpc/mpc83xx: Power Management support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> [mpe: It will map at least one page anyway, but still a good cleanup] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Nicholas Piggin authored
Use nmi_enter similarly to system reset interrupts. This uses NMI printk NMI buffers and turns off various debugging facilities that helps avoid tripping on ourselves or other CPUs. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Nicholas Piggin authored
There are quite a few machine check exceptions that can be caused by kernel bugs. To make debugging easier, use the kernel crash path in cases of synchronous machine checks that occur in kernel mode, if that would not result in the machine going straight to panic or crash dump. There is a downside here that die()ing the process in kernel mode can still leave the system unstable. panic_on_oops will always force the system to fail-stop, so systems where that behaviour is important will still do the right thing. As a test, when triggering an i-side 0111b error (ifetch from foreign address) in kernel mode process context on POWER9, the kernel currently dies quickly like this: Severe Machine check interrupt [Not recovered] NIP [ffff000000000000]: 0xffff000000000000 Initiator: CPU Error type: Real address [Instruction fetch (foreign)] [ 127.426651616,0] OPAL: Reboot requested due to Platform error. Effective[ 127.426693712,3] OPAL: Reboot requested due to Platform error. address: ffff000000000000 opal: Reboot type 1 not supported Kernel panic - not syncing: PowerNV Unrecovered Machine Check CPU: 56 PID: 4425 Comm: syscall Tainted: G M 4.12.0-rc1-13857-ga4700a26-dirty #35 Call Trace: [ 128.017988928,4] IPMI: BUG: Dropping ESEL on the floor due to buggy/mising code in OPAL for this BMC Rebooting in 10 seconds.. Trying to free IRQ 496 from IRQ context! After this patch, the process is killed and the kernel continues with this message, which gives enough information to identify the offending branch (i.e., with CFAR): Severe Machine check interrupt [Not recovered] NIP [ffff000000000000]: 0xffff000000000000 Initiator: CPU Error type: Real address [Instruction fetch (foreign)] Effective address: ffff000000000000 Oops: Machine check, sig: 7 [#1] SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 ... CPU: 22 PID: 4436 Comm: syscall Tainted: G M 4.12.0-rc1-13857-ga4700a26-dirty #36 task: c000000932300000 task.stack: c000000932380000 NIP: ffff000000000000 LR: 00000000217706a4 CTR: ffff000000000000 REGS: c00000000fc8fd80 TRAP: 0200 Tainted: G M (4.12.0-rc1-13857-ga4700a26-dirty) MSR: 90000000001c1003 <SF,HV,ME,RI,LE> CR: 24000484 XER: 20000000 CFAR: c000000000004c80 DAR: 0000000021770a90 DSISR: 0a000000 SOFTE: 1 GPR00: 0000000000001ebe 00007fffce4818b0 0000000021797f00 0000000000000000 GPR04: 00007fff8007ac24 0000000044000484 0000000000004000 00007fff801405e8 GPR08: 900000000280f033 0000000024000484 0000000000000000 0000000000000030 GPR12: 9000000000001003 00007fff801bc370 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR28: 00007fff801b0000 0000000000000000 00000000217707a0 00007fffce481918 NIP [ffff000000000000] 0xffff000000000000 LR [00000000217706a4] 0x217706a4 Call Trace: Instruction dump: XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Nicholas Piggin authored
Unrecovered MCE and HMI errors are sent through a special restart OPAL call to log the platform error. The downside is that they don't go through normal Linux crash paths, so they don't give much information to the Linux console. Change this by providing a special crash function which does some of the console flushing from the panic() path before calling firmware to reboot. The downside of this is a little more code to execute before reaching the firmware reboot. However in practice, it's critical to get the Linux console messages output in order to debug a problem. So this is a desirable tradeoff. Note on the implementation: It is difficult to plumb a custom reboot handler into the panic path, because panic does a little bit too much work. For example, it will try to delay with the timebase, but that may be corrupted in some cases resulting in a hang without reaching the platform reboot. Another problem is that panic can invoke the crash dump code which is not what we want in the case of a hardware platform error. Long-term the best solution will be to rework the panic path so it can be suitable for this kind of panic, but for now we just duplicate a bit of the code. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Nicholas Piggin authored
A system reset is a request to crash / debug the system rather than necessarily caused by encountering a BUG. So there is no need to serialize all CPUs behind the die lock, adding taints to all subsequent traces beyond the first, breaking console locks, etc. The system reset is NMI context which has its own printk buffers to prevent output being interleaved. Then it's better to have all secondaries print out their debug as quickly as possible and the primary will flush out all printk buffers during panic(). So remove the 0x100 path from die, and move it into system_reset. Name the crash/dump reasons "System Reset". This gives "not tained" traces when crashing an untainted kernel. It also gives the panic reason as "System Reset" as opposed to "Fatal exception in interrupt" (or "die oops" for fadump). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-