- 31 Aug, 2017 40 commits
-
-
Michael Ellerman authored
Anton noticed that if we fault part way through emulating an unaligned instruction, we don't update the DAR to reflect that. The DAR value is eventually reported back to userspace as the address in the SEGV signal, and if userspace is using that value to demand fault then it can be confused by us not setting the value correctly. This patch is ugly as hell, but is intended to be the minimal fix and back ports easily. Cc: stable@vger.kernel.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Paul Mackerras <paulus@ozlabs.org>
-
John Allen authored
Check if an LMB is assigned before attempting to call dlpar_acquire_drc in order to avoid any unnecessary rtas calls. This substantially reduces the running time of memory hot add on lpars with large amounts of memory. [mpe: We need to explicitly set rc to 0 in the success case, otherwise the compiler might think we use rc without initialising it.] Fixes: c21f515c ("powerpc/pseries: Make the acquire/release of the drc for memory a seperate step") Cc: stable@vger.kernel.org # v4.11+ Signed-off-by: John Allen <jallen@linux.vnet.ibm.com> Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Arvind Yadav authored
struct platform_suspend_ops are not supposed to change at runtime. Functions suspend_set_ops working with const platform_suspend_ops. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Oliver O'Halloran authored
In previous generations of Power processors each core had a private L2 cache. The Power 9 processor has a slightly different design where the L2 cache is shared among pairs of cores rather than being completely private. Making the scheduler aware of this cache sharing allows the scheduler to make better migration decisions. For example, if two CPU heavy tasks share a core then one task can be migrated to the paired core to improve throughput. Under the existing three level topology the task could be migrated to any core on the same chip, while with the new topology it would be preferentially migrated to the paired core so it remains cache-hot. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Oliver O'Halloran authored
We want to add an extra level to the CPU scheduler topology to account for cores which share a cache. To do this we need to build a cpumask for each CPU that indicates which CPUs share this cache to use as an input to the scheduler. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Oliver O'Halloran authored
The CPU scheduler topology is constructed from a number of per-cpu cpumasks which describe which sets of logical CPUs are related in some fashion. Current code that handles constructing these masks when CPUs are hot(un)plugged can be simplified a bit by exploiting the fact that the scheduler requires higher levels of the toplogy (e.g package level groupings) to be supersets of the lower levels (e.g. threas in a core). This patch reworks the cpumask construction to be simpler and easier to extend with extra topology levels. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> [mpe: Fix CONFIG_HOTPLUG_CPU=n build] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Oliver O'Halloran authored
When building the CPU scheduler topology the kernel uses the ibm,chipid property from the devicetree to group logical CPUs. Currently the DT search for this property is open-coded in smp.c and this functionality is a duplication of what's in cpu_to_chip_id() already. This patch removes the existing search in favor of that. It's worth mentioning that the semantics of the search are different in cpu_to_chip_id(). When there is no ibm,chipid in the CPUs node it will also search /cpus and / for the property, but this should not effect the output topology. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Frederic Barrat authored
cxl keeps a driver use count, which is used with the hash memory model on p8 to know when to upgrade local TLBIs to global and to trigger callbacks to manage the MMU for PSL8. If a process opens a context and closes without attaching or fails the attachment, the driver use count is never decremented. As a consequence, TLB invalidations remain global, even if there are no active cxl contexts. We should increment the driver use count when the process is attaching to the cxl adapter, and not on open. It's not needed before the adapter starts using the context and the use count is decremented on the detach path, so it makes more sense. It affects only the user api. The kernel api is already doing The Right Thing. Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Cc: stable@vger.kernel.org # v4.2+ Fixes: 7bb5d91a ("cxl: Rework context lifetimes") Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Michael Neuling authored
Currently these tests won't build with a `--enable-default-pie` compiler as they require r30 to be clobbered. This gives an error: ptrace-tm-spd-gpr.c:41:2: error: PIC register clobbered by 'r30' in 'asm' This forces these tests to be built no-pie. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Hannes Reinecke authored
mpsc.c and mpc52xx-psc.c are platform-specific serial drivers, and should be compiled for the respective platforms only. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Torsten Duwe <duwe@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Tobin C. Harding authored
.llong is an undocumented PPC specific directive. The generic equivalent is .quad, but even better (because it's self describing) is .8byte. Convert all .llong directives to .8byte. Signed-off-by: Tobin C. Harding <me@tobin.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Balbir Singh authored
Enable 64K page size and THP. I use ppc64le_defconfig when I need a single config across guest and host, but having 4K page size as default is not what I expect. I could move these over to server.config and merge if ppc64_defconfig is meant for systems that use 4k pages by default. Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Murilo Opsfelder Araujo authored
drivers/watchdog/wdrtas.c is of interest of linuxppc maintainers. Signed-off-by: Murilo Opsfelder Araujo <mopsfelder@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Balbir Singh authored
Most (all?) distros turn these on, so it makes sense to enable them for testing coverage, and they're also useful for developers. Signed-off-by: Balbir Singh <bsingharora@gmail.com> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> [mpe: Reword change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Balbir Singh authored
Add support for printing the PIDR/TIDR for ISA 300 and PSSCR and PTCR in ISA 3.0 hypervisor mode. SPRN_PSSCR_PR is the privileged mode access and is used when we are not in hypervisor mode. Signed-off-by: Balbir Singh <bsingharora@gmail.com> [mpe: Split out of larger patch] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Balbir Singh authored
This patch adds support to xmon for dumping the AMR, UAMOR, AMOR and IAMR SPRs based on their supported ISA revisions. Signed-off-by: Balbir Singh <bsingharora@gmail.com> [mpe: Split out of larger patch] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Balbir Singh authored
ISA 3.0 defines hypervisor decrementer to be 64 bits in length. This patch extends the print format for to be 64 bits. Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Masahiro Yamada authored
Remove unneeded variables and assignments. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Michael Ellerman authored
When we map memory at boot we print out the ranges of real addresses that we mapped and the page size that was used. Currently it's a bit ugly: Mapped range 0x0 - 0x2000000000 with 0x40000000 Mapped range 0x200000000000 - 0x202000000000 with 0x40000000 Pad the addresses so they line up, and print the page size using actual units, eg: Mapped 0x0000000000000000-0x0000000001200000 with 64.0 KiB pages Mapped 0x0000000001200000-0x0000000040000000 with 2.00 MiB pages Mapped 0x0000000040000000-0x0000000100000000 with 1.00 GiB pages Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Michael Ellerman authored
Make the printks look a bit nicer by adding a prefix. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Bryant G. Ly authored
For a PCI device it's pci_dn can be retrieved from pdev->dev.archdata.firmware_data, PCI_DN(devnode), or parent's list. Thus, we should just use the existing function pci_get_pdn_by_devfn to get the pci_dn. Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com> Reviewed-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Aneesh Kumar K.V authored
We need to add memory barrier so that the page table walk doesn't happen before the cpumask is set and made visible to the other cpus. We need to use a sync here instead of lwsync because lwsync is not sufficient for store/load ordering. We also need to add an if (mm) check so that we do the right thing when called with a kernel context. For kernel context, we have mm = NULL. W.r.t kernel address we can skip setting the mm cpumask. Fixes: 0f4bc093 ("powerpc/mm/cxl: Add the fault handling cpu to mm cpumask") Cc: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Sukadev Bhattiprolu authored
Define interfaces (wrappers) to the 'copy' and 'paste' instructions (which are new in PowerISA 3.0). These are intended to be used to by NX driver(s) to submit Coprocessor Request Blocks (CRBs) to the NX hardware engines. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Sukadev Bhattiprolu authored
Define an interface to open a VAS send window. This interface is intended to be used the Nest Accelerator (NX) driver(s) to open a send window and use it to submit compression/encryption requests to a VAS receive window. The receive window, identified by the [vasid, cop] parameters, must already be open in VAS (i.e connected to an NX engine). Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Sukadev Bhattiprolu authored
Define the vas_win_close() interface which should be used to close a send or receive windows. While the hardware configurations required to open send and receive windows differ, the configuration to close a window is the same for both. So we use a single interface to close the window. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Sukadev Bhattiprolu authored
Define the vas_rx_win_open() interface. This interface is intended to be used by the Nest Accelerator (NX) driver(s) to setup receive windows for one or more NX engines (which implement compression & encryption algorithms in the hardware). Follow-on patches will provide an interface to close the window and to open a send window that kernel subsystems can use to access the NX engines. The interface to open a receive window is expected to be invoked for each instance of VAS in the system. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Sukadev Bhattiprolu authored
Define helpers to allocate/free VAS window objects. These will be used in follow-on patches when opening/closing windows. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Sukadev Bhattiprolu authored
Define helpers to initialize window context registers of the VAS hardware. These will be used in follow-on patches when opening/closing VAS windows. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Sukadev Bhattiprolu authored
Define some helper functions to access the MMIO regions. We use these in follow-on patches to read/write VAS hardware registers. They are also used to later issue 'paste' instructions to submit requests to the NX hardware engines. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Sukadev Bhattiprolu authored
Implement vas_init() and vas_exit() functions for a new VAS module. This VAS module is essentially a library for other device drivers and kernel users of the NX coprocessors like NX-842 and NX-GZIP. In the future this will be extended to add support for user space to access the NX coprocessors. VAS is currently only supported with 64K page size. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Sukadev Bhattiprolu authored
Move the GET_FIELD and SET_FIELD macros to vas.h as VAS and other users of VAS, including NX-842 can use those macros. There is a lot of related code between the VAS/NX kernel drivers and skiboot. For consistency, switch the order of parameters in SET_FIELD to match the order in skiboot. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Reviewed-by: Dan Streetman <ddstreet@ieee.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Sukadev Bhattiprolu authored
Define macros for the VAS hardware registers and bit-fields as well as couple of data structures needed by the VAS driver. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> [mpe: Fixup include guard to use _ASM_POWERPC_VAS_H] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Balbir Singh authored
Convert 0.16x to 0.16lx. Otherwise we lose the top 8 nibbles and effectively print only the last 32 bits. Fixes: 1846193b ("powerpc/xmon: Dump ISA 2.06 SPRs") Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alexey Kardashevskiy authored
The check_req() helper uses pci_get_pdn() to get an OF node pointer. pci_get_pdn() returns a pci_dn pointer which either: 1) from the OF node returned by pci_device_to_OF_node(); 2) from the parent child_list where entries don't have OF node pointers. Since check_req() does not care about 2), it can call pci_device_to_OF_node() directly, hence the change. The find_pe_dn() helper uses embedded pci_dn to get an OF node which is also stored in edev->pdev so let's take a shortcut and call pci_device_to_OF_node() directly. With these 2 changes, we can finally get rid of the OF node back pointer. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alexey Kardashevskiy authored
The pci_dn struct caches a OF device node pointer in order to access the "ibm,loc-code" property when EEH is recovering. However, when this happens in eeh_dev_check_failure(), we also have a pci_dev pointer which should have a valid pointer to the device node when pci_dn has one (both pointers are not NULL for physical functions and are NULL for virtual functions). This changes pci_remove_device_node_info() to look for a parent of the node being removed, just like pci_add_device_node_info() does when it references the parent node. This is the first step to get rid of pci_dn::node. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alexey Kardashevskiy authored
The eeh_dev struct hold a config space address of an associated node and the very same address is also stored in the pci_dn struct which is always present during the eeh_dev lifetime. This uses bus:devfn directly from pci_dn instead of cached and packed config_addr. Since config_addr is made from device's bus:dev.fn, there is no point in keeping it in the debugfs either so remove that too. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alexey Kardashevskiy authored
The eeh_dev struct already holds a pointer to pci_dn which it does not exist without and pci_dn itself holds the very same pointer so just use it. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alexey Kardashevskiy authored
arch/powerpc/kernel/eeh_dev.c:57 is the only legit place where edev is allocated; other 2 places allocate it on stack and in the heap for a very short period of time to use eeh_pe_get() as takes edev. This changes eeh_pe_get() to receive required parameters explicitly. This removes unnecessary temporary allocation of edev. This uses the "pe_no" name instead of the "pe_config_addr" name as it actually is a PE number and not a config space address as it seemed. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Alexey Kardashevskiy authored
pdev is always NULL, remove it. To make checkpatch.pl happy, this also removes the "out of memory" message. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-
Arvind Yadav authored
clk_div_tables are not supposed to change at runtime. mpc512x_clk_divtable function working with const clk_div_table. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
-