Commits · 31bfdb036f1281831db2532178f0da41f4dc9bed · Kirill Smelkov / linux

01 Sep, 2017 16 commits

powerpc: Use instruction emulation infrastructure to handle alignment faults · 31bfdb03

Paul Mackerras authored Aug 30, 2017

This replaces almost all of the instruction emulation code in
fix_alignment() with calls to analyse_instr(), emulate_loadstore()
and emulate_dcbz().  The only emulation code left is the SPE
emulation code; analyse_instr() etc. do not handle SPE instructions
at present.

One result of this is that we can now handle alignment faults on
all the new VSX load and store instructions that were added in POWER9.
VSX loads/stores will take alignment faults for unaligned accesses
to cache-inhibited memory.

Another effect is that we no longer rely on the DAR and DSISR values
set by the processor.

With this, we now need to include the instruction emulation code
unconditionally.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

31bfdb03

powerpc: Separate out load/store emulation into its own function · a53d5182

Paul Mackerras authored Aug 30, 2017

This moves the parts of emulate_step() that deal with emulating
load and store instructions into a new function called
emulate_loadstore().  This is to make it possible to reuse this
code in the alignment handler.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

a53d5182

powerpc: Handle opposite-endian processes in emulation code · d955189a

Paul Mackerras authored Aug 30, 2017

This adds code to the load and store emulation code to byte-swap
the data appropriately when the process being emulated is set to
the opposite endianness to that of the kernel.

This also enables the emulation for the multiple-register loads
and stores (lmw, stmw, lswi, stswi, lswx, stswx) to work for
little-endian.  In little-endian mode, the partial word at the
end of a transfer for lsw*/stsw* (when the byte count is not a
multiple of 4) is loaded/stored at the least-significant end of
the register.  Additionally, this fixes a bug in the previous
code in that it could call read_mem/write_mem with a byte count
that was not 1, 2, 4 or 8.

Note that this only works correctly on processors with "true"
little-endian mode, such as IBM POWER processors from POWER6 on, not
the so-called "PowerPC" little-endian mode that uses address swizzling
as implemented on the old 32-bit 603, 604, 740/750, 74xx CPUs.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

d955189a

powerpc: Set regs->dar if memory access fails in emulate_step() · b9da9c8a

Paul Mackerras authored Aug 30, 2017

This adds code to the instruction emulation code to set regs->dar
to the address of any memory access that fails.  This address is
not necessarily the same as the effective address of the instruction,
because if the memory access is unaligned, it might cross a page
boundary and fault on the second page.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

b9da9c8a

powerpc: Emulate the dcbz instruction · b2543f7b

Paul Mackerras authored Aug 30, 2017

This adds code to analyse_instr() and emulate_step() to understand the
dcbz (data cache block zero) instruction.  The emulate_dcbz() function
is made public so it can be used by the alignment handler in future.
(The apparently unnecessary cropping of the address to 32 bits is
there because it will be needed in that situation.)
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

b2543f7b

powerpc: Emulate load/store floating double pair instructions · 1f41fb79

Paul Mackerras authored Aug 30, 2017

This adds lfdp[x] and stfdp[x] to the set of instructions that
analyse_instr() and emulate_step() understand.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

1f41fb79

powerpc: Emulate vector element load/store instructions · e61ccc7b

Paul Mackerras authored Aug 30, 2017

This adds code to analyse_instr() and emulate_step() to handle the
vector element loads and stores:

lvebx, lvehx, lvewx, stvebx, stvehx, stvewx.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

e61ccc7b

powerpc: Emulate FP/vector/VSX loads/stores correctly when regs not live · c22435a5

Paul Mackerras authored Aug 30, 2017

At present, the analyse_instr/emulate_step code checks for the
relevant MSR_FP/VEC/VSX bit being set when a FP/VMX/VSX load
or store is decoded, but doesn't recheck the bit before reading or
writing the relevant FP/VMX/VSX register in emulate_step().

Since we don't have preemption disabled, it is possible that we get
preempted between checking the MSR bit and doing the register access.
If that happened, then the registers would have been saved to the
thread_struct for the current process. Accesses to the CPU registers
would then potentially read stale values, or write values that would
never be seen by the user process.

Another way that the registers can become non-live is if a page
fault occurs when accessing user memory, and the page fault code
calls a copy routine that wants to use the VMX or VSX registers.

To fix this, the code for all the FP/VMX/VSX loads gets restructured
so that it forms an image in a local variable of the desired register
contents, then disables preemption, checks the MSR bit and either
sets the CPU register or writes the value to the thread struct.
Similarly, the code for stores checks the MSR bit, copies either the
CPU register or the thread struct to a local variable, then reenables
preemption and then copies the register image to memory.

If the instruction being emulated is in the kernel, then we must not
use the register values in the thread_struct. In this case, if the
relevant MSR enable bit is not set, then emulate_step refuses to
emulate the instruction.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

c22435a5

powerpc: Make load/store emulation use larger memory accesses · e0a0986b

Paul Mackerras authored Aug 30, 2017

At the moment, emulation of loads and stores of up to 8 bytes to
unaligned addresses on a little-endian system uses a sequence of
single-byte loads or stores to memory.  This is rather inefficient,
and the code is hard to follow because it has many ifdefs.
In addition, the Power ISA has requirements on how unaligned accesses
are performed, which are not met by doing all accesses as
sequences of single-byte accesses.

Emulation of VSX loads and stores uses __copy_{to,from}_user,
which means the emulation code has no control on the size of
accesses.

To simplify this, we add new copy_mem_in() and copy_mem_out()
functions for accessing memory.  These use a sequence of the largest
possible aligned accesses, up to 8 bytes (or 4 on 32-bit systems),
to copy memory between a local buffer and user memory.  We then
rewrite {read,write}_mem_unaligned and the VSX load/store
emulation using these new functions.

These new functions also simplify the code in do_fp_load() and
do_fp_store() for the unaligned cases.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

e0a0986b

powerpc: Add emulation for the addpcis instruction · 958465ee

Paul Mackerras authored Aug 30, 2017

The addpcis instruction puts the sum of the next instruction address
plus a constant into a register. Since the result depends on the
address of the instruction, it will give an incorrect result if it
is single-stepped out of line, which is what the *probes subsystem
will currently do if a probe is placed on an addpcis instruction.
This fixes the problem by adding emulation of it to analyse_instr().
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

958465ee

powerpc: Don't update CR0 in emulation of popcnt, prty, bpermd instructions · 5762e083

Paul Mackerras authored Aug 30, 2017

The architecture shows the least-significant bit of the instruction
word as reserved for the popcnt[bwd], prty[wd] and bpermd
instructions, that is, these instructions never update CR0.
Therefore this changes the emulation of these instructions to
skip the CR0 update.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

5762e083

powerpc: Fix emulation of the isel instruction · f1bbb99f

Paul Mackerras authored Aug 30, 2017

The case added for the isel instruction was added inside a switch
statement which uses the 10-bit minor opcode field in the 0x7fe
bits of the instruction word.  However, for the isel instruction,
the minor opcode field is only the 0x3e bits, and the 0x7c0 bits
are used for the "BC" field, which indicates which CR bit to use
to select the result.

Therefore, for the isel emulation to work correctly when BC != 0,
we need to match on ((instr >> 1) & 0x1f) == 15).  To do this, we
pull the isel case out of the switch statement and put it in an
if statement of its own.

Fixes: e27f71e5 ("powerpc/lib/sstep: Add isel instruction emulation")
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

f1bbb99f

powerpc/64: Fix update forms of loads and stores to write 64-bit EA · d120cdbc

Paul Mackerras authored Aug 30, 2017

When a 64-bit processor is executing in 32-bit mode, the update forms
of load and store instructions are required by the architecture to
write the full 64-bit effective address into the RA register, though
only the bottom 32 bits are used to address memory.  Currently,
the instruction emulation code writes the truncated address to the
RA register.  This fixes it by keeping the full 64-bit EA in the
instruction_op structure, truncating the address in emulate_step()
where it is used to address memory, rather than in the address
computations in analyse_instr().
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

d120cdbc

powerpc: Handle most loads and stores in instruction emulation code · 350779a2

Paul Mackerras authored Aug 30, 2017

This extends the instruction emulation infrastructure in sstep.c to
handle all the load and store instructions defined in the Power ISA
v3.0, except for the atomic memory operations, ldmx (which was never
implemented), lfdp/stfdp, and the vector element load/stores.

The instructions added are:

Integer loads and stores: lbarx, lharx, lqarx, stbcx., sthcx., stqcx.,
lq, stq.

VSX loads and stores: lxsiwzx, lxsiwax, stxsiwx, lxvx, lxvl, lxvll,
lxvdsx, lxvwsx, stxvx, stxvl, stxvll, lxsspx, lxsdx, stxsspx, stxsdx,
lxvw4x, lxsibzx, lxvh8x, lxsihzx, lxvb16x, stxvw4x, stxsibx, stxvh8x,
stxsihx, stxvb16x, lxsd, lxssp, lxv, stxsd, stxssp, stxv.

These instructions are handled both in the analyse_instr phase and in
the emulate_step phase.

The code for lxvd2ux and stxvd2ux has been taken out, as those
instructions were never implemented in any processor and have been
taken out of the architecture, and their opcodes have been reused for
other instructions in POWER9 (lxvb16x and stxvb16x).

The emulation for the VSX loads and stores uses helper functions
which don't access registers or memory directly, which can hopefully
be reused by KVM later.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

350779a2

powerpc: Don't check MSR FP/VMX/VSX enable bits in analyse_instr() · ee0a54d7

Paul Mackerras authored Aug 30, 2017

This removes the checks for the FP/VMX/VSX enable bits in the MSR
from analyse_instr() and adds them to emulate_step() instead.

The reason for this is that we may want to use analyse_instr() in
a situation where the FP/VMX/VSX register values are stored in the
current thread_struct and the FP/VMX/VSX enable bits in the MSR
image in the pt_regs are zero.  Since analyse_instr() doesn't make
any changes to register state, it is reasonable for it to indicate
what the effect of an instruction would be even though the relevant
enable bit is off.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

ee0a54d7

powerpc: Change analyse_instr so it doesn't modify *regs · 3cdfcbfd

Paul Mackerras authored Aug 30, 2017

The analyse_instr function currently doesn't just work out what an
instruction does, it also executes those instructions whose effect
is only to update CPU registers that are stored in struct pt_regs.
This is undesirable because optprobes uses analyse_instr to work out
if an instruction could be successfully emulated in future.

This changes analyse_instr so it doesn't modify *regs; instead it
stores information in the instruction_op structure to indicate what
registers (GPRs, CR, XER, LR) would be set and what value they would
be set to. A companion function called emulate_update_regs() can
then use that information to update a pt_regs struct appropriately.

As a minor cleanup, this replaces inline asm using the cntlzw and
cntlzd instructions with calls to __builtin_clz() and __builtin_clzl().
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

3cdfcbfd

31 Aug, 2017 24 commits

powerpc: Correct instruction code for xxlor instruction · 93b2d3cf

Paul Mackerras authored Aug 30, 2017

The instruction code for xxlor that commit 0016a4cf ("powerpc:
Emulate most Book I instructions in emulate_step()", 2010-06-15)
added is actually the code for xxlnor.  It is used in get_vsr()
and put_vsr() and the effect of the error is that if emulate_step
is used to emulate a VSX load or store from any register other
than vsr0, the bitwise complement of the correct value will be
loaded or stored.  This corrects the error.

Fixes: 0016a4cf ("powerpc: Emulate most Book I instructions in emulate_step()")
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

93b2d3cf

powerpc: Fix DAR reporting when alignment handler faults · f9effe92

Michael Ellerman authored Aug 24, 2017

Anton noticed that if we fault part way through emulating an unaligned
instruction, we don't update the DAR to reflect that.

The DAR value is eventually reported back to userspace as the address
in the SEGV signal, and if userspace is using that value to demand
fault then it can be confused by us not setting the value correctly.

This patch is ugly as hell, but is intended to be the minimal fix and
back ports easily.

Cc: stable@vger.kernel.org
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Paul Mackerras <paulus@ozlabs.org>

f9effe92

powerpc/pseries: Don't attempt to acquire drc during memory hot add for assigned lmbs · afb5519f

John Allen authored Aug 23, 2017

Check if an LMB is assigned before attempting to call dlpar_acquire_drc
in order to avoid any unnecessary rtas calls. This substantially
reduces the running time of memory hot add on lpars with large amounts
of memory.

[mpe: We need to explicitly set rc to 0 in the success case, otherwise
 the compiler might think we use rc without initialising it.]

Fixes: c21f515c ("powerpc/pseries: Make the acquire/release of the drc for memory a seperate step")
Cc: stable@vger.kernel.org # v4.11+
Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

afb5519f

powerpc/4xx: Constify cpm_suspend_ops · 7def9a24

Arvind Yadav authored Aug 30, 2017

struct platform_suspend_ops are not supposed to change at runtime.
Functions suspend_set_ops working with const platform_suspend_ops. So
mark the non-const structs as const.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

7def9a24

powerpc/smp: Add Power9 scheduler topology · 96d91431

Oliver O'Halloran authored Jun 29, 2017

In previous generations of Power processors each core had a private L2
cache. The Power 9 processor has a slightly different design where the
L2 cache is shared among pairs of cores rather than being completely
private.

Making the scheduler aware of this cache sharing allows the scheduler to
make better migration decisions. For example, if two CPU heavy tasks
share a core then one task can be migrated to the paired core to improve
throughput. Under the existing three level topology the task could be
migrated to any core on the same chip, while with the new topology it
would be preferentially migrated to the paired core so it remains
cache-hot.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

96d91431

powerpc/smp: Add cpu_l2_cache_map · 2a636a56

Oliver O'Halloran authored Jun 29, 2017

We want to add an extra level to the CPU scheduler topology to account
for cores which share a cache. To do this we need to build a cpumask
for each CPU that indicates which CPUs share this cache to use as an
input to the scheduler.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

2a636a56

powerpc/smp: Rework CPU topology construction · df52f671

Oliver O'Halloran authored Jun 29, 2017

The CPU scheduler topology is constructed from a number of per-cpu
cpumasks which describe which sets of logical CPUs are related in some
fashion. Current code that handles constructing these masks when CPUs
are hot(un)plugged can be simplified a bit by exploiting the fact that
the scheduler requires higher levels of the toplogy (e.g package level
groupings) to be supersets of the lower levels (e.g.  threas in a core).
This patch reworks the cpumask construction to be simpler and easier to
extend with extra topology levels.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
[mpe: Fix CONFIG_HOTPLUG_CPU=n build]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

df52f671

powerpc/smp: Use cpu_to_chip_id() to find core siblings · e3d8b67e

Oliver O'Halloran authored Jun 29, 2017

When building the CPU scheduler topology the kernel uses the ibm,chipid
property from the devicetree to group logical CPUs. Currently the DT
search for this property is open-coded in smp.c and this functionality
is a duplication of what's in cpu_to_chip_id() already. This patch
removes the existing search in favor of that.

It's worth mentioning that the semantics of the search are different
in cpu_to_chip_id(). When there is no ibm,chipid in the CPUs node it
will also search /cpus and / for the property, but this should not
effect the output topology.
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

e3d8b67e

cxl: Fix driver use count · 197267d0

Frederic Barrat authored Aug 30, 2017

cxl keeps a driver use count, which is used with the hash memory model
on p8 to know when to upgrade local TLBIs to global and to trigger
callbacks to manage the MMU for PSL8.

If a process opens a context and closes without attaching or fails the
attachment, the driver use count is never decremented. As a
consequence, TLB invalidations remain global, even if there are no
active cxl contexts.

We should increment the driver use count when the process is attaching
to the cxl adapter, and not on open. It's not needed before the
adapter starts using the context and the use count is decremented on
the detach path, so it makes more sense.

It affects only the user api. The kernel api is already doing The
Right Thing.
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org # v4.2+
Fixes: 7bb5d91a ("cxl: Rework context lifetimes")
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

197267d0

selftests/powerpc: Force ptrace tests to build -fno-pie · a3c01050

Michael Neuling authored Aug 30, 2017

Currently these tests won't build with a `--enable-default-pie`
compiler as they require r30 to be clobbered. This gives
an error:
  ptrace-tm-spd-gpr.c:41:2: error: PIC register clobbered by 'r30' in 'asm'

This forces these tests to be built no-pie.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

a3c01050

powerpc: conditionally compile platform-specific serial drivers · 866bfc75

Hannes Reinecke authored Jun 27, 2017

mpsc.c and mpc52xx-psc.c are platform-specific serial drivers, and
should be compiled for the respective platforms only.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Torsten Duwe <duwe@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

866bfc75

powerpc/asm: Convert .llong directives to .8byte · eb039161

Tobin C. Harding authored Mar 09, 2017

.llong is an undocumented PPC specific directive. The generic
equivalent is .quad, but even better (because it's self describing) is
.8byte.

Convert all .llong directives to .8byte.
Signed-off-by: Tobin C. Harding <me@tobin.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

eb039161

powerpc/configs: Enable THP and 64K for ppc64(le)_defconfig · 5b593949

Balbir Singh authored Apr 13, 2017

Enable 64K page size and THP. I use ppc64le_defconfig when I need
a single config across guest and host, but having 4K page size
as default is not what I expect. I could move these over to
server.config and merge if ppc64_defconfig is meant for systems
that use 4k pages by default.
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

5b593949

MAINTAINERS: Add drivers/watchdog/wdrtas.c to powerpc section · d8895268

Murilo Opsfelder Araujo authored Jun 01, 2017

drivers/watchdog/wdrtas.c is of interest of linuxppc maintainers.
Signed-off-by: Murilo Opsfelder Araujo <mopsfelder@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

d8895268

powerpc/configs: Enable function trace by default · 539df7fc

Balbir Singh authored Apr 13, 2017

Most (all?) distros turn these on, so it makes sense to enable them
for testing coverage, and they're also useful for developers.
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
[mpe: Reword change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

539df7fc

powerpc/xmon: Add ISA v3.0 SPRs to SPR dump · d1e1b351

Balbir Singh authored Aug 30, 2017

Add support for printing the PIDR/TIDR for ISA 300 and PSSCR and PTCR
in ISA 3.0 hypervisor mode.

SPRN_PSSCR_PR is the privileged mode access and is used when we are
not in hypervisor mode.
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
[mpe: Split out of larger patch]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

d1e1b351

powerpc/xmon: Add AMR, UAMOR, AMOR, IAMR to SPR dump · 64d66aa0

Balbir Singh authored Aug 30, 2017

This patch adds support to xmon for dumping the AMR, UAMOR, AMOR and
IAMR SPRs based on their supported ISA revisions.
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
[mpe: Split out of larger patch]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

64d66aa0

powerpc/xmon: Dump all 64 bits of HDEC · cf9159c3

Balbir Singh authored Aug 30, 2017

ISA 3.0 defines hypervisor decrementer to be 64 bits in length.
This patch extends the print format for to be 64 bits.
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

cf9159c3

powerpc: Squash lines for simple wrapper functions · 7f2462ac

Masahiro Yamada authored Sep 06, 2016

Remove unneeded variables and assignments.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

7f2462ac

powerpc/mm/radix: Prettify mapped memory range print out · 6deb6b47

Michael Ellerman authored Aug 30, 2017

When we map memory at boot we print out the ranges of real addresses
that we mapped and the page size that was used.

Currently it's a bit ugly:

  Mapped range 0x0 - 0x2000000000 with 0x40000000
  Mapped range 0x200000000000 - 0x202000000000 with 0x40000000

Pad the addresses so they line up, and print the page size using
actual units, eg:

  Mapped 0x0000000000000000-0x0000000001200000 with 64.0 KiB pages
  Mapped 0x0000000001200000-0x0000000040000000 with 2.00 MiB pages
  Mapped 0x0000000040000000-0x0000000100000000 with 1.00 GiB pages
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

6deb6b47

powerpc/mm/radix: Add pr_fmt() to pgtable-radix.c · bd350f71

Michael Ellerman authored Aug 30, 2017

Make the printks look a bit nicer by adding a prefix.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

bd350f71

powerpc/kernel: Change retrieval of pci_dn · f9df74df

Bryant G. Ly authored Aug 29, 2017

For a PCI device it's pci_dn can be retrieved from
pdev->dev.archdata.firmware_data, PCI_DN(devnode), or parent's list.
Thus, we should just use the existing function pci_get_pdn_by_devfn
to get the pci_dn.
Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Reviewed-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

f9df74df

powerpc/mm/cxl: Add barrier when setting mm cpumask · 22259a6e

Aneesh Kumar K.V authored Aug 28, 2017

We need to add memory barrier so that the page table walk doesn't happen
before the cpumask is set and made visible to the other cpus. We need
to use a sync here instead of lwsync because lwsync is not sufficient for
store/load ordering.

We also need to add an if (mm) check so that we do the right thing when called
with a kernel context. For kernel context, we have mm = NULL. W.r.t kernel
address we can skip setting the mm cpumask.

Fixes: 0f4bc093 ("powerpc/mm/cxl: Add the fault handling cpu to mm cpumask")
Cc: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

22259a6e

powerpc/powernv/vas: Define copy/paste interfaces · 2392c8c8

Sukadev Bhattiprolu authored Aug 28, 2017

Define interfaces (wrappers) to the 'copy' and 'paste'
instructions (which are new in PowerISA 3.0). These are intended to be
used to by NX driver(s) to submit Coprocessor Request Blocks (CRBs) to
the NX hardware engines.
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

2392c8c8