Commits · eac1e731b59ee3b5f5e641a7765c7ed41ed26226 · Kirill Smelkov / linux

02 Sep, 2017 2 commits

powerpc/xive: guest exploitation of the XIVE interrupt controller · eac1e731

Cédric Le Goater authored Aug 30, 2017

This is the framework for using XIVE in a PowerVM guest. The support
is very similar to the native one in a much simpler form.

Each source is associated with an Event State Buffer (ESB). This is a
two bit state machine which is used to trigger events. The bits are
named "P" (pending) and "Q" (queued) and can be controlled by MMIO.
The Guest OS registers event (or notifications) queues on which the HW
will post event data for a target to notify.

Instead of OPAL calls, a set of Hypervisors call are used to configure
the interrupt sources and the event/notification queues of the guest:

 - H_INT_GET_SOURCE_INFO

   used to obtain the address of the MMIO page of the Event State
   Buffer (PQ bits) entry associated with the source.

 - H_INT_SET_SOURCE_CONFIG

   assigns a source to a "target".

 - H_INT_GET_SOURCE_CONFIG

   determines to which "target" and "priority" is assigned to a source

 - H_INT_GET_QUEUE_INFO

   returns the address of the notification management page associated
   with the specified "target" and "priority".

 - H_INT_SET_QUEUE_CONFIG

   sets or resets the event queue for a given "target" and "priority".
   It is also used to set the notification config associated with the
   queue, only unconditional notification for the moment.  Reset is
   performed with a queue size of 0 and queueing is disabled in that
   case.

 - H_INT_GET_QUEUE_CONFIG

   returns the queue settings for a given "target" and "priority".

 - H_INT_RESET

   resets all of the partition's interrupt exploitation structures to
   their initial state, losing all configuration set via the hcalls
   H_INT_SET_SOURCE_CONFIG and H_INT_SET_QUEUE_CONFIG.

 - H_INT_SYNC

   issue a synchronisation on a source to make sure sure all
   notifications have reached their queue.

As for XICS, the XIVE interface for the guest is described in the
device tree under the "interrupt-controller" node. A couple of new
properties are specific to XIVE :

 - "reg"

   contains the base address and size of the thread interrupt
   managnement areas (TIMA), also called rings, for the User level and
   for the Guest OS level. Only the Guest OS level is taken into
   account today.

 - "ibm,xive-eq-sizes"

   the size of the event queues. One cell per size supported, contains
   log2 of size, in ascending order.

 - "ibm,xive-lisn-ranges"

   the interrupt numbers ranges assigned to the guest. These are
   allocated using a simple bitmap.

and also :

 - "/ibm,plat-res-int-priorities"

   contains a list of priorities that the hypervisor has reserved for
   its own use.

Tested with a QEMU XIVE model for pseries and with the Power hypervisor.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

eac1e731

powerpc/xive: introduce a common routine xive_queue_page_alloc() · 994ea2f4

Cédric Le Goater authored Aug 30, 2017

This routine will be used in the spapr backend. Also introduce a short
xive_alloc_order() helper.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

994ea2f4

01 Sep, 2017 38 commits

powerpc/sstep: Avoid used uninitialized error · 3b79b261

Michael Ellerman authored Sep 02, 2017

Older compilers think val may be used uninitialized:

arch/powerpc/lib/sstep.c: In function 'emulate_loadstore':
arch/powerpc/lib/sstep.c:2758:23: error: 'val' may be used uninitialized in this function

We know better, but initialise val to 0 to avoid breaking the build.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

3b79b261

axonram: Return directly after a failed kzalloc() in axon_ram_probe() · fdbb9457

Markus Elfring authored Aug 03, 2017

* Return directly after a call of the function "kzalloc" failed
  at the beginning.

* Delete a repeated check for the local variable "bank"
  which became unnecessary with this refactoring.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

fdbb9457

axonram: Improve a size determination in axon_ram_probe() · a1bddf39

Markus Elfring authored Aug 03, 2017

Replace the specification of a data structure by a pointer dereference
as the parameter for the operator "sizeof" to make the corresponding size
determination a bit safer according to the Linux coding style convention.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

a1bddf39

axonram: Delete an error message for a failed memory allocation in axon_ram_probe() · c86a9397

Markus Elfring authored Aug 03, 2017

Omit an extra message for a memory allocation failure in this function.

This issue was detected by using the Coccinelle software.

Link: http://events.linuxfoundation.org/sites/events/files/slides/LCJ16-Refactor_Strings-WSang_0.pdfSigned-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

c86a9397

powerpc/powernv/npu: Move tlb flush before launching ATSD · bab9f954

Alistair Popple authored Aug 11, 2017

The nest MMU tlb flush needs to happen before the GPU translation
shootdown is launched to avoid the GPU refilling its tlb with stale
nmmu translations prior to the nmmu flush completing.

Fixes: 1ab66d1f ("powerpc/powernv: Introduce address translation services for Nvlink2")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

bab9f954

powerpc/macintosh: constify wf_sensor_ops structures · de854e54

Julia Lawall authored Aug 02, 2017

The wf_sensor_ops structures are only stored in the ops field of a
wf_sensor structure, which is declared as const.  Thus the
wf_sensor_ops structures themselves can be const.

Done with the help of Coccinelle.

// <smpl>
@r disable optional_qualifier@
identifier i;
position p;
@@
static struct wf_sensor_ops i@p = { ... };

@ok1@
identifier r.i;
struct wf_sensor s;
position p;
@@
s.ops = &i@p

@ok2@
identifier r.i;
struct wf_sat_sensor s;
position p;
@@
s.sens.ops = &i@p

@bad@
position p != {r.p,ok1.p,ok2.p};
identifier r.i;
struct wf_sensor_ops e;
@@
e@i@p

@depends on !bad disable optional_qualifier@
identifier r.i;
@@
static
+const
 struct wf_sensor_ops i = { ... };
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

de854e54

powerpc/iommu: Use permission-specific DEVICE_ATTR variants · 8a7aef2c

Julia Lawall authored Oct 29, 2016

Use DEVICE_ATTR_RW for read-write attributes.  This simplifies the
source code, improves readbility, and reduces the chance of
inconsistencies.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

8a7aef2c

powerpc/eeh: Delete an error out of memory message at init time · 6ab41161

Markus Elfring authored Aug 04, 2017

Omit an extra message for a memory allocation failure in
eeh_dev_init().

This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
[mpe: Do not drop the message that can happen at runtime and lead to
 an event not being handled]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

6ab41161

powerpc/mm: Use seq_putc() in two functions · aae85e3c

Markus Elfring authored May 07, 2017

Two single characters (line breaks) should be put into a sequence.
Thus use the corresponding function "seq_putc".

This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

aae85e3c

macintosh: Convert to using %pOF instead of full_name · b6a945ae

Rob Herring authored Jul 18, 2017

Now that we have a custom printf format specifier, convert users of
full_name to use %pOF instead. This is preparation to remove storing
of the full path string for each node.
Signed-off-by: Rob Herring <robh@kernel.org>
[mpe: Also convert the two cases inside #if 0]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

b6a945ae

ide: pmac: Convert to using %pOF instead of full_name · 859420e3

Rob Herring authored Jul 18, 2017

Now that we have a custom printf format specifier, convert users of
full_name to use %pOF instead. This is preparation to remove storing
of the full path string for each node.
Signed-off-by: Rob Herring <robh@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linux-ide@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

859420e3

crypto/nx: Add P9 NX support for 842 compression engine · b0d6c9ba

Haren Myneni authored Aug 31, 2017

This patch adds P9 NX support for 842 compression engine. Virtual
Accelerator Switchboard (VAS) is used to access 842 engine on P9.

For each NX engine per chip, setup receive window using
vas_rx_win_open() which configures RxFIFo with FIFO address, lpid,
pid and tid values. This unique (lpid, pid, tid) combination will
be used to identify the target engine.

For crypto open request, open send window on the NX engine for
the corresponding chip / cpu where the open request is executed.
This send window will be closed upon crypto close request.

NX provides high and normal priority FIFOs. For compression /
decompression requests, we use only hight priority FIFOs in kernel.

Each NX request will be communicated to VAS using copy/paste
instructions with vas_copy_crb() / vas_paste_crb() functions.
Signed-off-by: Haren Myneni <haren@us.ibm.com>
Reviewed-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

b0d6c9ba

crypto/nx: Add P9 NX specific error codes for 842 engine · 146e9f1b

Haren Myneni authored Aug 31, 2017

This patch adds changes for checking P9 specific 842 engine
error codes. These errros are reported in coprocessor status
block (CSB) for failures.
Signed-off-by: Haren Myneni <haren@us.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

146e9f1b

crypto/nx: Use kzalloc for workmem allocation · f0536833

Haren Myneni authored Aug 31, 2017

Send window is opened / closed for each crypto session.
So initializes txwin in workmem.
Signed-off-by: Haren Myneni <haren@us.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

f0536833

crypto/nx: Add nx842_add_coprocs_list function · cd38a8a8

Haren Myneni authored Aug 31, 2017

Updating coprocessor list is moved to nx842_add_coprocs_list().
This function will be used for both icswx and VAS functions.
Signed-off-by: Haren Myneni <haren@us.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

cd38a8a8

crypto/nx: Create nx842_delete_coprocs function · 1ee51b28

Haren Myneni authored Aug 31, 2017

Move deleting coprocessors info upon exit or failure to
nx842_delete_coprocs().
Signed-off-by: Haren Myneni <haren@us.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

1ee51b28

crypto/nx: Create nx842_configure_crb function · 56c10d5e

Haren Myneni authored Aug 31, 2017

Configure CRB is moved to nx842_configure_crb() so that it can
be used for icswx and VAS exec functions. VAS function will be
added later with P9 support.
Signed-off-by: Haren Myneni <haren@us.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

56c10d5e

crypto/nx: Rename nx842_powernv_function as icswx function · c97f8169

Haren Myneni authored Aug 31, 2017

Rename nx842_powernv_function to nx842_powernv_exec.
nx842_powernv_exec points to nx842_exec_icswx and
will be point to VAS exec function which will be added later
for P9 NX support.
Signed-off-by: Haren Myneni <haren@us.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

c97f8169

powerpc/32: remove a NOP from memset() · ad1b0122

Christophe Leroy authored Aug 23, 2017

memset() is patched after initialisation to activate the
optimised part which uses cache instructions.

Today we have a 'b 2f' to skip the optimised patch, which then gets
replaced by a NOP, implying a useless cycle consumption.
As we have a 'bne 2f' just before, we could use that instruction
for the live patching, hence removing the need to have a
dedicated 'b 2f' to be replaced by a NOP.

This patch changes the 'bne 2f' by a 'b 2f'. During init, that
'b 2f' is then replaced by 'bne 2f'
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

ad1b0122

powerpc/32: optimise memset() · 7bf6057b

Christophe Leroy authored Aug 23, 2017

There is no need to extend the set value to an int when the length
is lower than 4 as in that case we only do byte stores.
We can therefore immediately branch to the part handling it.
By separating it from the normal case, we are able to eliminate
a few actions on the destination pointer.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

7bf6057b

powerpc: fix location of two EXPORT_SYMBOL · c0622167

Christophe Leroy authored Aug 23, 2017

Commit 9445aa1a ("ppc: move exports to definitions")
added EXPORT_SYMBOL() for memset() and flush_hash_pages() in
the middle of the functions.

This patch moves them at the end of the two functions.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

c0622167

powerpc/32: add memset16() · da74f659

Christophe Leroy authored Aug 23, 2017

Commit 694fc88c ("powerpc/string: Implement optimized
memset variants") added memset16(), memset32() and memset64()
for the 64 bits PPC.

On 32 bits, memset64() is not relevant, and as shown below,
the generic version of memset32() gives a good code, so only
memset16() is candidate for an optimised version.

000009c0 <memset32>:
 9c0:   2c 05 00 00     cmpwi   r5,0
 9c4:   39 23 ff fc     addi    r9,r3,-4
 9c8:   4d 82 00 20     beqlr
 9cc:   7c a9 03 a6     mtctr   r5
 9d0:   94 89 00 04     stwu    r4,4(r9)
 9d4:   42 00 ff fc     bdnz    9d0 <memset32+0x10>
 9d8:   4e 80 00 20     blr

The last part of memset() handling the not 4-bytes multiples
operates on bytes, making it unsuitable for handling word without
modification. As it would increase memset() complexity, it is
better to implement memset16() from scratch. In addition it
has the advantage of allowing a more optimised memset16() than what
we would have by using the memset() function.
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

da74f659

powerpc: Wrap register number correctly for string load/store instructions · 45f62159

Paul Mackerras authored Sep 01, 2017

Michael Ellerman reported that emulate_loadstore() was trying to
access element 32 of regs->gpr[], which doesn't exist, when
emulating a string store instruction.  This is because the string
load and store instructions (lswi, lswx, stswi and stswx) are
defined to wrap around from register 31 to register 0 if the number
of bytes being loaded or stored is sufficiently large.  This wrapping
was not implemented in the emulation code.  To fix it, we mask the
register number after incrementing it.
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Fixes: c9f6f4ed ("powerpc: Implement emulation of string loads and stores")
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

45f62159

powerpc: Emulate load/store floating point as integer word instructions · d2b65ac6

Paul Mackerras authored Aug 30, 2017

This adds emulation for the lfiwax, lfiwzx and stfiwx instructions.
This necessitated adding a new flag to indicate whether a floating
point or an integer conversion was needed for LOAD_FP and STORE_FP,
so this moves the size field in op->type up 4 bits.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

d2b65ac6

powerpc: Use instruction emulation infrastructure to handle alignment faults · 31bfdb03

Paul Mackerras authored Aug 30, 2017

This replaces almost all of the instruction emulation code in
fix_alignment() with calls to analyse_instr(), emulate_loadstore()
and emulate_dcbz().  The only emulation code left is the SPE
emulation code; analyse_instr() etc. do not handle SPE instructions
at present.

One result of this is that we can now handle alignment faults on
all the new VSX load and store instructions that were added in POWER9.
VSX loads/stores will take alignment faults for unaligned accesses
to cache-inhibited memory.

Another effect is that we no longer rely on the DAR and DSISR values
set by the processor.

With this, we now need to include the instruction emulation code
unconditionally.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

31bfdb03

powerpc: Separate out load/store emulation into its own function · a53d5182

Paul Mackerras authored Aug 30, 2017

This moves the parts of emulate_step() that deal with emulating
load and store instructions into a new function called
emulate_loadstore().  This is to make it possible to reuse this
code in the alignment handler.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

a53d5182

powerpc: Handle opposite-endian processes in emulation code · d955189a

Paul Mackerras authored Aug 30, 2017

This adds code to the load and store emulation code to byte-swap
the data appropriately when the process being emulated is set to
the opposite endianness to that of the kernel.

This also enables the emulation for the multiple-register loads
and stores (lmw, stmw, lswi, stswi, lswx, stswx) to work for
little-endian.  In little-endian mode, the partial word at the
end of a transfer for lsw*/stsw* (when the byte count is not a
multiple of 4) is loaded/stored at the least-significant end of
the register.  Additionally, this fixes a bug in the previous
code in that it could call read_mem/write_mem with a byte count
that was not 1, 2, 4 or 8.

Note that this only works correctly on processors with "true"
little-endian mode, such as IBM POWER processors from POWER6 on, not
the so-called "PowerPC" little-endian mode that uses address swizzling
as implemented on the old 32-bit 603, 604, 740/750, 74xx CPUs.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

d955189a

powerpc: Set regs->dar if memory access fails in emulate_step() · b9da9c8a

Paul Mackerras authored Aug 30, 2017

This adds code to the instruction emulation code to set regs->dar
to the address of any memory access that fails.  This address is
not necessarily the same as the effective address of the instruction,
because if the memory access is unaligned, it might cross a page
boundary and fault on the second page.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

b9da9c8a

powerpc: Emulate the dcbz instruction · b2543f7b

Paul Mackerras authored Aug 30, 2017

This adds code to analyse_instr() and emulate_step() to understand the
dcbz (data cache block zero) instruction.  The emulate_dcbz() function
is made public so it can be used by the alignment handler in future.
(The apparently unnecessary cropping of the address to 32 bits is
there because it will be needed in that situation.)
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

b2543f7b

powerpc: Emulate load/store floating double pair instructions · 1f41fb79

Paul Mackerras authored Aug 30, 2017

This adds lfdp[x] and stfdp[x] to the set of instructions that
analyse_instr() and emulate_step() understand.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

1f41fb79

powerpc: Emulate vector element load/store instructions · e61ccc7b

Paul Mackerras authored Aug 30, 2017

This adds code to analyse_instr() and emulate_step() to handle the
vector element loads and stores:

lvebx, lvehx, lvewx, stvebx, stvehx, stvewx.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

e61ccc7b

powerpc: Emulate FP/vector/VSX loads/stores correctly when regs not live · c22435a5

Paul Mackerras authored Aug 30, 2017

At present, the analyse_instr/emulate_step code checks for the
relevant MSR_FP/VEC/VSX bit being set when a FP/VMX/VSX load
or store is decoded, but doesn't recheck the bit before reading or
writing the relevant FP/VMX/VSX register in emulate_step().

Since we don't have preemption disabled, it is possible that we get
preempted between checking the MSR bit and doing the register access.
If that happened, then the registers would have been saved to the
thread_struct for the current process. Accesses to the CPU registers
would then potentially read stale values, or write values that would
never be seen by the user process.

Another way that the registers can become non-live is if a page
fault occurs when accessing user memory, and the page fault code
calls a copy routine that wants to use the VMX or VSX registers.

To fix this, the code for all the FP/VMX/VSX loads gets restructured
so that it forms an image in a local variable of the desired register
contents, then disables preemption, checks the MSR bit and either
sets the CPU register or writes the value to the thread struct.
Similarly, the code for stores checks the MSR bit, copies either the
CPU register or the thread struct to a local variable, then reenables
preemption and then copies the register image to memory.

If the instruction being emulated is in the kernel, then we must not
use the register values in the thread_struct. In this case, if the
relevant MSR enable bit is not set, then emulate_step refuses to
emulate the instruction.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

c22435a5

powerpc: Make load/store emulation use larger memory accesses · e0a0986b

Paul Mackerras authored Aug 30, 2017

At the moment, emulation of loads and stores of up to 8 bytes to
unaligned addresses on a little-endian system uses a sequence of
single-byte loads or stores to memory.  This is rather inefficient,
and the code is hard to follow because it has many ifdefs.
In addition, the Power ISA has requirements on how unaligned accesses
are performed, which are not met by doing all accesses as
sequences of single-byte accesses.

Emulation of VSX loads and stores uses __copy_{to,from}_user,
which means the emulation code has no control on the size of
accesses.

To simplify this, we add new copy_mem_in() and copy_mem_out()
functions for accessing memory.  These use a sequence of the largest
possible aligned accesses, up to 8 bytes (or 4 on 32-bit systems),
to copy memory between a local buffer and user memory.  We then
rewrite {read,write}_mem_unaligned and the VSX load/store
emulation using these new functions.

These new functions also simplify the code in do_fp_load() and
do_fp_store() for the unaligned cases.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

e0a0986b

powerpc: Add emulation for the addpcis instruction · 958465ee

Paul Mackerras authored Aug 30, 2017

The addpcis instruction puts the sum of the next instruction address
plus a constant into a register. Since the result depends on the
address of the instruction, it will give an incorrect result if it
is single-stepped out of line, which is what the *probes subsystem
will currently do if a probe is placed on an addpcis instruction.
This fixes the problem by adding emulation of it to analyse_instr().
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

958465ee

powerpc: Don't update CR0 in emulation of popcnt, prty, bpermd instructions · 5762e083

Paul Mackerras authored Aug 30, 2017

The architecture shows the least-significant bit of the instruction
word as reserved for the popcnt[bwd], prty[wd] and bpermd
instructions, that is, these instructions never update CR0.
Therefore this changes the emulation of these instructions to
skip the CR0 update.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

5762e083

powerpc: Fix emulation of the isel instruction · f1bbb99f

Paul Mackerras authored Aug 30, 2017

The case added for the isel instruction was added inside a switch
statement which uses the 10-bit minor opcode field in the 0x7fe
bits of the instruction word.  However, for the isel instruction,
the minor opcode field is only the 0x3e bits, and the 0x7c0 bits
are used for the "BC" field, which indicates which CR bit to use
to select the result.

Therefore, for the isel emulation to work correctly when BC != 0,
we need to match on ((instr >> 1) & 0x1f) == 15).  To do this, we
pull the isel case out of the switch statement and put it in an
if statement of its own.

Fixes: e27f71e5 ("powerpc/lib/sstep: Add isel instruction emulation")
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

f1bbb99f

powerpc/64: Fix update forms of loads and stores to write 64-bit EA · d120cdbc

Paul Mackerras authored Aug 30, 2017

When a 64-bit processor is executing in 32-bit mode, the update forms
of load and store instructions are required by the architecture to
write the full 64-bit effective address into the RA register, though
only the bottom 32 bits are used to address memory.  Currently,
the instruction emulation code writes the truncated address to the
RA register.  This fixes it by keeping the full 64-bit EA in the
instruction_op structure, truncating the address in emulate_step()
where it is used to address memory, rather than in the address
computations in analyse_instr().
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

d120cdbc

powerpc: Handle most loads and stores in instruction emulation code · 350779a2

Paul Mackerras authored Aug 30, 2017

This extends the instruction emulation infrastructure in sstep.c to
handle all the load and store instructions defined in the Power ISA
v3.0, except for the atomic memory operations, ldmx (which was never
implemented), lfdp/stfdp, and the vector element load/stores.

The instructions added are:

Integer loads and stores: lbarx, lharx, lqarx, stbcx., sthcx., stqcx.,
lq, stq.

VSX loads and stores: lxsiwzx, lxsiwax, stxsiwx, lxvx, lxvl, lxvll,
lxvdsx, lxvwsx, stxvx, stxvl, stxvll, lxsspx, lxsdx, stxsspx, stxsdx,
lxvw4x, lxsibzx, lxvh8x, lxsihzx, lxvb16x, stxvw4x, stxsibx, stxvh8x,
stxsihx, stxvb16x, lxsd, lxssp, lxv, stxsd, stxssp, stxv.

These instructions are handled both in the analyse_instr phase and in
the emulate_step phase.

The code for lxvd2ux and stxvd2ux has been taken out, as those
instructions were never implemented in any processor and have been
taken out of the architecture, and their opcodes have been reused for
other instructions in POWER9 (lxvb16x and stxvb16x).

The emulation for the VSX loads and stores uses helper functions
which don't access registers or memory directly, which can hopefully
be reused by KVM later.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

350779a2