1. 01 Sep, 2017 8 commits
    • Paul Mackerras's avatar
      powerpc: Make load/store emulation use larger memory accesses · e0a0986b
      Paul Mackerras authored
      At the moment, emulation of loads and stores of up to 8 bytes to
      unaligned addresses on a little-endian system uses a sequence of
      single-byte loads or stores to memory.  This is rather inefficient,
      and the code is hard to follow because it has many ifdefs.
      In addition, the Power ISA has requirements on how unaligned accesses
      are performed, which are not met by doing all accesses as
      sequences of single-byte accesses.
      
      Emulation of VSX loads and stores uses __copy_{to,from}_user,
      which means the emulation code has no control on the size of
      accesses.
      
      To simplify this, we add new copy_mem_in() and copy_mem_out()
      functions for accessing memory.  These use a sequence of the largest
      possible aligned accesses, up to 8 bytes (or 4 on 32-bit systems),
      to copy memory between a local buffer and user memory.  We then
      rewrite {read,write}_mem_unaligned and the VSX load/store
      emulation using these new functions.
      
      These new functions also simplify the code in do_fp_load() and
      do_fp_store() for the unaligned cases.
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      e0a0986b
    • Paul Mackerras's avatar
      powerpc: Add emulation for the addpcis instruction · 958465ee
      Paul Mackerras authored
      The addpcis instruction puts the sum of the next instruction address
      plus a constant into a register.  Since the result depends on the
      address of the instruction, it will give an incorrect result if it
      is single-stepped out of line, which is what the *probes subsystem
      will currently do if a probe is placed on an addpcis instruction.
      This fixes the problem by adding emulation of it to analyse_instr().
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      958465ee
    • Paul Mackerras's avatar
      powerpc: Don't update CR0 in emulation of popcnt, prty, bpermd instructions · 5762e083
      Paul Mackerras authored
      The architecture shows the least-significant bit of the instruction
      word as reserved for the popcnt[bwd], prty[wd] and bpermd
      instructions, that is, these instructions never update CR0.
      Therefore this changes the emulation of these instructions to
      skip the CR0 update.
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      5762e083
    • Paul Mackerras's avatar
      powerpc: Fix emulation of the isel instruction · f1bbb99f
      Paul Mackerras authored
      The case added for the isel instruction was added inside a switch
      statement which uses the 10-bit minor opcode field in the 0x7fe
      bits of the instruction word.  However, for the isel instruction,
      the minor opcode field is only the 0x3e bits, and the 0x7c0 bits
      are used for the "BC" field, which indicates which CR bit to use
      to select the result.
      
      Therefore, for the isel emulation to work correctly when BC != 0,
      we need to match on ((instr >> 1) & 0x1f) == 15).  To do this, we
      pull the isel case out of the switch statement and put it in an
      if statement of its own.
      
      Fixes: e27f71e5 ("powerpc/lib/sstep: Add isel instruction emulation")
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      f1bbb99f
    • Paul Mackerras's avatar
      powerpc/64: Fix update forms of loads and stores to write 64-bit EA · d120cdbc
      Paul Mackerras authored
      When a 64-bit processor is executing in 32-bit mode, the update forms
      of load and store instructions are required by the architecture to
      write the full 64-bit effective address into the RA register, though
      only the bottom 32 bits are used to address memory.  Currently,
      the instruction emulation code writes the truncated address to the
      RA register.  This fixes it by keeping the full 64-bit EA in the
      instruction_op structure, truncating the address in emulate_step()
      where it is used to address memory, rather than in the address
      computations in analyse_instr().
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      d120cdbc
    • Paul Mackerras's avatar
      powerpc: Handle most loads and stores in instruction emulation code · 350779a2
      Paul Mackerras authored
      This extends the instruction emulation infrastructure in sstep.c to
      handle all the load and store instructions defined in the Power ISA
      v3.0, except for the atomic memory operations, ldmx (which was never
      implemented), lfdp/stfdp, and the vector element load/stores.
      
      The instructions added are:
      
      Integer loads and stores: lbarx, lharx, lqarx, stbcx., sthcx., stqcx.,
      lq, stq.
      
      VSX loads and stores: lxsiwzx, lxsiwax, stxsiwx, lxvx, lxvl, lxvll,
      lxvdsx, lxvwsx, stxvx, stxvl, stxvll, lxsspx, lxsdx, stxsspx, stxsdx,
      lxvw4x, lxsibzx, lxvh8x, lxsihzx, lxvb16x, stxvw4x, stxsibx, stxvh8x,
      stxsihx, stxvb16x, lxsd, lxssp, lxv, stxsd, stxssp, stxv.
      
      These instructions are handled both in the analyse_instr phase and in
      the emulate_step phase.
      
      The code for lxvd2ux and stxvd2ux has been taken out, as those
      instructions were never implemented in any processor and have been
      taken out of the architecture, and their opcodes have been reused for
      other instructions in POWER9 (lxvb16x and stxvb16x).
      
      The emulation for the VSX loads and stores uses helper functions
      which don't access registers or memory directly, which can hopefully
      be reused by KVM later.
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      350779a2
    • Paul Mackerras's avatar
      powerpc: Don't check MSR FP/VMX/VSX enable bits in analyse_instr() · ee0a54d7
      Paul Mackerras authored
      This removes the checks for the FP/VMX/VSX enable bits in the MSR
      from analyse_instr() and adds them to emulate_step() instead.
      
      The reason for this is that we may want to use analyse_instr() in
      a situation where the FP/VMX/VSX register values are stored in the
      current thread_struct and the FP/VMX/VSX enable bits in the MSR
      image in the pt_regs are zero.  Since analyse_instr() doesn't make
      any changes to register state, it is reasonable for it to indicate
      what the effect of an instruction would be even though the relevant
      enable bit is off.
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      ee0a54d7
    • Paul Mackerras's avatar
      powerpc: Change analyse_instr so it doesn't modify *regs · 3cdfcbfd
      Paul Mackerras authored
      The analyse_instr function currently doesn't just work out what an
      instruction does, it also executes those instructions whose effect
      is only to update CPU registers that are stored in struct pt_regs.
      This is undesirable because optprobes uses analyse_instr to work out
      if an instruction could be successfully emulated in future.
      
      This changes analyse_instr so it doesn't modify *regs; instead it
      stores information in the instruction_op structure to indicate what
      registers (GPRs, CR, XER, LR) would be set and what value they would
      be set to.  A companion function called emulate_update_regs() can
      then use that information to update a pt_regs struct appropriately.
      
      As a minor cleanup, this replaces inline asm using the cntlzw and
      cntlzd instructions with calls to __builtin_clz() and __builtin_clzl().
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      3cdfcbfd
  2. 31 Aug, 2017 32 commits