1. 06 Nov, 2017 15 commits
    • Michael Ellerman's avatar
      powerpc/tm: Don't check for WARN in TM Bad Thing handling · 632f0574
      Michael Ellerman authored
      Currently when we take a TM Bad Thing program check exception, we
      search the bug table to see if the program check was generated by a
      WARN/WARN_ON etc.
      
      That makes no sense, the WARN macros use trap instructions, which
      should never generate a TM Bad Thing exception. If they ever did that
      would be a bug and we should oops.
      
      We do have some hand-coded bugs in tm.S, using EMIT_BUG_ENTRY, but
      those are all BUGs not WARNs, and they all use trap instructions
      anyway. Almost certainly this check was incorrectly copied from the
      REASON_TRAP handling in the same function.
      
      Remove it.
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Acked-By: default avatarMichael Neuling <mikey@neuling.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      632f0574
    • Michael Ellerman's avatar
      powerpc/mm: Add a CONFIG option to choose if radix is used by default · 1fd6c022
      Michael Ellerman authored
      Currently if the hardware supports the radix MMU we will use
      it, *unless* "disable_radix" is passed on the kernel command line.
      
      However some users would like the reverse semantics. ie. The kernel
      uses the hash MMU by default, unless radix is explicitly requested on
      the command line.
      
      So add a CONFIG option to choose whether we use radix by default or
      not, and expand the disable_radix command line option to allow
      "disable_radix=no" which *enables* radix.
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      1fd6c022
    • Michael Ellerman's avatar
      powerpc/64s: Replace CONFIG_PPC_STD_MMU_64 with CONFIG_PPC_BOOK3S_64 · 4e003747
      Michael Ellerman authored
      CONFIG_PPC_STD_MMU_64 indicates support for the "standard" powerpc MMU
      on 64-bit CPUs. The "standard" MMU refers to the hash page table MMU
      found in "server" processors, from IBM mainly.
      
      Currently CONFIG_PPC_STD_MMU_64 is == CONFIG_PPC_BOOK3S_64. While it's
      annoying to have two symbols that always have the same value, it's not
      quite annoying enough to bother removing one.
      
      However with the arrival of Power9, we now have the situation where
      CONFIG_PPC_STD_MMU_64 is enabled, but the kernel is running using the
      Radix MMU - *not* the "standard" MMU. So it is now actively confusing
      to use it, because it implies that code is disabled or inactive when
      the Radix MMU is in use, however that is not necessarily true.
      
      So s/CONFIG_PPC_STD_MMU_64/CONFIG_PPC_BOOK3S_64/, and do some minor
      formatting updates of some of the affected lines.
      
      This will be a pain for backports, but c'est la vie.
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      4e003747
    • Michael Ellerman's avatar
      powerpc/64: Free up CPU_FTR_ICSWX · c1807e3f
      Michael Ellerman authored
      The last user of CPU_FTR_ICSWX was removed in commit
      6ff4d3e9 ("powerpc: Remove old unused icswx based coprocessor
      support"), so free the bit up for future use.
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      c1807e3f
    • Aneesh Kumar K.V's avatar
      powerpc/mm/hash: Add pr_fmt() to hash_utils64.c · 7f142661
      Aneesh Kumar K.V authored
      Make the printks look a bit nicer by adding a prefix.
      
      Radix config now do
       radix-mmu: Page sizes from device-tree:
       radix-mmu: Page size shift = 12 AP=0x0
       radix-mmu: Page size shift = 16 AP=0x5
       radix-mmu: Page size shift = 21 AP=0x1
       radix-mmu: Page size shift = 30 AP=0x2
      
      This patch update hash config to do similar dmesg output. With the patch we have
      
       hash-mmu: Page sizes from device-tree:
       hash-mmu: base_shift=12: shift=12, sllp=0x0000, avpnm=0x00000000, tlbiel=1, penc=0
       hash-mmu: base_shift=12: shift=16, sllp=0x0000, avpnm=0x00000000, tlbiel=1, penc=7
       hash-mmu: base_shift=12: shift=24, sllp=0x0000, avpnm=0x00000000, tlbiel=1, penc=56
       hash-mmu: base_shift=16: shift=16, sllp=0x0110, avpnm=0x00000000, tlbiel=1, penc=1
       hash-mmu: base_shift=16: shift=24, sllp=0x0110, avpnm=0x00000000, tlbiel=1, penc=8
       hash-mmu: base_shift=20: shift=20, sllp=0x0111, avpnm=0x00000000, tlbiel=0, penc=2
       hash-mmu: base_shift=24: shift=24, sllp=0x0100, avpnm=0x00000001, tlbiel=0, penc=0
       hash-mmu: base_shift=34: shift=34, sllp=0x0120, avpnm=0x000007ff, tlbiel=0, penc=3
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      7f142661
    • Christophe Leroy's avatar
      powerpc/ipic: Fix status get and status clear · 6b148a7c
      Christophe Leroy authored
      IPIC Status is provided by register IPIC_SERSR and not by IPIC_SERMR
      which is the mask register.
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      6b148a7c
    • Alexey Kardashevskiy's avatar
      powerpc/powernv: Reserve a hole which appears after enabling IOV · d6f934fd
      Alexey Kardashevskiy authored
      In order to make generic IOV code work, the physical function IOV BAR
      should start from offset of the first VF. Since M64 segments share
      PE number space across PHB, and some PEs may be in use at the time
      when IOV is enabled, the existing code shifts the IOV BAR to the index
      of the first PE/VF. This creates a hole in IOMEM space which can be
      potentially taken by some other device.
      
      This reserves a temporary hole on a parent and releases it when IOV is
      disabled; the temporary resources are stored in pci_dn to avoid
      kmalloc/free.
      Signed-off-by: default avatarAlexey Kardashevskiy <aik@ozlabs.ru>
      Acked-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      d6f934fd
    • Tyrel Datwyler's avatar
      powerpc/pseries/vio: Dispose of virq mapping on vdevice unregister · b8f89fea
      Tyrel Datwyler authored
      When a vdevice is DLPAR removed from the system the vio subsystem
      doesn't bother unmapping the virq from the irq_domain. As a result we
      have a virq mapped to a hardware irq that is no longer valid for the
      irq_domain. A side effect is that we are left with /proc/irq/<irq#>
      affinity entries, and attempts to modify the smp_affinity of the irq
      will fail.
      
      In the following observed example the kernel log is spammed by
      ics_rtas_set_affinity errors after the removal of a VSCSI adapter.
      This is a result of irqbalance trying to adjust the affinity every 10
      seconds.
      
        rpadlpar_io: slot U8408.E8E.10A7ACV-V5-C25 removed
        ics_rtas_set_affinity: ibm,set-xive irq=655385 returns -3
        ics_rtas_set_affinity: ibm,set-xive irq=655385 returns -3
      
      This patch fixes the issue by calling irq_dispose_mapping() on the
      virq of the viodev on unregister.
      
      Fixes: f2ab6219 ("powerpc/pseries: Add PFO support to the VIO bus")
      Signed-off-by: default avatarTyrel Datwyler <tyreld@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      b8f89fea
    • Nicholas Piggin's avatar
      powerpc/64s/radix: Fix process table entry cache invalidation · 30b49ec7
      Nicholas Piggin authored
      According to the architecture, the process table entry cache must be
      flushed with tlbie RIC=2.
      
      Currently the process table entry is set to invalid right before the
      PID is returned to the allocator, with no invalidation. This works on
      existing implementations that are known to not cache the process table
      entry for any except the current PIDR.
      
      It is architecturally correct and cleaner to invalidate with RIC=2
      after clearing the process table entry and before the PID is returned
      to the allocator. This can be done in arch_exit_mmap that runs before
      the final flush, and to ensure the final flush (fullmm) is always a
      RIC=2 variant.
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      30b49ec7
    • Nicholas Piggin's avatar
      powerpc/64s/radix: Improve preempt handling in TLB code · dffe8449
      Nicholas Piggin authored
      Preempt should be consistently disabled for mm_is_thread_local tests,
      so bring the rest of these under preempt_disable().
      
      Preempt does not need to be disabled for the mm->context.id tests,
      which allows simplification and removal of gotos.
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      dffe8449
    • Nicholas Piggin's avatar
      powerpc/powernv: Use FIXUP_ENDIAN_HV in OPAL return · 63c9d8a4
      Nicholas Piggin authored
      Close the recoverability gap for OPAL calls by using FIXUP_ENDIAN_HV
      in the return path.
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      63c9d8a4
    • Nicholas Piggin's avatar
      powerpc/book3s: Add an HV variant of FIXUP_ENDIAN that is recoverable · 8ca9c08d
      Nicholas Piggin authored
      Add an HV variant of FIXUP_ENDIAN which uses HSRR[01] and does not
      clear MSR[RI], which improves recoverability.
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      8ca9c08d
    • Nicholas Piggin's avatar
    • Nicholas Piggin's avatar
      powerpc/64: Fix latency tracing for lazy irq replay · ff967900
      Nicholas Piggin authored
      When returning from an exception to a soft-enabled context, pending
      IRQs are replayed but IRQ tracing is not reset, so a number of them
      can get chained together into the same IRQ-disabled trace.
      
      Fix this by having __check_irq_replay re-set IRQ trace. This is
      conceptually where we respond to the next interrupt, so it fits the
      semantics of the IRQ tracer.
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      ff967900
    • Nicholas Piggin's avatar
      KVM: PPC: Book3S HV: Handle host system reset in guest mode · 6de6638b
      Nicholas Piggin authored
      If the host takes a system reset interrupt while a guest is running,
      the CPU must exit the guest before processing the host exception
      handler.
      
      After this patch, taking a sysrq+x with a CPU running in a guest
      gives a trace like this:
      
         cpu 0x27: Vector: 100 (System Reset) at [c000000fdf5776f0]
             pc: c008000010158b80: kvmppc_run_core+0x16b8/0x1ad0 [kvm_hv]
             lr: c008000010158b80: kvmppc_run_core+0x16b8/0x1ad0 [kvm_hv]
             sp: c000000fdf577850
            msr: 9000000002803033
           current = 0xc000000fdf4b1e00
           paca    = 0xc00000000fd4d680	 softe: 3	 irq_happened: 0x01
             pid   = 6608, comm = qemu-system-ppc
         Linux version 4.14.0-rc7-01489-g47e1893a404a-dirty #26 SMP
         [c000000fdf577a00] c008000010159dd4 kvmppc_vcpu_run_hv+0x3dc/0x12d0 [kvm_hv]
         [c000000fdf577b30] c0080000100a537c kvmppc_vcpu_run+0x44/0x60 [kvm]
         [c000000fdf577b60] c0080000100a1ae0 kvm_arch_vcpu_ioctl_run+0x118/0x310 [kvm]
         [c000000fdf577c00] c008000010093e98 kvm_vcpu_ioctl+0x530/0x7c0 [kvm]
         [c000000fdf577d50] c000000000357bf8 do_vfs_ioctl+0xd8/0x8c0
         [c000000fdf577df0] c000000000358448 SyS_ioctl+0x68/0x100
         [c000000fdf577e30] c00000000000b220 system_call+0x58/0x6c
         --- Exception: c01 (System Call) at 00007fff76868df0
         SP (7fff7069baf0) is in userspace
      
      Fixes: e36d0a2e ("powerpc/powernv: Implement NMI IPI with OPAL_SIGNAL_SYSTEM_RESET")
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      6de6638b
  2. 22 Oct, 2017 12 commits
  3. 20 Oct, 2017 6 commits
    • Michael Neuling's avatar
      powerpc/tm: P9 disable transactionally suspended sigcontexts · 92fb8690
      Michael Neuling authored
      Unfortunately userspace can construct a sigcontext which enables
      suspend. Thus userspace can force Linux into a path where trechkpt is
      executed.
      
      This patch blocks this from happening on POWER9 by sanity checking
      sigcontexts passed in.
      
      ptrace doesn't have this problem as only MSR SE and BE can be changed
      via ptrace.
      
      This patch also adds a number of WARN_ON()s in case we ever enter
      suspend when we shouldn't. This should not happen, but if it does the
      symptoms are soft lockup warnings which are not obviously TM related,
      so the WARN_ON()s should make it obvious what's happening.
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Signed-off-by: default avatarCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      92fb8690
    • Michael Ellerman's avatar
      powerpc/powernv: Enable TM without suspend if possible · 54820530
      Michael Ellerman authored
      Some Power9 revisions can run in a mode where TM operates without
      suspended state. If we find ourself on a CPU that might be in this
      mode, we query OPAL to check, and if so we reenable TM in CPU
      features, and enable a new user feature to signal to userspace that we
      are in this mode.
      
      We do not enable the "normal" user feature, PPC_FEATURE2_HTM, but we
      do enable PPC_FEATURE2_HTM_NOSC because that indicates to userspace
      that the kernel will abort transactions on syscall entry, which is
      true regardless of the suspend mode.
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      54820530
    • Michael Ellerman's avatar
      powerpc: Add PPC_FEATURE2_HTM_NO_SUSPEND · cba6ac48
      Michael Ellerman authored
      Some CPUs can operate in a mode where TM (Transactional Memory) is
      enabled but the suspended state of TM is disabled. In this mode
      tsuspend does not enter suspended state, instead the transaction is
      aborted. Similarly any other event that would lead to suspended state
      instead aborts the transaction.
      
      There is also an ABI change, in that in this mode processes are not
      allowed to sigreturn with an MSR that would lead to suspended state,
      Linux will instead return an error to the sigreturn syscall.
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      cba6ac48
    • Cyril Bur's avatar
      powerpc/tm: Add commandline option to disable hardware transactional memory · 07fd1761
      Cyril Bur authored
      Currently the kernel relies on firmware to inform it whether or not the
      CPU supports HTM and as long as the kernel was built with
      CONFIG_PPC_TRANSACTIONAL_MEM=y then it will allow userspace to make
      use of the facility.
      
      There may be situations where it would be advantageous for the kernel
      to not allow userspace to use HTM, currently the only way to achieve
      this is to recompile the kernel with CONFIG_PPC_TRANSACTIONAL_MEM=n.
      
      This patch adds a simple commandline option so that HTM can be
      disabled at boot time.
      Signed-off-by: default avatarCyril Bur <cyrilbur@gmail.com>
      [mpe: Simplify to a bool, move to prom.c, put doco in the right place.
       Always disable, regardless of initial state, to avoid user confusion.]
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      07fd1761
    • Michael Ellerman's avatar
      Merge branch 'topic/ppc-kvm' into next · ddd46ed2
      Michael Ellerman authored
      Bring in some KVM commits we need (the TM one in particular).
      ddd46ed2
    • Michael Ellerman's avatar
      KVM: PPC: Tie KVM_CAP_PPC_HTM to the user-visible TM feature · 2a3d6553
      Michael Ellerman authored
      Currently we use CPU_FTR_TM to decide if the CPU/kernel can support
      TM (Transactional Memory), and if it's true we advertise that to
      Qemu (or similar) via KVM_CAP_PPC_HTM.
      
      PPC_FEATURE2_HTM is the user-visible feature bit, which indicates that
      the CPU and kernel can support TM. Currently CPU_FTR_TM and
      PPC_FEATURE2_HTM always have the same value, either true or false, so
      using the former for KVM_CAP_PPC_HTM is correct.
      
      However some Power9 CPUs can operate in a mode where TM is enabled but
      TM suspended state is disabled. In this mode CPU_FTR_TM is true, but
      PPC_FEATURE2_HTM is false. Instead a different PPC_FEATURE2 bit is
      set, to indicate that this different mode of TM is available.
      
      It is not safe to let guests use TM as-is, when the CPU is in this
      mode. So to prevent that from happening, use PPC_FEATURE2_HTM to
      determine the value of KVM_CAP_PPC_HTM.
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      2a3d6553
  4. 19 Oct, 2017 1 commit
  5. 16 Oct, 2017 6 commits