1. 21 Feb, 2019 31 commits
    • Christophe Leroy's avatar
      powerpc/8xx: hide itlbie and dtlbie symbols · 32ceaa6e
      Christophe Leroy authored
      When disassembling InstructionTLBError we get the following messy code:
      
      c000138c:       7d 84 63 78     mr      r4,r12
      c0001390:       75 25 58 00     andis.  r5,r9,22528
      c0001394:       75 2a 40 00     andis.  r10,r9,16384
      c0001398:       41 a2 00 08     beq     c00013a0 <itlbie>
      c000139c:       7c 00 22 64     tlbie   r4,r0
      
      c00013a0 <itlbie>:
      c00013a0:       39 40 04 01     li      r10,1025
      c00013a4:       91 4b 00 b0     stw     r10,176(r11)
      c00013a8:       39 40 10 32     li      r10,4146
      c00013ac:       48 00 cc 59     bl      c000e004 <transfer_to_handler>
      
      For a cleaner code dump, this patch replaces itlbie and dtlbie
      symbols by local symbols.
      
      c000138c:       7d 84 63 78     mr      r4,r12
      c0001390:       75 25 58 00     andis.  r5,r9,22528
      c0001394:       75 2a 40 00     andis.  r10,r9,16384
      c0001398:       41 a2 00 08     beq     c00013a0 <InstructionTLBError+0xa0>
      c000139c:       7c 00 22 64     tlbie   r4,r0
      c00013a0:       39 40 04 01     li      r10,1025
      c00013a4:       91 4b 00 b0     stw     r10,176(r11)
      c00013a8:       39 40 10 32     li      r10,4146
      c00013ac:       48 00 cc 59     bl      c000e004 <transfer_to_handler>
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      32ceaa6e
    • Christophe Leroy's avatar
      powerpc/selftest: fix type of mftb() in null_syscall · beb4f472
      Christophe Leroy authored
      All callers of mftb() expect 'unsigned long', and the function itself
      only returns lower part of the TB so it really is 'unsigned long'
      not 'unsigned long long'
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      beb4f472
    • Paul Mackerras's avatar
      powerpc/powernv: Don't reprogram SLW image on every KVM guest entry/exit · 19f8a5b5
      Paul Mackerras authored
      Commit 24be85a2 ("powerpc/powernv: Clear PECE1 in LPCR via stop-api
      only on Hotplug", 2017-07-21) added two calls to opal_slw_set_reg()
      inside pnv_cpu_offline(), with the aim of changing the LPCR value in
      the SLW image to disable wakeups from the decrementer while a CPU is
      offline.  However, pnv_cpu_offline() gets called each time a secondary
      CPU thread is woken up to participate in running a KVM guest, that is,
      not just when a CPU is offlined.
      
      Since opal_slw_set_reg() is a very slow operation (with observed
      execution times around 20 milliseconds), this means that an offline
      secondary CPU can often be busy doing the opal_slw_set_reg() call
      when the primary CPU wants to grab all the secondary threads so that
      it can run a KVM guest.  This leads to messages like "KVM: couldn't
      grab CPU n" being printed and guest execution failing.
      
      There is no need to reprogram the SLW image on every KVM guest entry
      and exit.  So that we do it only when a CPU is really transitioning
      between online and offline, this moves the calls to
      pnv_program_cpu_hotplug_lpcr() into pnv_smp_cpu_kill_self().
      
      Fixes: 24be85a2 ("powerpc/powernv: Clear PECE1 in LPCR via stop-api only on Hotplug")
      Cc: stable@vger.kernel.org # v4.14+
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      19f8a5b5
    • Michael Ellerman's avatar
      powerpc/64s: Fix logic when handling unknown CPU features · 8cfaf106
      Michael Ellerman authored
      In cpufeatures_process_feature(), if a provided CPU feature is unknown and
      enable_unknown is false, we erroneously print that the feature is being
      enabled and return true, even though no feature has been enabled, and
      may also set feature bits based on the last entry in the match table.
      
      Fix this so that we only set feature bits from the match table if we have
      actually enabled a feature from that table, and when failing to enable an
      unknown feature, always print the "not enabling" message and return false.
      
      Coincidentally, some older gccs (<GCC 7), when invoked with
      -fsanitize-coverage=trace-pc, cause a spurious uninitialised variable
      warning in this function:
      
        arch/powerpc/kernel/dt_cpu_ftrs.c: In function ‘cpufeatures_process_feature’:
        arch/powerpc/kernel/dt_cpu_ftrs.c:686:7: warning: ‘m’ may be used uninitialized in this function [-Wmaybe-uninitialized]
          if (m->cpu_ftr_bit_mask)
      
      An upcoming patch will enable support for kcov, which requires this option.
      This patch avoids the warning.
      
      Fixes: 5a61ef74 ("powerpc/64s: Support new device tree binding for discovering CPU features")
      Reported-by: default avatarSegher Boessenkool <segher@kernel.crashing.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      [ajd: add commit message]
      Signed-off-by: default avatarAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      8cfaf106
    • Nicholas Piggin's avatar
      6fe243fe
    • Nicholas Piggin's avatar
      powerpc/smp: Fix NMI IPI xmon timeout · 88b9a3d1
      Nicholas Piggin authored
      The xmon debugger IPI handler waits in the callback function while
      xmon is still active. This means they don't complete the IPI, and the
      initiator always times out waiting for them.
      
      Things manage to work after the timeout because there is some fallback
      logic to keep NMI IPI state sane in case of the timeout, but this is a
      bit ugly.
      
      This patch changes NMI IPI back to half-asynchronous (i.e., wait for
      everyone to call in, do not wait for IPI function to complete), but
      the complexity is avoided by going one step further and allowing new
      IPIs to be issued before the IPI functions to all complete.
      
      If synchronization against that is required, it is left up to the
      caller, but current callers don't require that. In fact with the
      timeout handling, callers must be able to cope with this already.
      
      Fixes: 5b73151f ("powerpc: NMI IPI make NMI IPIs fully sychronous")
      Cc: stable@vger.kernel.org # v4.19+
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      88b9a3d1
    • Nicholas Piggin's avatar
      powerpc/smp: Fix NMI IPI timeout · 1b5fc84a
      Nicholas Piggin authored
      The NMI IPI timeout logic is broken, if __smp_send_nmi_ipi() times out
      on the first condition, delay_us will be zero which will send it into
      the second spin loop with no timeout so it will spin forever.
      
      Fixes: 5b73151f ("powerpc: NMI IPI make NMI IPIs fully sychronous")
      Cc: stable@vger.kernel.org # v4.19+
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      1b5fc84a
    • Michael Ellerman's avatar
      powerpc: Make PPC_64K_PAGES depend on only 44x or PPC_BOOK3S_64 · bba43630
      Michael Ellerman authored
      In commit 7820856a ("powerpc/mm/book3e/64: Remove unsupported
      64Kpage size from 64bit booke") we dropped the 64K page size support
      from the 64-bit nohash (Book3E) code.
      
      But we didn't update the dependencies of the PPC_64K_PAGES option,
      meaning a randconfig can still trigger this code and cause a build
      breakage, eg:
        arch/powerpc/include/asm/nohash/64/pgtable.h:14:2: error: #error "Page size not supported"
        arch/powerpc/include/asm/nohash/mmu-book3e.h:275:2: error: #error Unsupported page size
      
      So remove PPC_BOOK3E_64 from the dependencies. This also means we
      don't need to worry about PPC_FSL_BOOK3E, because that was just trying
      to prevent the PPC_BOOK3E_64=y && PPC_FSL_BOOK3E=y case.
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      bba43630
    • Michael Ellerman's avatar
      powerpc/64: Make sys_switch_endian() traceable · 81dac817
      Michael Ellerman authored
      We weren't using SYSCALL_DEFINE for sys_switch_endian(), which means
      it wasn't able to be traced by CONFIG_FTRACE_SYSCALLS.
      
      By using the macro we create the right metadata and the syscall is
      visible. eg:
      
        # cd /sys/kernel/debug/tracing
        # echo 1 | tee events/syscalls/sys_*_switch_endian/enable
        # ~/switch_endian_test
        # cat trace
        ...
        switch_endian_t-3604  [009] ....   315.175164: sys_switch_endian()
        switch_endian_t-3604  [009] ....   315.175167: sys_switch_endian -> 0x5555aaaa5555aaaa
        switch_endian_t-3604  [009] ....   315.175169: sys_switch_endian()
        switch_endian_t-3604  [009] ....   315.175169: sys_switch_endian -> 0x5555aaaa5555aaaa
      
      Fixes: 529d235a ("powerpc: Add a proper syscall for switching endianness")
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      81dac817
    • Robert P. J. Day's avatar
      powerpc/dts: Standardize DTS status assignments from "ok" to "okay" · 5c285dd7
      Robert P. J. Day authored
      While the current kernel drivers/of/ code allows developers to be
      sloppy and use a DTS status value of "ok", the current DTSpec 0.1
      makes it clear that the proper spelling is "okay", so fix the small
      number of PowerPC .dts files that do this.
      Signed-off-by: default avatarRobert P. J. Day <rpjday@crashcourse.ca>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      5c285dd7
    • Aneesh Kumar K.V's avatar
      powerpc/book3s: Remove pgd/pud/pmd_set() interfaces · c746ca00
      Aneesh Kumar K.V authored
      When updating page tables, we need to make sure we fill the page table
      entry valid bits. We do this by or'ing in one of PGD/PUD/PMD_VAL_BITS.
      
      The page table 'set' interfaces allow updating the raw value of page
      table entries without setting the valid bits, so remove those
      interfaces to avoid incorrect usage in future.
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      [mpe: Reword commit message based on mailing list discussion]
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      c746ca00
    • Mark Cave-Ayland's avatar
      powerpc: Fix 32-bit KVM-PR lockup and host crash with MacOS guest · fe1ef6bc
      Mark Cave-Ayland authored
      Commit 8792468d "powerpc: Add the ability to save FPU without
      giving it up" unexpectedly removed the MSR_FE0 and MSR_FE1 bits from
      the bitmask used to update the MSR of the previous thread in
      __giveup_fpu() causing a KVM-PR MacOS guest to lockup and panic the
      host kernel.
      
      Leaving FE0/1 enabled means unrelated processes might receive FPEs
      when they're not expecting them and crash. In particular if this
      happens to init the host will then panic.
      
      eg (transcribed):
        qemu-system-ppc[837]: unhandled signal 8 at 12cc9ce4 nip 12cc9ce4 lr 12cc9ca4 code 0
        systemd[1]: unhandled signal 8 at 202f02e0 nip 202f02e0 lr 001003d4 code 0
        Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
      
      Reinstate these bits to the MSR bitmask to enable MacOS guests to run
      under 32-bit KVM-PR once again without issue.
      
      Fixes: 8792468d ("powerpc: Add the ability to save FPU without giving it up")
      Cc: stable@vger.kernel.org # v4.6+
      Signed-off-by: default avatarMark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      fe1ef6bc
    • Tyrel Datwyler's avatar
      powerpc/pseries: export timebase register sample in lparcfg · 9f3ba362
      Tyrel Datwyler authored
      The Processor Utilzation of Resource Registers (PURR) provide an
      estimate of resources used by a cpu thread. Section 7.6 in Book III of
      the ISA outlines how to calculate the percentage of shared resources
      for threads using the ratio of the PURR delta and Timebase Register
      delta for a sampled period.
      
      This calculation is currently done erroneously by the lparstat tool
      from the powerpc-utils package. This patch exports the current
      timebase value after we sample the PURRs and exposes it to userspace
      accounting tools via /proc/ppc64/lparcfg.
      Signed-off-by: default avatarTyrel Datwyler <tyreld@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      9f3ba362
    • Michael Ellerman's avatar
      powerpc/44x: Force PCI on for CURRITUCK · aa7150ba
      Michael Ellerman authored
      The recent rework of PCI kconfig symbols exposed an existing bug in
      the CURRITUCK kconfig logic.
      
      It selects PPC4xx_PCI_EXPRESS which depends on PCI, but PCI is user
      selectable and might be disabled, leading to a warning:
      
        WARNING: unmet direct dependencies detected for PPC4xx_PCI_EXPRESS
          Depends on [n]: PCI [=n] && 4xx [=y]
          Selected by [y]:
          - CURRITUCK [=y] && PPC_47x [=y]
      
      Prior to commit eb01d42a ("PCI: consolidate PCI config entry in
      drivers/pci") PCI was enabled by default for currituck_defconfig so we
      didn't see the warning. The bad logic was still there, it just
      required someone disabling PCI in their .config to hit it.
      
      Fix it by forcing PCI on for CURRITUCK, which seems was always the
      expectation anyway.
      
      Fixes: eb01d42a ("PCI: consolidate PCI config entry in drivers/pci")
      Reported-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      aa7150ba
    • Oliver O'Halloran's avatar
      powerpc/eeh: Add eeh_force_recover to debugfs · 954bd994
      Oliver O'Halloran authored
      This patch adds a debugfs interface to force scheduling a recovery event.
      This can be used to recover a specific PE or schedule a "special" recovery
      even that checks for errors at the PHB level.
      To force a recovery of a normal PE, use:
      
       echo '<#pe>:<#phb>' > /sys/kernel/debug/powerpc/eeh_force_recover
      
      To force a scan for broken PHBs:
      
       echo 'hwcheck' > /sys/kernel/debug/powerpc/eeh_force_recover
      Signed-off-by: default avatarOliver O'Halloran <oohall@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      954bd994
    • Oliver O'Halloran's avatar
      powerpc/eeh: Allow disabling recovery · 6b493f60
      Oliver O'Halloran authored
      Currently when we detect an error we automatically invoke the EEH recovery
      handler. This can be annoying when debugging EEH problems, or when working
      on EEH itself so this patch adds a debugfs knob that will prevent a
      recovery event from being queued up when an issue is detected.
      Signed-off-by: default avatarOliver O'Halloran <oohall@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      6b493f60
    • Oliver O'Halloran's avatar
      powerpc/pci: Add pci_find_controller_for_domain() · 67060cb1
      Oliver O'Halloran authored
      Add a helper to find the pci_controller structure based on the domain
      number / phb id.
      Signed-off-by: default avatarOliver O'Halloran <oohall@gmail.com>
      Reviewed-by: default avatarSam Bobroff <sbobroff@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      67060cb1
    • Oliver O'Halloran's avatar
      powerpc/eeh_cache: Bump log level of eeh_addr_cache_print() · c8f02f21
      Oliver O'Halloran authored
      To use this function at all #define DEBUG needs to be set in eeh_cache.c.
      Considering that printing at pr_debug is probably not all that useful since
      it adds the additional hurdle of requiring you to enable the debug print if
      dynamic_debug is in use so this patch bumps it to pr_info.
      Signed-off-by: default avatarOliver O'Halloran <oohall@gmail.com>
      Reviewed-by: default avatarSam Bobroff <sbobroff@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      c8f02f21
    • Oliver O'Halloran's avatar
      powerpc/eeh_cache: Add a way to dump the EEH address cache · 5ca85ae6
      Oliver O'Halloran authored
      Adds a debugfs file that can be read to view the contents of the EEH
      address cache. This is pretty similar to the existing
      eeh_addr_cache_print() function, but that function is intended to debug
      issues inside of the kernel since it's #ifdef`ed out by default, and writes
      into the kernel log.
      Signed-off-by: default avatarOliver O'Halloran <oohall@gmail.com>
      Reviewed-by: default avatarSam Bobroff <sbobroff@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      5ca85ae6
    • Oliver O'Halloran's avatar
      powerpc/eeh_cache: Add pr_debug() prints for insert/remove · e67fbbec
      Oliver O'Halloran authored
      The EEH address cache is used to map a physical MMIO address back to a PCI
      device. It's useful to know when it's being manipulated, but currently this
      requires recompiling with #define DEBUG set. This is pointless since we
      have dynamic_debug nowdays, so remove the #ifdef guard and add a pr_debug()
      for the remove case too.
      Signed-off-by: default avatarOliver O'Halloran <oohall@gmail.com>
      Reviewed-by: default avatarSam Bobroff <sbobroff@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      e67fbbec
    • Oliver O'Halloran's avatar
      powerpc/eeh: Use debugfs_create_u32 for eeh_max_freezes · 46ee7c3c
      Oliver O'Halloran authored
      There's no need to the custom getter/setter functions so we should remove
      them in favour of using the generic one. While we're here, change the type
      of eeh_max_freeze to u32 and print the value in decimal rather than
      hex because printing it in hex makes no sense.
      Signed-off-by: default avatarOliver O'Halloran <oohall@gmail.com>
      Reviewed-by: default avatarSam Bobroff <sbobroff@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      46ee7c3c
    • Christophe Leroy's avatar
      powerpc: drop unused GENERIC_CSUM Kconfig item · d065ee93
      Christophe Leroy authored
      Commit d4fde568 ("powerpc/64: Use optimized checksum routines on
      little-endian") converted last powerpc user of GENERIC_CSUM.
      
      This patch does a final cleanup dropping the Kconfig GENERIC_CSUM
      option which is always 'n', and associated piece of code in
      asm/checksum.h
      
      Fixes: d4fde568 ("powerpc/64: Use optimized checksum routines on little-endian")
      Reported-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      d065ee93
    • Nicholas Piggin's avatar
      powerpc/64s/hash: Fix assert_slb_presence() use of the slbfee. instruction · 7104dccf
      Nicholas Piggin authored
      The slbfee. instruction must have bit 24 of RB clear, failure to do
      so can result in false negatives that result in incorrect assertions.
      
      This is not obvious from the ISA v3.0B document, which only says:
      
          The hardware ignores the contents of RB 36:38 40:63 -- p.1032
      
      This patch fixes the bug and also clears all other bits from PPC bit
      36-63, which is good practice when dealing with reserved or ignored
      bits.
      
      Fixes: e15a4fea ("powerpc/64s/hash: Add some SLB debugging tests")
      Cc: stable@vger.kernel.org # v4.20+
      Reported-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Tested-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      7104dccf
    • Michael Ellerman's avatar
      powerpc/mm/hash: Increase vmalloc space to 512T with hash MMU · 3d8810e0
      Michael Ellerman authored
      This patch updates the kernel non-linear virtual map to 512TB when
      we're built with 64K page size and are using the hash MMU. We allocate
      one context for the vmalloc region and hence the max virtual area size
      is limited by the context map size (512TB for 64K and 64TB for 4K page
      size).
      
      This patch fixes boot failures with large amounts of system RAM where
      we need large vmalloc space to handle per cpu allocations.
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Tested-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      3d8810e0
    • Michael Ellerman's avatar
      powerpc/ptrace: Simplify vr_get/set() to avoid GCC warning · ca6d5149
      Michael Ellerman authored
      GCC 8 warns about the logic in vr_get/set(), which with -Werror breaks
      the build:
      
        In function ‘user_regset_copyin’,
            inlined from ‘vr_set’ at arch/powerpc/kernel/ptrace.c:628:9:
        include/linux/regset.h:295:4: error: ‘memcpy’ offset [-527, -529] is
        out of the bounds [0, 16] of object ‘vrsave’ with type ‘union
        <anonymous>’ [-Werror=array-bounds]
        arch/powerpc/kernel/ptrace.c: In function ‘vr_set’:
        arch/powerpc/kernel/ptrace.c:623:5: note: ‘vrsave’ declared here
           } vrsave;
      
      This has been identified as a regression in GCC, see GCC bug 88273.
      
      However we can avoid the warning and also simplify the logic and make
      it more robust.
      
      Currently we pass -1 as end_pos to user_regset_copyout(). This says
      "copy up to the end of the regset".
      
      The definition of the regset is:
      	[REGSET_VMX] = {
      		.core_note_type = NT_PPC_VMX, .n = 34,
      		.size = sizeof(vector128), .align = sizeof(vector128),
      		.active = vr_active, .get = vr_get, .set = vr_set
      	},
      
      The end is calculated as (n * size), ie. 34 * sizeof(vector128).
      
      In vr_get/set() we pass start_pos as 33 * sizeof(vector128), meaning
      we can copy up to sizeof(vector128) into/out-of vrsave.
      
      The on-stack vrsave is defined as:
        union {
      	  elf_vrreg_t reg;
      	  u32 word;
        } vrsave;
      
      And elf_vrreg_t is:
        typedef __vector128 elf_vrreg_t;
      
      So there is no bug, but we rely on all those sizes lining up,
      otherwise we would have a kernel stack exposure/overwrite on our
      hands.
      
      Rather than relying on that we can pass an explict end_pos based on
      the sizeof(vrsave). The result should be exactly the same but it's
      more obviously not over-reading/writing the stack and it avoids the
      compiler warning.
      Reported-by: default avatarMeelis Roos <mroos@linux.ee>
      Reported-by: default avatarMathieu Malaterre <malat@debian.org>
      Cc: stable@vger.kernel.org
      Tested-by: default avatarMathieu Malaterre <malat@debian.org>
      Tested-by: default avatarMeelis Roos <mroos@linux.ee>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      ca6d5149
    • Peter Xu's avatar
      powerpc/powernv/npu: Remove redundant change_pte() hook · 1b58a975
      Peter Xu authored
      The change_pte() notifier was designed to use as a quick path to
      update secondary MMU PTEs on write permission changes or PFN changes.
      For KVM, it could reduce the vm-exits when vcpu faults on the pages
      that was touched up by KSM. It's not used to do cache invalidations,
      for example, if we see the notifier will be called before the real PTE
      update after all (please see set_pte_at_notify that set_pte_at was
      called later).
      
      All the necessary cache invalidation should all be done in
      invalidate_range() already.
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Reviewed-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Reviewed-by: default avatarAlistair Popple <alistair@popple.id.au>
      Reviewed-by: default avatarBalbir Singh <bsingharora@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      1b58a975
    • Michael Ellerman's avatar
      Merge branch 'topic/ppc-kvm' into next · e121ee6b
      Michael Ellerman authored
      Merge commits we're sharing with kvm-ppc tree.
      e121ee6b
    • Paul Mackerras's avatar
      powerpc/64s: Better printing of machine check info for guest MCEs · c0577201
      Paul Mackerras authored
      This adds an "in_guest" parameter to machine_check_print_event_info()
      so that we can avoid trying to translate guest NIP values into
      symbolic form using the host kernel's symbol table.
      Reviewed-by: default avatarAravinda Prasad <aravinda@linux.vnet.ibm.com>
      Reviewed-by: default avatarMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      c0577201
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Simplify machine check handling · 884dfb72
      Paul Mackerras authored
      This makes the handling of machine check interrupts that occur inside
      a guest simpler and more robust, with less done in assembler code and
      in real mode.
      
      Now, when a machine check occurs inside a guest, we always get the
      machine check event struct and put a copy in the vcpu struct for the
      vcpu where the machine check occurred.  We no longer call
      machine_check_queue_event() from kvmppc_realmode_mc_power7(), because
      on POWER8, when a vcpu is running on an offline secondary thread and
      we call machine_check_queue_event(), that calls irq_work_queue(),
      which doesn't work because the CPU is offline, but instead triggers
      the WARN_ON(lazy_irq_pending()) in pnv_smp_cpu_kill_self() (which
      fires again and again because nothing clears the condition).
      
      All that machine_check_queue_event() actually does is to cause the
      event to be printed to the console.  For a machine check occurring in
      the guest, we now print the event in kvmppc_handle_exit_hv()
      instead.
      
      The assembly code at label machine_check_realmode now just calls C
      code and then continues exiting the guest.  We no longer either
      synthesize a machine check for the guest in assembly code or return
      to the guest without a machine check.
      
      The code in kvmppc_handle_exit_hv() is extended to handle the case
      where the guest is not FWNMI-capable.  In that case we now always
      synthesize a machine check interrupt for the guest.  Previously, if
      the host thinks it has recovered the machine check fully, it would
      return to the guest without any notification that the machine check
      had occurred.  If the machine check was caused by some action of the
      guest (such as creating duplicate SLB entries), it is much better to
      tell the guest that it has caused a problem.  Therefore we now always
      generate a machine check interrupt for guests that are not
      FWNMI-capable.
      Reviewed-by: default avatarAravinda Prasad <aravinda@linux.vnet.ibm.com>
      Reviewed-by: default avatarMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      884dfb72
    • Michael Ellerman's avatar
      Merge branch 'topic/dma' into next · d0055df0
      Michael Ellerman authored
      Merge hch's big DMA rework series. This is in a topic branch in case he
      wants to merge it to minimise conflicts.
      d0055df0
    • Michael Ellerman's avatar
      KVM: PPC: Book3S HV: Context switch AMR on Power9 · d976f680
      Michael Ellerman authored
      kvmhv_p9_guest_entry() implements a fast-path guest entry for Power9
      when guest and host are both running with the Radix MMU.
      
      Currently in that path we don't save the host AMR (Authority Mask
      Register) value, and we always restore 0 on return to the host. That
      is OK at the moment because the AMR is not used for storage keys with
      the Radix MMU.
      
      However we plan to start using the AMR on Radix to prevent the kernel
      from reading/writing to userspace outside of copy_to/from_user(). In
      order to make that work we need to save/restore the AMR value.
      
      We only restore the value if it is different from the guest value,
      which is already in the register when we exit to the host. This should
      mean we rarely need to actually restore the value when running a
      modern Linux as a guest, because it will be using the same value as
      us.
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Tested-by: default avatarRussell Currey <ruscur@russell.cc>
      d976f680
  2. 19 Feb, 2019 1 commit
  3. 18 Feb, 2019 8 commits