1. 03 Jun, 2018 15 commits
    • Nicholas Piggin's avatar
      powerpc: allow soft-NMI watchdog to cover timer interrupts with large decrementers · a7cba02d
      Nicholas Piggin authored
      Large decrementers (e.g., POWER9) can take a very long time to wrap,
      so when the timer iterrupt handler sets the decrementer to max so as
      to avoid taking another decrementer interrupt when hard enabling
      interrupts before running timers, it effectively disables the soft
      NMI coverage for timer interrupts.
      
      Fix this by using the traditional 31-bit value instead, which wraps
      after a few seconds. masked interrupt code does the same thing, and
      in normal operation neither of these paths would ever wrap even the
      31 bit value.
      
      Note: the SMP watchdog should catch timer interrupt lockups, but it
      is preferable for the local soft-NMI to catch them, mainly to avoid
      the IPI.
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      a7cba02d
    • Nicholas Piggin's avatar
      powerpc: generic clockevents broadcast receiver call tick_receive_broadcast · 3f984620
      Nicholas Piggin authored
      The broadcast tick recipient can call tick_receive_broadcast rather
      than re-running the full timer interrupt.
      
      It does not have to check for the next event time, because the sender
      already determined the timer has expired. It does not have to test
      irq_work_pending, because that's a direct decrementer interrupt and
      does not go through the clock events subsystem. And it does not have
      to read PURR because that was removed with the previous patch.
      
      This results in no code size change, but both the decrementer and
      broadcast path lengths are reduced.
      
      Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      3f984620
    • Nicholas Piggin's avatar
      powerpc/pseries: lparcfg calculate PURR on demand · 3d3a6021
      Nicholas Piggin authored
      For SPLPAR, lparcfg provides a sum of PURR registers for all CPUs.
      Currently this is done by reading PURR in context switch and timer
      interrupt, and storing that into a per-CPU variable. These are summed
      to provide the value.
      
      This does not work with all timer schemes (e.g., NO_HZ_FULL), and it
      is sub-optimal for performance because it reads the PURR register on
      every context switch, although that's been difficult to distinguish
      from noise in the contxt_switch microbenchmark.
      
      This patch implements the sum by calling a function on each CPU, to
      read and add PURR values of each CPU.
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      3d3a6021
    • Nicholas Piggin's avatar
      powerpc/64: remove start_tb and accum_tb from thread_struct · 36d632ea
      Nicholas Piggin authored
      These fields are only written to.
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      36d632ea
    • Nicholas Piggin's avatar
      powerpc/64s: micro-optimise __hard_irq_enable() for mtmsrd L=1 support · 54071e41
      Nicholas Piggin authored
      Book3S minimum supported ISA version now requires mtmsrd L=1. This
      instruction does not require bits other than RI and EE to be supplied,
      so __hard_irq_enable() and __hard_irq_disable() does not have to read
      the kernel_msr from paca.
      
      Interrupt entry code already relies on L=1 support.
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      54071e41
    • Nicholas Piggin's avatar
      powerpc/pseries: put cede MSR[EE] check under IRQ_SOFT_MASK_DEBUG · 9f4b61b2
      Nicholas Piggin authored
      This check does not catch IRQ soft mask bugs, but this option is
      slightly more suitable than TRACE_IRQFLAGS.
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      9f4b61b2
    • Nicholas Piggin's avatar
      powerpc/64: irq_work avoid interrupt when called with hardware irqs enabled · ebb37cf3
      Nicholas Piggin authored
      irq_work_raise should not cause a decrementer exception unless it is
      called from NMI context. Doing so often just results in an immediate
      masked decrementer interrupt:
      
         <...>-550    90d...    4us : update_curr_rt <-dequeue_task_rt
         <...>-550    90d...    5us : dbs_update_util_handler <-update_curr_rt
         <...>-550    90d...    6us : arch_irq_work_raise <-irq_work_queue
         <...>-550    90d...    7us : soft_nmi_interrupt <-soft_nmi_common
         <...>-550    90d...    7us : printk_nmi_enter <-soft_nmi_interrupt
         <...>-550    90d.Z.    8us : rcu_nmi_enter <-soft_nmi_interrupt
         <...>-550    90d.Z.    9us : rcu_nmi_exit <-soft_nmi_interrupt
         <...>-550    90d...    9us : printk_nmi_exit <-soft_nmi_interrupt
         <...>-550    90d...   10us : cpuacct_charge <-update_curr_rt
      
      The soft_nmi_interrupt here is the call into the watchdog, due to the
      decrementer interrupt firing with irqs soft-disabled. This is
      harmless, but sub-optimal.
      
      When it's not called from NMI context or with interrupts enabled, mark
      the decrementer pending in the irq_happened mask directly, rather than
      having the masked decrementer interupt handler do it. This will be
      replayed at the next local_irq_enable. See the comment for details.
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      ebb37cf3
    • Alexey Kardashevskiy's avatar
      powerpc/powernv/ioda2: Remove redundant free of TCE pages · 98fd72fe
      Alexey Kardashevskiy authored
      When IODA2 creates a PE, it creates an IOMMU table with it_ops::free
      set to pnv_ioda2_table_free() which calls pnv_pci_ioda2_table_free_pages().
      
      Since iommu_tce_table_put() calls it_ops::free when the last reference
      to the table is released, explicit call to pnv_pci_ioda2_table_free_pages()
      is not needed so let's remove it.
      
      This should fix double free in the case of PCI hotuplug as
      pnv_pci_ioda2_table_free_pages() does not reset neither
      iommu_table::it_base nor ::it_size.
      
      This was not exposed by SRIOV as it uses different code path via
      pnv_pcibios_sriov_disable().
      
      IODA1 does not inialize it_ops::free so it does not have this issue.
      
      Fixes: c5f7700b ("powerpc/powernv: Dynamically release PE")
      Cc: stable@vger.kernel.org # v4.8+
      Signed-off-by: default avatarAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      98fd72fe
    • Yisheng Xie's avatar
      powerpc/xmon: use match_string() helper · 0abbf2bf
      Yisheng Xie authored
      match_string() returns the index of an array for a matching string,
      which can be used instead of open coded variant.
      
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: linuxppc-dev@lists.ozlabs.org
      Signed-off-by: default avatarYisheng Xie <xieyisheng1@huawei.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      0abbf2bf
    • Christophe Leroy's avatar
      powerpc: Fix build by disabling attribute-alias warning for SYSCALL_DEFINEx · 2479bfc9
      Christophe Leroy authored
      GCC 8.1 emits warnings such as the following. As arch/powerpc code is
      built with -Werror, this breaks the build with GCC 8.1.
      
        In file included from arch/powerpc/kernel/pci_64.c:23:
        ./include/linux/syscalls.h:233:18: error: 'sys_pciconfig_iobase' alias
        between functions of incompatible types 'long int(long int, long
        unsigned int, long unsigned int)' and 'long int(long int, long int,
        long int)' [-Werror=attribute-alias]
          asmlinkage long sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)) \
                          ^~~
        ./include/linux/syscalls.h:222:2: note: in expansion of macro '__SYSCALL_DEFINEx'
          __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
      
      This patch inhibits those warnings.
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      [mpe: Trim change log]
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      2479bfc9
    • Christophe Leroy's avatar
      powerpc/64: Fix strncpy() related build failures with GCC 8.1 · c9599881
      Christophe Leroy authored
      GCC 8.1 warns about possible string truncation:
      
        arch/powerpc/kernel/nvram_64.c:1042:2: error: 'strncpy' specified
        bound 12 equals destination size [-Werror=stringop-truncation]
          strncpy(new_part->header.name, name, 12);
      
        arch/powerpc/platforms/ps3/repository.c:106:2: error: 'strncpy'
        output truncated before terminating nul copying 8 bytes from a
        string of the same length [-Werror=stringop-truncation]
          strncpy((char *)&n, text, 8);
      
      Fix it by using memcpy(). To make that safe we need to ensure the
      destination is pre-zeroed. Use kzalloc() in the nvram code and
      initialise the u64 to zero in the ps3 code.
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      [mpe: Use kzalloc() in the nvram code, flesh out change log]
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      c9599881
    • Michael Ellerman's avatar
      Merge branch 'topic/pkey' into next · 2135a6ec
      Michael Ellerman authored
      This is a branch with a mixture of mm, x86 and powerpc commits all
      relating to some minor cross-arch pkeys consolidation. The x86/mm
      changes have been reviewed by Ingo & Dave Hansen and the tree has been
      in linux-next for some weeks without issue.
      2135a6ec
    • Michael Ellerman's avatar
      Merge branch 'fixes' into next · f1079d3a
      Michael Ellerman authored
      We ended up with an ugly conflict between fixes and next in ftrace.h
      involving multiple nested ifdefs, and the automatic resolution is
      wrong. So merge fixes into next so we can fix it up.
      f1079d3a
    • Michael Ellerman's avatar
      Merge branch 'topic/kbuild' into next · b5240b14
      Michael Ellerman authored
      Merge in some commits we're sharing with the kbuild tree.
      b5240b14
    • Michael Ellerman's avatar
      Merge branch 'topic/ppc-kvm' into next · 481c63ac
      Michael Ellerman authored
      Merge in some commits we're sharing with the kvm-ppc tree.
      481c63ac
  2. 01 Jun, 2018 7 commits
    • Aneesh Kumar K.V's avatar
      powerpc/mm: Fix kernel crash on page table free · 667416f3
      Aneesh Kumar K.V authored
      Fix the below crash on Book3E 64. pgtable_page_dtor expects struct
      page *arg.
      
      Also call the destructor on non book3s platforms correctly. This frees
      up the split PTL locks correctly if we had allocated them before.
      
      Call Trace:
        .kmem_cache_free+0x9c/0x44c (unreliable)
        .ptlock_free+0x1c/0x30
        .tlb_remove_table+0xdc/0x224
        .free_pgd_range+0x298/0x500
        .shift_arg_pages+0x10c/0x1e0
        .setup_arg_pages+0x200/0x25c
        .load_elf_binary+0x450/0x16c8
        .search_binary_handler.part.11+0x9c/0x248
        .do_execveat_common.isra.13+0x868/0xc18
        .run_init_process+0x34/0x4c
        .try_to_run_init_process+0x1c/0x68
        .kernel_init+0xdc/0x130
        .ret_from_kernel_thread+0x58/0x7c
      
      Fixes: 70234676 ("powerpc/mm/nohash: Remove pte fragment dependency from nohash")
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      667416f3
    • Mathieu Malaterre's avatar
      powerpc/prom: Fix %u/%llx usage since prom_printf() change · 8af1da40
      Mathieu Malaterre authored
      In commit eae5f709 ("powerpc: Add __printf verification to
      prom_printf") __printf attribute was added to prom_printf(), which
      means GCC started warning about type/format mismatches. As part of
      that commit we changed some "%lx" formats to "%llx" where the type is
      actually unsigned long long.
      
      Unfortunately prom_printf() doesn't know how to print "%llx", it just
      prints a literal "lx", eg:
      
        reserved memory map:
          lx - lx
          lx - lx
      
      prom_printf() also doesn't know how to print "%u" (only "%lu"), it
      just prints a literal "u", eg:
      
        Max number of cores passed to firmware: u (NR_CPUS = 2048)
      
      Instead of:
      
        Max number of cores passed to firmware: 2048 (NR_CPUS = 2048)
      
      This commit adds support for the missing formatters.
      
      Fixes: eae5f709 ("powerpc: Add __printf verification to prom_printf")
      Reported-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Reported-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarMathieu Malaterre <malat@debian.org>
      Tested-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      8af1da40
    • Vaibhav Jain's avatar
      cxl: Configure PSL to not use APC virtual machines · 9a6d2022
      Vaibhav Jain authored
      APC virtual machines arent used on POWER-9 chips and are already
      disabled in on-chip CAPP. They also need to be disabled on the PSL via
      'PSL Data Send Control Register' by setting bit(47). This forces the
      PSL to send commands to CAPP with queue.id == 0.
      
      Fixes: 56328743 ("cxl: Add support for POWER9 DD2")
      Cc: stable@vger.kernel.org # v4.15+
      Signed-off-by: default avatarVaibhav Jain <vaibhav@linux.vnet.ibm.com>
      Acked-by: default avatarAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Reviewed-by: default avatarAlastair D'Silva <alastair@d-silva.org>
      Reviewed-by: default avatarChristophe Lombard <clombard@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      9a6d2022
    • Vaibhav Jain's avatar
      cxl: Disable prefault_mode in Radix mode · b6c84ba2
      Vaibhav Jain authored
      Currently we see a kernel-oops reported on Power-9 while attaching a
      context to an AFU, with radix-mode and sysfs attr 'prefault_mode' set
      to anything other than 'none'. The backtrace of the oops is of this
      form:
      
        Unable to handle kernel paging request for data at address 0x00000080
        Faulting instruction address: 0xc00800000bcf3b20
        cpu 0x1: Vector: 300 (Data Access) at [c00000037f003800]
            pc: c00800000bcf3b20: cxl_load_segment+0x178/0x290 [cxl]
            lr: c00800000bcf39f0: cxl_load_segment+0x48/0x290 [cxl]
            sp: c00000037f003a80
           msr: 9000000000009033
           dar: 80
         dsisr: 40000000
          current = 0xc00000037f280000
          paca    = 0xc0000003ffffe600   softe: 3        irq_happened: 0x01
            pid   = 3529, comm = afp_no_int
        <snip>
        cxl_prefault+0xfc/0x248 [cxl]
        process_element_entry_psl9+0xd8/0x1a0 [cxl]
        cxl_attach_dedicated_process_psl9+0x44/0x130 [cxl]
        native_attach_process+0xc0/0x130 [cxl]
        afu_ioctl+0x3f4/0x5e0 [cxl]
        do_vfs_ioctl+0xdc/0x890
        ksys_ioctl+0x68/0xf0
        sys_ioctl+0x40/0xa0
        system_call+0x58/0x6c
      
      The issue is caused as on Power-8 the AFU attr 'prefault_mode' was
      used to improve initial storage fault performance by prefaulting
      process segments. However on Power-9 with radix mode we don't have
      Storage-Segments that we can prefault. Also prefaulting process Pages
      will be too costly and fine-grained.
      
      Hence, since the prefaulting mechanism doesn't makes sense of
      radix-mode, this patch updates prefault_mode_store() to not allow any
      other value apart from CXL_PREFAULT_NONE when radix mode is enabled.
      
      Fixes: f24be42a ("cxl: Add psl9 specific code")
      Cc: stable@vger.kernel.org # v4.12+
      Signed-off-by: default avatarVaibhav Jain <vaibhav@linux.ibm.com>
      Acked-by: default avatarFrederic Barrat <fbarrat@linux.vnet.ibm.com>
      Acked-by: default avatarAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      b6c84ba2
    • Nicholas Piggin's avatar
      powerpc/kbuild: Use flags variables rather than overriding LD/CC/AS · 1421dc6d
      Nicholas Piggin authored
      The powerpc toolchain can compile combinations of 32/64 bit and
      big/little endian, so it's convenient to consider, e.g.,
      
        `CC -m64 -mbig-endian`
      
      To be the C compiler for the purpose of invoking it to build target
      artifacts. So overriding the CC variable to include these flags works
      for this purpose.
      
      Unfortunately that is not compatible with the way the proposed new
      Kconfig macro language will work.
      
      After previous patches in this series, these flags can be carefully
      passed in using flags instead.
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      1421dc6d
    • Nicholas Piggin's avatar
      powerpc/kbuild: Remove CROSS32 defines from top level powerpc Makefile · af3901cb
      Nicholas Piggin authored
      Switch VDSO32 build over to use CROSS32_COMPILE directly, and have
      it pass in -m32 after the standard c_flags. This allows endianness
      overrides to be removed and the endian and bitness flags moved into
      standard flags variables.
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      af3901cb
    • Nicholas Piggin's avatar
      powerpc/kbuild: Set default generic machine type for 32-bit compile · 4bf4f42a
      Nicholas Piggin authored
      Some 64-bit toolchains uses the wrong ISA variant for compiling 32-bit
      kernels, even with -m32. Debian's powerpc64le is one such case, and
      that is because it is built with --with-cpu=power8.
      
      So when cross compiling a 32-bit kernel with a 64-bit toolchain, set
      -mcpu=powerpc initially, which is the generic 32-bit powerpc machine
      type and scheduling model. CPU and platform code can override this
      with subsequent -mcpu flags if necessary.
      
      This is not done for 32-bit toolchains otherwise it would override
      their defaults, which are presumably set appropriately for the
      environment (moreso than a 64-bit cross compiler).
      
      This fixes a lot of build failures due to incompatible assembly when
      compiling 32-bit kernel with the Debian powerpc64le 64-bit toolchain.
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      4bf4f42a
  3. 29 May, 2018 1 commit
  4. 28 May, 2018 4 commits
  5. 25 May, 2018 13 commits