1. 16 Dec, 2021 8 commits
    • Nicholas Piggin's avatar
      powerpc/64s/interrupt: Don't enable MSR[EE] in irq handlers unless perf is in use · 0faf20a1
      Nicholas Piggin authored
      Enabling MSR[EE] in interrupt handlers while interrupts are still soft
      masked allows PMIs to profile interrupt handlers to some degree, beyond
      what SIAR latching allows.
      
      When perf is not being used, this is almost useless work. It requires an
      extra mtmsrd in the irq handler, and it also opens the door to masked
      interrupts hitting and requiring replay, which is more expensive than
      just taking them directly. This effect can be noticable in high IRQ
      workloads.
      
      Avoid enabling MSR[EE] unless perf is currently in use. This saves about
      60 cycles (or 8%) on a simple decrementer interrupt microbenchmark.
      Replayed interrupts drop from 1.4% of all interrupts taken, to 0.003%.
      
      This does prevent the soft-nmi interrupt being taken in these handlers,
      but that's not too reliable anyway. The SMP watchdog will continue to be
      the reliable way to catch lockups.
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20210922145452.352571-5-npiggin@gmail.com
      0faf20a1
    • Nicholas Piggin's avatar
      powerpc/64s/perf: add power_pmu_wants_prompt_pmi to say whether perf wants PMIs to be soft-NMI · 5a7745b9
      Nicholas Piggin authored
      Interrupt code enables MSR[EE] in some irq handlers while keeping local
      irqs disabled via soft-mask, allowing PMI interrupts to be taken as
      soft-NMI to improve profiling of irq handlers.
      
      When perf is not enabled, there is no point to doing this, it's
      additional overhead. So provide a function that can say if PMIs should
      be taken promptly if possible.
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20210922145452.352571-4-npiggin@gmail.com
      5a7745b9
    • Nicholas Piggin's avatar
      powerpc/64s/interrupt: handle MSR EE and RI in interrupt entry wrapper · ff0b0d6e
      Nicholas Piggin authored
      The mtmsrd to enable MSR[RI] can be combined with the mtmsrd to enable
      MSR[EE] in interrupt entry code, for those interrupts which enable EE.
      This helps performance of important synchronous interrupts (e.g., page
      faults).
      
      This is similar to what commit dd152f70 ("powerpc/64s: system call
      avoid setting MSR[RI] until we set MSR[EE]") does for system calls.
      
      Do this by enabling EE and RI together at the beginning of the entry
      wrapper if PACA_IRQ_HARD_DIS is clear, and only enabling RI if it is
      set.
      
      Asynchronous interrupts set PACA_IRQ_HARD_DIS, but synchronous ones
      leave it unchanged, so by default they always get EE=1 unless they have
      interrupted a caller that is hard disabled. When the sync interrupt
      later calls interrupt_cond_local_irq_enable(), it will not require
      another mtmsrd because MSR[EE] was already enabled here.
      
      This avoids one mtmsrd L=1 for synchronous interrupts on 64s, which
      saves about 20 cycles on POWER9. And for kernel-mode interrupts, both
      synchronous and asynchronous, this saves an additional 40 cycles due to
      the mtmsrd being moved ahead of mfspr SPRN_AMR, which prevents a SPR
      scoreboard stall.
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20210922145452.352571-3-npiggin@gmail.com
      ff0b0d6e
    • Nicholas Piggin's avatar
      powerpc/64/interrupt: make normal synchronous interrupts enable MSR[EE] if possible · 4423eb5a
      Nicholas Piggin authored
      Make synchronous interrupt handler entry wrappers enable MSR[EE] if
      MSR[EE] was enabled in the interrupted context. IRQs are soft-disabled
      at this point so there is no change to high level code, but it's a
      masked interrupt could fire.
      
      This is a performance disadvantage for interrupts which do not later
      call interrupt_cond_local_irq_enable(), because an an additional mtmsrd
      or wrtee instruction is executed. However the important synchronous
      interrupts (e.g., page fault) do enable interrupts, so the performance
      disadvantage is mostly avoided.
      
      In the next patch, MSR[RI] enabling can be combined with MSR[EE]
      enabling, which mitigates the performance drop for the former and gives
      a performance advanage for the latter interrupts, on 64s machines. 64e
      is coming along for the ride for now to avoid divergences with 64s in
      this tricky code.
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20210922145452.352571-2-npiggin@gmail.com
      4423eb5a
    • Nicholas Piggin's avatar
      powerpc/pseries/vas: Don't print an error when VAS is unavailable · 0a006ace
      Nicholas Piggin authored
      KVM does not support VAS so guests always print a useless error on boot
      
          vas: HCALL(398) error -2, query_type 0, result buffer 0x57f2000
      
      Change this to only print the message if the error is not H_FUNCTION.
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20211126052133.1664375-1-npiggin@gmail.com
      0a006ace
    • Kajol Jain's avatar
      powerpc/perf: Add data source encodings for power10 platform · 6ed05a8e
      Kajol Jain authored
      The code represent memory/cache level data based on PERF_MEM_LVL_*
      namespace, which is in the process of deprication in the favour of
      newer composite PERF_MEM_{LVLNUM_,REMOTE_,SNOOPX_,HOPS_} fields.
      Add data source encodings to represent cache/memory data based on
      newer composite PERF_MEM_{LVLNUM_,REMOTE_,SNOOPX_,HOPS_} fields.
      
      Add data source encodings to represent data coming from local
      memory/Remote memory/distant memory and remote/distant cache hits.
      
      In order to represent data coming from OpenCAPI cache/memory, we use
      LVLNUM "PMEM" field which is used to present persistent memory accesses.
      
      Result in power10 system with patch changes:
      
      localhost:# ./perf mem report --sort="mem,sym,dso" --stdio
       # Overhead       Samples  Memory access             Symbol                      Shared Object
       # ........  ............  ........................  ..........................  ................
       #
          29.46%          2331  L1 or L1 hit              [.] __random                                     libc-2.28.so
          23.11%          2121  L1 or L1 hit              [.] producer_populate_cache                      producer_consumer
          18.56%          1758  L1 or L1 hit              [.] __random_r                                   libc-2.28.so
          15.64%          1559  L2 or L2 hit              [.] __random                                     libc-2.28.so
          .....
          0.09%              5  Remote socket, same board Any cache hit             [.] __random         libc-2.28.so
          0.07%              4  Remote socket, same board Any cache hit             [.] __random         libc-2.28.so
          .....
      Signed-off-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Reviewed-by: default avatarMadhavan Srinivasan <maddy@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20211206091749.87585-5-kjain@linux.ibm.com
      6ed05a8e
    • Kajol Jain's avatar
      powerpc/perf: Add encodings to represent data based on newer composite PERF_MEM_LVLNUM* fields · 4a20ee10
      Kajol Jain authored
      The code represent data coming from L1/L2/L3 cache hits based on
      PERF_MEM_LVL_* namespace, which is in the process of deprecation in
      the favour of newer composite PERF_MEM_{LVLNUM_,REMOTE_,SNOOPX_,HOPS_}
      fields.
      
      Add data source encodings to represent L1/L2/L3 cache hits based on
      newer composite PERF_MEM_{LVLNUM_,REMOTE_,SNOOPX_,HOPS_} fields for
      power10 and older platforms
      
      Result in power9 system without patch changes:
      
      localhost:# ./perf mem report --sort="mem,sym,dso" --stdio
       # Overhead       Samples  Memory access             Symbol                             Shared Object
       # ........  ............  ........................  .................................  ................
       #
          29.51%             1  L2 hit                    [k] perf_event_exec                [kernel.vmlinux]
          27.05%             1  L1 hit                    [k] perf_ctx_unlock                [kernel.vmlinux]
          13.93%             1  L1 hit                    [k] vtime_delta                    [kernel.vmlinux]
          13.11%             1  L1 hit                    [k] prepend_path.isra.11           [kernel.vmlinux]
           8.20%             1  L1 hit                    [.] 00000038.plt_call.__GI_strlen  libc-2.28.so
           8.20%             1  L1 hit                    [k] perf_event_interrupt           [kernel.vmlinux]
      
      Result in power9 system with patch changes:
      
      localhost:# ./perf mem report --sort="mem,sym,dso" --stdio
       # Overhead       Samples  Memory access             Symbol                      Shared Object
       # ........  ............  ........................  ..........................  ................
       #
          36.63%             1  L2 or L2 hit              [k] perf_event_exec         [kernel.vmlinux]
          25.50%             1  L1 or L1 hit              [k] vtime_delta             [kernel.vmlinux]
          13.12%             1  L1 or L1 hit              [k] unmap_region            [kernel.vmlinux]
          12.62%             1  L1 or L1 hit              [k] perf_sample_event_took  [kernel.vmlinux]
           6.93%             1  L1 or L1 hit              [k] perf_ctx_unlock         [kernel.vmlinux]
           5.20%             1  L1 or L1 hit              [.] __memcpy_power7         libc-2.28.so
      Signed-off-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Reviewed-by: default avatarMadhavan Srinivasan <maddy@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20211206091749.87585-4-kjain@linux.ibm.com
      4a20ee10
    • Kajol Jain's avatar
      perf: Add new macros for mem_hops field · cb1c4aba
      Kajol Jain authored
      Add new macros for mem_hops field which can be used to
      represent remote-node, socket and board level details.
      
      Currently the code had macro for HOPS_0, which corresponds
      to data coming from another core but same node.
      Add new macros for HOPS_1 to HOPS_3 to represent
      remote-node, socket and board level data.
      
      For ex: Encodings for mem_hops fields with L2 cache:
      
      L2			- local L2
      L2 | REMOTE | HOPS_0	- remote core, same node L2
      L2 | REMOTE | HOPS_1	- remote node, same socket L2
      L2 | REMOTE | HOPS_2	- remote socket, same board L2
      L2 | REMOTE | HOPS_3	- remote board L2
      Signed-off-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20211206091749.87585-2-kjain@linux.ibm.com
      cb1c4aba
  2. 15 Dec, 2021 1 commit
  3. 14 Dec, 2021 1 commit
  4. 09 Dec, 2021 30 commits