An error occurred fetching the project authors.
  1. 16 Sep, 2020 1 commit
    • Nicholas Piggin's avatar
      powerpc/64s/radix: Fix mm_cpumask trimming race vs kthread_use_mm · a665eec0
      Nicholas Piggin authored
      Commit 0cef77c7 ("powerpc/64s/radix: flush remote CPUs out of
      single-threaded mm_cpumask") added a mechanism to trim the mm_cpumask of
      a process under certain conditions. One of the assumptions is that
      mm_users would not be incremented via a reference outside the process
      context with mmget_not_zero() then go on to kthread_use_mm() via that
      reference.
      
      That invariant was broken by io_uring code (see previous sparc64 fix),
      but I'll point Fixes: to the original powerpc commit because we are
      changing that assumption going forward, so this will make backports
      match up.
      
      Fix this by no longer relying on that assumption, but by having each CPU
      check the mm is not being used, and clearing their own bit from the mask
      only if it hasn't been switched-to by the time the IPI is processed.
      
      This relies on commit 38cf307c ("mm: fix kthread_use_mm() vs TLB
      invalidate") and ARCH_WANT_IRQS_OFF_ACTIVATE_MM to disable irqs over mm
      switch sequences.
      
      Fixes: 0cef77c7 ("powerpc/64s/radix: flush remote CPUs out of single-threaded mm_cpumask")
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Reviewed-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Depends-on: 38cf307c ("mm: fix kthread_use_mm() vs TLB invalidate")
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20200914045219.3736466-5-npiggin@gmail.com
      a665eec0
  2. 16 Jul, 2020 1 commit
  3. 20 May, 2020 1 commit
  4. 17 Mar, 2020 1 commit
  5. 25 Jan, 2020 1 commit
  6. 05 Nov, 2019 3 commits
  7. 24 Sep, 2019 2 commits
  8. 05 Sep, 2019 3 commits
  9. 03 Jul, 2019 4 commits
  10. 30 May, 2019 1 commit
  11. 15 May, 2019 2 commits
    • Masahiro Yamada's avatar
      powerpc/mm/radix: mark as __tlbie_pid() and friends as__always_inline · efc344c5
      Masahiro Yamada authored
      This prepares to move CONFIG_OPTIMIZE_INLINING from x86 to a common
      place.  We need to eliminate potential issues beforehand.
      
      If it is enabled for powerpc, the following errors are reported:
      
        arch/powerpc/mm/tlb-radix.c: In function '__tlbie_lpid':
        arch/powerpc/mm/tlb-radix.c:148:2: warning: asm operand 3 probably doesn't match constraints
          asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1)
          ^~~
        arch/powerpc/mm/tlb-radix.c:148:2: error: impossible constraint in 'asm'
        arch/powerpc/mm/tlb-radix.c: In function '__tlbie_pid':
        arch/powerpc/mm/tlb-radix.c:118:2: warning: asm operand 3 probably doesn't match constraints
          asm volatile(PPC_TLBIE_5(%0, %4, %3, %2, %1)
          ^~~
        arch/powerpc/mm/tlb-radix.c: In function '__tlbiel_pid':
        arch/powerpc/mm/tlb-radix.c:104:2: warning: asm operand 3 probably doesn't match constraints
          asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1)
          ^~~
      
      Link: http://lkml.kernel.org/r/20190423034959.13525-11-yamada.masahiro@socionext.comSigned-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Boris Brezillon <bbrezillon@kernel.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Norris <computersforpeace@gmail.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Marek Vasut <marek.vasut@gmail.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Malaterre <malat@debian.org>
      Cc: Miquel Raynal <miquel.raynal@bootlin.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: Stefan Agner <stefan@agner.ch>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      efc344c5
    • Masahiro Yamada's avatar
      powerpc/mm/radix: mark __radix__flush_tlb_range_psize() as __always_inline · e12d6d7d
      Masahiro Yamada authored
      This prepares to move CONFIG_OPTIMIZE_INLINING from x86 to a common
      place.  We need to eliminate potential issues beforehand.
      
      If it is enabled for powerpc, the following error is reported:
      
        arch/powerpc/mm/tlb-radix.c: In function '__radix__flush_tlb_range_psize':
        arch/powerpc/mm/tlb-radix.c:104:2: error: asm operand 3 probably doesn't match constraints [-Werror]
          asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1)
          ^~~
        arch/powerpc/mm/tlb-radix.c:104:2: error: impossible constraint in 'asm'
      
      Link: http://lkml.kernel.org/r/20190423034959.13525-10-yamada.masahiro@socionext.comSigned-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Boris Brezillon <bbrezillon@kernel.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Norris <computersforpeace@gmail.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Marek Vasut <marek.vasut@gmail.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Malaterre <malat@debian.org>
      Cc: Miquel Raynal <miquel.raynal@bootlin.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: Stefan Agner <stefan@agner.ch>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e12d6d7d
  12. 02 May, 2019 1 commit
  13. 20 Oct, 2018 1 commit
  14. 09 Oct, 2018 1 commit
    • Suraj Jitindar Singh's avatar
      KVM: PPC: Book3S HV: Handle page fault for a nested guest · fd10be25
      Suraj Jitindar Singh authored
      Consider a normal (L1) guest running under the main hypervisor (L0),
      and then a nested guest (L2) running under the L1 guest which is acting
      as a nested hypervisor. L0 has page tables to map the address space for
      L1 providing the translation from L1 real address -> L0 real address;
      
      	L1
      	|
      	| (L1 -> L0)
      	|
      	----> L0
      
      There are also page tables in L1 used to map the address space for L2
      providing the translation from L2 real address -> L1 read address. Since
      the hardware can only walk a single level of page table, we need to
      maintain in L0 a "shadow_pgtable" for L2 which provides the translation
      from L2 real address -> L0 real address. Which looks like;
      
      	L2				L2
      	|				|
      	| (L2 -> L1)			|
      	|				|
      	----> L1			| (L2 -> L0)
      	      |				|
      	      | (L1 -> L0)		|
      	      |				|
      	      ----> L0			--------> L0
      
      When a page fault occurs while running a nested (L2) guest we need to
      insert a pte into this "shadow_pgtable" for the L2 -> L0 mapping. To
      do this we need to:
      
      1. Walk the pgtable in L1 memory to find the L2 -> L1 mapping, and
         provide a page fault to L1 if this mapping doesn't exist.
      2. Use our L1 -> L0 pgtable to convert this L1 address to an L0 address,
         or try to insert a pte for that mapping if it doesn't exist.
      3. Now we have a L2 -> L0 mapping, insert this into our shadow_pgtable
      
      Once this mapping exists we can take rc faults when hardware is unable
      to automatically set the reference and change bits in the pte. On these
      we need to:
      
      1. Check the rc bits on the L2 -> L1 pte match, and otherwise reflect
         the fault down to L1.
      2. Set the rc bits in the L1 -> L0 pte which corresponds to the same
         host page.
      3. Set the rc bits in the L2 -> L0 pte.
      
      As we reuse a large number of functions in book3s_64_mmu_radix.c for
      this we also needed to refactor a number of these functions to take
      an lpid parameter so that the correct lpid is used for tlb invalidations.
      The functionality however has remained the same.
      Reviewed-by: default avatarDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: default avatarSuraj Jitindar Singh <sjitindarsingh@gmail.com>
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      fd10be25
  15. 04 Oct, 2018 1 commit
  16. 16 Jul, 2018 1 commit
  17. 19 Jun, 2018 2 commits
  18. 03 Jun, 2018 2 commits
    • Nicholas Piggin's avatar
      powerpc/64s/radix: flush remote CPUs out of single-threaded mm_cpumask · 0cef77c7
      Nicholas Piggin authored
      When a single-threaded process has a non-local mm_cpumask, try to use
      that point to flush the TLBs out of other CPUs in the cpumask.
      
      An IPI is used for clearing remote CPUs for a few reasons:
      - An IPI can end lazy TLB use of the mm, which is required to prevent
        TLB entries being created on the remote CPU. The alternative is to
        drop lazy TLB switching completely, which costs 7.5% in a context
        switch ping-pong test betwee a process and kernel idle thread.
      - An IPI can have remote CPUs flush the entire PID, but the local CPU
        can flush a specific VA. tlbie would require over-flushing of the
        local CPU (where the process is running).
      - A single threaded process that is migrated to a different CPU is
        likely to have a relatively small mm_cpumask, so IPI is reasonable.
      
      No other thread can concurrently switch to this mm, because it must
      have been given a reference to mm_users by the current thread before it
      can use_mm. mm_users can be asynchronously incremented (by
      mm_activate or mmget_not_zero), but those users must use remote mm
      access and can't use_mm or access user address space. Existing code
      makes the this assumption already, for example sparc64 has reset
      mm_cpumask using this condition since the start of history, see
      arch/sparc/kernel/smp_64.c.
      
      This reduces tlbies for a kernel compile workload from 0.90M to 0.12M,
      tlbiels are increased significantly due to the PID flushing for the
      cleaning up remote CPUs, and increased local flushes (PID flushes take
      128 tlbiels vs 1 tlbie).
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      0cef77c7
    • Nicholas Piggin's avatar
      powerpc/64s/radix: optimise pte_update · 85bcfaf6
      Nicholas Piggin authored
      Implementing pte_update with pte_xchg (which uses cmpxchg) is
      inefficient. A single larx/stcx. works fine, no need for the less
      efficient cmpxchg sequence.
      
      Then remove the memory barriers from the operation. There is a
      requirement for TLB flushing to load mm_cpumask after the store
      that reduces pte permissions, which is moved into the TLB flush
      code.
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      85bcfaf6
  19. 17 May, 2018 1 commit
  20. 12 Apr, 2018 1 commit
    • Michael Ellerman's avatar
      powerpc/mm/radix: Fix checkstops caused by invalid tlbiel · 2675c13b
      Michael Ellerman authored
      In tlbiel_radix_set_isa300() we use the PPC_TLBIEL() macro to
      construct tlbiel instructions. The instruction takes 5 fields, two of
      which are registers, and the others are constants. But because it's
      constructed with inline asm the compiler doesn't know that.
      
      We got the constraint wrong on the 'r' field, using "r" tells the
      compiler to put the value in a register. The value we then get in the
      macro is the *register number*, not the value of the field.
      
      That means when we mask the register number with 0x1 we get 0 or 1
      depending on which register the compiler happens to put the constant
      in, eg:
      
        li      r10,1
        tlbiel  r8,r9,2,0,0
      
        li      r7,1
        tlbiel  r10,r6,0,0,1
      
      If we're unlucky we might generate an invalid instruction form, for
      example RIC=0, PRS=1 and R=0, tlbiel r8,r7,0,1,0, this has been
      observed to cause machine checks:
      
        Oops: Machine check, sig: 7 [#1]
        CPU: 24 PID: 0 Comm: swapper
        NIP:  00000000000385f4 LR: 000000000100ed00 CTR: 000000000000007f
        REGS: c00000000110bb40 TRAP: 0200
        MSR:  9000000000201003 <SF,HV,ME,RI,LE>  CR: 48002222  XER: 20040000
        CFAR: 00000000000385d0 DAR: 0000000000001c00 DSISR: 00000200 SOFTE: 1
      
      If the machine check happens early in boot while we have MSR_ME=0 it
      will escalate into a checkstop and kill the box entirely.
      
      To fix it we could change the inline asm constraint to "i" which
      tells the compiler the value is a constant. But a better fix is to just
      pass a literal 1 into the macro, which bypasses any problems with inline
      asm constraints.
      
      Fixes: d4748276 ("powerpc/64s: Improve local TLB flush for boot and MCE on POWER9")
      Cc: stable@vger.kernel.org # v4.16+
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Reviewed-by: default avatarNicholas Piggin <npiggin@gmail.com>
      2675c13b
  21. 01 Apr, 2018 1 commit
  22. 30 Mar, 2018 1 commit
  23. 27 Mar, 2018 1 commit
  24. 23 Mar, 2018 4 commits
  25. 21 Jan, 2018 1 commit
  26. 17 Jan, 2018 1 commit
    • Nicholas Piggin's avatar
      powerpc/64s: Improve local TLB flush for boot and MCE on POWER9 · d4748276
      Nicholas Piggin authored
      There are several cases outside the normal address space management
      where a CPU's entire local TLB is to be flushed:
      
        1. Booting the kernel, in case something has left stale entries in
           the TLB (e.g., kexec).
      
        2. Machine check, to clean corrupted TLB entries.
      
      One other place where the TLB is flushed, is waking from deep idle
      states. The flush is a side-effect of calling ->cpu_restore with the
      intention of re-setting various SPRs. The flush itself is unnecessary
      because in the first case, the TLB should not acquire new corrupted
      TLB entries as part of sleep/wake (though they may be lost).
      
      This type of TLB flush is coded inflexibly, several times for each CPU
      type, and they have a number of problems with ISA v3.0B:
      
      - The current radix mode of the MMU is not taken into account, it is
        always done as a hash flushn For IS=2 (LPID-matching flush from host)
        and IS=3 with HV=0 (guest kernel flush), tlbie(l) is undefined if
        the R field does not match the current radix mode.
      
      - ISA v3.0B hash must flush the partition and process table caches as
        well.
      
      - ISA v3.0B radix must flush partition and process scoped translations,
        partition and process table caches, and also the page walk cache.
      
      So consolidate the flushing code and implement it in C and inline asm
      under the mm/ directory with the rest of the flush code. Add ISA v3.0B
      cases for radix and hash, and use the radix flush in radix environment.
      
      Provide a way for IS=2 (LPID flush) to specify the radix mode of the
      partition. Have KVM pass in the radix mode of the guest.
      
      Take out the flushes from early cputable/dt_cpu_ftrs detection hooks,
      and move it later in the boot process after, the MMU registers are set
      up and before relocation is first turned on.
      
      The TLB flush is no longer called when restoring from deep idle states.
      This was not be done as a separate step because booting secondaries
      uses the same cpu_restore as idle restore, which needs the TLB flush.
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      d4748276