1. 07 Oct, 2019 40 commits
    • Huacai Chen's avatar
      MIPS: Loongson-3: Add CSR IPI support · ffe59ee3
      Huacai Chen authored
      CSR IPI and legacy MMIO use the same infrastructure, but CSR IPI is
      faster than legacy MMIO IPI. This patch enable CSR IPI if possible
      (except for MailBox, because CSR IPI is too complicated for MailBox).
      Signed-off-by: default avatarHuacai Chen <chenhc@lemote.com>
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: linux-mips@linux-mips.org
      Cc: linux-mips@vger.kernel.org
      Cc: Fuxin Zhang <zhangfx@lemote.com>
      Cc: Zhangjin Wu <wuzhangjin@gmail.com>
      Cc: Huacai Chen <chenhuacai@gmail.com>
      ffe59ee3
    • Huacai Chen's avatar
      MIPS: Loongson: Add Loongson-3A R4 basic support · 7507445b
      Huacai Chen authored
      All Loongson-3 CPU family:
      
      Code-name         Brand-name       PRId
      Loongson-3A R1    Loongson-3A1000  0x6305
      Loongson-3A R2    Loongson-3A2000  0x6308
      Loongson-3A R2.1  Loongson-3A2000  0x630c
      Loongson-3A R3    Loongson-3A3000  0x6309
      Loongson-3A R3.1  Loongson-3A3000  0x630d
      Loongson-3A R4    Loongson-3A4000  0xc000
      Loongson-3B R1    Loongson-3B1000  0x6306
      Loongson-3B R2    Loongson-3B1500  0x6307
      
      Features of R4 revision of Loongson-3A:
      
        - All R2/R3 features, including SFB, V-Cache, FTLB, RIXI, DSP, etc.
        - Support variable ASID bits.
        - Support MSA and VZ extensions.
        - Support CPUCFG (CPU config) and CSR (Control and Status Register)
            extensions.
        - 64 entries of VTLB (classic TLB), 2048 entries of FTLB (8-way
            set-associative).
      
      Now 64-bit Loongson processors has three types of PRID.IMP: 0x6300 is
      the classic one so we call it PRID_IMP_LOONGSON_64C (e.g., Loongson-2E/
      2F/3A1000/3B1000/3B1500/3A2000/3A3000), 0x6100 is for some processors
      which has reduced capabilities so we call it PRID_IMP_LOONGSON_64R
      (e.g., Loongson-2K), 0xc000 is supposed to cover all new processors in
      general (e.g., Loongson-3A4000+) so we call it PRID_IMP_LOONGSON_64G.
      Signed-off-by: default avatarHuacai Chen <chenhc@lemote.com>
      Signed-off-by: default avatarJiaxun Yang <jiaxun.yang@flygoat.com>
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: linux-mips@linux-mips.org
      Cc: linux-mips@vger.kernel.org
      Cc: Fuxin Zhang <zhangfx@lemote.com>
      Cc: Zhangjin Wu <wuzhangjin@gmail.com>
      Cc: Huacai Chen <chenhuacai@gmail.com>
      7507445b
    • Huacai Chen's avatar
      MIPS: Loongson: Add CFUCFG&CSR support · 6a6f9b7d
      Huacai Chen authored
      Loongson-3A R4+ (Loongson-3A4000 and newer) has CPUCFG (CPU config) and
      CSR (Control and Status Register) extensions. This patch add read/write
      functionalities for them.
      Signed-off-by: default avatarHuacai Chen <chenhc@lemote.com>
      Signed-off-by: default avatarJiaxun Yang <jiaxun.yang@flygoat.com>
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: linux-mips@linux-mips.org
      Cc: linux-mips@vger.kernel.org
      Cc: Fuxin Zhang <zhangfx@lemote.com>
      Cc: Zhangjin Wu <wuzhangjin@gmail.com>
      Cc: Huacai Chen <chenhuacai@gmail.com>
      6a6f9b7d
    • Mike Rapoport's avatar
      mips: sgi-ip27: switch from DISCONTIGMEM to SPARSEMEM · 397dc00e
      Mike Rapoport authored
      The memory initialization of SGI-IP27 is already half-way to support
      SPARSEMEM. It only had free_bootmem_with_active_regions() left-overs
      interfering with sparse_memory_present_with_active_regions().
      
      Replace these calls with simpler memblocks_present() call in prom_meminit()
      and adjust arch/mips/Kconfig to enable SPARSEMEM and SPARSEMEM_EXTREME for
      SGI-IP27.
      Co-developed-by: default avatarThomas Bogendoerfer <tbogendoerfer@suse.de>
      Signed-off-by: default avatarThomas Bogendoerfer <tbogendoerfer@suse.de>
      Signed-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      397dc00e
    • Paul Burton's avatar
      MIPS: Check Loongson3 LL/SC errata workaround correctness · e4acfbc1
      Paul Burton authored
      When Loongson3 LL/SC errata workarounds are enabled (ie.
      CONFIG_CPU_LOONGSON3_WORKAROUNDS=y) run a tool to scan through the
      compiled kernel & ensure that the workaround is applied correctly. That
      is, ensure that:
      
        - Every LL or LLD instruction is preceded by a sync instruction.
      
        - Any branches from within an LL/SC loop to outside of that loop
          target a sync instruction.
      
      Reasoning for these conditions can be found by reading the comment above
      the definition of __SYNC_loongson3_war in arch/mips/include/asm/sync.h.
      
      This tool will help ensure that we don't inadvertently introduce code
      paths that miss the required workarounds.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      e4acfbc1
    • Paul Burton's avatar
      MIPS: genex: Don't reload address unnecessarily · 4dee90d7
      Paul Burton authored
      In ejtag_debug_handler() we must reload the address of
      ejtag_debug_buffer_spinlock if an sc fails, since the address in k0 will
      have been clobbered by the result of the sc instruction. In the case
      where we simply load a non-zero value (ie. there's contention for the
      lock) the address will not be clobbered & we can simply branch back to
      repeat the load from memory without reloading the address into k0.
      
      The primary motivation for this change is that it moves the target of
      the bnez instruction to an instruction within the LL/SC loop (the LL
      itself), which we know contains no other memory accesses & therefore
      isn't affected by Loongson3 LL/SC errata.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      4dee90d7
    • Paul Burton's avatar
      MIPS: genex: Add Loongson3 LL/SC workaround to ejtag_debug_handler · 12dbb04f
      Paul Burton authored
      In ejtag_debug_handler we use LL & SC instructions to acquire & release
      an open-coded spinlock. For Loongson3 systems affected by LL/SC errata
      this requires that we insert a sync instruction prior to the LL in order
      to ensure correct behavior of the LL/SC loop.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      12dbb04f
    • Paul Burton's avatar
      MIPS: barrier: Make __smp_mb__before_atomic() a no-op for Loongson3 · ae4cd0b1
      Paul Burton authored
      Loongson3 systems with CONFIG_CPU_LOONGSON3_WORKAROUNDS enabled already
      emit a full completion barrier as part of the inline assembly containing
      LL/SC loops for atomic operations. As such the barrier emitted by
      __smp_mb__before_atomic() is redundant, and we can remove it.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      ae4cd0b1
    • Paul Burton's avatar
      MIPS: barrier: Remove loongson_llsc_mb() · 7f56b123
      Paul Burton authored
      The loongson_llsc_mb() macro is no longer used - instead barriers are
      emitted as part of inline asm using the __SYNC() macro. Remove the
      now-defunct loongson_llsc_mb() macro.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      7f56b123
    • Paul Burton's avatar
      MIPS: syscall: Emit Loongson3 sync workarounds within asm · e84957e6
      Paul Burton authored
      Generate the sync instructions required to workaround Loongson3 LL/SC
      errata within inline asm blocks, which feels a little safer than doing
      it from C where strictly speaking the compiler would be well within its
      rights to insert a memory access between the separate asm statements we
      previously had, containing sync & ll instructions respectively.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      e84957e6
    • Paul Burton's avatar
      MIPS: futex: Emit Loongson3 sync workarounds within asm · 3c1d3f09
      Paul Burton authored
      Generate the sync instructions required to workaround Loongson3 LL/SC
      errata within inline asm blocks, which feels a little safer than doing
      it from C where strictly speaking the compiler would be well within its
      rights to insert a memory access between the separate asm statements we
      previously had, containing sync & ll instructions respectively.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      3c1d3f09
    • Paul Burton's avatar
      MIPS: cmpxchg: Omit redundant barriers for Loongson3 · a91f2a1d
      Paul Burton authored
      When building a kernel configured to support Loongson3 LL/SC workarounds
      (ie. CONFIG_CPU_LOONGSON3_WORKAROUNDS=y) the inline assembly in
      __xchg_asm() & __cmpxchg_asm() already emits completion barriers, and as
      such we don't need to emit extra barriers from the xchg() or cmpxchg()
      macros. Add compile-time constant checks causing us to omit the
      redundant memory barriers.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      a91f2a1d
    • Paul Burton's avatar
      MIPS: cmpxchg: Emit Loongson3 sync workarounds within asm · 6a57d2d1
      Paul Burton authored
      Generate the sync instructions required to workaround Loongson3 LL/SC
      errata within inline asm blocks, which feels a little safer than doing
      it from C where strictly speaking the compiler would be well within its
      rights to insert a memory access between the separate asm statements we
      previously had, containing sync & ll instructions respectively.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      6a57d2d1
    • Paul Burton's avatar
      MIPS: bitops: Use smp_mb__before_atomic in test_* ops · 90267377
      Paul Burton authored
      Use smp_mb__before_atomic() rather than smp_mb__before_llsc() in
      test_and_set_bit(), test_and_clear_bit() & test_and_change_bit(). The
      _atomic() versions make semantic sense in these cases, and will allow a
      later patch to omit redundant barriers for Loongson3 systems that
      already include a barrier within __test_bit_op().
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      90267377
    • Paul Burton's avatar
      MIPS: bitops: Emit Loongson3 sync workarounds within asm · 5bb29275
      Paul Burton authored
      Generate the sync instructions required to workaround Loongson3 LL/SC
      errata within inline asm blocks, which feels a little safer than doing
      it from C where strictly speaking the compiler would be well within its
      rights to insert a memory access between the separate asm statements we
      previously had, containing sync & ll instructions respectively.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      5bb29275
    • Paul Burton's avatar
      MIPS: bitops: Use BIT_WORD() & BITS_PER_LONG · c042be02
      Paul Burton authored
      Rather than using custom SZLONG_LOG & SZLONG_MASK macros to shift & mask
      a bit index to form word & bit offsets respectively, make use of the
      standard BIT_WORD() & BITS_PER_LONG macros for the same purpose.
      
      volatile is added to the definition of pointers to the long-sized word
      we'll operate on, in order to prevent the compiler complaining that we
      cast away the volatile qualifier of the addr argument. This should have
      no effect on generated code, which in the LL/SC case is inline asm
      anyway & in the non-LLSC case access is constrained by compiler barriers
      provided by raw_local_irq_{save,restore}().
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      c042be02
    • Paul Burton's avatar
      MIPS: bitops: Abstract LL/SC loops · cc99987c
      Paul Burton authored
      Introduce __bit_op() & __test_bit_op() macros which abstract away the
      implementation of LL/SC loops. This cuts down on a lot of duplicate
      boilerplate code, and also allows R10000_LLSC_WAR to be handled outside
      of the individual bitop functions.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      cc99987c
    • Paul Burton's avatar
      MIPS: bitops: Avoid redundant zero-comparison for non-LLSC · aad028ca
      Paul Burton authored
      The IRQ-disabling non-LLSC fallbacks for bitops on UP systems already
      return a zero or one, so there's no need to perform another comparison
      against zero. Move these comparisons into the LLSC paths to avoid the
      redundant work.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      aad028ca
    • Paul Burton's avatar
      MIPS: bitops: Use the BIT() macro · d6103510
      Paul Burton authored
      Use the BIT() macro in asm/bitops.h rather than open-coding its
      equivalent.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      d6103510
    • Paul Burton's avatar
      MIPS: bitops: Allow immediates in test_and_{set,clear,change}_bit · a2e66b86
      Paul Burton authored
      The logical operations or & xor used in the test_and_set_bit_lock(),
      test_and_clear_bit() & test_and_change_bit() functions currently force
      the value 1<<bit to be placed in a register. If the bit is compile-time
      constant & fits within the immediate field of an or/xor instruction (ie.
      16 bits) then we can make use of the ori/xori instruction variants &
      avoid the use of an extra register. Add the extra "i" constraints in
      order to allow use of these immediate encodings.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      a2e66b86
    • Paul Burton's avatar
      MIPS: bitops: Implement test_and_set_bit() in terms of _lock variant · 6bbe043b
      Paul Burton authored
      The only difference between test_and_set_bit() & test_and_set_bit_lock()
      is memory ordering barrier semantics - the former provides a full
      barrier whilst the latter only provides acquire semantics.
      
      We can therefore implement test_and_set_bit() in terms of
      test_and_set_bit_lock() with the addition of the extra memory barrier.
      Do this in order to avoid duplicating logic.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      6bbe043b
    • Paul Burton's avatar
      MIPS: bitops: ins start position is always an immediate · 27aab272
      Paul Burton authored
      The start position for an ins instruction is always encoded as an
      immediate, so allowing registers to be used by the inline asm makes no
      sense. It should never happen anyway since a bit index should always be
      small enough to be treated as an immediate, but remove the nonsensical
      "r" for sanity.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      27aab272
    • Paul Burton's avatar
      MIPS: bitops: Use MIPS_ISA_REV, not #ifdefs · 59361e99
      Paul Burton authored
      Rather than #ifdef on CONFIG_CPU_* to determine whether the ins
      instruction is supported we can simply check MIPS_ISA_REV to discover
      whether we're targeting MIPSr2 or higher. Do so in order to clean up the
      code.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      59361e99
    • Paul Burton's avatar
      MIPS: bitops: Only use ins for bit 16 or higher · 3d2920cf
      Paul Burton authored
      set_bit() can set bits 0-15 using an ori instruction, rather than
      loading the value -1 into a register & then using an ins instruction.
      
      That is, rather than the following:
      
        li   t0, -1
        ll   t1, 0(t2)
        ins  t1, t0, 4, 1
        sc   t1, 0(t2)
      
      We can have the simpler:
      
        ll   t1, 0(t2)
        ori  t1, t1, 0x10
        sc   t1, 0(t2)
      
      The or path already allows immediates to be used, so simply restricting
      the ins path to bits that don't fit in immediates is sufficient to take
      advantage of this.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      3d2920cf
    • Paul Burton's avatar
      MIPS: bitops: Handle !kernel_uses_llsc first · fe7cd97e
      Paul Burton authored
      Reorder conditions in our various bitops functions that check
      kernel_uses_llsc such that they handle the !kernel_uses_llsc case first.
      This allows us to avoid the need to duplicate the kernel_uses_llsc check
      in all the other cases. For functions that don't involve barriers common
      to the various implementations, we switch to returning from within each
      if block making each case easier to read in isolation.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      fe7cd97e
    • Paul Burton's avatar
      MIPS: atomic: Deduplicate 32b & 64b read, set, xchg, cmpxchg · 1da7bce8
      Paul Burton authored
      Remove the remaining duplication between 32b & 64b in asm/atomic.h by
      making use of an ATOMIC_OPS() macro to generate:
      
        - atomic_read()/atomic64_read()
        - atomic_set()/atomic64_set()
        - atomic_cmpxchg()/atomic64_cmpxchg()
        - atomic_xchg()/atomic64_xchg()
      
      This is consistent with the way all other functions in asm/atomic.h are
      generated, and ensures consistency between the 32b & 64b functions.
      
      Of note is that this results in the above now being static inline
      functions rather than macros.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      1da7bce8
    • Paul Burton's avatar
      MIPS: atomic: Unify 32b & 64b sub_if_positive · 40e784b4
      Paul Burton authored
      Unify the definitions of atomic_sub_if_positive() &
      atomic64_sub_if_positive() using a macro like we do for most other
      atomic functions. This allows us to share the implementation ensuring
      consistency between the two. Notably this provides the appropriate
      loongson3_war barriers in the atomic64_sub_if_positive() case which were
      previously missing.
      
      The code is rearranged a little to handle the !kernel_uses_llsc case
      first in order to de-indent the LL/SC case & allow us not to go over 80
      characters per line.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      40e784b4
    • Paul Burton's avatar
      MIPS: atomic: Use _atomic barriers in atomic_sub_if_positive() · 77d281b7
      Paul Burton authored
      Use smp_mb__before_atomic() & smp_mb__after_atomic() in
      atomic_sub_if_positive() rather than the equivalent
      smp_mb__before_llsc() & smp_llsc_mb(). The former are more standard &
      this preps us for avoiding redundant duplicate barriers on Loongson3 in
      a later patch.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      77d281b7
    • Paul Burton's avatar
      MIPS: atomic: Emit Loongson3 sync workarounds within asm · 4d1dbfe6
      Paul Burton authored
      Generate the sync instructions required to workaround Loongson3 LL/SC
      errata within inline asm blocks, which feels a little safer than doing
      it from C where strictly speaking the compiler would be well within its
      rights to insert a memory access between the separate asm statements we
      previously had, containing sync & ll instructions respectively.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      4d1dbfe6
    • Paul Burton's avatar
      MIPS: atomic: Use one macro to generate 32b & 64b functions · a38ee6bb
      Paul Burton authored
      Cut down on duplication by generalizing the ATOMIC_OP(),
      ATOMIC_OP_RETURN() & ATOMIC_FETCH_OP() macros to work for both 32b &
      64b atomics, and removing the ATOMIC64_ variants. This ensures
      consistency between our atomic_* & atomic64_* functions.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      a38ee6bb
    • Paul Burton's avatar
      MIPS: atomic: Handle !kernel_uses_llsc first · 9537db24
      Paul Burton authored
      Handle the !kernel_uses_llsc path first in our ATOMIC_OP(),
      ATOMIC_OP_RETURN() & ATOMIC_FETCH_OP() macros & return from within the
      block. This allows us to de-indent the kernel_uses_llsc path by one
      level which will be useful when making further changes.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      9537db24
    • Paul Burton's avatar
      MIPS: atomic: Fix whitespace in ATOMIC_OP macros · 36d3295c
      Paul Burton authored
      We define macros in asm/atomic.h which end each line with space
      characters before a backslash to continue on the next line. Remove the
      space characters leaving tabs as the whitespace used for conformity with
      coding convention.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      36d3295c
    • Paul Burton's avatar
      MIPS: barrier: Clean up sync_ginv() · 185d7d7a
      Paul Burton authored
      Use the new __SYNC() infrastructure to implement sync_ginv(), for
      consistency with much of the rest of the asm/barrier.h.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      185d7d7a
    • Paul Burton's avatar
      MIPS: barrier: Clean up __sync() definition · fe0065e5
      Paul Burton authored
      Implement __sync() using the new __SYNC() infrastructure, which will
      take care of not emitting an instruction for old R3k CPUs that don't
      support it. The only behavioral difference is that __sync() will now
      provide a compiler barrier on these old CPUs, but that seems like
      reasonable behavior anyway.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      fe0065e5
    • Paul Burton's avatar
      MIPS: barrier: Remove fast_mb() Octeon #ifdef'ery · 5c12a6ef
      Paul Burton authored
      The definition of fast_mb() is the same in both the Octeon & non-Octeon
      cases, so remove the duplication & define it only once.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      5c12a6ef
    • Paul Burton's avatar
      MIPS: barrier: Clean up __smp_mb() definition · 05e6da74
      Paul Burton authored
      We #ifdef on Cavium Octeon CPUs, but emit the same sync instruction in
      both cases. Remove the #ifdef & simply expand to the __sync() macro.
      
      Whilst here indent the strong ordering case definitions to match the
      indentation of the weak ordering ones, helping readability.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      05e6da74
    • Paul Burton's avatar
      MIPS: barrier: Clean up rmb() & wmb() definitions · 21e3134b
      Paul Burton authored
      Simplify our definitions of rmb() & wmb() using the new __SYNC()
      infrastructure.
      
      The fast_rmb() & fast_wmb() macros are removed, since they only provided
      a level of indirection that made the code less readable & weren't
      directly used anywhere in the kernel tree.
      
      The Octeon #ifdef'ery is removed, since the "syncw" instruction
      previously used is merely an alias for "sync 4" which __SYNC() will emit
      for the wmb sync type when the kernel is configured for an Octeon CPU.
      Similarly __SYNC() will emit nothing for the rmb sync type in Octeon
      configurations.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      21e3134b
    • Paul Burton's avatar
      MIPS: barrier: Add __SYNC() infrastructure · bf929272
      Paul Burton authored
      Introduce an asm/sync.h header which provides infrastructure that can be
      used to generate sync instructions of various types, and for various
      reasons. For example if we need a sync instruction that provides a full
      completion barrier but only on systems which have weak memory ordering,
      we can generate the appropriate assembly code using:
      
        __SYNC(full, weak_ordering)
      
      When the kernel is configured to run on systems with weak memory
      ordering (ie. CONFIG_WEAK_ORDERING is selected) we'll emit a sync
      instruction. When the kernel is configured to run on systems with strong
      memory ordering (ie. CONFIG_WEAK_ORDERING is not selected) we'll emit
      nothing. The caller doesn't need to know which happened - it simply says
      what it needs & when, with no concern for checking the kernel
      configuration.
      
      There are some scenarios in which we may want to emit code only when we
      *didn't* emit a sync instruction. For example, some Loongson3 CPUs
      suffer from a bug that requires us to emit a sync instruction prior to
      each ll instruction (enabled by CONFIG_CPU_LOONGSON3_WORKAROUNDS). In
      cases where this bug workaround is enabled, it's wasteful to then have
      more generic code emit another sync instruction to provide barriers we
      need in general. A __SYNC_ELSE() macro allows for this, providing an
      extra argument that contains code to be assembled only in cases where
      the sync instruction was not emitted. For example if we have a scenario
      in which we generally want to emit a release barrier but for affected
      Loongson3 configurations upgrade that to a full completion barrier, we
      can do that like so:
      
        __SYNC_ELSE(full, loongson3_war, __SYNC(rl, always))
      
      The assembly generated by these macros can be used either as inline
      assembly or in assembly source files.
      
      Differing types of sync as provided by MIPSr6 are defined, but currently
      they all generate a full completion barrier except in kernels configured
      for Cavium Octeon systems. There the wmb sync-type is used, and rmb
      syncs are omitted, as has been the case since commit 6b07d38a
      ("MIPS: Octeon: Use optimized memory barrier primitives."). Using
      __SYNC() with the wmb or rmb types will abstract away the Octeon
      specific behavior and allow us to later clean up asm/barrier.h code that
      currently includes a plethora of #ifdef's.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      bf929272
    • Paul Burton's avatar
      MIPS: Use compact branch for LL/SC loops on MIPSr6+ · ef85d057
      Paul Burton authored
      When targeting MIPSr6 or higher make use of a compact branch in LL/SC
      loops, preventing the insertion of a delay slot nop that only serves to
      waste space.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      ef85d057
    • Paul Burton's avatar
      MIPS: Unify sc beqz definition · 878f75c7
      Paul Burton authored
      We currently duplicate the definition of __scbeqz in asm/atomic.h &
      asm/cmpxchg.h. Move it to asm/llsc.h & rename it to __SC_BEQZ to fit
      better with the existing __SC macro provided there.
      
      We include a tab in the string in order to avoid the need for users to
      indent code any further to include whitespace of their own after the
      instruction mnemonic.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
      Cc: linux-kernel@vger.kernel.org
      878f75c7