• Marko Mäkelä's avatar
    MDEV-23633 MY_RELAX_CPU performs unnecessary compare-and-swap on ARM · 24f510bb
    Marko Mäkelä authored
    This follows up MDEV-14374, which was filed against MariaDB Server 10.3.
    Back then, on a 48-core Qualcomm Centriq 2400, the performance of
    delay loops for spinloops was tested both with and without the dummy
    compare-and-swap operation, and it was decided to keep the dummy
    operation.
    
    On target architectures where nothing special is available (other than
    x86 (IA-32, AMD64) or POWER), we perform a dummy compare-and-swap operation.
    This is contrary to the idea of the x86 PAUSE instruction and the
    __ppc_get_timebase(), which aim to keep the memory bus idle for a while,
    to allow other cores to better execute code while a spinloop is waiting
    for something to be changed.
    
    On MariaDB Server 10.4 and another implementation of the ARMv8 ISA,
    omitting the dummy compare-and-swap improved performance by up to 12%.
    So, let us avoid the dummy compare-and-swap on ARM.
    
    For now, we are retaining the dummy compare-and-swap on other ISAs
    (such as SPARC, MIPS, S390x, RISC-V) because we do not have any
    performance data for them.
    24f510bb
my_cpu.h 4.01 KB