• Waiman Long's avatar
    tick/broadcast: Reduce lock cacheline contention · 668802c2
    Waiman Long authored
    It was observed that on an Intel x86 system without the ARAT (Always
    running APIC timer) feature and with fairly large number of CPUs as
    well as CPUs coming in and out of intel_idle frequently, the lock
    contention on the tick_broadcast_lock can become significant.
    
    To reduce contention, the lock is put into its own cacheline and all
    the cpumask_var_t variables are put into the __read_mostly section.
    
    Running the SP benchmark of the NAS Parallel Benchmarks on a 4-socket
    16-core 32-thread Nehalam system, the performance number improved
    from 3353.94 Mop/s to 3469.31 Mop/s when this patch was applied on
    a 4.9.6 kernel.  This is a 3.4% improvement.
    Signed-off-by: default avatarWaiman Long <longman@redhat.com>
    Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Link: http://lkml.kernel.org/r/1485799063-20857-1-git-send-email-longman@redhat.comSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
    668802c2
tick-broadcast.c 26.5 KB