• Nadav Amit's avatar
    smp: Run functions concurrently in smp_call_function_many_cond() · a32a4d8a
    Nadav Amit authored
    Currently, on_each_cpu() and similar functions do not exploit the
    potential of concurrency: the function is first executed remotely and
    only then it is executed locally. Functions such as TLB flush can take
    considerable time, so this provides an opportunity for performance
    optimization.
    
    To do so, modify smp_call_function_many_cond(), to allows the callers to
    provide a function that should be executed (remotely/locally), and run
    them concurrently. Keep other smp_call_function_many() semantic as it is
    today for backward compatibility: the called function is not executed in
    this case locally.
    
    smp_call_function_many_cond() does not use the optimized version for a
    single remote target that smp_call_function_single() implements. For
    synchronous function call, smp_call_function_single() keeps a
    call_single_data (which is used for synchronization) on the stack.
    Interestingly, it seems that not using this optimization provides
    greater performance improvements (greater speedup with a single remote
    target than with multiple ones). Presumably, holding data structures
    that are intended for synchronization on the stack can introduce
    overheads due to TLB misses and false-sharing when the stack is used for
    other purposes.
    Signed-off-by: default avatarNadav Amit <namit@vmware.com>
    Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
    Reviewed-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
    Link: https://lore.kernel.org/r/20210220231712.2475218-2-namit@vmware.com
    a32a4d8a
smp.c 27.5 KB