• Tianchen Ding's avatar
    sched: Clear ttwu_pending after enqueue_task() · d6962c4f
    Tianchen Ding authored
    We found a long tail latency in schbench whem m*t is close to nr_cpus.
    (e.g., "schbench -m 2 -t 16" on a machine with 32 cpus.)
    
    This is because when the wakee cpu is idle, rq->ttwu_pending is cleared
    too early, and idle_cpu() will return true until the wakee task enqueued.
    This will mislead the waker when selecting idle cpu, and wake multiple
    worker threads on the same wakee cpu. This situation is enlarged by
    commit f3dd3f67 ("sched: Remove the limitation of WF_ON_CPU on
    wakelist if wakee cpu is idle") because it tends to use wakelist.
    
    Here is the result of "schbench -m 2 -t 16" on a VM with 32vcpu
    (Intel(R) Xeon(R) Platinum 8369B).
    
    Latency percentiles (usec):
                    base      base+revert_f3dd3f67   base+this_patch
    50.0000th:         9                            13                 9
    75.0000th:        12                            19                12
    90.0000th:        15                            22                15
    95.0000th:        18                            24                17
    *99.0000th:       27                            31                24
    99.5000th:      3364                            33                27
    99.9000th:     12560                            36                30
    
    We also tested on unixbench and hackbench, and saw no performance
    change.
    Signed-off-by: default avatarTianchen Ding <dtcccc@linux.alibaba.com>
    Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
    Acked-by: default avatarMel Gorman <mgorman@suse.de>
    Link: https://lkml.kernel.org/r/20221104023601.12844-1-dtcccc@linux.alibaba.com
    d6962c4f
core.c 286 KB