• Srivatsa Vaddagiri's avatar
    sched: Improve balance_cpu() to consider other cpus in its group as target of (pinned) task · 88b8dac0
    Srivatsa Vaddagiri authored
    Current load balance scheme requires only one cpu in a
    sched_group (balance_cpu) to look at other peer sched_groups for
    imbalance and pull tasks towards itself from a busy cpu. Tasks
    thus pulled by balance_cpu could later get picked up by cpus
    that are in the same sched_group as that of balance_cpu.
    
    This scheme however fails to pull tasks that are not allowed to
    run on balance_cpu (but are allowed to run on other cpus in its
    sched_group). That can affect fairness and in some worst case
    scenarios cause starvation.
    
    Consider a two core (2 threads/core) system running tasks as
    below:
    
              Core0            Core1
             /     \          /     \
    	C0     C1	 C2     C3
            |      |         |      |
            v      v         v      v
    	F0     T1        F1     [idle]
    			 T2
    
     F0 = SCHED_FIFO task (pinned to C0)
     F1 = SCHED_FIFO task (pinned to C2)
     T1 = SCHED_OTHER task (pinned to C1)
     T2 = SCHED_OTHER task (pinned to C1 and C2)
    
    F1 could become a cpu hog, which will starve T2 unless C1 pulls
    it. Between C0 and C1 however, C0 is required to look for
    imbalance between cores, which will fail to pull T2 towards
    Core0. T2 will starve eternally in this case. The same scenario
    can arise in presence of non-rt tasks as well (say we replace F1
    with high irq load).
    
    We tackle this problem by having balance_cpu move pinned tasks
    to one of its sibling cpus (where they can run). We first check
    if load balance goal can be met by ignoring pinned tasks,
    failing which we retry move_tasks() with a new env->dst_cpu.
    
    This patch modifies load balance semantics on who can move load
    towards a given cpu in a given sched_domain.
    
    Before this patch, a given_cpu or a ilb_cpu acting on behalf of
    an idle given_cpu is responsible for moving load to given_cpu.
    
    With this patch applied, balance_cpu can in addition decide on
    moving some load to a given_cpu.
    
    There is a remote possibility that excess load could get moved
    as a result of this (balance_cpu and given_cpu/ilb_cpu deciding
    *independently* and at *same* time to move some load to a
    given_cpu). However we should see less of such conflicting
    decisions in practice and moreover subsequent load balance
    cycles should correct the excess load moved to given_cpu.
    Signed-off-by: default avatarSrivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
    Signed-off-by: default avatarPrashanth Nageshappa <prashanth@linux.vnet.ibm.com>
    Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
    Link: http://lkml.kernel.org/r/4FE06CDB.2060605@linux.vnet.ibm.com
    [ minor edits ]
    Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
    88b8dac0
fair.c 133 KB