• Gregory Haskins's avatar
    sched: create "pushable_tasks" list to limit pushing to one attempt · 917b627d
    Gregory Haskins authored
    The RT scheduler employs a "push/pull" design to actively balance tasks
    within the system (on a per disjoint cpuset basis).  When a task is
    awoken, it is immediately determined if there are any lower priority
    cpus which should be preempted.  This is opposed to the way normal
    SCHED_OTHER tasks behave, which will wait for a periodic rebalancing
    operation to occur before spreading out load.
    
    When a particular RQ has more than 1 active RT task, it is said to
    be in an "overloaded" state.  Once this occurs, the system enters
    the active balancing mode, where it will try to push the task away,
    or persuade a different cpu to pull it over.  The system will stay
    in this state until the system falls back below the <= 1 queued RT
    task per RQ.
    
    However, the current implementation suffers from a limitation in the
    push logic.  Once overloaded, all tasks (other than current) on the
    RQ are analyzed on every push operation, even if it was previously
    unpushable (due to affinity, etc).  Whats more, the operation stops
    at the first task that is unpushable and will not look at items
    lower in the queue.  This causes two problems:
    
    1) We can have the same tasks analyzed over and over again during each
       push, which extends out the fast path in the scheduler for no
       gain.  Consider a RQ that has dozens of tasks that are bound to a
       core.  Each one of those tasks will be encountered and skipped
       for each push operation while they are queued.
    
    2) There may be lower-priority tasks under the unpushable task that
       could have been successfully pushed, but will never be considered
       until either the unpushable task is cleared, or a pull operation
       succeeds.  The net result is a potential latency source for mid
       priority tasks.
    
    This patch aims to rectify these two conditions by introducing a new
    priority sorted list: "pushable_tasks".  A task is added to the list
    each time a task is activated or preempted.  It is removed from the
    list any time it is deactivated, made current, or fails to push.
    
    This works because a task only needs to be attempted to push once.
    After an initial failure to push, the other cpus will eventually try to
    pull the task when the conditions are proper.  This also solves the
    problem that we don't completely analyze all tasks due to encountering
    an unpushable tasks.  Now every task will have a push attempted (when
    appropriate).
    
    This reduces latency both by shorting the critical section of the
    rq->lock for certain workloads, and by making sure the algorithm
    considers all eligible tasks in the system.
    
    [ rostedt: added a couple more BUG_ONs ]
    Signed-off-by: default avatarGregory Haskins <ghaskins@novell.com>
    Acked-by: default avatarSteven Rostedt <srostedt@redhat.com>
    917b627d
sched.c 232 KB