• Gregory Haskins's avatar
    sched: prioritize non-migratable tasks over migratable ones · 45c01e82
    Gregory Haskins authored
    Dmitry Adamushko pointed out a known flaw in the rt-balancing algorithm
    that could allow suboptimal balancing if a non-migratable task gets
    queued behind a running migratable one.  It is discussed in this thread:
    
    http://lkml.org/lkml/2008/4/22/296
    
    This issue has been further exacerbated by a recent checkin to
    sched-devel (git-id 5eee63a5ebc19a870ac40055c0be49457f3a89a3).
    
    >From a pure priority standpoint, the run-queue is doing the "right"
    thing. Using Dmitry's nomenclature, if T0 is on cpu1 first, and T1
    wakes up at equal or lower priority (affined only to cpu1) later, it
    *should* wait for T0 to finish.  However, in reality that is likely
    suboptimal from a system perspective if there are other cores that
    could allow T0 and T1 to run concurrently.  Since T1 can not migrate,
    the only choice for higher concurrency is to try to move T0.  This is
    not something we addessed in the recent rt-balancing re-work.
    
    This patch tries to enhance the balancing algorithm by accomodating this
    scenario.  It accomplishes this by incorporating the migratability of a
    task into its priority calculation.  Within a numerical tsk->prio, a
    non-migratable task is logically higher than a migratable one.  We
    maintain this by introducing a new per-priority queue (xqueue, or
    exclusive-queue) for holding non-migratable tasks.  The scheduler will
    draw from the xqueue over the standard shared-queue (squeue) when
    available.
    
    There are several details for utilizing this properly.
    
    1) During task-wake-up, we not only need to check if the priority
       preempts the current task, but we also need to check for this
       non-migratable condition.  Therefore, if a non-migratable task wakes
       up and sees an equal priority migratable task already running, it
       will attempt to preempt it *if* there is a likelyhood that the
       current task will find an immediate home.
    
    2) Tasks only get this non-migratable "priority boost" on wake-up.  Any
       requeuing will result in the non-migratable task being queued to the
       end of the shared queue.  This is an attempt to prevent the system
       from being completely unfair to migratable tasks during things like
       SCHED_RR timeslicing.
    
    I am sure this patch introduces potentially "odd" behavior if you
    concoct a scenario where a bunch of non-migratable threads could starve
    migratable ones given the right pattern.  I am not yet convinced that
    this is a problem since we are talking about tasks of equal RT priority
    anyway, and there never is much in the way of guarantees against
    starvation under that scenario anyway. (e.g. you could come up with a
    similar scenario with a specific timing environment verses an affinity
    environment).  I can be convinced otherwise, but for now I think this is
    "ok".
    Signed-off-by: default avatarGregory Haskins <ghaskins@novell.com>
    CC: Dmitry Adamushko <dmitry.adamushko@gmail.com>
    CC: Steven Rostedt <rostedt@goodmis.org>
    Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
    Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
    45c01e82
sched_rt.c 32.4 KB