• Gregory Haskins's avatar
    sched: Fix race in cpupri introduced by cpumask_var changes · 07903af1
    Gregory Haskins authored
    Background:
    
    Several race conditions in the scheduler have cropped up
    recently, which Steven and I have tracked down using ftrace.
    The most recent one turns out to be a race in how the scheduler
    determines a suitable migration target for RT tasks, introduced
    recently with commit:
    
        commit 68e74568
        Date:   Tue Nov 25 02:35:13 2008 +1030
    
            sched: convert struct cpupri_vec cpumask_var_t.
    
    The original design of cpupri allowed lockless readers to
    quickly determine a best-estimate target.  Races between the
    pri_active bitmap and the vec->mask were handled in the
    original code because we would detect and return "0" when this
    occured.  The design was predicated on the *effective*
    atomicity (*) of caching the result of cpus_and() between the
    cpus_allowed and the vec->mask.
    
    Commit 68e74568 changed the behavior such that vec->mask is
    accessed multiple times.  This introduces a subtle race, the
    result of which means we can have a result that returns "1",
    but with an empty bitmap.
    
    *) yes, we know cpus_and() is not a locked operator across the
       entire composite array, but it is implicitly atomic on a
       per-word basis which is all the design required to work.
    
    Implementation:
    
    Rather than forgoing the lockless design, or reverting to a
    stack-based cpumask_t, we simply check for when the race has
    been encountered and continue processing in the event that the
    race is hit.  This renders the removal race as if the priority
    bit had been atomically cleared as well, and allows the
    algorithm to execute correctly.
    Signed-off-by: default avatarGregory Haskins <ghaskins@novell.com>
    CC: Rusty Russell <rusty@rustcorp.com.au>
    CC: Steven Rostedt <srostedt@redhat.com>
    Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
    LKML-Reference: <20090730145728.25226.92769.stgit@dev.haskins.net>
    Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
    07903af1
sched_cpupri.c 5.37 KB