• Peter Zijlstra's avatar
    sched: Fix rq->nr_uninterruptible update race · 4ca9b72b
    Peter Zijlstra authored
    KOSAKI Motohiro noticed the following race:
    
     > CPU0                    CPU1
     > --------------------------------------------------------
     > deactivate_task()
     >                         task->state = TASK_UNINTERRUPTIBLE;
     > activate_task()
     >    rq->nr_uninterruptible--;
     >
     >                         schedule()
     >                           deactivate_task()
     >                             rq->nr_uninterruptible++;
     >
    
    Kosaki-San's scenario is possible when CPU0 runs
    __sched_setscheduler() against CPU1's current @task.
    
    __sched_setscheduler() does a dequeue/enqueue in order to move
    the task to its new queue (position) to reflect the newly provided
    scheduling parameters. However it should be completely invariant to
    nr_uninterruptible accounting, sched_setscheduler() doesn't affect
    readyness to run, merely policy on when to run.
    
    So convert the inappropriate activate/deactivate_task usage to
    enqueue/dequeue_task, which avoids the nr_uninterruptible accounting.
    
    Also convert the two other sites: __migrate_task() and
    normalize_task() that still use activate/deactivate_task. These sites
    aren't really a problem since __migrate_task() will only be called on
    non-running task (and therefore are immume to the described problem)
    and normalize_task() isn't ever used on regular systems.
    
    Also remove the comments from activate/deactivate_task since they're
    misleading at best.
    Reported-by: default avatarKOSAKI Motohiro <kosaki.motohiro@gmail.com>
    Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
    Link: http://lkml.kernel.org/r/1327486224.2614.45.camel@laptopSigned-off-by: default avatarIngo Molnar <mingo@elte.hu>
    4ca9b72b
core.c 194 KB