• Tejun Heo's avatar
    sched_ext: TASK_DEAD tasks must be switched out of SCX on ops_disable · 61eeb9a9
    Tejun Heo authored
    scx_ops_disable_workfn() only switches !TASK_DEAD tasks out of SCX while
    calling scx_ops_exit_task() on all tasks including dead ones. This can leave
    a dead task on SCX but with SCX_TASK_NONE state, which is inconsistent.
    
    If another task was in the process of changing the TASK_DEAD task's
    scheduling class and grabs the rq lock after scx_ops_disable_workfn() is
    done with the task, the task ends up calling scx_ops_disable_task() on the
    dead task which is in an inconsistent state triggering a warning:
    
      WARNING: CPU: 6 PID: 3316 at kernel/sched/ext.c:3411 scx_ops_disable_task+0x12c/0x160
      ...
      RIP: 0010:scx_ops_disable_task+0x12c/0x160
      ...
      Call Trace:
       <TASK>
       check_class_changed+0x2c/0x70
       __sched_setscheduler+0x8a0/0xa50
       do_sched_setscheduler+0x104/0x1c0
       __x64_sys_sched_setscheduler+0x18/0x30
       do_syscall_64+0x7b/0x140
       entry_SYSCALL_64_after_hwframe+0x76/0x7e
      RIP: 0033:0x7f140d70ea5b
    
    There is no reason to leave dead tasks on SCX when unloading the BPF
    scheduler. Fix by making scx_ops_disable_workfn() eject all tasks including
    the dead ones from SCX.
    Signed-off-by: default avatarTejun Heo <tj@kernel.org>
    61eeb9a9
ext.c 181 KB