• Tejun Heo's avatar
    sched_ext: Synchronize bypass state changes with rq lock · 750a40d8
    Tejun Heo authored
    While the BPF scheduler is being unloaded, the following warning messages
    trigger sometimes:
    
     NOHZ tick-stop error: local softirq work is pending, handler #80!!!
    
    This is caused by the CPU entering idle while there are pending softirqs.
    The main culprit is the bypassing state assertion not being synchronized
    with rq operations. As the BPF scheduler cannot be trusted in the disable
    path, the first step is entering the bypass mode where the BPF scheduler is
    ignored and scheduling becomes global FIFO.
    
    This is implemented by turning scx_ops_bypassing() true. However, the
    transition isn't synchronized against anything and it's possible for enqueue
    and dispatch paths to have different ideas on whether bypass mode is on.
    
    Make each rq track its own bypass state with SCX_RQ_BYPASSING which is
    modified while rq is locked.
    
    This removes most of the NOHZ tick-stop messages but not completely. I
    believe the stragglers are from the sched core bug where pick_task_scx() can
    be called without preceding balance_scx(). Once that bug is fixed, we should
    verify that all occurrences of this error message are gone too.
    
    v2: scx_enabled() test moved inside the for_each_possible_cpu() loop so that
        the per-cpu states are always synchronized with the global state.
    Signed-off-by: default avatarTejun Heo <tj@kernel.org>
    Reported-by: default avatarDavid Vernet <void@manifault.com>
    750a40d8
sched.h 101 KB