• Valentin Schneider's avatar
    workqueue: Unbind kworkers before sending them to exit() · e02b9312
    Valentin Schneider authored
    It has been reported that isolated CPUs can suffer from interference due to
    per-CPU kworkers waking up just to die.
    
    A surge of workqueue activity during initial setup of a latency-sensitive
    application (refresh_vm_stats() being one of the culprits) can cause extra
    per-CPU kworkers to be spawned. Then, said latency-sensitive task can be
    running merrily on an isolated CPU only to be interrupted sometime later by
    a kworker marked for death (cf. IDLE_WORKER_TIMEOUT, 5 minutes after last
    kworker activity).
    
    Prevent this by affining kworkers to the wq_unbound_cpumask (which doesn't
    contain isolated CPUs, cf. HK_TYPE_WQ) before waking them up after marking
    them with WORKER_DIE.
    
    Changing the affinity does require a sleepable context, leverage the newly
    introduced pool->idle_cull_work to get that.
    
    Remove dying workers from pool->workers and keep track of them in a
    separate list. This intentionally prevents for_each_loop_worker() from
    iterating over workers that are marked for death.
    
    Rename destroy_worker() to set_working_dying() to better reflect its
    effects and relationship with wake_dying_workers().
    Signed-off-by: default avatarValentin Schneider <vschneid@redhat.com>
    Signed-off-by: default avatarTejun Heo <tj@kernel.org>
    e02b9312
workqueue.c 173 KB