• Mel Gorman's avatar
    sched/core: Do not requeue task on CPU excluded from cpus_mask · 751d4cbc
    Mel Gorman authored
    The following warning was triggered on a large machine early in boot on
    a distribution kernel but the same problem should also affect mainline.
    
       WARNING: CPU: 439 PID: 10 at ../kernel/workqueue.c:2231 process_one_work+0x4d/0x440
       Call Trace:
        <TASK>
        rescuer_thread+0x1f6/0x360
        kthread+0x156/0x180
        ret_from_fork+0x22/0x30
        </TASK>
    
    Commit c6e7bd7a ("sched/core: Optimize ttwu() spinning on p->on_cpu")
    optimises ttwu by queueing a task that is descheduling on the wakelist,
    but does not check if the task descheduling is still allowed to run on that CPU.
    
    In this warning, the problematic task is a workqueue rescue thread which
    checks if the rescue is for a per-cpu workqueue and running on the wrong CPU.
    While this is early in boot and it should be possible to create workers,
    the rescue thread may still used if the MAYDAY_INITIAL_TIMEOUT is reached
    or MAYDAY_INTERVAL and on a sufficiently large machine, the rescue
    thread is being used frequently.
    
    Tracing confirmed that the task should have migrated properly using the
    stopper thread to handle the migration. However, a parallel wakeup from udev
    running on another CPU that does not share CPU cache observes p->on_cpu and
    uses task_cpu(p), queues the task on the old CPU and triggers the warning.
    
    Check that the wakee task that is descheduling is still allowed to run
    on its current CPU and if not, wait for the descheduling to complete
    and select an allowed CPU.
    
    Fixes: c6e7bd7a ("sched/core: Optimize ttwu() spinning on p->on_cpu")
    Signed-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
    Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
    Link: https://lore.kernel.org/r/20220804092119.20137-1-mgorman@techsingularity.net
    751d4cbc
core.c 284 KB