-
Peter Zijlstra authored
BugLink: https://bugs.launchpad.net/bugs/1821259 Matt reported the following deadlock: CPU0 CPU1 schedule(.prev=migrate/0) <fault> pick_next_task() ... idle_balance() migrate_swap() active_balance() stop_two_cpus() spin_lock(stopper0->lock) spin_lock(stopper1->lock) ttwu(migrate/0) smp_cond_load_acquire() -- waits for schedule() stop_one_cpu(1) spin_lock(stopper1->lock) -- waits for stopper lock Fix this deadlock by taking the wakeups out from under stopper->lock. This allows the active_balance() to queue the stop work and finish the context switch, which in turn allows the wakeup from migrate_swap() to observe the context and complete the wakeup. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reported-by: Matt Fleming <matt@codeblueprint.co.uk> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20180420095005.GH4064@hirez.programming.kicks-ass.netSigned-off-by: Ingo Molnar <mingo@kernel.org> (backported from commit 0b26351b) [mfo: backport: - hunk 1: - refresh context lines - include 'linux/sched.h' instead of 'linux/sched/wake_q.h' which is code moved by upstream commit eb61baf6 ("sched/headers: Move the wake-queue types and interfaces from sched.h into <linux/sched/wake_q.h>") - hunk 2: - refresh context lines - s/bool/void/ return type of cpu_stop_queue_work() and other 2 changes which are not relevant to this fix, due to lack of upstream commits: - 'enabled' variable: commit 1b034bd9 ("stop_machine: Make cpu_stop_queue_work() and stop_one_cpu_nowait() return bool") - else 'if (work->done)' condition commit dd2e3121 ("stop_machine: Shift the 'done != NULL' check from cpu_stop_signal_done() to callers") - s/DEFINE_WAKE_Q/WAKE_Q/ due to lack of upstream commit 194a6b5b ("sched/wake_q: Rename WAKE_Q to DEFINE_WAKE_Q") - hunk 3: - refresh context lines - s/DEFINE_WAKE_Q/WAKE_Q/ due to lack of upstream commit 194a6b5b ("sched/wake_q: Rename WAKE_Q to DEFINE_WAKE_Q") - hunk 4: - refresh context lines - hunk 5: - merge with hunk 4 - refresh context lines] Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com> Acked-by: Khalid Elmously <khalid.elmously@canonical.com> Acked-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com> Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
ae93e478