1. 13 Jul, 2008 2 commits
    • Linus Torvalds's avatar
      Merge branch 'sched-fixes-for-linus' of... · 3b5c6b83
      Linus Torvalds authored
      Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
      
      * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
        cpusets, hotplug, scheduler: fix scheduler domain breakage
      3b5c6b83
    • Dmitry Adamushko's avatar
      cpusets, hotplug, scheduler: fix scheduler domain breakage · 3e84050c
      Dmitry Adamushko authored
      Commit f18f982a ("sched: CPU hotplug events must not destroy scheduler
      domains created by the cpusets") introduced a hotplug-related problem as
      described below:
      
      Upon CPU_DOWN_PREPARE,
      
        update_sched_domains() -> detach_destroy_domains(&cpu_online_map)
      
      does the following:
      
      /*
       * Force a reinitialization of the sched domains hierarchy. The domains
       * and groups cannot be updated in place without racing with the balancing
       * code, so we temporarily attach all running cpus to the NULL domain
       * which will prevent rebalancing while the sched domains are recalculated.
       */
      
      The sched-domains should be rebuilt when a CPU_DOWN ops. has been
      completed, effectively either upon CPU_DEAD{_FROZEN} (upon success) or
      CPU_DOWN_FAILED{_FROZEN} (upon failure -- restore the things to their
      initial state). That's what update_sched_domains() also does but only
      for !CPUSETS case.
      
      With f18f982a, sched-domains' reinitialization is delegated to
      CPUSETS code:
      
      cpuset_handle_cpuhp() -> common_cpu_mem_hotplug_unplug() ->
      rebuild_sched_domains()
      
      Being called for CPU_UP_PREPARE and if its callback is called after
      update_sched_domains()), it just negates all the work done by
      update_sched_domains() -- i.e. a soon-to-be-offline cpu is included in
      the sched-domains and that makes it visible for the load-balancer
      while the CPU_DOWN ops. is in progress.
      
      __migrate_live_tasks() moves the tasks off a 'dead' cpu (it's already
      "offline" when this function is called).
      
      try_to_wake_up() is called for one of these tasks from another CPU ->
      the load-balancer (wake_idle()) picks up a "dead" CPU and places the
      task on it. Then e.g. BUG_ON(rq->nr_running) detects this a bit later
      -> oops.
      Signed-off-by: default avatarDmitry Adamushko <dmitry.adamushko@gmail.com>
      Tested-by: default avatarVegard Nossum <vegard.nossum@gmail.com>
      Cc: Paul Menage <menage@google.com>
      Cc: Max Krasnyansky <maxk@qualcomm.com>
      Cc: Paul Jackson <pj@sgi.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: miaox@cn.fujitsu.com
      Cc: rostedt@goodmis.org
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3e84050c
  2. 12 Jul, 2008 17 commits
  3. 11 Jul, 2008 12 commits
  4. 10 Jul, 2008 9 commits