[PATCH] sched: improved load_balance() tolerance for pinned tasks

A large number of processes that are pinned to a single CPU results in every other CPU's load_balance() seeing this overloaded CPU as "busiest", yet move_tasks() never finds a task to pull-migrate. This condition occurs during module unload, but can also occur as a denial-of-service using sys_sched_setaffinity(). Several hundred CPUs performing this fruitless load_balance() will livelock on the busiest CPU's runqueue lock. A smaller number of CPUs will livelock if the pinned task count gets high. This simple patch remedies the more common first problem: after a move_tasks() failure to migrate anything, the balance_interval increments. Using a simple increment, vs. the more dramatic doubling of the balance_interval, is conservative and yet also effective. Signed-off-by: John Hawkes <hawkes@sgi.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>

[PATCH] sched: improved load_balance() tolerance for pinned tasks
A large number of processes that are pinned to a single CPU results in every other CPU's load_balance() seeing this overloaded CPU as "busiest", yet move_tasks() never finds a task to pull-migrate. This condition occurs during module unload, but can also occur as a denial-of-service using sys_sched_setaffinity(). Several hundred CPUs performing this fruitless load_balance() will livelock on the busiest CPU's runqueue lock. A smaller number of CPUs will livelock if the pinned task count gets high. This simple patch remedies the more common first problem: after a move_tasks() failure to migrate anything, the balance_interval increments. Using a simple increment, vs. the more dramatic doubling of the balance_interval, is conservative and yet also effective. Signed-off-by: John Hawkes <hawkes@sgi.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
83e2ae7f · John Hawkes · Linus Torvalds · 625e1bd5 · 83e2ae7f
Commit 83e2ae7f authored Oct 25, 2004 by John Hawkes Committed by Linus Torvalds Oct 25, 2004
Hide whitespace changes
Inline Side-by-side

Showing with 12 additions and 4 deletions

kernel/sched.c kernel/sched.c +12 -4

No files found.
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -1973,11 +1973,19 @@ static int load_balance(int this_cpu, runqueue_t *this_rq,
 			 */
 			sd->nr_balance_failed = sd->cache_nice_tries;
 		}
-	} else
-		sd->nr_balance_failed = 0;

-	/* We were unbalanced, so reset the balancing interval */
-	sd->balance_interval = sd->min_interval;
+		/*
+		 * We were unbalanced, but unsuccessful in move_tasks(),
+		 * so bump the balance_interval to lessen the lock contention.
+		 */
+		if (sd->balance_interval < sd->max_interval)
+			sd->balance_interval++;
+	} else {
+                sd->nr_balance_failed = 0;
+
+		/* We were unbalanced, so reset the balancing interval */
+		sd->balance_interval = sd->min_interval;
+	}

 	return nr_moved;