Commit 8a0be9ef authored by Frederic Weisbecker's avatar Frederic Weisbecker Committed by Ingo Molnar

sched: don't rebalance if attached on NULL domain

Impact: fix function graph trace hang / drop pointless softirq on UP

While debugging a function graph trace hang on an old PII, I saw
that it consumed most of its time on the timer interrupt. And
the domain rebalancing softirq was the most concerned.

The timer interrupt calls trigger_load_balance() which will
decide if it is worth to schedule a rebalancing softirq.

In case of builtin UP kernel, no problem arises because there is
no domain question.

In case of builtin SMP kernel running on an SMP box, still no
problem, the softirq will be raised each time we reach the
next_balance time.

In case of builtin SMP kernel running on a UP box (most distros
provide default SMP kernels, whatever the box you have), then
the CPU is attached to the NULL sched domain. So a kind of
unexpected behaviour happen:

trigger_load_balance() -> raises the rebalancing softirq later
on softirq: run_rebalance_domains() -> rebalance_domains() where
the for_each_domain(cpu, sd) is not taken because of the NULL
domain we are attached at. Which means rq->next_balance is never
updated. So on the next timer tick, we will enter
trigger_load_balance() which will always reschedule() the
rebalacing softirq:

if (time_after_eq(jiffies, rq->next_balance))
	raise_softirq(SCHED_SOFTIRQ);

So for each tick, we process this pointless softirq.

This patch fixes it by checking if we are attached to the null
domain before raising the softirq, another possible fix would be
to set the maximal possible JIFFIES value to rq->next_balance if
we are attached to the NULL domain.

v2: build fix on UP
Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <49af242d.1c07d00a.32d5.ffffc019@mx.google.com>
Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
parent 49d2d266
...@@ -4148,6 +4148,11 @@ static void run_rebalance_domains(struct softirq_action *h) ...@@ -4148,6 +4148,11 @@ static void run_rebalance_domains(struct softirq_action *h)
#endif #endif
} }
static inline int on_null_domain(int cpu)
{
return !rcu_dereference(cpu_rq(cpu)->sd);
}
/* /*
* Trigger the SCHED_SOFTIRQ if it is time to do periodic load balancing. * Trigger the SCHED_SOFTIRQ if it is time to do periodic load balancing.
* *
...@@ -4205,7 +4210,9 @@ static inline void trigger_load_balance(struct rq *rq, int cpu) ...@@ -4205,7 +4210,9 @@ static inline void trigger_load_balance(struct rq *rq, int cpu)
cpumask_test_cpu(cpu, nohz.cpu_mask)) cpumask_test_cpu(cpu, nohz.cpu_mask))
return; return;
#endif #endif
if (time_after_eq(jiffies, rq->next_balance)) /* Don't need to rebalance while attached to NULL domain */
if (time_after_eq(jiffies, rq->next_balance) &&
likely(!on_null_domain(cpu)))
raise_softirq(SCHED_SOFTIRQ); raise_softirq(SCHED_SOFTIRQ);
} }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment