Commit 548796e2 authored by Cruz Zhao's avatar Cruz Zhao Committed by Peter Zijlstra

sched/core: introduce sched_core_idle_cpu()

As core scheduling introduced, a new state of idle is defined as
force idle, running idle task but nr_running greater than zero.

If a cpu is in force idle state, idle_cpu() will return zero. This
result makes sense in some scenarios, e.g., load balance,
showacpu when dumping, and judge the RCU boost kthread is starving.

But this will cause error in other scenarios, e.g., tick_irq_exit():
When force idle, rq->curr == rq->idle but rq->nr_running > 0, results
that idle_cpu() returns 0. In function tick_irq_exit(), if idle_cpu()
is 0, tick_nohz_irq_exit() will not be called, and ts->idle_active will
not become 1, which became 0 in tick_nohz_irq_enter().
ts->idle_sleeptime won't update in function update_ts_time_stats(), if
ts->idle_active is 0, which should be 1. And this bug will result that
ts->idle_sleeptime is less than the actual value, and finally will
result that the idle time in /proc/stat is less than the actual value.

To solve this problem, we introduce sched_core_idle_cpu(), which
returns 1 when force idle. We audit all users of idle_cpu(), and
change idle_cpu() into sched_core_idle_cpu() in function
tick_irq_exit().

v2-->v3: Only replace idle_cpu() with sched_core_idle_cpu() in
function tick_irq_exit(). And modify the corresponding commit log.
Signed-off-by: default avatarCruz Zhao <CruzZhao@linux.alibaba.com>
Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: default avatarPeter Zijlstra <peterz@infradead.org>
Reviewed-by: default avatarFrederic Weisbecker <frederic@kernel.org>
Reviewed-by: default avatarJoel Fernandes <joel@joelfernandes.org>
Link: https://lore.kernel.org/r/1688011324-42406-1-git-send-email-CruzZhao@linux.alibaba.com
parent 677ea015
...@@ -2433,9 +2433,11 @@ extern void sched_core_free(struct task_struct *tsk); ...@@ -2433,9 +2433,11 @@ extern void sched_core_free(struct task_struct *tsk);
extern void sched_core_fork(struct task_struct *p); extern void sched_core_fork(struct task_struct *p);
extern int sched_core_share_pid(unsigned int cmd, pid_t pid, enum pid_type type, extern int sched_core_share_pid(unsigned int cmd, pid_t pid, enum pid_type type,
unsigned long uaddr); unsigned long uaddr);
extern int sched_core_idle_cpu(int cpu);
#else #else
static inline void sched_core_free(struct task_struct *tsk) { } static inline void sched_core_free(struct task_struct *tsk) { }
static inline void sched_core_fork(struct task_struct *p) { } static inline void sched_core_fork(struct task_struct *p) { }
static inline int sched_core_idle_cpu(int cpu) { return idle_cpu(cpu); }
#endif #endif
extern void sched_set_stop_task(int cpu, struct task_struct *stop); extern void sched_set_stop_task(int cpu, struct task_struct *stop);
......
...@@ -7383,6 +7383,19 @@ struct task_struct *idle_task(int cpu) ...@@ -7383,6 +7383,19 @@ struct task_struct *idle_task(int cpu)
return cpu_rq(cpu)->idle; return cpu_rq(cpu)->idle;
} }
#ifdef CONFIG_SCHED_CORE
int sched_core_idle_cpu(int cpu)
{
struct rq *rq = cpu_rq(cpu);
if (sched_core_enabled(rq) && rq->curr == rq->idle)
return 1;
return idle_cpu(cpu);
}
#endif
#ifdef CONFIG_SMP #ifdef CONFIG_SMP
/* /*
* This function computes an effective utilization for the given CPU, to be * This function computes an effective utilization for the given CPU, to be
......
...@@ -612,7 +612,7 @@ static inline void tick_irq_exit(void) ...@@ -612,7 +612,7 @@ static inline void tick_irq_exit(void)
int cpu = smp_processor_id(); int cpu = smp_processor_id();
/* Make sure that timer wheel updates are propagated */ /* Make sure that timer wheel updates are propagated */
if ((idle_cpu(cpu) && !need_resched()) || tick_nohz_full_cpu(cpu)) { if ((sched_core_idle_cpu(cpu) && !need_resched()) || tick_nohz_full_cpu(cpu)) {
if (!in_hardirq()) if (!in_hardirq())
tick_nohz_irq_exit(); tick_nohz_irq_exit();
} }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment