Commit 7ff16932 authored by Tim C Chen's avatar Tim C Chen Committed by Peter Zijlstra

sched/fair: Implement prefer sibling imbalance calculation between asymmetric groups

In the current prefer sibling load balancing code, there is an implicit
assumption that the busiest sched group and local sched group are
equivalent, hence the tasks to be moved is simply the difference in
number of tasks between the two groups (i.e. imbalance) divided by two.

However, we may have different number of cores between the cluster groups,
say when we take CPU offline or we have hybrid groups.  In that case,
we should balance between the two groups such that #tasks/#cores ratio
is the same between the same between both groups.  Hence the imbalance
computed will need to reflect this.

Adjust the sibling imbalance computation to take into account of the
above considerations.
Signed-off-by: default avatarTim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/4eacbaa236e680687dae2958378a6173654113df.1688770494.git.tim.c.chen@linux.intel.com
parent d24cb0d9
......@@ -9535,6 +9535,41 @@ static inline bool smt_balance(struct lb_env *env, struct sg_lb_stats *sgs,
return false;
}
static inline long sibling_imbalance(struct lb_env *env,
struct sd_lb_stats *sds,
struct sg_lb_stats *busiest,
struct sg_lb_stats *local)
{
int ncores_busiest, ncores_local;
long imbalance;
if (env->idle == CPU_NOT_IDLE || !busiest->sum_nr_running)
return 0;
ncores_busiest = sds->busiest->cores;
ncores_local = sds->local->cores;
if (ncores_busiest == ncores_local) {
imbalance = busiest->sum_nr_running;
lsub_positive(&imbalance, local->sum_nr_running);
return imbalance;
}
/* Balance such that nr_running/ncores ratio are same on both groups */
imbalance = ncores_local * busiest->sum_nr_running;
lsub_positive(&imbalance, ncores_busiest * local->sum_nr_running);
/* Normalize imbalance and do rounding on normalization */
imbalance = 2 * imbalance + ncores_local + ncores_busiest;
imbalance /= ncores_local + ncores_busiest;
/* Take advantage of resource in an empty sched group */
if (imbalance == 0 && local->sum_nr_running == 0 &&
busiest->sum_nr_running > 1)
imbalance = 2;
return imbalance;
}
static inline bool
sched_reduced_capacity(struct rq *rq, struct sched_domain *sd)
{
......@@ -10393,14 +10428,12 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s
}
if (busiest->group_weight == 1 || sds->prefer_sibling) {
unsigned int nr_diff = busiest->sum_nr_running;
/*
* When prefer sibling, evenly spread running tasks on
* groups.
*/
env->migration_type = migrate_task;
lsub_positive(&nr_diff, local->sum_nr_running);
env->imbalance = nr_diff;
env->imbalance = sibling_imbalance(env, sds, busiest, local);
} else {
/*
......@@ -10597,7 +10630,7 @@ static struct sched_group *find_busiest_group(struct lb_env *env)
* group's child domain.
*/
if (sds.prefer_sibling && local->group_type == group_has_spare &&
busiest->sum_nr_running > local->sum_nr_running + 1)
sibling_imbalance(env, &sds, busiest, local) > 1)
goto force_balance;
if (busiest->group_type != group_overloaded) {
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment