• Qais Yousef's avatar
    sched/uclamp: Fix wrong implementation of cpu.uclamp.min · 0c18f2ec
    Qais Yousef authored
    cpu.uclamp.min is a protection as described in cgroup-v2 Resource
    Distribution Model
    
    	Documentation/admin-guide/cgroup-v2.rst
    
    which means we try our best to preserve the minimum performance point of
    tasks in this group. See full description of cpu.uclamp.min in the
    cgroup-v2.rst.
    
    But the current implementation makes it a limit, which is not what was
    intended.
    
    For example:
    
    	tg->cpu.uclamp.min = 20%
    
    	p0->uclamp[UCLAMP_MIN] = 0
    	p1->uclamp[UCLAMP_MIN] = 50%
    
    	Previous Behavior (limit):
    
    		p0->effective_uclamp = 0
    		p1->effective_uclamp = 20%
    
    	New Behavior (Protection):
    
    		p0->effective_uclamp = 20%
    		p1->effective_uclamp = 50%
    
    Which is inline with how protections should work.
    
    With this change the cgroup and per-task behaviors are the same, as
    expected.
    
    Additionally, we remove the confusing relationship between cgroup and
    !user_defined flag.
    
    We don't want for example RT tasks that are boosted by default to max to
    change their boost value when they attach to a cgroup. If a cgroup wants
    to limit the max performance point of tasks attached to it, then
    cpu.uclamp.max must be set accordingly.
    
    Or if they want to set different boost value based on cgroup, then
    sysctl_sched_util_clamp_min_rt_default must be used to NOT boost to max
    and set the right cpu.uclamp.min for each group to let the RT tasks
    obtain the desired boost value when attached to that group.
    
    As it stands the dependency on !user_defined flag adds an extra layer of
    complexity that is not required now cpu.uclamp.min behaves properly as
    a protection.
    
    The propagation model of effective cpu.uclamp.min in child cgroups as
    implemented by cpu_util_update_eff() is still correct. The parent
    protection sets an upper limit of what the child cgroups will
    effectively get.
    
    Fixes: 3eac870a (sched/uclamp: Use TG's clamps to restrict TASK's clamps)
    Signed-off-by: default avatarQais Yousef <qais.yousef@arm.com>
    Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
    Link: https://lkml.kernel.org/r/20210510145032.1934078-2-qais.yousef@arm.com
    0c18f2ec
core.c 261 KB