• Patrick Bellasi's avatar
    sched/fair: Add util_est on top of PELT · 7f65ea42
    Patrick Bellasi authored
    The util_avg signal computed by PELT is too variable for some use-cases.
    For example, a big task waking up after a long sleep period will have its
    utilization almost completely decayed. This introduces some latency before
    schedutil will be able to pick the best frequency to run a task.
    
    The same issue can affect task placement. Indeed, since the task
    utilization is already decayed at wakeup, when the task is enqueued in a
    CPU, this can result in a CPU running a big task as being temporarily
    represented as being almost empty. This leads to a race condition where
    other tasks can be potentially allocated on a CPU which just started to run
    a big task which slept for a relatively long period.
    
    Moreover, the PELT utilization of a task can be updated every [ms], thus
    making it a continuously changing value for certain longer running
    tasks. This means that the instantaneous PELT utilization of a RUNNING
    task is not really meaningful to properly support scheduler decisions.
    
    For all these reasons, a more stable signal can do a better job of
    representing the expected/estimated utilization of a task/cfs_rq.
    Such a signal can be easily created on top of PELT by still using it as
    an estimator which produces values to be aggregated on meaningful
    events.
    
    This patch adds a simple implementation of util_est, a new signal built on
    top of PELT's util_avg where:
    
        util_est(task) = max(task::util_avg, f(task::util_avg@dequeue))
    
    This allows to remember how big a task has been reported by PELT in its
    previous activations via f(task::util_avg@dequeue), which is the new
    _task_util_est(struct task_struct*) function added by this patch.
    
    If a task should change its behavior and it runs longer in a new
    activation, after a certain time its util_est will just track the
    original PELT signal (i.e. task::util_avg).
    
    The estimated utilization of cfs_rq is defined only for root ones.
    That's because the only sensible consumer of this signal are the
    scheduler and schedutil when looking for the overall CPU utilization
    due to FAIR tasks.
    
    For this reason, the estimated utilization of a root cfs_rq is simply
    defined as:
    
        util_est(cfs_rq) = max(cfs_rq::util_avg, cfs_rq::util_est::enqueued)
    
    where:
    
        cfs_rq::util_est::enqueued = sum(_task_util_est(task))
                                     for each RUNNABLE task on that root cfs_rq
    
    It's worth noting that the estimated utilization is tracked only for
    objects of interests, specifically:
    
     - Tasks: to better support tasks placement decisions
     - root cfs_rqs: to better support both tasks placement decisions as
                     well as frequencies selection
    Signed-off-by: default avatarPatrick Bellasi <patrick.bellasi@arm.com>
    Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
    Reviewed-by: default avatarDietmar Eggemann <dietmar.eggemann@arm.com>
    Cc: Joel Fernandes <joelaf@google.com>
    Cc: Juri Lelli <juri.lelli@redhat.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Morten Rasmussen <morten.rasmussen@arm.com>
    Cc: Paul Turner <pjt@google.com>
    Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
    Cc: Steve Muckle <smuckle@google.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Todd Kjos <tkjos@android.com>
    Cc: Vincent Guittot <vincent.guittot@linaro.org>
    Cc: Viresh Kumar <viresh.kumar@linaro.org>
    Link: http://lkml.kernel.org/r/20180309095245.11071-2-patrick.bellasi@arm.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
    7f65ea42
fair.c 272 KB