• Luben Tuikov's avatar
    drm/scheduler: Set the FIFO scheduling policy as the default · 977d97f1
    Luben Tuikov authored
    The currently default Round-Robin GPU scheduling can result in starvation
    of entities which have a large number of jobs, over entities which have
    a very small number of jobs (single digit).
    
    This can be illustrated in the following diagram, where jobs are
    alphabetized to show their chronological order of arrival, where job A is
    the oldest, B is the second oldest, and so on, to J, the most recent job to
    arrive.
    
        ---> entities
    j | H-F-----A--E--I--
    o | --G-----B-----J--
    b | --------C--------
    s\/ --------D--------
    
    WLOG, assuming all jobs are "ready", then a R-R scheduling will execute them
    in the following order (a slice off of the top of the entities' list),
    
    H, F, A, E, I, G, B, J, C, D.
    
    However, to mitigate job starvation, we'd rather execute C and D before E,
    and so on, given, of course, that they're all ready to be executed.
    
    So, if all jobs are ready at this instant, the order of execution for this
    and the next 9 instances of picking the next job to execute, should really
    be,
    
    A, B, C, D, E, F, G, H, I, J,
    
    which is their chronological order. The only reason for this order to be
    broken, is if an older job is not yet ready, but a younger job is ready, at
    an instant of picking a new job to execute. For instance if job C wasn't
    ready at time 2, but job D was ready, then we'd pick job D, like this:
    
    0 +1 +2  ...
    A, B, D, ...
    
    And from then on, C would be preferred before all other jobs, if it is ready
    at the time when a new job for execution is picked. So, if C became ready
    two steps later, the execution order would look like this:
    
    ......0 +1 +2  ...
    A, B, D, E, C, F, G, H, I, J
    
    This is what the FIFO GPU scheduling algorithm achieves. It uses a
    Red-Black tree to keep jobs sorted in chronological order, where picking
    the oldest job is O(1) (we use the "cached" structure), and balancing the
    tree is O(log n). IOW, it picks the *oldest ready* job to execute now.
    
    The implementation is already in the kernel, and this commit only changes
    the default GPU scheduling algorithm to use.
    
    This was tested and achieves about 1% faster performance over the Round
    Robin algorithm.
    
    Cc: Christian König <christian.koenig@amd.com>
    Cc: Alex Deucher <Alexander.Deucher@amd.com>
    Cc: Direct Rendering Infrastructure - Development <dri-devel@lists.freedesktop.org>
    Signed-off-by: default avatarLuben Tuikov <luben.tuikov@amd.com>
    Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20221024212634.27230-1-luben.tuikov@amd.comSigned-off-by: default avatarChristian König <christian.koenig@amd.com>
    977d97f1
sched_main.c 32.5 KB