• Peter Zijlstra's avatar
    sched/topology: Fix overlapping sched_group_mask · 73bb059f
    Peter Zijlstra authored
    The point of sched_group_mask is to select those CPUs from
    sched_group_cpus that can actually arrive at this balance domain.
    
    The current code gets it wrong, as can be readily demonstrated with a
    topology like:
    
      node   0   1   2   3
        0:  10  20  30  20
        1:  20  10  20  30
        2:  30  20  10  20
        3:  20  30  20  10
    
    Where (for example) domain 1 on CPU1 ends up with a mask that includes
    CPU0:
    
      [] CPU1 attaching sched-domain:
      []  domain 0: span 0-2 level NUMA
      []   groups: 1 (mask: 1), 2, 0
      []   domain 1: span 0-3 level NUMA
      []    groups: 0-2 (mask: 0-2) (cpu_capacity: 3072), 0,2-3 (cpu_capacity: 3072)
    
    This causes sched_balance_cpu() to compute the wrong CPU and
    consequently should_we_balance() will terminate early resulting in
    missed load-balance opportunities.
    
    The fixed topology looks like:
    
      [] CPU1 attaching sched-domain:
      []  domain 0: span 0-2 level NUMA
      []   groups: 1 (mask: 1), 2, 0
      []   domain 1: span 0-3 level NUMA
      []    groups: 0-2 (mask: 1) (cpu_capacity: 3072), 0,2-3 (cpu_capacity: 3072)
    
    (note: this relies on OVERLAP domains to always have children, this is
     true because the regular topology domains are still here -- this is
     before degenerate trimming)
    Debugged-by: default avatarLauro Ramos Venancio <lvenanci@redhat.com>
    Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Mike Galbraith <efault@gmx.de>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: linux-kernel@vger.kernel.org
    Cc: stable@vger.kernel.org
    Fixes: e3589f6c ("sched: Allow for overlapping sched_domain spans")
    Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
    73bb059f
topology.c 40.5 KB