• Alexey Dobriyan's avatar
    cpumask: make cpumask_next() out-of-line · f22ef333
    Alexey Dobriyan authored
    Every for_each_XXX_cpu() invocation calls cpumask_next() which is an
    inline function:
    
    	static inline unsigned int cpumask_next(int n, const struct cpumask *srcp)
    	{
    	        /* -1 is a legal arg here. */
    	        if (n != -1)
    	                cpumask_check(n);
    	        return find_next_bit(cpumask_bits(srcp), nr_cpumask_bits, n + 1);
    	}
    
    However!
    
    find_next_bit() is regular out-of-line function which means "nr_cpu_ids"
    load and increment happen at the caller resulting in a lot of bloat
    
    x86_64 defconfig:
    	add/remove: 3/0 grow/shrink: 8/373 up/down: 155/-5668 (-5513)
    x86_64 allyesconfig-ish:
    	add/remove: 3/1 grow/shrink: 57/634 up/down: 3515/-28177 (-24662) !!!
    
    Some archs redefine find_next_bit() but it is OK:
    
    	m68k		inline but SMP is not supported
    	arm		out-of-line
    	unicore32	out-of-line
    
    Function call will happen anyway, so move load and increment into callee.
    
    Link: http://lkml.kernel.org/r/20170824230010.GA1593@avx2Signed-off-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    f22ef333
cpumask.c 5.74 KB