• Nicolas Pitre's avatar
    do_div(): generic optimization for constant divisor on 32-bit machines · 461a5e51
    Nicolas Pitre authored
    64-by-32-bit divisions are prominent in the kernel, even on 32-bit
    machines.  Luckily, many of them use a constant divisor that allows
    for a much faster multiplication by the divisor's reciprocal.
    
    The compiler already performs this optimization when compiling a 32-by-32
    division with a constant divisor. Unfortunately, on 32-bit machines, gcc
    does not optimize 64-by-32 divisions in that case, except for constant
    divisors that happen to be a power of 2.
    
    Let's avoid the slow path whenever the divisor is constant by manually
    computing the reciprocal ourselves and performing the multiplication
    inline.  In most cases, this improves performance of 64-by-32 divisions
    by about two orders of magnitude compared to the __div64_32() fallback,
    especially on architectures lacking a native div instruction.
    
    The algorithm used here comes from the existing ARM code.
    
    The __div64_const32_is_OK macro can be predefined by architectures to
    disable this optimization in some cases. For example, some ancient gcc
    version on ARM would crash with an ICE when fed this code.
    Signed-off-by: default avatarNicolas Pitre <nico@linaro.org>
    Acked-by: default avatarAlexey Brodkin <abrodkin@synopsys.com>
    461a5e51
div64.h 6.49 KB