• Nicolai Stange's avatar
    x86/timers/apic: Fix imprecise timer interrupts by eliminating TSC clockevents... · 1a9e4c56
    Nicolai Stange authored
    x86/timers/apic: Fix imprecise timer interrupts by eliminating TSC clockevents frequency roundoff error
    
    I noticed the following bug/misbehavior on certain Intel systems: with a
    single task running on a NOHZ CPU on an Intel Haswell, I recognized
    that I did not only get the one expected local_timer APIC interrupt, but
    two per second at minimum. (!)
    
    Further tracing showed that the first one precedes the programmed deadline
    by up to ~50us and hence, it did nothing except for reprogramming the TSC
    deadline clockevent device to trigger shortly thereafter again.
    
    The reason for this is imprecise calibration, the timeout we program into
    the APIC results in 'too short' timer interrupts. The core (hr)timer code
    notices this (because it has a precise ktime source and sees the short
    interrupt) and fixes it up by programming an additional very short
    interrupt period.
    
    This is obviously suboptimal.
    
    The reason for the imprecise calibration is twofold, and this patch
    fixes the first reason:
    
    In setup_APIC_timer(), the registered clockevent device's frequency
    is calculated by first dividing tsc_khz by TSC_DIVISOR and multiplying
    it with 1000 afterwards:
    
      (tsc_khz / TSC_DIVISOR) * 1000
    
    The multiplication with 1000 is done for converting from kHz to Hz and the
    division by TSC_DIVISOR is carried out in order to make sure that the final
    result fits into an u32.
    
    However, with the order given in this calculation, the roundoff error
    introduced by the division gets magnified by a factor of 1000 by the
    following multiplication.
    
    To fix it, reversing the order of the division and the multiplication a la:
    
      (tsc_khz * 1000) / TSC_DIVISOR
    
    ... reduces the roundoff error already.
    
    Furthermore, if TSC_DIVISOR divides 1000, associativity holds:
    
      (tsc_khz * 1000) / TSC_DIVISOR = tsc_khz * (1000 / TSC_DIVISOR)
    
    and thus, the roundoff error even vanishes and the whole operation can be
    carried out within 32 bits.
    
    The powers of two that divide 1000 are 2, 4 and 8. A value of 8 for
    TSC_DIVISOR still allows for TSC frequencies up to
    2^32 / 10^9ns * 8 = 34.4GHz which is way larger than anything to expect
    in the next years.
    
    Thus we also replace the current TSC_DIVISOR value of 32 by 8. Reverse
    the order of the divison and the multiplication in the calculation of
    the registered clockevent device's frequency.
    Signed-off-by: default avatarNicolai Stange <nicstange@gmail.com>
    Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
    Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
    Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
    Cc: Adrian Hunter <adrian.hunter@intel.com>
    Cc: Borislav Petkov <bp@suse.de>
    Cc: Christopher S. Hall <christopher.s.hall@intel.com>
    Cc: H. Peter Anvin <hpa@zytor.com>
    Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
    Cc: Len Brown <len.brown@intel.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Viresh Kumar <viresh.kumar@linaro.org>
    Link: http://lkml.kernel.org/r/20160714152255.18295-2-nicstange@gmail.com
    [ Improved changelog. ]
    Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
    1a9e4c56
apic.c 63.2 KB