• Paul E. McKenney's avatar
    clocksource: Retry clock read if long delays detected · db3a34e1
    Paul E. McKenney authored
    When the clocksource watchdog marks a clock as unstable, this might be due
    to that clock being unstable or it might be due to delays that happen to
    occur between the reads of the two clocks.  Yes, interrupts are disabled
    across those two reads, but there are no shortage of things that can delay
    interrupts-disabled regions of code ranging from SMI handlers to vCPU
    preemption.  It would be good to have some indication as to why the clock
    was marked unstable.
    
    Therefore, re-read the watchdog clock on either side of the read from the
    clock under test.  If the watchdog clock shows an excessive time delta
    between its pair of reads, the reads are retried.
    
    The maximum number of retries is specified by a new kernel boot parameter
    clocksource.max_cswd_read_retries, which defaults to three, that is, up to
    four reads, one initial and up to three retries.  If more than one retry
    was required, a message is printed on the console (the occasional single
    retry is expected behavior, especially in guest OSes).  If the maximum
    number of retries is exceeded, the clock under test will be marked
    unstable.  However, the probability of this happening due to various sorts
    of delays is quite small.  In addition, the reason (clock-read delays) for
    the unstable marking will be apparent.
    Reported-by: default avatarChris Mason <clm@fb.com>
    Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
    Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
    Acked-by: default avatarFeng Tang <feng.tang@intel.com>
    Link: https://lore.kernel.org/r/20210527190124.440372-1-paulmck@kernel.org
    db3a34e1
clocksource.c 34.7 KB