• Thomas Gleixner's avatar
    tick/broadcast: Make broadcast device replacement work correctly · f9d36cf4
    Thomas Gleixner authored
    When a tick broadcast clockevent device is initialized for one shot mode
    then tick_broadcast_setup_oneshot() OR's the periodic broadcast mode
    cpumask into the oneshot broadcast cpumask.
    
    This is required when switching from periodic broadcast mode to oneshot
    broadcast mode to ensure that CPUs which are waiting for periodic
    broadcast are woken up on the next tick.
    
    But it is subtly broken, when an active broadcast device is replaced and
    the system is already in oneshot (NOHZ/HIGHRES) mode. Victor observed
    this and debugged the issue.
    
    Then the OR of the periodic broadcast CPU mask is wrong as the periodic
    cpumask bits are sticky after tick_broadcast_enable() set it for a CPU
    unless explicitly cleared via tick_broadcast_disable().
    
    That means that this sets all other CPUs which have tick broadcasting
    enabled at that point unconditionally in the oneshot broadcast mask.
    
    If the affected CPUs were already idle and had their bits set in the
    oneshot broadcast mask then this does no harm. But for non idle CPUs
    which were not set this corrupts their state.
    
    On their next invocation of tick_broadcast_enable() they observe the bit
    set, which indicates that the broadcast for the CPU is already set up.
    As a consequence they fail to update the broadcast event even if their
    earliest expiring timer is before the actually programmed broadcast
    event.
    
    If the programmed broadcast event is far in the future, then this can
    cause stalls or trigger the hung task detector.
    
    Avoid this by telling tick_broadcast_setup_oneshot() explicitly whether
    this is the initial switch over from periodic to oneshot broadcast which
    must take the periodic broadcast mask into account. In the case of
    initialization of a replacement device this prevents that the broadcast
    oneshot mask is modified.
    
    There is a second problem with broadcast device replacement in this
    function. The broadcast device is only armed when the previous state of
    the device was periodic.
    
    That is correct for the switch from periodic broadcast mode to oneshot
    broadcast mode as the underlying broadcast device could operate in
    oneshot state already due to lack of periodic state in hardware. In that
    case it is already armed to expire at the next tick.
    
    For the replacement case this is wrong as the device is in shutdown
    state. That means that any already pending broadcast event will not be
    armed.
    
    This went unnoticed because any CPU which goes idle will observe that
    the broadcast device has an expiry time of KTIME_MAX and therefore any
    CPUs next timer event will be earlier and cause a reprogramming of the
    broadcast device. But that does not guarantee that the events of the
    CPUs which were already in idle are delivered on time.
    
    Fix this by arming the newly installed device for an immediate event
    which will reevaluate the per CPU expiry times and reprogram the
    broadcast device accordingly. This is simpler than caching the last
    expiry time in yet another place or saving it before the device exchange
    and handing it down to the setup function. Replacement of broadcast
    devices is not a frequent operation and usually happens once somewhere
    late in the boot process.
    
    Fixes: 9c336c99 ("tick/broadcast: Allow late registered device to enter oneshot mode")
    Reported-by: default avatarVictor Hassan <victor@allwinnertech.com>
    Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
    Reviewed-by: default avatarFrederic Weisbecker <frederic@kernel.org>
    Link: https://lore.kernel.org/r/87pm7d2z1i.ffs@tglx
    f9d36cf4
tick-broadcast.c 32.6 KB