• Michael Ellerman's avatar
    powerpc/time: Always set decrementer in timer_interrupt() · d2b9be1f
    Michael Ellerman authored
    This is a partial revert of commit 0faf20a1 ("powerpc/64s/interrupt:
    Don't enable MSR[EE] in irq handlers unless perf is in use").
    
    Prior to that commit, we always set the decrementer in
    timer_interrupt(), to clear the timer interrupt. Otherwise we could end
    up continuously taking timer interrupts.
    
    When high res timers are enabled there is no problem seen with leaving
    the decrementer untouched in timer_interrupt(), because it will be
    programmed via hrtimer_interrupt() -> tick_program_event() ->
    clockevents_program_event() -> decrementer_set_next_event().
    
    However with CONFIG_HIGH_RES_TIMERS=n or booting with highres=off, we
    see a stall/lockup, because tick_nohz_handler() does not cause a
    reprogram of the decrementer, leading to endless timer interrupts.
    Example trace:
    
      [    1.898617][    T7] Freeing initrd memory: 2624K^M
      [   22.680919][    C1] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:^M
      [   22.682281][    C1] rcu:     0-....: (25 ticks this GP) idle=073/0/0x1 softirq=10/16 fqs=1050 ^M
      [   22.682851][    C1]  (detected by 1, t=2102 jiffies, g=-1179, q=476)^M
      [   22.683649][    C1] Sending NMI from CPU 1 to CPUs 0:^M
      [   22.685252][    C0] NMI backtrace for cpu 0^M
      [   22.685649][    C0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-rc2-00185-g0faf20a1 #145^M
      [   22.686393][    C0] NIP:  c000000000016d64 LR: c000000000f6cca4 CTR: c00000000019c6e0^M
      [   22.686774][    C0] REGS: c000000002833590 TRAP: 0500   Not tainted  (5.16.0-rc2-00185-g0faf20a1)^M
      [   22.687222][    C0] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 24000222  XER: 00000000^M
      [   22.688297][    C0] CFAR: c00000000000c854 IRQMASK: 0 ^M
      ...
      [   22.692637][    C0] NIP [c000000000016d64] arch_local_irq_restore+0x174/0x250^M
      [   22.694443][    C0] LR [c000000000f6cca4] __do_softirq+0xe4/0x3dc^M
      [   22.695762][    C0] Call Trace:^M
      [   22.696050][    C0] [c000000002833830] [c000000000f6cc80] __do_softirq+0xc0/0x3dc (unreliable)^M
      [   22.697377][    C0] [c000000002833920] [c000000000151508] __irq_exit_rcu+0xd8/0x130^M
      [   22.698739][    C0] [c000000002833950] [c000000000151730] irq_exit+0x20/0x40^M
      [   22.699938][    C0] [c000000002833970] [c000000000027f40] timer_interrupt+0x270/0x460^M
      [   22.701119][    C0] [c0000000028339d0] [c0000000000099a8] decrementer_common_virt+0x208/0x210^M
    
    Possibly this should be fixed in the lowres timing code, but that would
    be a generic change and could take some time and may not backport
    easily, so for now make the programming of the decrementer unconditional
    again in timer_interrupt() to avoid the stall/lockup.
    
    Fixes: 0faf20a1 ("powerpc/64s/interrupt: Don't enable MSR[EE] in irq handlers unless perf is in use")
    Reported-by: default avatarMiguel Ojeda <miguel.ojeda.sandonis@gmail.com>
    Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
    Reviewed-by: default avatarNicholas Piggin <npiggin@gmail.com>
    Link: https://lore.kernel.org/r/20220420141657.771442-1-mpe@ellerman.id.au
    d2b9be1f
time.c 28 KB