• Paul Mackerras's avatar
    powerpc/powernv: Handle irq_happened flag correctly in off-line loop · 53c656c4
    Paul Mackerras authored
    This fixes a bug where it is possible for an off-line CPU to fail to go
    into a low-power state (nap/sleep/winkle), and to become unresponsive to
    requests from the KVM subsystem to wake up and run a VCPU. What can
    happen is that a maskable interrupt of some kind (external, decrementer,
    hypervisor doorbell, or HMI) after we have called local_irq_disable() at
    the beginning of pnv_smp_cpu_kill_self() and before interrupts are
    hard-disabled inside power7_nap/sleep/winkle(). In this situation, the
    pending event is marked in the irq_happened flag in the PACA. This
    pending event prevents power7_nap/sleep/winkle from going to the
    requested low-power state; instead they return immediately. We don't
    deal with any of these pending event flags in the off-line loop in
    pnv_smp_cpu_kill_self() because power7_nap et al. return 0 in this case,
    so we will have srr1 == 0, and none of the processing to clear
    interrupts or doorbells will be done.
    
    Usually, the most obvious symptom of this is that a KVM guest will fail
    with a console message saying "KVM: couldn't grab cpu N".
    
    This fixes the problem by making sure we handle the irq_happened flags
    properly. First, we hard-disable before the off-line loop. Once we have
    hard-disabled, the irq_happened flags can't change underneath us. We
    unconditionally clear the DEC and HMI flags: there is no processing of
    timer interrupts while off-line, and the necessary HMI processing is all
    done in lower-level code. We leave the EE and DBELL flags alone for the
    first iteration of the loop, so that we won't fail to respond to a
    split-core request that came in just before hard-disabling. Within the
    loop, we handle external interrupts if the EE bit is set in irq_happened
    as well as if the low-power state was interrupted by an external
    interrupt. (We don't need to do the msgclr for a pending doorbell in
    irq_happened, because doorbells are edge-triggered and don't remain
    pending in hardware.) Then we clear both the EE and DBELL flags, and
    once clear, they cannot be set again (until this CPU comes online again,
    that is).
    
    This also fixes the debug check to not be done when we just ran a KVM
    guest or when the sleep didn't happen because of a pending event in
    irq_happened.
    Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
    Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
    53c656c4
smp.c 7.17 KB