• Paolo Bonzini's avatar
    KVM: x86: Fix split-irqchip vs interrupt injection window request · 71cc849b
    Paolo Bonzini authored
    kvm_cpu_accept_dm_intr and kvm_vcpu_ready_for_interrupt_injection are
    a hodge-podge of conditions, hacked together to get something that
    more or less works.  But what is actually needed is much simpler;
    in both cases the fundamental question is, do we have a place to stash
    an interrupt if userspace does KVM_INTERRUPT?
    
    In userspace irqchip mode, that is !vcpu->arch.interrupt.injected.
    Currently kvm_event_needs_reinjection(vcpu) covers it, but it is
    unnecessarily restrictive.
    
    In split irqchip mode it's a bit more complicated, we need to check
    kvm_apic_accept_pic_intr(vcpu) (the IRQ window exit is basically an INTACK
    cycle and thus requires ExtINTs not to be masked) as well as
    !pending_userspace_extint(vcpu).  However, there is no need to
    check kvm_event_needs_reinjection(vcpu), since split irqchip keeps
    pending ExtINT state separate from event injection state, and checking
    kvm_cpu_has_interrupt(vcpu) is wrong too since ExtINT has higher
    priority than APIC interrupts.  In fact the latter fixes a bug:
    when userspace requests an IRQ window vmexit, an interrupt in the
    local APIC can cause kvm_cpu_has_interrupt() to be true and thus
    kvm_vcpu_ready_for_interrupt_injection() to return false.  When this
    happens, vcpu_run does not exit to userspace but the interrupt window
    vmexits keep occurring.  The VM loops without any hope of making progress.
    
    Once we try to fix these with something like
    
         return kvm_arch_interrupt_allowed(vcpu) &&
    -        !kvm_cpu_has_interrupt(vcpu) &&
    -        !kvm_event_needs_reinjection(vcpu) &&
    -        kvm_cpu_accept_dm_intr(vcpu);
    +        (!lapic_in_kernel(vcpu)
    +         ? !vcpu->arch.interrupt.injected
    +         : (kvm_apic_accept_pic_intr(vcpu)
    +            && !pending_userspace_extint(v)));
    
    we realize two things.  First, thanks to the previous patch the complex
    conditional can reuse !kvm_cpu_has_extint(vcpu).  Second, the interrupt
    window request in vcpu_enter_guest()
    
            bool req_int_win =
                    dm_request_for_irq_injection(vcpu) &&
                    kvm_cpu_accept_dm_intr(vcpu);
    
    should be kept in sync with kvm_vcpu_ready_for_interrupt_injection():
    it is unnecessary to ask the processor for an interrupt window
    if we would not be able to return to userspace.  Therefore,
    kvm_cpu_accept_dm_intr(vcpu) is basically !kvm_cpu_has_extint(vcpu)
    ANDed with the existing check for masked ExtINT.  It all makes sense:
    
    - we can accept an interrupt from userspace if there is a place
      to stash it (and, for irqchip split, ExtINTs are not masked).
      Interrupts from userspace _can_ be accepted even if right now
      EFLAGS.IF=0.
    
    - in order to tell userspace we will inject its interrupt ("IRQ
      window open" i.e. kvm_vcpu_ready_for_interrupt_injection), both
      KVM and the vCPU need to be ready to accept the interrupt.
    
    ... and this is what the patch implements.
    Reported-by: default avatarDavid Woodhouse <dwmw@amazon.co.uk>
    Analyzed-by: default avatarDavid Woodhouse <dwmw@amazon.co.uk>
    Cc: stable@vger.kernel.org
    Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
    Reviewed-by: default avatarNikos Tsironis <ntsironis@arrikto.com>
    Reviewed-by: default avatarDavid Woodhouse <dwmw@amazon.co.uk>
    Tested-by: default avatarDavid Woodhouse <dwmw@amazon.co.uk>
    71cc849b
x86.c 293 KB