• Sean Christopherson's avatar
    KVM: x86: Morph pending exceptions to pending VM-Exits at queue time · 7709aba8
    Sean Christopherson authored
    Morph pending exceptions to pending VM-Exits (due to interception) when
    the exception is queued instead of waiting until nested events are
    checked at VM-Entry.  This fixes a longstanding bug where KVM fails to
    handle an exception that occurs during delivery of a previous exception,
    KVM (L0) and L1 both want to intercept the exception (e.g. #PF for shadow
    paging), and KVM determines that the exception is in the guest's domain,
    i.e. queues the new exception for L2.  Deferring the interception check
    causes KVM to esclate various combinations of injected+pending exceptions
    to double fault (#DF) without consulting L1's interception desires, and
    ends up injecting a spurious #DF into L2.
    
    KVM has fudged around the issue for #PF by special casing emulated #PF
    injection for shadow paging, but the underlying issue is not unique to
    shadow paging in L0, e.g. if KVM is intercepting #PF because the guest
    has a smaller maxphyaddr and L1 (but not L0) is using shadow paging.
    Other exceptions are affected as well, e.g. if KVM is intercepting #GP
    for one of SVM's workaround or for the VMware backdoor emulation stuff.
    The other cases have gone unnoticed because the #DF is spurious if and
    only if L1 resolves the exception, e.g. KVM's goofs go unnoticed if L1
    would have injected #DF anyways.
    
    The hack-a-fix has also led to ugly code, e.g. bailing from the emulator
    if #PF injection forced a nested VM-Exit and the emulator finds itself
    back in L1.  Allowing for direct-to-VM-Exit queueing also neatly solves
    the async #PF in L2 mess; no need to set a magic flag and token, simply
    queue a #PF nested VM-Exit.
    
    Deal with event migration by flagging that a pending exception was queued
    by userspace and check for interception at the next KVM_RUN, e.g. so that
    KVM does the right thing regardless of the order in which userspace
    restores nested state vs. event state.
    
    When "getting" events from userspace, simply drop any pending excpetion
    that is destined to be intercepted if there is also an injected exception
    to be migrated.  Ideally, KVM would migrate both events, but that would
    require new ABI, and practically speaking losing the event is unlikely to
    be noticed, let alone fatal.  The injected exception is captured, RIP
    still points at the original faulting instruction, etc...  So either the
    injection on the target will trigger the same intercepted exception, or
    the source of the intercepted exception was transient and/or
    non-deterministic, thus dropping it is ok-ish.
    
    Fixes: a04aead1 ("KVM: nSVM: fix running nested guests when npt=0")
    Fixes: feaf0c7d ("KVM: nVMX: Do not generate #DF if #PF happens during exception delivery into L2")
    Cc: Jim Mattson <jmattson@google.com>
    Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
    Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
    Link: https://lore.kernel.org/r/20220830231614.3580124-22-seanjc@google.comSigned-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
    7709aba8
x86.c 359 KB