• Jacob Keller's avatar
    fm10k: ensure completer aborts are marked as non-fatal after a resume · e330af78
    Jacob Keller authored
    VF drivers can trigger PCIe completer aborts any time they read a queue
    that they don't own. Even in nominal circumstances, it is not possible
    to prevent the VF driver from reading queues it doesn't own. VF drivers
    may attempt to read queues it previously owned, but which it no longer
    does due to a PF reset.
    
    Normally these completer aborts aren't an issue. However, on some
    platforms these trigger machine check errors. This is true even if we
    lower their severity from fatal to non-fatal. Indeed, we already have
    code for lowering the severity.
    
    We could attempt to mask these errors conditionally around resets, which
    is the most common time they would occur. However this would essentially
    be a race between the PF and VF drivers, and we may still occasionally
    see machine check exceptions on these strictly configured platforms.
    
    Instead, mask the errors entirely any time we resume VFs. By doing so,
    we prevent the completer aborts from being sent to the parent PCIe
    device, and thus these strict platforms will not upgrade them into
    machine check errors.
    
    Additionally, we don't lose any information by masking these errors,
    because we'll still report VFs which attempt to access queues via the
    FUM_BAD_VF_QACCESS errors.
    
    Without this change, on platforms where completer aborts cause machine
    check exceptions, the VF reading queues it doesn't own could crash the
    host system. Masking the completer abort prevents this, so we should
    mask it for good, and not just around a PCIe reset. Otherwise malicious
    or misconfigured VFs could cause the host system to crash.
    
    Because we are masking the error entirely, there is little reason to
    also keep setting the severity bit, so that code is also removed.
    Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
    Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
    e330af78
fm10k_iov.c 17.4 KB