• Jian-Hong Pan's avatar
    PCI/MSI: Fix incorrect MSI-X masking on resume · e045fa29
    Jian-Hong Pan authored
    When a driver enables MSI-X, msix_program_entries() reads the MSI-X Vector
    Control register for each vector and saves it in desc->masked.  Each
    register is 32 bits and bit 0 is the actual Mask bit.
    
    When we restored these registers during resume, we previously set the Mask
    bit if *any* bit in desc->masked was set instead of when the Mask bit
    itself was set:
    
      pci_restore_state
        pci_restore_msi_state
          __pci_restore_msix_state
            for_each_pci_msi_entry
              msix_mask_irq(entry, entry->masked)   <-- entire u32 word
                __pci_msix_desc_mask_irq(desc, flag)
                  mask_bits = desc->masked & ~PCI_MSIX_ENTRY_CTRL_MASKBIT
                  if (flag)       <-- testing entire u32, not just bit 0
                    mask_bits |= PCI_MSIX_ENTRY_CTRL_MASKBIT
                  writel(mask_bits, desc_addr + PCI_MSIX_ENTRY_VECTOR_CTRL)
    
    This means that after resume, MSI-X vectors were masked when they shouldn't
    be, which leads to timeouts like this:
    
      nvme nvme0: I/O 978 QID 3 timeout, completion polled
    
    On resume, set the Mask bit only when the saved Mask bit from suspend was
    set.
    
    This should remove the need for 19ea025e ("nvme: Add quirk for Kingston
    NVME SSD running FW E8FK11.T").
    
    [bhelgaas: commit log, move fix to __pci_msix_desc_mask_irq()]
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=204887
    Link: https://lore.kernel.org/r/20191008034238.2503-1-jian-hong@endlessm.com
    Fixes: f2440d9a ("PCI MSI: Refactor interrupt masking code")
    Signed-off-by: default avatarJian-Hong Pan <jian-hong@endlessm.com>
    Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
    Cc: stable@vger.kernel.org
    e045fa29
msi.c 39.1 KB