• Lennert Buytenhek's avatar
    ahci: add 43-bit DMA address quirk for ASMedia ASM1061 controllers · 20730e9b
    Lennert Buytenhek authored
    With one of the on-board ASM1061 AHCI controllers (1b21:0612) on an
    ASUSTeK Pro WS WRX80E-SAGE SE WIFI mainboard, a controller hang was
    observed that was immediately preceded by the following kernel
    messages:
    
    ahci 0000:28:00.0: Using 64-bit DMA addresses
    ahci 0000:28:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0035 address=0x7fffff00000 flags=0x0000]
    ahci 0000:28:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0035 address=0x7fffff00300 flags=0x0000]
    ahci 0000:28:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0035 address=0x7fffff00380 flags=0x0000]
    ahci 0000:28:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0035 address=0x7fffff00400 flags=0x0000]
    ahci 0000:28:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0035 address=0x7fffff00680 flags=0x0000]
    ahci 0000:28:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0035 address=0x7fffff00700 flags=0x0000]
    
    The first message is produced by code in drivers/iommu/dma-iommu.c
    which is accompanied by the following comment that seems to apply:
    
            /*
             * Try to use all the 32-bit PCI addresses first. The original SAC vs.
             * DAC reasoning loses relevance with PCIe, but enough hardware and
             * firmware bugs are still lurking out there that it's safest not to
             * venture into the 64-bit space until necessary.
             *
             * If your device goes wrong after seeing the notice then likely either
             * its driver is not setting DMA masks accurately, the hardware has
             * some inherent bug in handling >32-bit addresses, or not all the
             * expected address bits are wired up between the device and the IOMMU.
             */
    
    Asking the ASM1061 on a discrete PCIe card to DMA from I/O virtual
    address 0xffffffff00000000 produces the following I/O page faults:
    
    vfio-pci 0000:07:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0021 address=0x7ff00000000 flags=0x0010]
    vfio-pci 0000:07:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0021 address=0x7ff00000500 flags=0x0010]
    
    Note that the upper 21 bits of the logged DMA address are zero.  (When
    asking a different PCIe device in the same PCIe slot to DMA to the
    same I/O virtual address, we do see all the upper 32 bits of the DMA
    address as 1, so this is not an issue with the chipset or IOMMU
    configuration on the test system.)
    
    Also, hacking libahci to always set the upper 21 bits of all DMA
    addresses to 1 produces no discernible effect on the behavior of the
    ASM1061, and mkfs/mount/scrub/etc work as without this hack.
    
    This all strongly suggests that the ASM1061 has a 43 bit DMA address
    limit, and this commit therefore adds a quirk to deal with this limit.
    
    This issue probably applies to (some of) the other supported ASMedia
    parts as well, but we limit it to the PCI IDs known to refer to
    ASM1061 parts, as that's the only part we know for sure to be affected
    by this issue at this point.
    
    Link: https://lore.kernel.org/linux-ide/ZaZ2PIpEId-rl6jv@wantstofly.org/Signed-off-by: default avatarLennert Buytenhek <kernel@wantstofly.org>
    [cassel: drop date from error messages in commit log]
    Signed-off-by: default avatarNiklas Cassel <cassel@kernel.org>
    20730e9b
ahci.c 64.8 KB