• Johannes Thumshirn's avatar
    PCI: Set Read Completion Boundary to 128 iff Root Port supports it (_HPX) · e42010d8
    Johannes Thumshirn authored
    Per PCIe spec r3.0, sec 2.3.1.1, the Read Completion Boundary (RCB)
    determines the naturally aligned address boundaries on which a Read Request
    may be serviced with multiple Completions:
    
      - For a Root Complex, RCB is 64 bytes or 128 bytes
        This value is reported in the Link Control Register
    
        Note: Bridges and Endpoints may implement a corresponding command bit
        which may be set by system software to indicate the RCB value for the
        Root Complex, allowing the Bridge/Endpoint to optimize its behavior
        when the Root Complex’s RCB is 128 bytes.
    
      - For all other system elements, RCB is 128 bytes
    
    Per sec 7.8.7, if a Root Port only supports a 64-byte RCB, the RCB of all
    downstream devices must be clear, indicating an RCB of 64 bytes.  If the
    Root Port supports a 128-byte RCB, we may optionally set the RCB of
    downstream devices so they know they can generate larger Completions.
    
    Some BIOSes supply an _HPX that tells us to set RCB, even though the Root
    Port doesn't have RCB set, which may lead to Malformed TLP errors if the
    Endpoint generates completions larger than the Root Port can handle.
    
    The IBM x3850 X6 with BIOS version -[A8E120CUS-1.30]- 08/22/2016 supplies
    such an _HPX and a Mellanox MT27500 ConnectX-3 device fails to initialize:
    
      mlx4_core 0000:41:00.0: command 0xfff timed out (go bit not cleared)
      mlx4_core 0000:41:00.0: device is going to be reset
      mlx4_core 0000:41:00.0: Failed to obtain HW semaphore, aborting
      mlx4_core 0000:41:00.0: Fail to reset HCA
      ------------[ cut here ]------------
      kernel BUG at drivers/net/ethernet/mellanox/mlx4/catas.c:193!
    
    After 6cd33649 ("PCI: Add pci_configure_device() during enumeration")
    and 7a1562d4 ("PCI: Apply _HPX Link Control settings to all devices
    with a link"), we apply _HPX settings to *all* devices, not just those
    hot-added after boot.
    
    Before 7a1562d4, we didn't touch the Mellanox RCB, and the device
    worked.  After 7a1562d4, we set its RCB to 128, and it failed.
    
    Set the RCB to 128 iff the Root Port supports a 128-byte RCB.  Otherwise,
    set RCB to 64 bytes.  This effectively ignores what _HPX tells us about
    RCB.
    
    Note that this change only affects _HPX handling.  If we have no _HPX, this
    does nothing with RCB.
    
    [bhelgaas: changelog, clear RCB if not set for Root Port]
    Fixes: 6cd33649 ("PCI: Add pci_configure_device() during enumeration")
    Fixes: 7a1562d4 ("PCI: Apply _HPX Link Control settings to all devices with a link")
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=187781Tested-by: default avatarFrank Danapfel <fdanapfe@redhat.com>
    Signed-off-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
    Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
    Acked-by: default avatarMyron Stowe <myron.stowe@redhat.com>
    CC: stable@vger.kernel.org	# v3.18+
    e42010d8
probe.c 63.7 KB