• Vidya Sagar's avatar
    PCI: Extend ACS configurability · 47c8846a
    Vidya Sagar authored
    PCIe ACS settings control the level of isolation and the possible P2P paths
    between devices. With greater isolation the kernel will create smaller
    iommu_groups and with less isolation there is more HW that can achieve P2P
    transfers. From a virtualization perspective all devices in the same
    iommu_group must be assigned to the same VM as they lack security
    isolation.
    
    There is no way for the kernel to automatically know the correct ACS
    settings for any given system and workload. Existing command line options
    (e.g., disable_acs_redir) allow only for large scale change, disabling all
    isolation, but this is not sufficient for more complex cases.
    
    Add a kernel command-line option 'config_acs' to directly control all the
    ACS bits for specific devices, which allows the operator to setup the right
    level of isolation to achieve the desired P2P configuration.  The
    definition is future proof; when new ACS bits are added to the spec the
    open syntax can be extended.
    
    ACS needs to be setup early in the kernel boot as the ACS settings affect
    how iommu_groups are formed. iommu_group formation is a one time event
    during initial device discovery, so changing ACS bits after kernel boot can
    result in an inaccurate view of the iommu_groups compared to the current
    isolation configuration.
    
    ACS applies to PCIe Downstream Ports and multi-function devices.  The
    default ACS settings are strict and deny any direct traffic between two
    functions. This results in the smallest iommu_group the HW can support.
    Frequently these values result in slow or non-working P2PDMA.
    
    ACS offers a range of security choices controlling how traffic is
    allowed to go directly between two devices. Some popular choices:
    
      - Full prevention
    
      - Translated requests can be direct, with various options
    
      - Asymmetric direct traffic, A can reach B but not the reverse
    
      - All traffic can be direct
    
    Along with some other less common ones for special topologies.
    
    The intention is that this option would be used with expert knowledge of
    the HW capability and workload to achieve the desired configuration.
    
    Link: https://lore.kernel.org/r/20240625153150.159310-1-vidyas@nvidia.comSigned-off-by: default avatarVidya Sagar <vidyas@nvidia.com>
    [bhelgaas: add example, tidy printk formats]
    Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
    47c8846a
kernel-parameters.txt 270 KB