- 19 Jul, 2024 9 commits
-
-
Bjorn Helgaas authored
- Add "apb", "sys", "pmc", "msg", "err" for Endpoint descriptions as well as for Root Complexes (Niklas Cassel) - Add "tx_inta", "tx_intb", "tx_intc", "tx_intd" for interrupt signals triggered in response to PCIe Assert_INTx messages (Niklas Cassel) - Refactor rockchip-dw-pcie binding to move generic properties to a new rockchip-dw-pcie-common binding that can be shared by both RC and EP mode (Niklas Cassel) - Fix rockchip-dw-pcie description of INTx signals (Niklas Cassel) - Add rockchip-dw-pcie description of Endpoint controller (Niklas Cassel) - Avoid xilinx-versal-cpm overlapping of bridge registers and 32-bit BAR addresses (Thippeswamy Havalige) - Add qcom Operating Performance Points (OPP) table (Krishna chaitanya chundru) - Add a picture of mediatek,mt7621-pcie topology (Sergio Paracuellos) - Add a generic "ats-supported" property so the OS can discover whether a Root Complex supports ATS (Jean-Philippe Brucker) - Make the qcom,pcie-x1e80100 MHI register region mandatory (Abel Vesa) * pci/dt-bindings: dt-bindings: PCI: qcom: x1e80100: Make the MHI reg region mandatory dt-bindings: PCI: generic: Add ats-supported property dt-bindings: PCI: mediatek,mt7621-pcie: Add PCIe host topology ASCII graph dt-bindings: PCI: qcom: Add OPP table dt-bindings: PCI: xilinx-cpm: Fix overlapping of bridge register and 32-bit BAR addresses dt-bindings: PCI: rockchip: Add DesignWare based PCIe Endpoint controller dt-bindings: PCI: rockchip-dw-pcie: Fix description of legacy IRQ dt-bindings: PCI: rockchip-dw-pcie: Prepare for Endpoint mode support dt-bindings: PCI: snps,dw-pcie-ep: Add tx_int{a,b,c,d} legacy IRQs dt-bindings: PCI: snps,dw-pcie-ep: Add vendor specific interrupt-names dt-bindings: PCI: snps,dw-pcie-ep: Add vendor specific reg-name
-
Bjorn Helgaas authored
- Rename find_resource() to find_resource_space() to make it more descriptive for exporting outside resource.c (Ilpo Järvinen) - Document find_resource_space() and the resource_constraint struct it uses (Ilpo Järvinen) - Add typedef resource_alignf to make it simpler to declare allocation constraint alignf callbacks (Ilpo Järvinen) - Open-code the no-constraint simple alignment case to make the simple_align_resource() default callback unnecessary (Ilpo Järvinen) - Export find_resource_space() because PCI bridge window allocation needs to learn whether there's space for a window (Ilpo Järvinen) - Fix a double-counting problem in PCI calculate_memsize() that led to allocating larger windows each time a bus was removed and rescanned (Ilpo Järvinen) - When we don't have space to allocate larger bridge windows, allocate windows only large enough for the downstream devices to prevent cases where a device worked originally, but not after being removed and re-added (Ilpo Järvinen) * pci/resource: PCI: Relax bridge window tail sizing rules PCI: Make minimum bridge window alignment reference more obvious PCI: Fix resource double counting on remove & rescan resource: Export find_resource_space() resource: Handle simple alignment inside __find_resource_space() resource: Use typedef for alignf callback resource: Document find_resource_space() and resource_constraint resource: Rename find_resource() to find_resource_space()
-
Bjorn Helgaas authored
- Warn about doing a Secondary Bus Reset without holding the device lock (Dan Williams) - Lock bridge in addition to downstream hierarchy before doing a Secondary Bus Reset (Dan Williams) * pci/reset: PCI: Add missing bridge lock to pci_bus_lock() PCI: Warn on missing cfg_access_lock during secondary bus reset
-
Bjorn Helgaas authored
- Detect if a device was removed or replaced during system sleep so we don't assume a new device is the one that used to be there. This uses Vendor/Device/Subsystem/Class/Revision and Device Serial Number (if implemented), so it's not fool-proof and drivers may know how to detect more cases (Lukas Wunner) - Add missing MODULE_DESCRIPTION() macro (Jeff Johnson) * pci/hotplug: PCI: acpiphp: Add missing MODULE_DESCRIPTION() macro PCI: pciehp: Detect device replacement during system sleep
-
Bjorn Helgaas authored
- Disable AER and DPC during suspend so that if they share an interrupt with PME and errors occur during suspend, the AER or DPC interrupt doesn't cause spurious wakeups (Kai-Heng Feng) * pci/err: PCI/DPC: Disable DPC service on suspend PCI/AER: Disable AER service on suspend
-
Bjorn Helgaas authored
- Move the PRESERVE_BOOT_CONFIG ACPI _DSM evaluation from drivers/acpi to drivers/pci so we can unify with similar DT functionality (Vidya Sagar) - Add of_pci_preserve_config() to check for a DT "linux,pci-probe-only" property on a per-host bridge basis in addition to a global basis (Vidya Sagar) - Unify ACPI PRESERVE_BOOT_CONFIG _DSM and DT "linux,pci-probe-only" in a generic pci_preserve_config() path (Vidya Sagar) * pci/enumeration: PCI: Use preserve_config in place of pci_flags PCI: Unify ACPI and DT 'preserve config' support PCI: of: Add of_pci_preserve_config() for per-host bridge support PCI: Move PRESERVE_BOOT_CONFIG _DSM evaluation to pci_register_host_bridge()
-
Bjorn Helgaas authored
- If there's a device below a bridge, prevent a use-after-free by holding a reference to the device while waiting for the secondary bus to be ready in case the device is concurrently removed, e.g., by DPC (Lukas Wunner) * pci/dpc: PCI/DPC: Fix use-after-free on concurrent DPC and hot-removal
-
Bjorn Helgaas authored
- Add pcim_add_mapping_to_legacy_table() and pcim_remove_mapping_from_legacy_table() helper functions to simplify devres iomap table (Philipp Stanner) - Reimplement devres that take a bit mask of BARs in a way that can be used to map partial BARs as well as entire BARs (Philipp Stanner) - Deprecate pcim_iomap_table() and pcim_iomap_regions_request_all() in favor of pcim_* request plus pcim_* mapping (Philipp Stanner) - Add pcim_request_region(), a managed interface to request a single BAR (Philipp Stanner) - Use the existing pci_is_enabled() interface to replace the struct devres.enabled bit (Philipp Stanner) - Move the struct pci_devres.pinned bit to struct pci_dev (Philipp Stanner) - Reimplement pcim_set_mwi() so it uses its own devres cleanup callback instead of a special-purpose bit in struct pci_devres (Philipp Stanner) - Add pcim_intx(), which is unambiguously managed, unlike pci_intx(), which is managed if pcim_enable_device() has been called but unmanaged otherwise (Philipp Stanner) - Remove pcim_release(), which is no longer needed after previous cleanups of pcim_set_mwi() and pci_intx() (Philipp Stanner) - Add pcim_iomap_range(), a managed interface to map part of a BAR (Philipp Stanner) - Fix vboxvideo leak by using the new pcim_iomap_range() instead of the unmanaged pci_iomap_range() (Philipp Stanner) * pci/devres: drm/vboxvideo: fix mapping leaks PCI: Add managed pcim_iomap_range() PCI: Remove legacy pcim_release() PCI: Add managed pcim_intx() PCI: Give pcim_set_mwi() its own devres cleanup callback PCI: Move struct pci_devres.pinned bit to struct pci_dev PCI: Remove struct pci_devres.enabled status bit PCI: Document hybrid devres hazards PCI: Add managed pcim_request_region() PCI: Deprecate pcim_iomap_table(), pcim_iomap_regions_request_all() PCI: Add managed partial-BAR request and map infrastructure PCI: Add devres helpers for iomap table PCI: Add and use devres helper for bit masks
-
Bjorn Helgaas authored
- Add ACS quirk for Broadcom BCM5760X NIC, which doesn't allow peer-to-peer transactions between functions, but doesn't advertise ACS support (Ajit Khaparde) - Add "pci=config_acs=" kernel command-line parameter to relax default ACS settings to enable peer-to-peer configurations. Requires expert knowledge of topology and ACS operation (Vidya Sagar) * pci/acs: PCI: Extend ACS configurability PCI: Add ACS quirk for Broadcom BCM5760X NIC
-
- 12 Jul, 2024 2 commits
-
-
Vidya Sagar authored
PCIe ACS settings control the level of isolation and the possible P2P paths between devices. With greater isolation the kernel will create smaller iommu_groups and with less isolation there is more HW that can achieve P2P transfers. From a virtualization perspective all devices in the same iommu_group must be assigned to the same VM as they lack security isolation. There is no way for the kernel to automatically know the correct ACS settings for any given system and workload. Existing command line options (e.g., disable_acs_redir) allow only for large scale change, disabling all isolation, but this is not sufficient for more complex cases. Add a kernel command-line option 'config_acs' to directly control all the ACS bits for specific devices, which allows the operator to setup the right level of isolation to achieve the desired P2P configuration. The definition is future proof; when new ACS bits are added to the spec the open syntax can be extended. ACS needs to be setup early in the kernel boot as the ACS settings affect how iommu_groups are formed. iommu_group formation is a one time event during initial device discovery, so changing ACS bits after kernel boot can result in an inaccurate view of the iommu_groups compared to the current isolation configuration. ACS applies to PCIe Downstream Ports and multi-function devices. The default ACS settings are strict and deny any direct traffic between two functions. This results in the smallest iommu_group the HW can support. Frequently these values result in slow or non-working P2PDMA. ACS offers a range of security choices controlling how traffic is allowed to go directly between two devices. Some popular choices: - Full prevention - Translated requests can be direct, with various options - Asymmetric direct traffic, A can reach B but not the reverse - All traffic can be direct Along with some other less common ones for special topologies. The intention is that this option would be used with expert knowledge of the HW capability and workload to achieve the desired configuration. Link: https://lore.kernel.org/r/20240625153150.159310-1-vidyas@nvidia.comSigned-off-by: Vidya Sagar <vidyas@nvidia.com> [bhelgaas: add example, tidy printk formats] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
-
Dan Williams authored
One of the true positives that the cfg_access_lock lockdep effort identified is this sequence: WARNING: CPU: 14 PID: 1 at drivers/pci/pci.c:4886 pci_bridge_secondary_bus_reset+0x5d/0x70 RIP: 0010:pci_bridge_secondary_bus_reset+0x5d/0x70 Call Trace: <TASK> ? __warn+0x8c/0x190 ? pci_bridge_secondary_bus_reset+0x5d/0x70 ? report_bug+0x1f8/0x200 ? handle_bug+0x3c/0x70 ? exc_invalid_op+0x18/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? pci_bridge_secondary_bus_reset+0x5d/0x70 pci_reset_bus+0x1d8/0x270 vmd_probe+0x778/0xa10 pci_device_probe+0x95/0x120 Where pci_reset_bus() users are triggering unlocked secondary bus resets. Ironically pci_bus_reset(), several calls down from pci_reset_bus(), uses pci_bus_lock() before issuing the reset which locks everything *but* the bridge itself. For the same motivation as adding: bridge = pci_upstream_bridge(dev); if (bridge) pci_dev_lock(bridge); to pci_reset_function() for the "bus" and "cxl_bus" reset cases, add pci_dev_lock() for @bus->self to pci_bus_lock(). Link: https://lore.kernel.org/r/171711747501.1628941.15217746952476635316.stgit@dwillia2-xfh.jf.intel.comReported-by: Imre Deak <imre.deak@intel.com> Closes: http://lore.kernel.org/r/6657833b3b5ae_14984b29437@dwillia2-xfh.jf.intel.com.notmuchSigned-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Keith Busch <kbusch@kernel.org> [bhelgaas: squash in recursive locking deadlock fix from Keith Busch: https://lore.kernel.org/r/20240711193650.701834-1-kbusch@meta.com] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Hans de Goede <hdegoede@redhat.com> Tested-by: Kalle Valo <kvalo@kernel.org> Reviewed-by: Dave Jiang <dave.jiang@intel.com>
-
- 11 Jul, 2024 4 commits
-
-
Philipp Stanner authored
When the PCI devres API was introduced to this driver, it was wrongly assumed that initializing the device with pcim_enable_device() instead of pci_enable_device() will make all PCI functions managed. This is wrong and was caused by the quite confusing PCI devres API in which some, but not all, functions become managed that way. The function pci_iomap_range() is never managed. Replace pci_iomap_range() with the managed function pcim_iomap_range(). Fixes: 8558de40 ("drm/vboxvideo: use managed pci functions") Link: https://lore.kernel.org/r/20240613115032.29098-14-pstanner@redhat.comSigned-off-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com>
-
Philipp Stanner authored
The only managed mapping function currently is pcim_iomap() which doesn't allow for mapping an area starting at a certain offset, which many drivers want. Add pcim_iomap_range() as an exported function. Link: https://lore.kernel.org/r/20240613115032.29098-13-pstanner@redhat.comSigned-off-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
-
Philipp Stanner authored
Thanks to preceding cleanup steps, pcim_release() is now not needed anymore and can be replaced by pcim_disable_device(), which is the exact counterpart to pcim_enable_device(). This permits removing further parts of the old PCI devres implementation. Replace pcim_release() with pcim_disable_device(). Remove the now unused function get_pci_dr(). Remove the struct pci_devres from pci.h. Link: https://lore.kernel.org/r/20240613115032.29098-12-pstanner@redhat.comSigned-off-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
-
Philipp Stanner authored
pci_intx() is a "hybrid" function, i.e., it is managed if pcim_enable_device() has been called, but unmanaged otherwise. Add pcim_intx(), which is always managed, and implement pci_intx() using it. Remove the now-unused struct pci_devres.orig_intx and .restore_intx and find_pci_dr(). Link: https://lore.kernel.org/r/20240613115032.29098-11-pstanner@redhat.comSigned-off-by: Philipp Stanner <pstanner@redhat.com> [kwilczynski: squashed in https://lore.kernel.org/r/426645d40776198e0fcc942f4a6cac4433c7a9aa.camel@redhat.com to fix problem reported and tested by Ashish Kalra <Ashish.Kalra@amd.com>: https://lore.kernel.org/r/20240708214656.4721-1-Ashish.Kalra@amd.com https://lore.kernel.org/r/8c4634e9-4f02-4c54-9c89-d75e2f4bf026@amd.com/] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
-
- 10 Jul, 2024 9 commits
-
-
Philipp Stanner authored
Managing pci_set_mwi() with devres can easily be done with its own callback, without the necessity to store any state about it in a device-related struct. Remove the MWI state from struct pci_devres. Give pcim_set_mwi() a separate devres cleanup callback. Link: https://lore.kernel.org/r/20240613115032.29098-10-pstanner@redhat.comSigned-off-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
-
Philipp Stanner authored
The bit describing whether the PCI device is currently pinned is stored in struct pci_devres. To clean up and simplify the PCI devres API, it's better if this information is stored in struct pci_dev. This will later permit simplifying pcim_enable_device(). Move the 'pinned' boolean bit to struct pci_dev. Restructure bits in struct pci_dev so the pm / pme fields are next to each other. Link: https://lore.kernel.org/r/20240613115032.29098-9-pstanner@redhat.comSigned-off-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
-
Philipp Stanner authored
The struct pci_devres has a separate boolean to track whether a device is enabled. That, however, can easily be tracked in an agnostic manner through the function pci_is_enabled(). Using it allows for simplifying the PCI devres implementation. Replace the separate 'enabled' status bit from struct pci_devres with calls to pci_is_enabled() at the appropriate places. Link: https://lore.kernel.org/r/20240613115032.29098-8-pstanner@redhat.comSigned-off-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
-
Philipp Stanner authored
These functions: pci_request_region() pci_request_regions() pci_request_regions_exclusive() pci_request_selected_regions() pci_request_selected_regions_exclusive() pci_intx() are "hybrid" functions that are managed if pcim_enable_device() has been called, but unmanaged otherwise. This is confusing and has already caused a bug (in 8558de40 ("drm/vboxvideo: use managed pci functions")) because users believe all PCI functions, such as pci_iomap_range(), can become managed that way, which is not the case. Add comments to the relevant functions' docstrings that warn users about this behavior. Link: https://lore.kernel.org/r/20240613115032.29098-7-pstanner@redhat.comSigned-off-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
-
Philipp Stanner authored
These existing functions: pci_request_region() pci_request_selected_regions() pci_request_selected_regions_exclusive() are "hybrid" functions built on __pci_request_region() and are managed if pcim_enable_device() has been called, but unmanaged otherwise. Add these new functions: pcim_request_region() pcim_request_region_exclusive() These are *always* managed and use the new pcim_addr_devres tracking infrastructure instead of find_pci_dr() and struct pci_devres.region_mask. Implement the hybrid functions using the new "pure" functions and remove struct pci_devres.region_mask, which is no longer needed. Link: https://lore.kernel.org/r/20240613115032.29098-6-pstanner@redhat.comSigned-off-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
-
Philipp Stanner authored
Deprecate pcim_iomap_table(). It returns a pointer to a table of ioremapped BARs, or NULL if it fails. This makes uses like this: addr = pcim_iomap_table(pdev)[0]; problematic because it causes a NULL pointer dereference on failure. Callers should use pcim_iomap() instead. Deprecate pcim_iomap_regions_request_all() because it is built on __pci_request_region() and is managed if pcim_enable_device() has been called, but unmanaged otherwise, which is prone to errors. Callers should either use pcim_iomap_regions() to request and map BARs, or use pcim_request_region() followed by pcim_iomap(). Link: https://lore.kernel.org/r/20240613115032.29098-5-pstanner@redhat.comSigned-off-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> [bhelgaas: commit log, sphinx markup] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
-
Philipp Stanner authored
The pcim_iomap_devres table tracks entire-BAR mappings, so we can't use it to build a managed version of pci_iomap_range(), which maps partial BARs. Add struct pcim_addr_devres, which can track request and mapping of both entire BARs and partial BARs. Add the following internal devres functions based on struct pcim_addr_devres: pcim_iomap_region() # request & map entire BAR pcim_iounmap_region() # unmap & release entire BAR pcim_request_region() # request entire BAR pcim_release_region() # release entire BAR pcim_request_all_regions() # request all entire BARs pcim_release_all_regions() # release all entire BARs Rework the following public interfaces using the new infrastructure listed above: pcim_iomap() # map partial BAR pcim_iounmap() # unmap partial BAR pcim_iomap_regions() # request & map specified BARs pcim_iomap_regions_request_all() # request all BARs, map specified BARs pcim_iounmap_regions() # unmap & release specified BARs Link: https://lore.kernel.org/r/20240613115032.29098-4-pstanner@redhat.comSigned-off-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
-
Philipp Stanner authored
The pcim_iomap_devres.table administrated by pcim_iomap_table() has its entries set and unset at several places throughout devres.c using manual iterations which are effectively code duplications. Add pcim_add_mapping_to_legacy_table() and pcim_remove_mapping_from_legacy_table() helper functions and use them where possible. Link: https://lore.kernel.org/r/20240613115032.29098-3-pstanner@redhat.comSigned-off-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
-
Philipp Stanner authored
The current devres implementation uses manual shift operations to check whether a bit in a mask is set. The code can be made more readable by writing a small helper function for that. Implement mask_contains_bar() and use it where applicable. Link: https://lore.kernel.org/r/20240613115032.29098-2-pstanner@redhat.comSigned-off-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
-
- 04 Jul, 2024 6 commits
-
-
Abel Vesa authored
All PCIe controllers found on X1E80100 have MHI register region. So change the schema to reflect that. Fixes: 692eadd5 ("dt-bindings: PCI: qcom: Document the X1E80100 PCIe Controller") Link: https://lore.kernel.org/linux-pci/20240605-x1e80100-pci-bindings-fix-v2-1-c465e87966fc@linaro.orgSigned-off-by: Abel Vesa <abel.vesa@linaro.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
-
Jean-Philippe Brucker authored
Add a way for firmware to tell the OS that ATS is supported by the PCI root complex. An endpoint with ATS enabled may send Translation Requests and Translated Memory Requests, which look just like Normal Memory Requests with a non-zero AT field. So a root controller that ignores the AT field may simply forward the request to the IOMMU as a Normal Memory Request, which could end badly. In any case, the endpoint will be unusable. The ats-supported property allows the OS to only enable ATS in endpoints if the root controller can handle ATS requests. Only add the property to pcie-host-ecam-generic for the moment. For non-generic root controllers, availability of ATS can be inferred from the compatible string. Link: https://lore.kernel.org/linux-pci/20240607105415.2501934-3-jean-philippe@linaro.orgSigned-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
-
Sergio Paracuellos authored
MediaTek MT7621 PCIe sub-system supports a single Root Complex (RC) with 3 Root Ports. Add PCIe host topology ASCII graph to the binding for completeness. Suggested-by: Krzysztof Kozlowski <krzk@kernel.org> Link: https://lore.kernel.org/linux-pci/20240522044321.3205160-1-sergio.paracuellos@gmail.comSigned-off-by: Sergio Paracuellos <sergio.paracuellos@gmail.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
-
Krishna chaitanya chundru authored
PCIe needs to choose the appropriate performance state of RPMh power domain based on the PCIe gen speed. Adding the Operating Performance Points table allows to adjust power domain performance state and ICC peak bw, depending on the PCIe data rate and link width. Link: https://lore.kernel.org/linux-pci/20240619-opp_support-v15-2-aa769a2173a3@quicinc.comSigned-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
-
Thippeswamy Havalige authored
The current configuration had non-prefetchable memory overlapping with bridge registers by 64KB from base address. This patch fixes the 'ranges' property in the device tree by adjusting the non-prefetchable memory addresses beyond the 64KB mark to prevent conflicts. [kwilczynski: commit log] Link: https://lore.kernel.org/linux-pci/20240624111022.133780-1-thippesw@amd.comSigned-off-by: Thippeswamy Havalige <thippesw@amd.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
-
Niklas Cassel authored
Document DT bindings for PCIe Endpoint controller found in Rockchip SoCs. Link: https://lore.kernel.org/linux-pci/20240607-rockchip-pcie-ep-v1-v5-6-0a042d6b0049@kernel.orgSigned-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
-
- 01 Jul, 2024 1 commit
-
-
Lukas Wunner authored
Keith reports a use-after-free when a DPC event occurs concurrently to hot-removal of the same portion of the hierarchy: The dpc_handler() awaits readiness of the secondary bus below the Downstream Port where the DPC event occurred. To do so, it polls the config space of the first child device on the secondary bus. If that child device is concurrently removed, accesses to its struct pci_dev cause the kernel to oops. That's because pci_bridge_wait_for_secondary_bus() neglects to hold a reference on the child device. Before v6.3, the function was only called on resume from system sleep or on runtime resume. Holding a reference wasn't necessary back then because the pciehp IRQ thread could never run concurrently. (On resume from system sleep, IRQs are not enabled until after the resume_noirq phase. And runtime resume is always awaited before a PCI device is removed.) However starting with v6.3, pci_bridge_wait_for_secondary_bus() is also called on a DPC event. Commit 53b54ad0 ("PCI/DPC: Await readiness of secondary bus after reset"), which introduced that, failed to appreciate that pci_bridge_wait_for_secondary_bus() now needs to hold a reference on the child device because dpc_handler() and pciehp may indeed run concurrently. The commit was backported to v5.10+ stable kernels, so that's the oldest one affected. Add the missing reference acquisition. Abridged stack trace: BUG: unable to handle page fault for address: 00000000091400c0 CPU: 15 PID: 2464 Comm: irq/53-pcie-dpc 6.9.0 RIP: pci_bus_read_config_dword+0x17/0x50 pci_dev_wait() pci_bridge_wait_for_secondary_bus() dpc_reset_link() pcie_do_recovery() dpc_handler() Fixes: 53b54ad0 ("PCI/DPC: Await readiness of secondary bus after reset") Closes: https://lore.kernel.org/r/20240612181625.3604512-3-kbusch@meta.com/ Link: https://lore.kernel.org/linux-pci/8e4bcd4116fd94f592f2bf2749f168099c480ddf.1718707743.git.lukas@wunner.deReported-by: Keith Busch <kbusch@kernel.org> Tested-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: stable@vger.kernel.org # v5.10+
-
- 21 Jun, 2024 5 commits
-
-
Niklas Cassel authored
The descriptions of the combined interrupt signals (level1) mention all the lower interrupt signals (level2) for each combined interrupt, regardless if the lower (level2) signal is RC or EP specific. E.g. the description of "Combined system interrupt" includes rbar_update, which is EP specific, and the description of "Combined message interrupt" includes obff_idle, obff_obff, obff_cpu_active, which are all EP specific. The only exception is the "Combined legacy interrupt", which for some reason does not provide an exhaustive list of the lower (level2) signals. Add the missing lower interrupt signals: tx_inta, tx_intb, tx_intc, and tx_intd for the "Combined legacy interrupt", as per the rk3568 and rk3588 Technical Reference Manuals, such that the descriptions of the combined interrupt signals are consistent. Link: https://lore.kernel.org/linux-pci/20240607-rockchip-pcie-ep-v1-v5-5-0a042d6b0049@kernel.orgSigned-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
-
Niklas Cassel authored
Refactor the rockchip-dw-pcie binding to move generic properties to a new rockchip-dw-pcie-common binding that can be shared by both RC and EP mode. No functional change intended. Link: https://lore.kernel.org/linux-pci/20240607-rockchip-pcie-ep-v1-v5-4-0a042d6b0049@kernel.orgSigned-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
-
Niklas Cassel authored
The DWC core has four interrupt signals: tx_inta, tx_intb, tx_intc, tx_intd that are triggered when the PCIe controller (when running in Endpoint mode) has sent an Assert_INTA Message to the upstream device. Some DWC controllers have these interrupt in a combined interrupt signal. Add the description of these interrupts to the device tree binding. Link: https://lore.kernel.org/linux-pci/20240607-rockchip-pcie-ep-v1-v5-3-0a042d6b0049@kernel.orgSigned-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
-
Niklas Cassel authored
Considering that some drivers (e.g. pcie-dw-rockchip.c) already use the interrupt-names "sys", "pmc", "msg", "err" for the device tree binding in Root Complex mode (snps,dw-pcie.yaml), it doesn't make sense that those drivers should use different interrupt-names when running in Endpoint mode (snps,dw-pcie-ep.yaml). Therefore, since "sys", "pmc", "msg", "err" are already defined in snps,dw-pcie.yaml, add them also for snps,dw-pcie-ep.yaml. Link: https://lore.kernel.org/linux-pci/20240607-rockchip-pcie-ep-v1-v5-2-0a042d6b0049@kernel.orgSigned-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
-
Niklas Cassel authored
Considering that some drivers (e.g. pcie-dw-rockchip.c) already use the reg-name "apb" for the device tree binding in Root Complex mode (snps,dw-pcie.yaml), it doesn't make sense that those drivers should use a different reg-name when running in Endpoint mode (snps,dw-pcie-ep.yaml). Therefore, since "apb" is already defined in snps,dw-pcie.yaml, add it also for snps,dw-pcie-ep.yaml. Link: https://lore.kernel.org/linux-pci/20240607-rockchip-pcie-ep-v1-v5-1-0a042d6b0049@kernel.orgSigned-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
-
- 18 Jun, 2024 3 commits
-
-
Kai-Heng Feng authored
If the link is powered off during suspend, electrical noise may cause errors that trigger DPC. If the DPC interrupt is enabled and shares an IRQ with PME, that causes a spurious wakeup during suspend. Disable DPC triggering and the DPC interrupt during suspend to prevent this. Clear DPC interrupt status before re-enabling DPC interrupts during resume so we don't get an interrupt for errors that occurred during the suspend/resume process. Link: https://bugzilla.kernel.org/show_bug.cgi?id=209149 Link: https://bugzilla.kernel.org/show_bug.cgi?id=216295 Link: https://bugzilla.kernel.org/show_bug.cgi?id=218090 Link: https://lore.kernel.org/r/20240416043225.1462548-3-kai.heng.feng@canonical.comSigned-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> [bhelgaas: clear status on resume, add comments, commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
-
Kai-Heng Feng authored
If the link is powered off during suspend, electrical noise may cause errors that are logged via AER. If the AER interrupt is enabled and shares an IRQ with PME, that causes a spurious wakeup during suspend. Disable the AER interrupt during suspend to prevent this. Clear error status before re-enabling IRQ interrupts during resume so we don't get an interrupt for errors that occurred during the suspend/resume process. Link: https://bugzilla.kernel.org/show_bug.cgi?id=209149 Link: https://bugzilla.kernel.org/show_bug.cgi?id=216295 Link: https://bugzilla.kernel.org/show_bug.cgi?id=218090 Link: https://lore.kernel.org/r/20240416043225.1462548-2-kai.heng.feng@canonical.comSigned-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> [bhelgaas: drop pci_ancestor_pr3_present() etc, commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
-
Jeff Johnson authored
With ARCH=arm64, make allmodconfig && make W=1 C=1 reports: WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/pci/hotplug/acpiphp_ampere_altra.o Add the missing MODULE_DESCRIPTION(). Link: https://lore.kernel.org/r/20240612-md-drivers-pci-hotplug-v1-1-2b30d14d783d@quicinc.comSigned-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
-
- 12 Jun, 2024 1 commit
-
-
Ilpo Järvinen authored
During remove & rescan cycle, PCI subsystem will recalculate and adjust the bridge window sizing that was initially done by "BIOS". The size calculation is based on the required alignment of the largest resource among the downstream resources as per pbus_size_mem() (unimportant or zero parameters marked with "..."): min_align = calculate_mem_align(aligns, max_order); size0 = calculate_memsize(size, ..., min_align); inside calculate_memsize(), for the largest alignment: min_align = align1 >> 1; ... return min_align; and then in calculate_memsize(): return ALIGN(max(size, ...), align); If the original bridge window sizing tried to conserve space, this will lead to massive increase of the required bridge window size when the downstream has a large disparity in BAR sizes. E.g., with 16MiB and 16GiB BARs this results in 24GiB bridge window size even if 16MiB BAR does not require gigabytes of space to fit. When doing remove & rescan for a bus that contains such a PCI device, a larger bridge window is suddenly required on rescan but when there is a bridge window upstream that is already assigned based on the original size, it cannot be enlarged to the new requirement. This causes the allocation of the bridge window to fail (0x600000000 > 0x400ffffff): pci 0000:02:01.0: PCI bridge to [bus 03] pci 0000:02:01.0: bridge window [mem 0x40400000-0x405fffff] pci 0000:02:01.0: bridge window [mem 0x6000000000-0x6400ffffff 64bit pref] pci 0000:01:00.0: PCI bridge to [bus 02-04] pci 0000:01:00.0: bridge window [mem 0x40400000-0x406fffff] pci 0000:01:00.0: bridge window [mem 0x6000000000-0x6400ffffff 64bit pref] pci 0000:03:00.0: device released pci 0000:02:01.0: device released pcieport 0000:01:00.0: scanning [bus 02-04] behind bridge, pass 0 pci 0000:02:01.0: PCI bridge to [bus 03] pci 0000:02:01.0: bridge window [mem 0x40400000-0x405fffff] pci 0000:02:01.0: bridge window [mem 0x6000000000-0x6400ffffff 64bit pref] pci 0000:02:01.0: scanning [bus 03-03] behind bridge, pass 0 pci 0000:03:00.0: BAR 0 [mem 0x6400000000-0x6400ffffff 64bit pref] pci 0000:03:00.0: BAR 2 [mem 0x6000000000-0x63ffffffff 64bit pref] pci 0000:03:00.0: ROM [mem 0x40400000-0x405fffff pref] pci 0000:02:01.0: PCI bridge to [bus 03] pci 0000:02:01.0: scanning [bus 03-03] behind bridge, pass 1 pcieport 0000:01:00.0: scanning [bus 02-04] behind bridge, pass 1 pci 0000:02:01.0: bridge window [mem size 0x600000000 64bit pref]: can't assign; no space pci 0000:02:01.0: bridge window [mem size 0x600000000 64bit pref]: failed to assign pci 0000:02:01.0: bridge window [mem 0x40400000-0x405fffff]: assigned pci 0000:03:00.0: BAR 2 [mem size 0x400000000 64bit pref]: can't assign; no space pci 0000:03:00.0: BAR 2 [mem size 0x400000000 64bit pref]: failed to assign pci 0000:03:00.0: BAR 0 [mem size 0x01000000 64bit pref]: can't assign; no space pci 0000:03:00.0: BAR 0 [mem size 0x01000000 64bit pref]: failed to assign pci 0000:03:00.0: ROM [mem 0x40400000-0x405fffff pref]: assigned pci 0000:02:01.0: PCI bridge to [bus 03] pci 0000:02:01.0: bridge window [mem 0x40400000-0x405fffff] This is a major surprise for users who are suddenly left with a device that was working fine with the original bridge window sizing. Even if the already assigned bridge window could be enlarged by reallocation in some cases (something the current code does not attempt to do), it is not possible in general case and the large amount of wasted space at the tail of the bridge window may lead to other resource exhaustion problems on Root Complex level (think of multiple PCIe cards with VFs and BAR size disparity in a single system). PCI BARs only need natural alignment (PCIe r6.1, sec 7.5.1.2.1) and bridge memory windows need 1MiB (sec 7.5.1.3). The current bridge window tail alignment rule was introduced in the commit 5d0a8965 ("[PATCH] 2.5.14: New PCI allocation code (alpha, arm, parisc) [2/2]") that only states: "pbus_size_mem: core stuff; tested with randomly generated sets of resources". It does not explain the motivation for the extra tail space allocated that is not truly needed by the downstream resources. As such, it is far from clear if it ever has been required by any HW. To prevent devices with BAR size disparity from becoming unusable after remove & rescan cycle, attempt to do a truly minimal allocation for memory resources if needed. First check if the normally calculated bridge window will not fit into an already assigned upstream resource. In such case, try with relaxed bridge window tail sizing rules instead where no extra tail space is requested beyond what the downstream resources require. Only enforce the alignment requirement of the bridge window itself (normally 1MiB). With this patch, the resources are successfully allocated: pci 0000:02:01.0: PCI bridge to [bus 03] pci 0000:02:01.0: scanning [bus 03-03] behind bridge, pass 1 pcieport 0000:01:00.0: scanning [bus 02-04] behind bridge, pass 1 pcieport 0000:01:00.0: Assigned bridge window [mem 0x6000000000-0x6400ffffff 64bit pref] to [bus 02-04] cannot fit 0x600000000 required for 0000:02:01.0 bridging to [bus 03] pci 0000:02:01.0: bridge window [mem 0x6000000000-0x6400ffffff 64bit pref] to [bus 03] requires relaxed alignment rules pcieport 0000:01:00.0: Assigned bridge window [mem 0x40400000-0x406fffff] to [bus 02-04] free space at [mem 0x40400000-0x405fffff] pci 0000:02:01.0: bridge window [mem 0x6000000000-0x6400ffffff 64bit pref]: assigned pci 0000:02:01.0: bridge window [mem 0x40400000-0x405fffff]: assigned pci 0000:03:00.0: BAR 2 [mem 0x6000000000-0x63ffffffff 64bit pref]: assigned pci 0000:03:00.0: BAR 0 [mem 0x6400000000-0x6400ffffff 64bit pref]: assigned pci 0000:03:00.0: ROM [mem 0x40400000-0x405fffff pref]: assigned pci 0000:02:01.0: PCI bridge to [bus 03] pci 0000:02:01.0: bridge window [mem 0x40400000-0x405fffff] pci 0000:02:01.0: bridge window [mem 0x6000000000-0x6400ffffff 64bit pref] This patch draws inspiration from the initial investigations and work by Mika Westerberg. Closes: https://bugzilla.kernel.org/show_bug.cgi?id=216795 Link: https://lore.kernel.org/linux-pci/20190812144144.2646-1-mika.westerberg@linux.intel.com/ Fixes: 5d0a8965 ("[PATCH] 2.5.14: New PCI allocation code (alpha, arm, parisc) [2/2]") Link: https://lore.kernel.org/r/20240507102523.57320-9-ilpo.jarvinen@linux.intel.comTested-by: Lidong Wang <lidong.wang@intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
-