1. 01 Nov, 2019 1 commit
    • Thierry Reding's avatar
      gpu: host1x: Unconditionally select IOMMU_IOVA · c8a20364
      Thierry Reding authored
      Currently configurations can be generated where IOMMU_SUPPORT is
      disabled but IOMMU_IOVA is built as a module and HOST1X as built-in. In
      such a case, the symbols guarded by IOMMU_IOVA will not be available
      when linking the host1x driver and cause a linking failure.
      
      Simplify this by unconditionally selecting IOMMU_IOVA, which makes sure
      that it will be forced to =y if HOST1X=y. Technically we can now get
      IOMMU_IOVA code built-in even if we don't use it (host1x only uses it
      when IOMMU_SUPPORT is also enabled), but such configuration are of a
      mostly academic nature. In all practical configurations we want IOMMU
      support anyway.
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      c8a20364
  2. 29 Oct, 2019 12 commits
    • Thierry Reding's avatar
      drm/tegra: Optionally attach clients to the IOMMU · fa6661b7
      Thierry Reding authored
      If a client is already attached to an IOMMU domain that is not the
      shared domain, don't try to attach it again. This allows using the
      IOMMU-backed DMA API.
      
      Since the IOMMU-backed DMA API is now supported and there's no way
      to detach from it on 64-bit ARM, don't bother to detach from it on
      32-bit ARM either.
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      fa6661b7
    • Thierry Reding's avatar
      drm/tegra: Support DMA API for display controllers · 2e8d8749
      Thierry Reding authored
      If a display controller is not attached to an explicit IOMMU domain,
      which usually means that it's connected to an IOMMU domain controlled by
      the DMA API, make sure to map the framebuffer to the display controller
      address space. This allows us to transparently handle setups where the
      display controller is attached to an IOMMU or setups where it isn't. It
      also allows the driver to work with a DMA API that is backed by an
      IOMMU.
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      2e8d8749
    • Thierry Reding's avatar
      drm/tegra: falcon: Clarify address usage · d972d624
      Thierry Reding authored
      Rename paddr -> iova and vaddr -> virt to make it clearer how these
      addresses are used. This is important for a subsequent patch that makes
      a distinction between the physical address (physical address of the
      system memory from the CPU's point of view) and the IOVA (physical
      address of the system memory from the device's point of view).
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      d972d624
    • Thierry Reding's avatar
      drm/tegra: Remove memory allocation from Falcon library · 20e7dce2
      Thierry Reding authored
      Having to provide allocator hooks to the Falcon library is somewhat
      cumbersome and it doesn't give the users of the library a lot of
      flexibility to deal with allocations. Instead, remove the notion of
      Falcon "operations" and let drivers deal with the memory allocations
      themselves.
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      20e7dce2
    • Thierry Reding's avatar
      gpu: host1x: Set DMA mask based on IOMMU setup · 06867a36
      Thierry Reding authored
      If the Tegra DRM clients are backed by an IOMMU, push buffers are likely
      to be allocated beyond the 32-bit boundary if sufficient system memory
      is available. This is problematic on earlier generations of Tegra where
      host1x supports a maximum of 32 address bits for the GATHER opcode. More
      recent versions of Tegra (Tegra186 and later) have a wide variant of the
      GATHER opcode, which allows addressing up to 64 bits of memory.
      
      If host1x itself is behind an IOMMU as well this doesn't matter because
      the IOMMU's input address space is restricted to 32 bits on generations
      without support for wide GATHER opcodes.
      
      However, if host1x is not behind an IOMMU, it won't be able to process
      push buffers beyond the 32-bit boundary on Tegra generations that don't
      support wide GATHER opcodes. Restrict the DMA mask to 32 bits on these
      generations prevents buffers from being allocated from beyond the 32-bit
      boundary.
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      06867a36
    • Thierry Reding's avatar
      gpu: host1x: Support DMA mapping of buffers · af1cbfb9
      Thierry Reding authored
      If host1x_bo_pin() returns an SG table, create a DMA mapping for the
      buffer. For buffers that the host1x client has already mapped itself,
      host1x_bo_pin() returns NULL and the existing DMA address is used.
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      af1cbfb9
    • Thierry Reding's avatar
      gpu: host1x: Allocate gather copy for host1x · b78e70c0
      Thierry Reding authored
      Currently when the gather buffers are copied, they are copied to a
      buffer that is allocated for the host1x client that wants to execute the
      command streams in the buffers. However, the gather buffers will be read
      by the host1x device, which causes SMMU faults if the DMA API is backed
      by an IOMMU.
      
      Fix this by allocating the gather buffer copy for the host1x device,
      which makes sure that it will be mapped into the host1x's IOVA space if
      the DMA API is backed by an IOMMU.
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      b78e70c0
    • Thierry Reding's avatar
      gpu: host1x: Add direction flags to relocations · ab4f81bf
      Thierry Reding authored
      Add direction flags to host1x relocations performed during job pinning.
      These flags indicate the kinds of accesses that hardware is allowed to
      perform on the relocated buffers.
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      ab4f81bf
    • Thierry Reding's avatar
      gpu: host1x: Clean up debugfs on removal · 44156eee
      Thierry Reding authored
      The debugfs files created for host1x are never removed, causing these
      files to be left dangling in debugfs. This results in a crash when any
      of these files are accessed after the host1x driver has been removed,
      as well as a failure to create the debugfs entries when they are added
      again on driver probe.
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      44156eee
    • Thierry Reding's avatar
      gpu: host1x: Overhaul host1x_bo_{pin,unpin}() API · 80327ce3
      Thierry Reding authored
      The host1x_bo_pin() and host1x_bo_unpin() APIs are used to pin and unpin
      buffers during host1x job submission. Pinning currently returns the SG
      table and the DMA address (an IOVA if an IOMMU is used or a physical
      address if no IOMMU is used) of the buffer. The DMA address is only used
      for buffers that are relocated, whereas the host1x driver will map
      gather buffers into its own IOVA space so that they can be processed by
      the CDMA engine.
      
      This approach has a couple of issues. On one hand it's not very useful
      to return a DMA address for the buffer if host1x doesn't need it. On the
      other hand, returning the SG table of the buffer is suboptimal because a
      single SG table cannot be shared for multiple mappings, because the DMA
      address is stored within the SG table, and the DMA address may be
      different for different devices.
      
      Subsequent patches will move the host1x driver over to the DMA API which
      doesn't work with a single shared SG table. Fix this by returning a new
      SG table each time a buffer is pinned. This allows the buffer to be
      referenced by multiple jobs for different engines.
      
      Change the prototypes of host1x_bo_pin() and host1x_bo_unpin() to take a
      struct device *, specifying the device for which the buffer should be
      pinned. This is required in order to be able to properly construct the
      SG table. While at it, make host1x_bo_pin() return the SG table because
      that allows us to return an ERR_PTR()-encoded error code if we need to,
      or return NULL to signal that we don't need the SG table to be remapped
      and can simply use the DMA address as-is. At the same time, returning
      the DMA address is made optional because in the example of command
      buffers, host1x doesn't need to know the DMA address since it will have
      to create its own mapping anyway.
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      80327ce3
    • Thierry Reding's avatar
      drm/tegra: Simplify IOMMU group selection · 7edd7961
      Thierry Reding authored
      All the devices that make up the DRM device are now part of the same
      IOMMU group. This simplifies the handling of the IOMMU attachment and
      also avoids exhausting the number of IOMMUs available on early Tegra
      SoC generations.
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      7edd7961
    • Thierry Reding's avatar
      drm/tegra: Do not use ->load() and ->unload() callbacks · a7303f77
      Thierry Reding authored
      The ->load() and ->unload() drivers are midlayers and should be avoided
      in modern drivers. Fix this by moving the code into the driver ->probe()
      and ->remove() implementations, respectively.
      
      v2: kick out conflicting framebuffers before initializing fbdev
      v3: rebase onto drm/tegra/for-next
      Tested-by: default avatarDmitry Osipenko <digetx@gmail.com>
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      a7303f77
  3. 28 Oct, 2019 27 commits