1. 24 Jun, 2021 10 commits
    • Daniel Vetter's avatar
      dma-buf: Document dma-buf implicit fencing/resv fencing rules · 05459351
      Daniel Vetter authored
      Docs for struct dma_resv are fairly clear:
      
      "A reservation object can have attached one exclusive fence (normally
      associated with write operations) or N shared fences (read
      operations)."
      
      https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html#reservation-objects
      
      Furthermore a review across all of upstream.
      
      First of render drivers and how they set implicit fences:
      
      - nouveau follows this contract, see in validate_fini_no_ticket()
      
      			nouveau_bo_fence(nvbo, fence, !!b->write_domains);
      
        and that last boolean controls whether the exclusive or shared fence
        slot is used.
      
      - radeon follows this contract by setting
      
      		p->relocs[i].tv.num_shared = !r->write_domain;
      
        in radeon_cs_parser_relocs(), which ensures that the call to
        ttm_eu_fence_buffer_objects() in radeon_cs_parser_fini() will do the
        right thing.
      
      - vmwgfx seems to follow this contract with the shotgun approach of
        always setting ttm_val_buf->num_shared = 0, which means
        ttm_eu_fence_buffer_objects() will only use the exclusive slot.
      
      - etnaviv follows this contract, as can be trivially seen by looking
        at submit_attach_object_fences()
      
      - i915 is a bit a convoluted maze with multiple paths leading to
        i915_vma_move_to_active(). Which sets the exclusive flag if
        EXEC_OBJECT_WRITE is set. This can either come as a buffer flag for
        softpin mode, or through the write_domain when using relocations. It
        follows this contract.
      
      - lima follows this contract, see lima_gem_submit() which sets the
        exclusive fence when the LIMA_SUBMIT_BO_WRITE flag is set for that
        bo
      
      - msm follows this contract, see msm_gpu_submit() which sets the
        exclusive flag when the MSM_SUBMIT_BO_WRITE is set for that buffer
      
      - panfrost follows this contract with the shotgun approach of just
        always setting the exclusive fence, see
        panfrost_attach_object_fences(). Benefits of a single engine I guess
      
      - v3d follows this contract with the same shotgun approach in
        v3d_attach_fences_and_unlock_reservation(), but it has at least an
        XXX comment that maybe this should be improved
      
      - v4c uses the same shotgun approach of always setting an exclusive
        fence, see vc4_update_bo_seqnos()
      
      - vgem also follows this contract, see vgem_fence_attach_ioctl() and
        the VGEM_FENCE_WRITE. This is used in some igts to validate prime
        sharing with i915.ko without the need of a 2nd gpu
      
      - vritio follows this contract again with the shotgun approach of
        always setting an exclusive fence, see virtio_gpu_array_add_fence()
      
      This covers the setting of the exclusive fences when writing.
      
      Synchronizing against the exclusive fence is a lot more tricky, and I
      only spot checked a few:
      
      - i915 does it, with the optional EXEC_OBJECT_ASYNC to skip all
        implicit dependencies (which is used by vulkan)
      
      - etnaviv does this. Implicit dependencies are collected in
        submit_fence_sync(), again with an opt-out flag
        ETNA_SUBMIT_NO_IMPLICIT. These are then picked up in
        etnaviv_sched_dependency which is the
        drm_sched_backend_ops->dependency callback.
      
      - v4c seems to not do much here, maybe gets away with it by not having
        a scheduler and only a single engine. Since all newer broadcom chips than
        the OG vc4 use v3d for rendering, which follows this contract, the
        impact of this issue is fairly small.
      
      - v3d does this using the drm_gem_fence_array_add_implicit() helper,
        which then it's drm_sched_backend_ops->dependency callback
        v3d_job_dependency() picks up.
      
      - panfrost is nice here and tracks the implicit fences in
        panfrost_job->implicit_fences, which again the
        drm_sched_backend_ops->dependency callback panfrost_job_dependency()
        picks up. It is mildly questionable though since it only picks up
        exclusive fences in panfrost_acquire_object_fences(), but not buggy
        in practice because it also always sets the exclusive fence. It
        should pick up both sets of fences, just in case there's ever going
        to be a 2nd gpu in a SoC with a mali gpu. Or maybe a mali SoC with a
        pcie port and a real gpu, which might actually happen eventually. A
        bug, but easy to fix. Should probably use the
        drm_gem_fence_array_add_implicit() helper.
      
      - lima is nice an easy, uses drm_gem_fence_array_add_implicit() and
        the same schema as v3d.
      
      - msm is mildly entertaining. It also supports MSM_SUBMIT_NO_IMPLICIT,
        but because it doesn't use the drm/scheduler it handles fences from
        the wrong context with a synchronous dma_fence_wait. See
        submit_fence_sync() leading to msm_gem_sync_object(). Investing into
        a scheduler might be a good idea.
      
      - all the remaining drivers are ttm based, where I hope they do
        appropriately obey implicit fences already. I didn't do the full
        audit there because a) not follow the contract would confuse ttm
        quite well and b) reading non-standard scheduler and submit code
        which isn't based on drm/scheduler is a pain.
      
      Onwards to the display side.
      
      - Any driver using the drm_gem_plane_helper_prepare_fb() helper will
        correctly. Overwhelmingly most drivers get this right, except a few
        totally dont. I'll follow up with a patch to make this the default
        and avoid a bunch of bugs.
      
      - I didn't audit the ttm drivers, but given that dma_resv started
        there I hope they get this right.
      
      In conclusion this IS the contract, both as documented and
      overwhelmingly implemented, specically as implemented by all render
      drivers except amdgpu.
      
      Amdgpu tried to fix this already in
      
      commit 049aca43
      Author: Christian König <christian.koenig@amd.com>
      Date:   Wed Sep 19 16:54:35 2018 +0200
      
          drm/amdgpu: fix using shared fence for exported BOs v2
      
      but this fix falls short on a number of areas:
      
      - It's racy, by the time the buffer is shared it might be too late. To
        make sure there's definitely never a problem we need to set the
        fences correctly for any buffer that's potentially exportable.
      
      - It's breaking uapi, dma-buf fds support poll() and differentitiate
        between, which was introduced in
      
      	commit 9b495a58
      	Author: Maarten Lankhorst <maarten.lankhorst@canonical.com>
      	Date:   Tue Jul 1 12:57:43 2014 +0200
      
      	    dma-buf: add poll support, v3
      
      - Christian König wants to nack new uapi building further on this
        dma_resv contract because it breaks amdgpu, quoting
      
        "Yeah, and that is exactly the reason why I will NAK this uAPI change.
      
        "This doesn't works for amdgpu at all for the reasons outlined above."
      
        https://lore.kernel.org/dri-devel/f2eb6751-2f82-9b23-f57e-548de5b729de@gmail.com/
      
        Rejecting new development because your own driver is broken and
        violates established cross driver contracts and uapi is really not
        how upstream works.
      
      Now this patch will have a severe performance impact on anything that
      runs on multiple engines. So we can't just merge it outright, but need
      a bit a plan:
      
      - amdgpu needs a proper uapi for handling implicit fencing. The funny
        thing is that to do it correctly, implicit fencing must be treated
        as a very strange IPC mechanism for transporting fences, where both
        setting the fence and dependency intercepts must be handled
        explicitly. Current best practices is a per-bo flag to indicate
        writes, and a per-bo flag to to skip implicit fencing in the CS
        ioctl as a new chunk.
      
      - Since amdgpu has been shipping with broken behaviour we need an
        opt-out flag from the butchered implicit fencing model to enable the
        proper explicit implicit fencing model.
      
      - for kernel memory fences due to bo moves at least the i915 idea is
        to use ttm_bo->moving. amdgpu probably needs the same.
      
      - since the current p2p dma-buf interface assumes the kernel memory
        fence is in the exclusive dma_resv fence slot we need to add a new
        fence slot for kernel fences, which must never be ignored. Since
        currently only amdgpu supports this there's no real problem here
        yet, until amdgpu gains a NO_IMPLICIT CS flag.
      
      - New userspace needs to ship in enough desktop distros so that users
        wont notice the perf impact. I think we can ignore LTS distros who
        upgrade their kernels but not their mesa3d snapshot.
      
      - Then when this is all in place we can merge this patch here.
      
      What is not a solution to this problem here is trying to make the
      dma_resv rules in the kernel more clever. The fundamental issue here
      is that the amdgpu CS uapi is the least expressive one across all
      drivers (only equalled by panfrost, which has an actual excuse) by not
      allowing any userspace control over how implicit sync is conducted.
      
      Until this is fixed it's completely pointless to make the kernel more
      clever to improve amdgpu, because all we're doing is papering over
      this uapi design issue. amdgpu needs to attain the status quo
      established by other drivers first, once that's achieved we can tackle
      the remaining issues in a consistent way across drivers.
      
      v2: Bas pointed me at AMDGPU_GEM_CREATE_EXPLICIT_SYNC, which I
      entirely missed.
      
      This is great because it means the amdgpu specific piece for proper
      implicit fence handling exists already, and that since a while. The
      only thing that's now missing is
      - fishing the implicit fences out of a shared object at the right time
      - setting the exclusive implicit fence slot at the right time.
      
      Jason has a patch series to fill that gap with a bunch of generic
      ioctl on the dma-buf fd:
      
      https://lore.kernel.org/dri-devel/20210520190007.534046-1-jason@jlekstrand.net/
      
      v3: Since Christian has fixed amdgpu now in
      
      commit 8c505bdc (drm-misc/drm-misc-next)
      Author: Christian König <christian.koenig@amd.com>
      Date:   Wed Jun 9 13:51:36 2021 +0200
      
          drm/amdgpu: rework dma_resv handling v3
      
      Use the audit covered in this commit message as the excuse to update
      the dma-buf docs around dma_buf.resv usage across drivers.
      
      Since dynamic importers have different rules also hammer these in
      again while we're at it.
      
      v4:
      - Add the missing "through the device" in the dynamic section that I
        overlooked.
      - Fix a kerneldoc markup mistake, the link didn't connect
      
      v5:
      - A few s/should/must/ to make clear what must be done (if the driver
        does implicit sync) and what's more a maybe (Daniel Stone)
      - drop all the example api discussion, that needs to be expanded,
        clarified and put into a new chapter in drm-uapi.rst (Daniel Stone)
      
      Cc: Daniel Stone <daniel@fooishbar.org>
      Acked-by: default avatarDaniel Stone <daniel@fooishbar.org>
      Reviewed-by: Dave Airlie <airlied@redhat.com> (v4)
      Reviewed-by: Christian König <christian.koenig@amd.com> (v3)
      Cc: mesa-dev@lists.freedesktop.org
      Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
      Cc: Dave Airlie <airlied@gmail.com>
      Cc: Rob Clark <robdclark@chromium.org>
      Cc: Kristian H. Kristensen <hoegsberg@google.com>
      Cc: Michel Dänzer <michel@daenzer.net>
      Cc: Daniel Stone <daniels@collabora.com>
      Cc: Sumit Semwal <sumit.semwal@linaro.org>
      Cc: "Christian König" <christian.koenig@amd.com>
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Deepak R Varma <mh12gx2825@gmail.com>
      Cc: Chen Li <chenli@uniontech.com>
      Cc: Kevin Wang <kevin1.wang@amd.com>
      Cc: Dennis Li <Dennis.Li@amd.com>
      Cc: Luben Tuikov <luben.tuikov@amd.com>
      Cc: linaro-mm-sig@lists.linaro.org
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20210624125246.166721-1-daniel.vetter@ffwll.ch
      05459351
    • Daniel Vetter's avatar
      dma-buf: Switch to inline kerneldoc · d6abed2a
      Daniel Vetter authored
      Also review & update everything while we're at it.
      
      This is prep work to smash a ton of stuff into the kerneldoc for
      @resv.
      
      v2: Move the doc for sysfs_entry.attachment_uid to the right place too
      (Sam)
      Reviewed-by: default avatarSam Ravnborg <sam@ravnborg.org>
      Acked-by: default avatarChristian König <christian.koenig@amd.com>
      Cc: Sam Ravnborg <sam@ravnborg.org>
      Reviewed-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      Cc: Sumit Semwal <sumit.semwal@linaro.org>
      Cc: "Christian König" <christian.koenig@amd.com>
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Dave Airlie <airlied@redhat.com>
      Cc: Nirmoy Das <nirmoy.das@amd.com>
      Cc: Deepak R Varma <mh12gx2825@gmail.com>
      Cc: Chen Li <chenli@uniontech.com>
      Cc: Kevin Wang <kevin1.wang@amd.com>
      Cc: linux-media@vger.kernel.org
      Cc: linaro-mm-sig@lists.linaro.org
      Link: https://patchwork.freedesktop.org/patch/msgid/20210623161712.3370885-1-daniel.vetter@ffwll.ch
      d6abed2a
    • Daniel Vetter's avatar
      drm/gem: Tiny kernel clarification for drm_gem_fence_array_add · d7fdae59
      Daniel Vetter authored
      Spotted while trying to convert panfrost to these.
      Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      Cc: "Christian König" <christian.koenig@amd.com>
      Cc: Lucas Stach <l.stach@pengutronix.de>
      Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Cc: Maxime Ripard <mripard@kernel.org>
      Cc: Thomas Zimmermann <tzimmermann@suse.de>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Daniel Vetter <daniel@ffwll.ch>
      Link: https://patchwork.freedesktop.org/patch/msgid/20210622165511.3169559-15-daniel.vetter@ffwll.ch
      d7fdae59
    • Laurent Pinchart's avatar
      drm/bridge: ti-sn65dsi86: Split connector creation to a function · 379d3426
      Laurent Pinchart authored
      To prepare for making connector creation option, move connector creation
      out of ti_sn_bridge_attach to a separate function.
      
      No functional change intended.
      Signed-off-by: default avatarLaurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
      Reviewed-by: default avatarStephen Boyd <swboyd@chromium.org>
      Reviewed-by: default avatarDouglas Anderson <dianders@chromium.org>
      Signed-off-by: default avatarRobert Foss <robert.foss@linaro.org>
      Link: https://patchwork.freedesktop.org/patch/msgid/20210624000304.16281-7-laurent.pinchart+renesas@ideasonboard.com
      379d3426
    • Laurent Pinchart's avatar
      drm/bridge: ti-sn65dsi86: Group code in sections · 77674e72
      Laurent Pinchart authored
      Reorganize the functions in sections, related to connector operations,
      bridge operations, AUX adapter, GPIO controller and probe & remove.
      
      This prepares for proper support of DRM_BRIDGE_ATTACH_NO_CONNECTOR that
      will add more functions, to ensure that the code will stay readable.
      
      No functional change intended.
      Signed-off-by: default avatarLaurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
      Reviewed-by: default avatarStephen Boyd <swboyd@chromium.org>
      Reviewed-by: default avatarDouglas Anderson <dianders@chromium.org>
      Signed-off-by: default avatarRobert Foss <robert.foss@linaro.org>
      Link: https://patchwork.freedesktop.org/patch/msgid/20210624000304.16281-6-laurent.pinchart+renesas@ideasonboard.com
      77674e72
    • Laurent Pinchart's avatar
      drm/bridge: ti-sn65dsi86: Wrap panel with panel-bridge · 4e5763f0
      Laurent Pinchart authored
      To simplify interfacing with the panel, wrap it in a panel-bridge and
      let the DRM bridge helpers handle chaining of operations.
      
      This also prepares for support of DRM_BRIDGE_ATTACH_NO_CONNECTOR, which
      requires all components in the display pipeline to be represented by
      bridges.
      Signed-off-by: default avatarLaurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
      Reviewed-by: default avatarJagan Teki <jagan@amarulasolutions.com>
      Signed-off-by: default avatarRobert Foss <robert.foss@linaro.org>
      Link: https://patchwork.freedesktop.org/patch/msgid/20210624000304.16281-5-laurent.pinchart+renesas@ideasonboard.com
      4e5763f0
    • Laurent Pinchart's avatar
    • Laurent Pinchart's avatar
    • Laurent Pinchart's avatar
      dt-bindings: drm/bridge: ti-sn65dsi8: Make enable GPIO optional · 07157867
      Laurent Pinchart authored
      The SN65DSI86 EN pin can be hardwired to a high level, or connected to a
      global reset signal, not controllable by the kernel. Make it optional in
      those cases.
      Signed-off-by: default avatarLaurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
      Reviewed-by: default avatarJagan Teki <jagan@amarulasolutions.com>
      Reviewed-by: default avatarStephen Boyd <swboyd@chromium.org>
      Reviewed-by: default avatarDouglas Anderson <dianders@chromium.org>
      Acked-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarRobert Foss <robert.foss@linaro.org>
      Link: https://patchwork.freedesktop.org/patch/msgid/20210624000304.16281-2-laurent.pinchart+renesas@ideasonboard.com
      07157867
    • Boris Brezillon's avatar
      drm/panfrost: Make sure MMU context lifetime is not bound to panfrost_priv · 7fdc48cc
      Boris Brezillon authored
      Jobs can be in-flight when the file descriptor is closed (either because
      the process did not terminate properly, or because it didn't wait for
      all GPU jobs to be finished), and apparently panfrost_job_close() does
      not cancel already running jobs. Let's refcount the MMU context object
      so it's lifetime is no longer bound to the FD lifetime and running jobs
      can finish properly without generating spurious page faults.
      Reported-by: default avatarIcecream95 <ixn@keemail.me>
      Fixes: 7282f764 ("drm/panfrost: Implement per FD address spaces")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarBoris Brezillon <boris.brezillon@collabora.com>
      Reviewed-by: default avatarSteven Price <steven.price@arm.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20210621133907.1683899-2-boris.brezillon@collabora.com
      7fdc48cc
  2. 23 Jun, 2021 23 commits
  3. 22 Jun, 2021 7 commits