Commits · a0d3fdb628b83e3a24acbf6915ede9359a1ecc2b · Kirill Smelkov / linux

21 Dec, 2020 1 commit

drm/i915/gt: Split logical ring contexts from execlist submission · a0d3fdb6

Chris Wilson authored Dec 19, 2020

Split the definition, construction and updating of the Logical Ring
Context from the execlist submission interface. The LRC is used by the
HW, irrespective of our different submission backends.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201219020343.22681-1-chris@chris-wilson.co.uk

a0d3fdb6

20 Dec, 2020 1 commit

drm/i915/gt: Another tweak for flushing the tasklets · 5ec17c76

Chris Wilson authored Dec 20, 2020

tasklet_kill() ensures that we _yield_ the processor until a remote
tasklet is completed. However, this leads to a starvation condition as
being at the bottom of the scheduler's runqueue means that anything else
is able to run, including all hogs keeping the tasklet occupied.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201220134858.10510-1-chris@chris-wilson.co.uk

5ec17c76

18 Dec, 2020 3 commits

drm/i915: Check for rq->hwsp validity after acquiring RCU lock · 9bb36cf6

Chris Wilson authored Dec 18, 2020

Since we allow removing the timeline map at runtime, there is a risk
that rq->hwsp points into a stale page. To control that risk, we hold
the RCU read lock while reading *rq->hwsp, but we missed a couple of
important barriers. First, the unpinning / removal of the timeline map
must be after all RCU readers into that map are complete, i.e. after an
rcu barrier (in this case courtesy of call_rcu()). Secondly, we must
make sure that the rq->hwsp we are about to dereference under the RCU
lock is valid. In this case, we make the rq->hwsp pointer safe during
i915_request_retire() and so we know that rq->hwsp may become invalid
only after the request has been signaled. Therefore is the request is
not yet signaled when we acquire rq->hwsp under the RCU, we know that
rq->hwsp will remain valid for the duration of the RCU read lock.

This is a very small window that may lead to either considering the
request not completed (causing a delay until the request is checked
again, any wait for the request is not affected) or dereferencing an
invalid pointer.

Fixes: 3adac468 ("drm/i915: Introduce concept of per-timeline (context) HWSP")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v5.1+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201218122421.18344-1-chris@chris-wilson.co.uk

9bb36cf6

drm/i915/tgl: Add bound checks and simplify TGL REVID macros · 0a982c15

Aditya Swarup authored Dec 02, 2020

Add bound checks for TGL REV ID array. Since, there might
be a possibility of using older kernels on latest platform
revisions, resulting in out of bounds access for rev ID array.
In this scenario, use the latest rev ID available and apply
those WAs.

Also, modify GT macros for TGL rev ID to reuse tgl_revids_get().

Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Aditya Swarup <aditya.swarup@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201203072359.156682-2-aditya.swarup@intel.com

0a982c15

drm/i915/tgl: Fix REVID macros for TGL to fetch correct stepping · 83dbd74f

Aditya Swarup authored Dec 02, 2020

Fix TGL REVID macros to fetch correct display/gt stepping based
on SOC rev id from INTEL_REVID() macro. Previously, we were just
returning the first element of the revid array instead of using
the correct index based on SOC rev id.

Fixes: c33298cb ("drm/i915/tgl: Fix stepping WA matching")
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Aditya Swarup <aditya.swarup@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201203072359.156682-1-aditya.swarup@intel.com

83dbd74f

17 Dec, 2020 2 commits

drm/i915/gt: Track the overall awake/busy time · 8c3b1ba0

Chris Wilson authored Dec 15, 2020

Since we wake the GT up before executing a request, and go to sleep as
soon as it is retired, the GT wake time not only represents how long the
device is powered up, but also provides a summary, albeit an overestimate,
of the device runtime (i.e. the rc0 time to compare against rc6 time).

v2: s/busy/awake/
v3: software-gt-awake-time and I915_PMU_SOFTWARE_GT_AWAKE_TIME
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reported-by: kernel test robot <oliver.sang@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201215154456.13954-1-chris@chris-wilson.co.uk

8c3b1ba0

drm/i915/gt: Drain the breadcrumbs just once · e3ed90b8

Chris Wilson authored Dec 17, 2020

Matthew Brost pointed out that the while-loop on a shared breadcrumb was
inherently fraught with danger as it competed with the other users of
the breadcrumbs. However, in order to completely drain the re-arming irq
worker, the while-loop is a necessity, despite my optimism that we could
force cancellation with a couple of irq_work invocations.

Given that we can't merely drop the while-loop, use an activity counter on
the breadcrumbs to detect when we are parking the breadcrumbs for the
last time.

Based on a patch by Matthew Brost.
Reported-by: Matthew Brost <matthew.brost@intel.com>
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Fixes: 9d5612ca ("drm/i915/gt: Defer enabling the breadcrumb interrupt to after submission")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201217091524.10258-1-chris@chris-wilson.co.uk

e3ed90b8

16 Dec, 2020 4 commits

drm/i915: Encode fence specific waitqueue behaviour into the wait.flags · 460d02ba

Chris Wilson authored Dec 16, 2020

Use the wait_queue_entry.flags to denote the special fence behaviour
(flattening continuations along fence chains, and for propagating
errors) rather than trying to detect ordinary waiters by their
functions.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201216165850.25030-1-chris@chris-wilson.co.uk

460d02ba

drm/i915/gt: Move gen8 CS emitters into gen8_engine_cs.h · 45233ab2

Chris Wilson authored Dec 16, 2020

Reduce the pollution of intel_engine.h by moving gen8_emit_pipe_control
and friends to gen8_engine_cs.h
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201216135452.6063-1-chris@chris-wilson.co.uk

45233ab2

drm/i915/gem: Drop free_work for GEM contexts · f8246cf4

Chris Wilson authored Dec 15, 2020

The free_list and worker was introduced in commit 5f09a9c8 ("drm/i915:
Allow contexts to be unreferenced locklessly"), but subsequently made
redundant by the removal of the last sleeping lock in commit 2935ed53
("drm/i915: Remove logical HW ID"). As we can now free the GEM context
immediately from any context, remove the deferral of the free_list

v2: Lift removing the context from the global list into close().
Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201215152138.8158-1-chris@chris-wilson.co.uk

f8246cf4

drm/i915: Fix mismatch between misplaced vma check and vma insert · 5f22cc0b

Chris Wilson authored Dec 16, 2020

When inserting a VMA, we restrict the placement to the low 4G unless the
caller opts into using the full range. This was done to allow usersapce
the opportunity to transition slowly from a 32b address space, and to
avoid breaking inherent 32b assumptions of some commands.

However, for insert we limited ourselves to 4G-4K, but on verification
we allowed the full 4G. This causes some attempts to bind a new buffer
to sporadically fail with -ENOSPC, but at other times be bound
successfully.

commit 48ea1e32 ("drm/i915/gen9: Set PIN_ZONE_4G end to 4GB - 1
page") suggests that there is a genuine problem with stateless addressing
that cannot utilize the last page in 4G and so we purposefully excluded
it. This means that the quick pin pass may cause us to utilize a buggy
placement.
Reported-by: CQ Tang <cq.tang@intel.com>
Testcase: igt/gem_exec_params/larger-than-life-batch
Fixes: 48ea1e32 ("drm/i915/gen9: Set PIN_ZONE_4G end to 4GB - 1 page")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: CQ Tang <cq.tang@intel.com>
Reviewed-by: CQ Tang <cq.tang@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Cc: <stable@vger.kernel.org> # v4.5+
Link: https://patchwork.freedesktop.org/patch/msgid/20201216092951.7124-1-chris@chris-wilson.co.uk

5f22cc0b

15 Dec, 2020 1 commit

doc: Fix build of documentation after i915 file rename · 3b7bc18b

José Roberto de Souza authored Dec 14, 2020

Commit 70a2b431 ("drm/i915/gt: Rename lrc.c to
execlists_submission.c") renamed intel_lrc.c to
intel_execlists_submission.c but forgot to update i915.rst.

Fixes: 70a2b431 ("drm/i915/gt: Rename lrc.c to execlists_submission.c")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201214185440.243537-1-jose.souza@intel.com

3b7bc18b

14 Dec, 2020 3 commits

drm/i915/pmu: Remove !CONFIG_PM code · c41ce819

Tvrtko Ursulin authored Dec 14, 2020

Chris spotted that since 16ffe73c ("drm/i915/pmu: Use GT parked for
estimating RC6 while asleep") we don't rely on runtime pm internals when
estimating RC6 while asleep. We can remove the ifdef code to simplify and
at the same time wake up the device less when querying RC6 if CONFIG_PM is
not compiled in.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
References: 16ffe73c ("drm/i915/pmu: Use GT parked for estimating RC6 while asleep")
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201214094349.3563876-3-tvrtko.ursulin@linux.intel.com

c41ce819

drm/i915/pmu: Use raw clock for rc6 estimation · c51c29fb

Tvrtko Ursulin authored Dec 14, 2020

RC6 is a hardware counter and as such estimating it using the raw clock
during runtime suspend is more appropriate.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
References: 34f43927 ("perf: Add per event clockid support")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201214094349.3563876-2-tvrtko.ursulin@linux.intel.com

c51c29fb

drm/i915/pmu: Don't grab wakeref when enabling events · dbe13ae1

Tvrtko Ursulin authored Dec 14, 2020

Chris found a CI report which points out calling intel_runtime_pm_get from
inside i915_pmu_enable hook is not allowed since it can be invoked from
hard irq context. This is something we knew but forgot, so lets fix it
once again.

We do this by syncing the internal book keeping with hardware rc6 counter
on driver load.

v2:
 * Always sync on parking and fully sync on init.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: f4e9894b ("drm/i915/pmu: Correct the rc6 offset upon enabling")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201214094349.3563876-1-tvrtko.ursulin@linux.intel.com

dbe13ae1

10 Dec, 2020 3 commits

drm/i915/gt: Wean workaround selftests off GEM context · 04adaba8

Chris Wilson authored Dec 10, 2020

The workarounds are tied to the GT and we should derive the tests local
to the GT.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201210080240.24529-2-chris@chris-wilson.co.uk

04adaba8

drm/i915/gt: Mark legacy ring context as lost · 20a6774e

Chris Wilson authored Dec 10, 2020

When we reset the legacy ring context, due to potential corruption over
suspend/resume, remove the valid bit so that we avoid loading garbage.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201210080240.24529-1-chris@chris-wilson.co.uk

20a6774e

drm/i915: Correct location of Wa_1408615072 · c97ffd08

John Harrison authored Dec 10, 2020

The above workaround was added as an engine workaround not a GT
workaround. Moved it to the correct location.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201210170615.3107266-1-lucas.demarchi@intel.com

c97ffd08

09 Dec, 2020 9 commits

drm/i915: split gen8+ flush and bb_start emission functions · d0d829e5

Daniele Ceraolo Spurio authored Dec 09, 2020

These functions are independent from the backend used and can therefore
be split out of the exelists submission file, so they can be re-used by
the upcoming GuC submission backend.

Based on a patch by Chris Wilson.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201209233618.4287-3-chris@chris-wilson.co.ukSigned-off-by: Chris Wilson <chris@chris-wilson.co.uk>

d0d829e5

drm/i915/gt: Rename lrc.c to execlists_submission.c · 70a2b431

Chris Wilson authored Dec 09, 2020

We want to separate the utility functions for controlling the logical
ring context from the execlists submission mechanism (which is an
overgrown scheduler).

This is similar to Daniele's work to split up the files, but being
selfish I wanted to base it after my own changes to intel_lrc.c petered
out.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201209233618.4287-2-chris@chris-wilson.co.uk

70a2b431

drm/i915/gt: Move move context layout registers and offsets to lrc_reg.h · 9fd96c06

Chris Wilson authored Dec 09, 2020

Cleanup intel_lrc.h by moving some of the residual common register
definitions into intel_lrc_reg.h, prior to rebranding and splitting off
the submission backends.

v2: keep the SCHEDULE enum in the old file, since it is specific to the
gvt usage of the execlists submission backend (John)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> #v2
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201209233618.4287-1-chris@chris-wilson.co.uk

9fd96c06

drm/i915/gt: Remove uninterruptible parameter from intel_gt_wait_for_idle · 51c87fa6

Chris Wilson authored Dec 09, 2020

Now that the only user of the uninterruptible wait was eliminated,
remove the support.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201209164008.5487-3-chris@chris-wilson.co.uk

51c87fa6

drm/i915: Sleep around performing iommu unmaps on Tigerlake · 84361529

Chris Wilson authored Dec 09, 2020

Tigerlake is plagued by spontaneous DMAR faults [reason 7, next page
table ptr is invalid] which lead to GPU hangs. These faults occur when
an iommu map is immediately reused. Adding further clflushes and
barriers around either the GTT PTE or iommu PTE updates do not prevent
the faults. So far the only effect has been from inducing a delay
between reuse of the iommu on the GPU, and applying the delay at the
iommu map allows for the smallest stable delay.

Note that such a delay is hideous and clearly does not fix the root cause,
and so should only be a bandaid until a complete solution is found. The
delay was determined by running igt/gem_exec_fence/parallel in a loop for
a few hours (unpatched MTBF is about 10s).

We have also seen such DMAR fault [reason 7] errors on other platforms,
notably gen9-gen11, but so far it has only been trivially and
consistently reproduced on Tigerlake.

v2: Leave a tell-tale to know when we apply the vt'd quirk, and as a
reminder to remove it again. Hopefully.

Testcase: igt/gem_exec_fence/parallel
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201209164008.5487-2-chris@chris-wilson.co.uk

84361529

drm/i915: Remove livelock from "do_idle_maps" vtd w/a · 63de1da1

Chris Wilson authored Dec 09, 2020

A call to wait for the GT to idle from inside the put_pages fallback is
prone to cause an uninterruptible livelock. As it does not provide
adequate serialisation with new requests, simply fallback to a trivial
sleep.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201209164008.5487-1-chris@chris-wilson.co.uk

63de1da1

drm/i915/gt: document masked registers · 338d58cf

Lucas De Marchi authored Dec 08, 2020

Document what a masked register is according to bspec so we avoid
developers using the wrong functions to implement WAs.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201209045246.2905675-3-lucas.demarchi@intel.com

338d58cf

drm/i915/gt: rename wa_write_masked_or() · 305b3bb5

Lucas De Marchi authored Dec 08, 2020

The use of "masked" in this function is due to its history. Once upon a
time it received a mask and a value as parameter. Since
commit eeec73f8 ("drm/i915/gt: Skip rmw for masked registers")
that is not true anymore and now there is a clear and a set parameter.
Depending on the case, that can still be thought as a mask and value,
but there are some subtle differences: what we clear doesn't need to be
the same bits we are setting, particularly when we are using masked
registers.

The fact that we also have "masked registers", i.e. registers whose mask
is stored in the upper 16 bits of the register, makes it even more
confusing, because "masked" in wa_write_masked_or() has little to do
with masked registers, but rather refers to the old mask parameter the
function received (that can also, but not exclusively, be used to write
to masked register).

Avoid the ambiguity and misnomer by renaming it to something else,
hopefully less confusing: wa_write_clr_set(), to designate that we are
doing both clr and set operations in the register.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201209045246.2905675-2-lucas.demarchi@intel.com

305b3bb5

drm/i915/gt: stop ignoring read with wa_masked_field_set · 61b3b0d1

Lucas De Marchi authored Dec 08, 2020

When using masked registers, there is nothing to clear since a masked
register has the mask in the upper 16b: we can just write to the
location we want and use the mask to control what bits we are writing
to.

However that doesn't mean we don't want to read back the register and
check the value actually matched what we wanted to write, i.e. that
the WA stick. That should be an explicit opt-out for registers that are
either write-only or that are affected by hardware misbehavior.

Moreover both wa_masked_en() and wa_masked_dis() check the WA stick, so
skipping the check just because the field is more than 1 bit is
surprising and error-prone.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201209045246.2905675-1-lucas.demarchi@intel.com

61b3b0d1

08 Dec, 2020 1 commit

drm/i915/gem: Drop false !i915_vma_is_closed assertion · e9f4829f

Chris Wilson authored Dec 07, 2020

Closed vma are protected by the GT wakeref held as we lookup the vma, so
we know that the vma will not be freed as we process it for the execbuf.
Instead we expect to catch the closed status of the context, and simply
allow the close-race on an individual vma to be washed away.

Longer term, the GT wakeref protection will be removed by explicit
vma.kref tracking.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2245Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201207193824.18114-1-chris@chris-wilson.co.uk

e9f4829f

07 Dec, 2020 2 commits

drm/i915/selftests: Improve error reporting for igt_mock_max_segment · 4f963d36

Chris Wilson authored Dec 07, 2020

When we fail to find a single block large enough to require splitting,
report the largest block we did find.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Ramalingam C <ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201207130346.11849-1-chris@chris-wilson.co.uk

4f963d36

drm/i915: fix size_t greater or equal to zero comparison · e70956a2

Colin Ian King authored Oct 02, 2020

Currently the check that the unsigned size_t variable i is >= 0
is always true because the unsigned variable will never be negative,
causing the loop to run forever. Fix this by changing the
pre-decrement check to a zero check on i followed by a decrement of i.

Addresses-Coverity: ("Unsigned compared against 0")
Fixes: bfed6708 ("drm/i915: use vmap in shmem_pin_map")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201002170354.94627-1-colin.king@canonical.com

e70956a2

05 Dec, 2020 4 commits

drm/i915: remove WA_SET_FIELD_MASKED() · 6ca07255

Lucas De Marchi authored Dec 05, 2020

Remove the last macro and implement it as a function like the rest of
the operations that don't assume there is a `wal` list, but rather
receive it as argument.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201205092542.2325477-4-lucas.demarchi@intel.com

6ca07255

drm/i915: remove WA_CLR_BIT_MASKED() · 66901614

Lucas De Marchi authored Dec 05, 2020

Just ommitting the list it's operating on doesn't save much typing
and adds another way to do the same thing. Just replace it with
wa_masked_dis().
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201205092542.2325477-3-lucas.demarchi@intel.com

66901614

drm/i915: remove WA_SET_BIT_MASKED() · b9bdccd5

Lucas De Marchi authored Dec 05, 2020

Just ommitting the list it's operating on doesn't save much typing and
adds another way to do the same thing. Just replace it with
wa_masked_en().
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201205092542.2325477-2-lucas.demarchi@intel.com

b9bdccd5

drm/i915/dg1: Implement WA_16011163337 · 1efa473e

Swathi Dhanavanthri authored Dec 05, 2020

Set GS Timer to 224 to prevent a HS/DS hang.

Bspec: 53508

v2: reword commit message and add comment explaining why read
verification is ignored (Chris)

Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Swathi Dhanavanthri <swathi.dhanavanthri@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201205092542.2325477-1-lucas.demarchi@intel.com

1efa473e

04 Dec, 2020 5 commits

drm/i915/gt: Clear the execlists timers upon reset · f867b66e

Chris Wilson authored Dec 04, 2020

Across a reset, we stop the engine but not the timers. This leaves a
window where the timers have inconsistent state with the engine, but
should only result in a spurious timeout. As we cancel the outstanding
events, also cancel their timers.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201204151234.19729-4-chris@chris-wilson.co.uk

f867b66e

drm/i915/gt: Include reset failures in the trace · cb56a07d

Chris Wilson authored Dec 04, 2020

The GT and engine reset failures are completely invisible when looking at
a trace for a bug, but are vital to understanding the incomplete flow.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201204151234.19729-3-chris@chris-wilson.co.uk

cb56a07d

drm/i915/gt: Cancel the preemption timeout on responding to it · d997e240

Chris Wilson authored Dec 04, 2020

We currently presume that the engine reset is successful, cancelling the
expired preemption timer in the process. However, engine resets can
fail, leaving the timeout still pending and we will then respond to the
timeout again next time the tasklet fires. What we want is for the
failed engine reset to be promoted to a full device reset, which is
kicked by the heartbeat once the engine stops processing events.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1168
Fixes: 3a7a92ab ("drm/i915/execlists: Force preemption")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v5.5+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201204151234.19729-2-chris@chris-wilson.co.uk

d997e240

drm/i915/gt: Ignore repeated attempts to suspend request flow across reset · b9695405

Chris Wilson authored Dec 04, 2020

Before reseting the engine, we suspend the execution of the guilty
request, so that we can continue execution with a new context while we
slowly compress the captured error state for the guilty context. However,
if the reset fails, we will promptly attempt to reset the same request
again, and discover the ongoing capture. Ignore the second attempt to
suspend and capture the same request.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1168
Fixes: 32ff621f ("drm/i915/gt: Allow temporary suspension of inflight requests")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v5.7+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201204151234.19729-1-chris@chris-wilson.co.uk

b9695405

drm/i915/gem: Propagate error from cancelled submit due to context closure · ba38b79e

Chris Wilson authored Dec 03, 2020

In the course of discovering and closing many races with context closure
and execbuf submission, since commit 61231f6b ("drm/i915/gem: Check
that the context wasn't closed during setup") we started checking that
the context was not closed by another userspace thread during the execbuf
ioctl. In doing so we cancelled the inflight request (by telling it to be
skipped), but kept reporting success since we do submit a request, albeit
one that doesn't execute. As the error is known before we return from the
ioctl, we can report the error we detect immediately, rather than leave
it on the fence status. With the immediate propagation of the error, it
is easier for userspace to handle.

Fixes: 61231f6b ("drm/i915/gem: Check that the context wasn't closed during setup")
Testcase: igt/gem_ctx_exec/basic-close-race
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v5.7+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201203103432.31526-1-chris@chris-wilson.co.uk

ba38b79e

03 Dec, 2020 1 commit

drm/i915/gem: Check the correct variable in selftest · 14f2d760

Dan Carpenter authored Dec 03, 2020

There is a copy and paste bug in this code.  It's supposed to check
"obj2" instead of checking "obj" a second time.

Fixes: 80f0b679 ("drm/i915: Add an implementation for i915_gem_ww_ctx locking, v2.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/8ilneOcJAjwqU4t@mwand

14f2d760