Commits · 460d02ba50763e73da033448d0d57f0aea23d039 · Kirill Smelkov / linux

16 Dec, 2020 4 commits

drm/i915: Encode fence specific waitqueue behaviour into the wait.flags · 460d02ba

Chris Wilson authored Dec 16, 2020

Use the wait_queue_entry.flags to denote the special fence behaviour
(flattening continuations along fence chains, and for propagating
errors) rather than trying to detect ordinary waiters by their
functions.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201216165850.25030-1-chris@chris-wilson.co.uk

460d02ba

drm/i915/gt: Move gen8 CS emitters into gen8_engine_cs.h · 45233ab2

Chris Wilson authored Dec 16, 2020

Reduce the pollution of intel_engine.h by moving gen8_emit_pipe_control
and friends to gen8_engine_cs.h
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201216135452.6063-1-chris@chris-wilson.co.uk

45233ab2

drm/i915/gem: Drop free_work for GEM contexts · f8246cf4

Chris Wilson authored Dec 15, 2020

The free_list and worker was introduced in commit 5f09a9c8 ("drm/i915:
Allow contexts to be unreferenced locklessly"), but subsequently made
redundant by the removal of the last sleeping lock in commit 2935ed53
("drm/i915: Remove logical HW ID"). As we can now free the GEM context
immediately from any context, remove the deferral of the free_list

v2: Lift removing the context from the global list into close().
Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201215152138.8158-1-chris@chris-wilson.co.uk

f8246cf4

drm/i915: Fix mismatch between misplaced vma check and vma insert · 5f22cc0b

Chris Wilson authored Dec 16, 2020

When inserting a VMA, we restrict the placement to the low 4G unless the
caller opts into using the full range. This was done to allow usersapce
the opportunity to transition slowly from a 32b address space, and to
avoid breaking inherent 32b assumptions of some commands.

However, for insert we limited ourselves to 4G-4K, but on verification
we allowed the full 4G. This causes some attempts to bind a new buffer
to sporadically fail with -ENOSPC, but at other times be bound
successfully.

commit 48ea1e32 ("drm/i915/gen9: Set PIN_ZONE_4G end to 4GB - 1
page") suggests that there is a genuine problem with stateless addressing
that cannot utilize the last page in 4G and so we purposefully excluded
it. This means that the quick pin pass may cause us to utilize a buggy
placement.
Reported-by: CQ Tang <cq.tang@intel.com>
Testcase: igt/gem_exec_params/larger-than-life-batch
Fixes: 48ea1e32 ("drm/i915/gen9: Set PIN_ZONE_4G end to 4GB - 1 page")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: CQ Tang <cq.tang@intel.com>
Reviewed-by: CQ Tang <cq.tang@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Cc: <stable@vger.kernel.org> # v4.5+
Link: https://patchwork.freedesktop.org/patch/msgid/20201216092951.7124-1-chris@chris-wilson.co.uk

5f22cc0b

15 Dec, 2020 1 commit

doc: Fix build of documentation after i915 file rename · 3b7bc18b

José Roberto de Souza authored Dec 14, 2020

Commit 70a2b431 ("drm/i915/gt: Rename lrc.c to
execlists_submission.c") renamed intel_lrc.c to
intel_execlists_submission.c but forgot to update i915.rst.

Fixes: 70a2b431 ("drm/i915/gt: Rename lrc.c to execlists_submission.c")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201214185440.243537-1-jose.souza@intel.com

3b7bc18b

14 Dec, 2020 3 commits

drm/i915/pmu: Remove !CONFIG_PM code · c41ce819

Tvrtko Ursulin authored Dec 14, 2020

Chris spotted that since 16ffe73c ("drm/i915/pmu: Use GT parked for
estimating RC6 while asleep") we don't rely on runtime pm internals when
estimating RC6 while asleep. We can remove the ifdef code to simplify and
at the same time wake up the device less when querying RC6 if CONFIG_PM is
not compiled in.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
References: 16ffe73c ("drm/i915/pmu: Use GT parked for estimating RC6 while asleep")
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201214094349.3563876-3-tvrtko.ursulin@linux.intel.com

c41ce819

drm/i915/pmu: Use raw clock for rc6 estimation · c51c29fb

Tvrtko Ursulin authored Dec 14, 2020

RC6 is a hardware counter and as such estimating it using the raw clock
during runtime suspend is more appropriate.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
References: 34f43927 ("perf: Add per event clockid support")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201214094349.3563876-2-tvrtko.ursulin@linux.intel.com

c51c29fb

drm/i915/pmu: Don't grab wakeref when enabling events · dbe13ae1

Tvrtko Ursulin authored Dec 14, 2020

Chris found a CI report which points out calling intel_runtime_pm_get from
inside i915_pmu_enable hook is not allowed since it can be invoked from
hard irq context. This is something we knew but forgot, so lets fix it
once again.

We do this by syncing the internal book keeping with hardware rc6 counter
on driver load.

v2:
 * Always sync on parking and fully sync on init.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: f4e9894b ("drm/i915/pmu: Correct the rc6 offset upon enabling")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201214094349.3563876-1-tvrtko.ursulin@linux.intel.com

dbe13ae1

10 Dec, 2020 3 commits

drm/i915/gt: Wean workaround selftests off GEM context · 04adaba8

Chris Wilson authored Dec 10, 2020

The workarounds are tied to the GT and we should derive the tests local
to the GT.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201210080240.24529-2-chris@chris-wilson.co.uk

04adaba8

drm/i915/gt: Mark legacy ring context as lost · 20a6774e

Chris Wilson authored Dec 10, 2020

When we reset the legacy ring context, due to potential corruption over
suspend/resume, remove the valid bit so that we avoid loading garbage.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201210080240.24529-1-chris@chris-wilson.co.uk

20a6774e

drm/i915: Correct location of Wa_1408615072 · c97ffd08

John Harrison authored Dec 10, 2020

The above workaround was added as an engine workaround not a GT
workaround. Moved it to the correct location.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201210170615.3107266-1-lucas.demarchi@intel.com

c97ffd08

09 Dec, 2020 9 commits

drm/i915: split gen8+ flush and bb_start emission functions · d0d829e5

Daniele Ceraolo Spurio authored Dec 09, 2020

These functions are independent from the backend used and can therefore
be split out of the exelists submission file, so they can be re-used by
the upcoming GuC submission backend.

Based on a patch by Chris Wilson.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201209233618.4287-3-chris@chris-wilson.co.ukSigned-off-by: Chris Wilson <chris@chris-wilson.co.uk>

d0d829e5

drm/i915/gt: Rename lrc.c to execlists_submission.c · 70a2b431

Chris Wilson authored Dec 09, 2020

We want to separate the utility functions for controlling the logical
ring context from the execlists submission mechanism (which is an
overgrown scheduler).

This is similar to Daniele's work to split up the files, but being
selfish I wanted to base it after my own changes to intel_lrc.c petered
out.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201209233618.4287-2-chris@chris-wilson.co.uk

70a2b431

drm/i915/gt: Move move context layout registers and offsets to lrc_reg.h · 9fd96c06

Chris Wilson authored Dec 09, 2020

Cleanup intel_lrc.h by moving some of the residual common register
definitions into intel_lrc_reg.h, prior to rebranding and splitting off
the submission backends.

v2: keep the SCHEDULE enum in the old file, since it is specific to the
gvt usage of the execlists submission backend (John)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> #v2
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201209233618.4287-1-chris@chris-wilson.co.uk

9fd96c06

drm/i915/gt: Remove uninterruptible parameter from intel_gt_wait_for_idle · 51c87fa6

Chris Wilson authored Dec 09, 2020

Now that the only user of the uninterruptible wait was eliminated,
remove the support.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201209164008.5487-3-chris@chris-wilson.co.uk

51c87fa6

drm/i915: Sleep around performing iommu unmaps on Tigerlake · 84361529

Chris Wilson authored Dec 09, 2020

Tigerlake is plagued by spontaneous DMAR faults [reason 7, next page
table ptr is invalid] which lead to GPU hangs. These faults occur when
an iommu map is immediately reused. Adding further clflushes and
barriers around either the GTT PTE or iommu PTE updates do not prevent
the faults. So far the only effect has been from inducing a delay
between reuse of the iommu on the GPU, and applying the delay at the
iommu map allows for the smallest stable delay.

Note that such a delay is hideous and clearly does not fix the root cause,
and so should only be a bandaid until a complete solution is found. The
delay was determined by running igt/gem_exec_fence/parallel in a loop for
a few hours (unpatched MTBF is about 10s).

We have also seen such DMAR fault [reason 7] errors on other platforms,
notably gen9-gen11, but so far it has only been trivially and
consistently reproduced on Tigerlake.

v2: Leave a tell-tale to know when we apply the vt'd quirk, and as a
reminder to remove it again. Hopefully.

Testcase: igt/gem_exec_fence/parallel
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201209164008.5487-2-chris@chris-wilson.co.uk

84361529

drm/i915: Remove livelock from "do_idle_maps" vtd w/a · 63de1da1

Chris Wilson authored Dec 09, 2020

A call to wait for the GT to idle from inside the put_pages fallback is
prone to cause an uninterruptible livelock. As it does not provide
adequate serialisation with new requests, simply fallback to a trivial
sleep.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201209164008.5487-1-chris@chris-wilson.co.uk

63de1da1

drm/i915/gt: document masked registers · 338d58cf

Lucas De Marchi authored Dec 08, 2020

Document what a masked register is according to bspec so we avoid
developers using the wrong functions to implement WAs.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201209045246.2905675-3-lucas.demarchi@intel.com

338d58cf

drm/i915/gt: rename wa_write_masked_or() · 305b3bb5

Lucas De Marchi authored Dec 08, 2020

The use of "masked" in this function is due to its history. Once upon a
time it received a mask and a value as parameter. Since
commit eeec73f8 ("drm/i915/gt: Skip rmw for masked registers")
that is not true anymore and now there is a clear and a set parameter.
Depending on the case, that can still be thought as a mask and value,
but there are some subtle differences: what we clear doesn't need to be
the same bits we are setting, particularly when we are using masked
registers.

The fact that we also have "masked registers", i.e. registers whose mask
is stored in the upper 16 bits of the register, makes it even more
confusing, because "masked" in wa_write_masked_or() has little to do
with masked registers, but rather refers to the old mask parameter the
function received (that can also, but not exclusively, be used to write
to masked register).

Avoid the ambiguity and misnomer by renaming it to something else,
hopefully less confusing: wa_write_clr_set(), to designate that we are
doing both clr and set operations in the register.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201209045246.2905675-2-lucas.demarchi@intel.com

305b3bb5

drm/i915/gt: stop ignoring read with wa_masked_field_set · 61b3b0d1

Lucas De Marchi authored Dec 08, 2020

When using masked registers, there is nothing to clear since a masked
register has the mask in the upper 16b: we can just write to the
location we want and use the mask to control what bits we are writing
to.

However that doesn't mean we don't want to read back the register and
check the value actually matched what we wanted to write, i.e. that
the WA stick. That should be an explicit opt-out for registers that are
either write-only or that are affected by hardware misbehavior.

Moreover both wa_masked_en() and wa_masked_dis() check the WA stick, so
skipping the check just because the field is more than 1 bit is
surprising and error-prone.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201209045246.2905675-1-lucas.demarchi@intel.com

61b3b0d1

08 Dec, 2020 1 commit

drm/i915/gem: Drop false !i915_vma_is_closed assertion · e9f4829f

Chris Wilson authored Dec 07, 2020

Closed vma are protected by the GT wakeref held as we lookup the vma, so
we know that the vma will not be freed as we process it for the execbuf.
Instead we expect to catch the closed status of the context, and simply
allow the close-race on an individual vma to be washed away.

Longer term, the GT wakeref protection will be removed by explicit
vma.kref tracking.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2245Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201207193824.18114-1-chris@chris-wilson.co.uk

e9f4829f

07 Dec, 2020 2 commits

drm/i915/selftests: Improve error reporting for igt_mock_max_segment · 4f963d36

Chris Wilson authored Dec 07, 2020

When we fail to find a single block large enough to require splitting,
report the largest block we did find.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Ramalingam C <ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201207130346.11849-1-chris@chris-wilson.co.uk

4f963d36

drm/i915: fix size_t greater or equal to zero comparison · e70956a2

Colin Ian King authored Oct 02, 2020

Currently the check that the unsigned size_t variable i is >= 0
is always true because the unsigned variable will never be negative,
causing the loop to run forever. Fix this by changing the
pre-decrement check to a zero check on i followed by a decrement of i.

Addresses-Coverity: ("Unsigned compared against 0")
Fixes: bfed6708 ("drm/i915: use vmap in shmem_pin_map")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201002170354.94627-1-colin.king@canonical.com

e70956a2

05 Dec, 2020 4 commits

drm/i915: remove WA_SET_FIELD_MASKED() · 6ca07255

Lucas De Marchi authored Dec 05, 2020

Remove the last macro and implement it as a function like the rest of
the operations that don't assume there is a `wal` list, but rather
receive it as argument.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201205092542.2325477-4-lucas.demarchi@intel.com

6ca07255

drm/i915: remove WA_CLR_BIT_MASKED() · 66901614

Lucas De Marchi authored Dec 05, 2020

Just ommitting the list it's operating on doesn't save much typing
and adds another way to do the same thing. Just replace it with
wa_masked_dis().
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201205092542.2325477-3-lucas.demarchi@intel.com

66901614

drm/i915: remove WA_SET_BIT_MASKED() · b9bdccd5

Lucas De Marchi authored Dec 05, 2020

Just ommitting the list it's operating on doesn't save much typing and
adds another way to do the same thing. Just replace it with
wa_masked_en().
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201205092542.2325477-2-lucas.demarchi@intel.com

b9bdccd5

drm/i915/dg1: Implement WA_16011163337 · 1efa473e

Swathi Dhanavanthri authored Dec 05, 2020

Set GS Timer to 224 to prevent a HS/DS hang.

Bspec: 53508

v2: reword commit message and add comment explaining why read
verification is ignored (Chris)

Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Swathi Dhanavanthri <swathi.dhanavanthri@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201205092542.2325477-1-lucas.demarchi@intel.com

1efa473e

04 Dec, 2020 5 commits

drm/i915/gt: Clear the execlists timers upon reset · f867b66e

Chris Wilson authored Dec 04, 2020

Across a reset, we stop the engine but not the timers. This leaves a
window where the timers have inconsistent state with the engine, but
should only result in a spurious timeout. As we cancel the outstanding
events, also cancel their timers.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201204151234.19729-4-chris@chris-wilson.co.uk

f867b66e

drm/i915/gt: Include reset failures in the trace · cb56a07d

Chris Wilson authored Dec 04, 2020

The GT and engine reset failures are completely invisible when looking at
a trace for a bug, but are vital to understanding the incomplete flow.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201204151234.19729-3-chris@chris-wilson.co.uk

cb56a07d

drm/i915/gt: Cancel the preemption timeout on responding to it · d997e240

Chris Wilson authored Dec 04, 2020

We currently presume that the engine reset is successful, cancelling the
expired preemption timer in the process. However, engine resets can
fail, leaving the timeout still pending and we will then respond to the
timeout again next time the tasklet fires. What we want is for the
failed engine reset to be promoted to a full device reset, which is
kicked by the heartbeat once the engine stops processing events.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1168
Fixes: 3a7a92ab ("drm/i915/execlists: Force preemption")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v5.5+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201204151234.19729-2-chris@chris-wilson.co.uk

d997e240

drm/i915/gt: Ignore repeated attempts to suspend request flow across reset · b9695405

Chris Wilson authored Dec 04, 2020

Before reseting the engine, we suspend the execution of the guilty
request, so that we can continue execution with a new context while we
slowly compress the captured error state for the guilty context. However,
if the reset fails, we will promptly attempt to reset the same request
again, and discover the ongoing capture. Ignore the second attempt to
suspend and capture the same request.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1168
Fixes: 32ff621f ("drm/i915/gt: Allow temporary suspension of inflight requests")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v5.7+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201204151234.19729-1-chris@chris-wilson.co.uk

b9695405

drm/i915/gem: Propagate error from cancelled submit due to context closure · ba38b79e

Chris Wilson authored Dec 03, 2020

In the course of discovering and closing many races with context closure
and execbuf submission, since commit 61231f6b ("drm/i915/gem: Check
that the context wasn't closed during setup") we started checking that
the context was not closed by another userspace thread during the execbuf
ioctl. In doing so we cancelled the inflight request (by telling it to be
skipped), but kept reporting success since we do submit a request, albeit
one that doesn't execute. As the error is known before we return from the
ioctl, we can report the error we detect immediately, rather than leave
it on the fence status. With the immediate propagation of the error, it
is easier for userspace to handle.

Fixes: 61231f6b ("drm/i915/gem: Check that the context wasn't closed during setup")
Testcase: igt/gem_ctx_exec/basic-close-race
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v5.7+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201203103432.31526-1-chris@chris-wilson.co.uk

ba38b79e

03 Dec, 2020 1 commit

drm/i915/gem: Check the correct variable in selftest · 14f2d760

Dan Carpenter authored Dec 03, 2020

There is a copy and paste bug in this code.  It's supposed to check
"obj2" instead of checking "obj" a second time.

Fixes: 80f0b679 ("drm/i915: Add an implementation for i915_gem_ww_ctx locking, v2.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/8ilneOcJAjwqU4t@mwand

14f2d760

02 Dec, 2020 6 commits

Revert "drm/i915/lmem: Limit block size to 4G" · 7d1a31e1

Chris Wilson authored Dec 02, 2020

Mixing I915_ALLOC_CONTIGUOUS and I915_ALLOC_MAX_SEGMENT_SIZE fared
badly. The two directives conflict, with the contiguous request setting
the min_order to the full size of the object, and the max-segment-size
setting the max_order to the limit of the DMA mapper. This results in a
situation where max_order < min_order, causing our sanity checks to
fail.

Instead of limiting the buddy block size, in the previous patch we split
the oversized buddy into multiple scatterlist elements.

Fixes: d2cf0125 ("drm/i915/lmem: Limit block size to 4G")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201202173444.14903-2-chris@chris-wilson.co.uk

7d1a31e1

drm/i915/gem: Limit lmem scatterlist elements to UINT_MAX · a2843b3b

Chris Wilson authored Dec 02, 2020

Adhere to the i915_sg_max_segment() limit on the lengths of individual
scatterlist elements, and in doing so split up very large chunks of lmem
into manageable pieces for the dma-mapping backend.
Reported-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Suggested-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201202173444.14903-1-chris@chris-wilson.co.uk

a2843b3b

drm/i915/selftests: Tidy prng constructor for client blits · 840291a7

Chris Wilson authored Dec 02, 2020

Since we only initialise the prng once within the scope of the selftest,
we can use the default initialiser.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201202130406.18461-1-chris@chris-wilson.co.uk

840291a7

drm/i915/pmu: Deprecate I915_PMU_LAST and optimize state tracking · 348fb0cb

Tvrtko Ursulin authored Dec 01, 2020

Adding any kinds of "last" abi markers is usually a mistake which I
repeated when implementing the PMU because it felt convenient at the time.

This patch marks I915_PMU_LAST as deprecated and stops the internal
implementation using it for sizing the event status bitmask and array.

New way of sizing the fields is a bit less elegant, but it omits reserving
slots for tracking events we are not interested in, and as such saves some
runtime space. Adding sampling events is likely to be a special event and
the new plumbing needed will be easily detected in testing. Existing
asserts against the bitfield and array sizes are keeping the code safe.

First event which gets the new treatment in this new scheme are the
interrupts - which neither needs any tracking in i915 pmu nor needs
waking up the GPU to read it.

v2:
 * Streamline helper names. (Chris)

v3:
 * Comment which events need tracking. (Chris)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201201131757.206367-1-tvrtko.ursulin@linux.intel.com

348fb0cb

drm/i915/gem: Report error for vmap() failure · 37df0edf

Chris Wilson authored Dec 01, 2020

Convert the NULL pointer from a failed vmap() to ERR_PTR(-ENOMEM) for
propagation.

<1> [269.830447] BUG: kernel NULL pointer dereference, address: 0000000000000000
<1> [269.830455] #PF: supervisor write access in kernel mode
<1> [269.830457] #PF: error_code(0x0002) - not-present page
<6> [269.830459] PGD 0 P4D 0
<4> [269.830465] Oops: 0002 [#1] PREEMPT SMP PTI
<4> [269.830469] CPU: 3 PID: 5789 Comm: i915_selftest Tainted: G     U            5.10.0-rc6-CI-CI_DRM_9412+ #1
<4> [269.830472] Hardware name: Intel Corp. Geminilake/GLK RVP2 LP4SD (07), BIOS GELKRVPA.X64.0062.B30.1708222146 08/22/2017
<4> [269.830636] RIP: 0010:igt_client_fill+0x1b9/0x5f0 [i915]
<4> [269.830640] Code: e8 0c 32 02 00 48 89 c5 48 3d 00 f0 ff ff 0f 87 e9 02 00 00 48 8b 8b 78 06 00 00 44 89 f0 48 89 ef 35 af be ad de 48 c1 e9 02 <f3> ab 0f b6 83 80 03 00 00 89 c2 c0 ea 03 83 e2 02 75 09 83 c8 20
<4> [269.830642] RSP: 0018:ffffc900007a79e8 EFLAGS: 00010206
<4> [269.830645] RAX: 00000000df0bf37b RBX: ffff88811d8af3c0 RCX: 00000000010afc00
<4> [269.830647] RDX: 0000000000000000 RSI: ffffffff822f2b17 RDI: 0000000000000000
<4> [269.830648] RBP: 0000000000000000 R08: ffff888111c80930 R09: 00000000fffffffe
<4> [269.830650] R10: 0000000000000000 R11: 00000000ffbc70e4 R12: ffff88811090f700
<4> [269.830652] R13: ffff88810df60180 R14: 0000000001a64dd4 R15: 0000000000000000
<4> [269.830655] FS:  00007f137b07de40(0000) GS:ffff88817b980000(0000) knlGS:0000000000000000
<4> [269.830657] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [269.830659] CR2: 0000000000000000 CR3: 0000000115984000 CR4: 0000000000350ee0
<4> [269.830661] Call Trace:
<4> [269.830780]  __i915_subtests.cold.7+0x42/0x92 [i915]
<4> [269.830886]  ? __i915_nop_teardown+0x10/0x10 [i915]
<4> [269.830989]  ? __i915_live_setup+0x30/0x30 [i915]
<4> [269.831104]  __run_selftests.part.3+0xf7/0x14c [i915]
<4> [269.831939]  i915_live_selftests.cold.5+0x1f/0x47 [i915]
<4> [269.832027]  i915_pci_probe+0x93/0x1d0 [i915]
<4> [269.832037]  ? _raw_spin_unlock_irqrestore+0x2f/0x50
<4> [269.832043]  pci_device_probe+0x9e/0x110
<4> [269.832049]  really_probe+0x1c4/0x430
<4> [269.832053]  driver_probe_device+0xd9/0x140
<4> [269.832056]  device_driver_attach+0x4a/0x50
<4> [269.832059]  __driver_attach+0x83/0x140
<4> [269.832062]  ? device_driver_attach+0x50/0x50
<4> [269.832064]  ? device_driver_attach+0x50/0x50
<4> [269.832067]  bus_for_each_dev+0x75/0xc0
<4> [269.832070]  bus_add_driver+0x14b/0x1f0
<4> [269.832073]  driver_register+0x66/0xb0
<4> [269.832160]  i915_init+0x70/0x87 [i915]
<4> [269.832164]  ? 0xffffffffa05e3000
<4> [269.832168]  do_one_initcall+0x56/0x2e0
<4> [269.832174]  ? kmem_cache_alloc_trace+0x6a4/0x770
<4> [269.832180]  do_init_module+0x55/0x200
<4> [269.832184]  load_module+0x22a2/0x2480
<4> [269.832191]  ? __do_sys_finit_module+0xad/0x110
<4> [269.832194]  __do_sys_finit_module+0xad/0x110
<4> [269.832199]  do_syscall_64+0x33/0x80
<4> [269.832202]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
<4> [269.832204] RIP: 0033:0x7f137a718839
<4> [269.832208] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1f f6 2c 00 f7 d8 64 89 01 48
<4> [269.832210] RSP: 002b:00007ffc4267d308 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
<4> [269.832214] RAX: ffffffffffffffda RBX: 000056288b88f0d0 RCX: 00007f137a718839
<4> [269.832216] RDX: 0000000000000000 RSI: 000056288b895850 RDI: 0000000000000007
<4> [269.832218] RBP: 000056288b895850 R08: 312d3d7374736574 R09: 000056288b88c020
<4> [269.832220] R10: 00007ffc4267d450 R11: 0000000000000246 R12: 0000000000000000
<4> [269.832222] R13: 000056288b8877a0 R14: 0000000000000020 R15: 0000000000000045
<4> [269.832226] Modules linked in: i915(+) vgem mei_hdcp snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel cdc_ether usbnet snd_intel_dspcfg mii snd_hda_codec snd_hwdep snd_hda_core r8169 snd_pcm realtek mei_me mei prime_numbers intel_lpss_pci i2c_hid pinctrl_geminilake [last unloaded: i915]
<4> [269.832264] CR2: 0000000000000000

Fixes: cb2ce93e ("drm/i915/gem: Differentiate oom failures from invalid map types")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201201215441.31900-1-chris@chris-wilson.co.uk

37df0edf

drm/i915/tgl, rkl, dg1: Apply WA_1406941453 to TGL, RKL and DG1 · 5ac84806

Swathi Dhanavanthri authored Nov 02, 2020

This workaround is applicable only for tgl,rkl and dg1.

Bspec: 52890, 53273, 53508.
Signed-off-by: Swathi Dhanavanthri <swathi.dhanavanthri@intel.com>
Reviewed-by: Clint Taylor <Clinton.A.Taylor@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201201175735.1377372-1-lucas.demarchi@intel.com

5ac84806

30 Nov, 2020 1 commit

drm/i915/gem: Differentiate oom failures from invalid map types · cb2ce93e

Chris Wilson authored Nov 27, 2020

After a cursory check on the parameters to i915_gem_object_pin_map(),
where we return a precise error, if the backend rejects the mapping we
always return PTR_ERR(-ENOMEM). Let us also return a more precise error
here so we can differentiate between running out of memory and
programming errors (or situations where we may be trying different paths
and looking for an error from an unsupported map).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201127195334.13134-1-chris@chris-wilson.co.uk

cb2ce93e