Commits · e41627db6f364f9e3157b15a7646f8dd35f75508 · nexedi / linux

08 May, 2020 3 commits

drm/i915/gt: Improve precision on defer_request assert · e41627db

Chris Wilson authored May 08, 2020

The kernel_context does not use initial-breadcrumbs, so when we ask if
its requests have started we do so by comparing against the completion
seqno of the previous request. This is very imprecise, not precise
enough for the defer_request assertion.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1847Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200508104220.9872-1-chris@chris-wilson.co.uk

e41627db

drm/i915: Pull waiting on an external dma-fence into its routine · ac938052

Chris Wilson authored May 08, 2020

As a means for a small code consolidation, but primarily to start
thinking more carefully about internal-vs-external linkage, pull the
pair of i915_sw_fence_await_dma_fence() calls into a common routine.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200508092933.738-2-chris@chris-wilson.co.uk

ac938052

drm/i915: Ignore submit-fences on the same timeline · 2045d666

Chris Wilson authored May 08, 2020

While we ordinarily do not skip submit-fences due to the accompanying
hook that we want to callback on execution, a submit-fence on the same
timeline is meaningless.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200508092933.738-1-chris@chris-wilson.co.uk

2045d666

07 May, 2020 7 commits

drm/i915/gen12: Add aux table invalidate for all engines · 972282c4

Mika Kuoppala authored May 07, 2020

All engines, exception being blitter as it does not
care about the form, can access compressed surfaces.

So we need to add forced aux table invalidates
for those engines.

v2: virtual instance masking (Chris)
v3: bug on if not found (Chris)

References: d248b371 ("drm/i915/gen12: Invalidate aux table entries forcibly")
References bspec#43904, hsdes#1809175790
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chuansheng Liu <chuansheng.liu@intel.com>
Cc: Rafael Antognolli <rafael.antognolli@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200507142045.8668-1-mika.kuoppala@linux.intel.com

972282c4

drm/i915: Remove wait priority boosting · eec39e44

Chris Wilson authored May 07, 2020

Upon waiting a request (when asked), we gave that request a small
priority boost, not enough for it to cause preemption, but enough for it
to be scheduled next before all equals. We also used that bit to give
new clients a small priority boost, similar to FQ_CODEL, such that we
favoured short interactive tasks ahead of long running streams.

However, this is causing lots of complications with timeslicing where we
both want to honour the boost and yet ignore it. Those complications
cause unexpected user behaviour (tasks not being timesliced and run
concurrently as epxected), and the easiest way to resolve that is to
remove the boost. Hopefully, we can find a compromise again if we need
to, but in theory timeslicing itself and future more advanced schedulers
should give us the interactivity boost we seek.

Testcase: igt/gem_exec_schedule/lateslice
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200507152338.7452-3-chris@chris-wilson.co.uk

eec39e44

drm/i915: Mark concurrent submissions with a weak-dependency · 6b6cd2eb

Chris Wilson authored May 07, 2020

We recorded the dependencies for WAIT_FOR_SUBMIT in order that we could
correctly perform priority inheritance from the parallel branches to the
common trunk. However, for the purpose of timeslicing and reset
handling, the dependency is weak -- as we the pair of requests are
allowed to run in parallel and not in strict succession.

The real significance though is that this allows us to rearrange
groups of WAIT_FOR_SUBMIT linked requests along the single engine, and
so can resolve user level inter-batch scheduling dependencies from user
semaphores.

Fixes: c81471f5 ("drm/i915: Copy across scheduler behaviour flags across submit fences")
Testcase: igt/gem_exec_fence/submit
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v5.6+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200507155109.8892-1-chris@chris-wilson.co.uk

6b6cd2eb

drm/i915/gen12: Invalidate aux table entries forcibly · d248b371

Mika Kuoppala authored May 06, 2020

Aux table invalidation can fail on update. So
next access may cause memory access to be into stale entry.

Proposed workaround is to invalidate entries between
all batchbuffers.

v2: correct register address (Yang)
v3: respect the order (Chris)

References bspec#43904, hsdes#1809175790
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chuansheng Liu <chuansheng.liu@intel.com>
Cc: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Yang A Shi <yang.a.shi@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200506165310.1239-1-mika.kuoppala@linux.intel.com

d248b371

drm/i915/gen12: Flush L3 · 0c7c0c8e

Mika Kuoppala authored May 06, 2020

Flush TDL,L3 and EUs
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200506144734.29297-3-mika.kuoppala@linux.intel.com

0c7c0c8e

drm/i915/gen12: Fix HDC pipeline flush · 32d7171e

Mika Kuoppala authored May 06, 2020

HDC pipeline flush is bit on the first dword of
the PIPE_CONTROL, not the second. Make it so.

v2: function naming (Chris)
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200506144734.29297-2-mika.kuoppala@linux.intel.com

32d7171e

Revert "drm/i915/tgl: Include ro parts of l3 to invalidate" · f02ac414

Mika Kuoppala authored May 06, 2020

This reverts commit 62037fff.

L3 ro cache invalidation is part of the dword0 of pipe
control. Also it is not relevant to this gen.
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200506144734.29297-1-mika.kuoppala@linux.intel.com

f02ac414

06 May, 2020 1 commit

drm/i915: Propagate error from completed fences · 24fe5f2a

Chris Wilson authored May 06, 2020

We need to preserve fatal errors from fences that are being terminated
as we hook them up.

Fixes: ef468849 ("drm/i915: Propagate fence errors")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200506162136.3325-1-chris@chris-wilson.co.uk

24fe5f2a

05 May, 2020 6 commits

drm/i915/icp: Add Wa_14010685332 · 9b2383a7

Matt Roper authored May 01, 2020

We need to toggle a SDE chicken bit on and then off as the final
step when disabling interrupts in preparation for runtime suspend.

Bspec: 33450
Bspec: 8402
Cc: Bob Paauwe <bob.j.paauwe@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200501213701.371443-1-matthew.d.roper@intel.comReviewed-by: Bob Paauwe <bob.j.paauwe@intel.com>

9b2383a7

drm/i915/gt: Stop holding onto the pinned_default_state · 977253df

Chris Wilson authored May 04, 2020

As we only restore the default context state upon banning a context, we
only need enough of the state to run the ring and nothing more. That is
we only need our bare protocontext.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200504180745.15645-1-chris@chris-wilson.co.uk

977253df

drm/i915/execlists: Record the active CCID from before reset · b68be5c6

Chris Wilson authored May 05, 2020

If we cannot trust the reset will flush out the CS event queue such that
process_csb() reports an accurate view of HW, we will need to search the
active and pending contexts to determine which was actually running at
the time we issued the reset.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200505084629.31365-1-chris@chris-wilson.co.uk

b68be5c6

drm/i915: Added required new PCode commands · f136c58a

Stanislav Lisovskiy authored May 05, 2020

We need a new PCode request commands and reply codes
to be added as a prepartion patch for QGV points
restricting for new SAGV support.

v2: - Extracted those changes into separate patch
      (Ville Syrjälä)

v3: - Moved new PCode masks to another place from
      PCode commands(Ville)

v4: - Moved new PCode masks to correspondent PCode
      command, with identation(Ville)
    - Changed naming to ICL_ instead of GEN11_
      to fit more nicely into existing definition
      style.
Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200505102247.32452-5-stanislav.lisovskiy@intel.com

f136c58a

drm/i915/tgl+: Fix interrupt handling for DP AUX transactions · 054318c7

Imre Deak authored May 04, 2020

Unmask/enable AUX interrupts on all ports on TGL+. So far the interrupts
worked only on port A, which meant each transaction on other ports took
10ms.

Cc: <stable@vger.kernel.org> # v5.4+
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200504075828.20348-1-imre.deak@intel.com

054318c7

drm/i915/gt: Small tidy of gen8+ breadcrumb emission · 25fd6de3

Chris Wilson authored May 04, 2020

Use a local to shrink a line under 80 columns, and refactor the common
emit_xcs_breadcrumb() wrapper of ggtt-write.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200504180507.6017-1-chris@chris-wilson.co.uk

25fd6de3

04 May, 2020 15 commits

drm/i915/selftests: Repeat the rps clock frequency measurement · 8757797f

Chris Wilson authored May 04, 2020

Repeat the measurement of the clock frequency a few times and use the
median to try and reduce the systematic measurement error.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200504044903.7626-6-chris@chris-wilson.co.uk

8757797f

drm/i915/display: Warn if the FBC is still writing to stolen on removal · 0065e5f5

Chris Wilson authored May 03, 2020

If the FBC is still writing into stolen, it will overwrite any future
users of that stolen region. Check before release, just to ease any
concerns -- we can remove it again later if it is barking up the wrong
tree.

References: https://gitlab.freedesktop.org/drm/intel/-/issues/1635Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200503180034.20010-1-chris@chris-wilson.co.uk

0065e5f5

drm/i915: Don't enable WaIncreaseLatencyIPCEnabled when IPC is disabled · 690d22da

Sultan Alsawaf authored Apr 30, 2020

In commit 5a7d202b, a logical AND was erroneously changed to an OR,
causing WaIncreaseLatencyIPCEnabled to be enabled unconditionally for
kabylake and coffeelake, even when IPC is disabled. Fix the logic so
that WaIncreaseLatencyIPCEnabled is only used when IPC is enabled.

Fixes: 5a7d202b ("drm/i915: Drop WaIncreaseLatencyIPCEnabled/1140 for cnl")
Cc: stable@vger.kernel.org # 5.3.x+
Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430214654.51314-1-sultan@kerneltoast.com

690d22da

drm/i915: Streamline the artihmetic · 2dd43144

Ville Syrjälä authored Apr 29, 2020

All these ROUNDING_FACTORs and whatnot are making this thing hard to
read. Get rid of them. And let's massage some of the fractions to
give us less questionable intermediate results and perhaps less
divisions.

Also looks like a good helping of 64bit math stuff is needed to
avoid some of overflows present in the current code. There
might still be a few overflows, namely when calculating
link_clks_available/samples_room (would require a huge hblank
though), and potentially when calculating hblank_rise (not sure
how large link_clks_active can get).

It looks like we're still not calculating exactly what the spec says
since we truncate tu_data and tu_line early. But I'm too lazy to
figure out if we could avoid that.

v2: Fix typo in commit msg (Uma)
    Remove ROUNDING_FACTOR define (Uma)
    s/5*link_clk+5*cdclk/5*(link_clk+cdclk)/ (Chris)

Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Cc: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429185457.26235-3-ville.syrjala@linux.intel.comReviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>

2dd43144

drm/i915: Rename variables to be consistent with bspec · 41ee86d6

Ville Syrjälä authored Apr 29, 2020

Since the code seems insistent on using the variable names from the
bspec formulat, let's be consistent and use those names for all
the things. For some reason 'link_clk' and 'lanes' were left out
in the code until now.

Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Cc: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429185457.26235-2-ville.syrjala@linux.intel.comReviewed-by: Uma Shankar <uma.shankar@intel.com>

41ee86d6

drm/i915: Nuke mode.vrefresh usage · d19b29be

Ville Syrjälä authored Apr 29, 2020

mode.vrefresh is rounded to the nearest integer. You don't want to use
it anywhere that requires precision. Also I want to nuke it.
vtotal*vrefresh == 1000*clock/htotal, so let's use the latter.

Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Cc: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429185457.26235-1-ville.syrjala@linux.intel.comReviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>

d19b29be

drm/i915: Remove cnl pre-prod workarounds · dab3aff7

Ville Syrjälä authored Apr 30, 2020

Remove all the stepping dependent cnl workarounds. Bspec lists
more steppings than this so presumably these are classed as
pre-production. And this is cnl after all so no one should
really care anyway.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430125822.21985-2-ville.syrjala@linux.intel.comReviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

dab3aff7

drm/i915/fbc: Require linear fb stride to be multiple of 512 bytes on gen9/glk · 25444ca6

Ville Syrjälä authored Apr 29, 2020

Display WA #1105 says that FBC requires PLANE_STRIDE to be a multiple
of 512 bytes on gen9 and glk.

This is definitely true for glk as certain tests (such as
igt/kms_big_fb/linear-16bpp-rotate-0) are now failing when the
display resolution results in a plane stride which is not a
multiple of 512 bytes.

Curiously I was not able to reproduce this on a KBL. First I
suspected that our use of the FBC override stride explain this,
but after trying to use the override stride on glk the test
still failed. I did try both the old CHICKEN_MISC_4 way and
the new FBC_STRIDE way, neither had any effect on the result.

Anyways, we need this at least on glk. But let's trust the spec
and apply the w/a for all gen9 as well, despite being unable to
reproduce the problem.

v2: s/FBC_CHICKEN/FBC_STRIDE/ in commit msg

Cc: José Roberto de Souza <jose.souza@intel.com>
Fixes: 691f7ba5 ("drm/i915/display/fbc: Make fences a nice-to-have for GEN9+")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429101034.8208-2-ville.syrjala@linux.intel.comReviewed-by: Matt Roper <matthew.d.roper@intel.com>

25444ca6

drm/i915: Rename bw_state to new_bw_state · 9ff79708

Stanislav Lisovskiy authored Apr 23, 2020

That is a preparation patch before next one where we
introduce old_bw_state and a bunch of other changes
as well.
In a review comment it was suggested to split out
at least that renaming into a separate patch, what
is done here.

v2: Removed spurious space
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200423075902.21892-8-stanislav.lisovskiy@intel.com

9ff79708

drm/i915: Track active_pipes in bw_state · ecab0f3d

Stanislav Lisovskiy authored Apr 30, 2020

We need to calculate SAGV mask also in a non-modeset
commit, however currently active_pipes are only calculated
for modesets in global atomic state, thus now we will be
tracking those also in bw_state in order to be able to
properly access global data.

v2: - Removed pre/post plane SAGV updates from modeset(Ville)
    - Now tracking active pipes in intel_can_enable_sagv(Ville)

v3: - lock global state if active_pipes change as well(Ville)
Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430195634.7666-1-stanislav.lisovskiy@intel.com

ecab0f3d

drm/i915: Use bw state for per crtc SAGV evaluation · 9728889f

Stanislav Lisovskiy authored Apr 30, 2020

Future platforms require per-crtc SAGV evaluation
and serializing global state when those are changed
from different commits.

v2: - Add has_sagv check to intel_crtc_can_enable_sagv
      so that it sets bit in reject mask.
    - Use bw_state in intel_pre/post_plane_enable_sagv
      instead of atomic state

v3: - Fixed rebase conflict, now using
      intel_atomic_crtc_state_for_each_plane_state in
      order to call it from atomic check
v4: - Use fb modifier from plane state

v5: - Make intel_has_sagv static again(Ville)
    - Removed unnecessary NULL assignments(Ville)
    - Removed unnecessary SAGV debug(Ville)
    - Call intel_compute_sagv_mask only for modesets(Ville)
    - Serialize global state only if sagv results change, but
      not mask itself(Ville)

v6: - use lock global state instead of serialize(Ville)
v7: - use both global state lock and serialize depending on
      if we need to change only global state or access hw
      (Ville)
Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Cc: Ville Syrjälä <ville.syrjala@intel.com>
Cc: James Ausmus <james.ausmus@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430191757.18206-1-stanislav.lisovskiy@intel.com

9728889f

drm/i915/gem: Implement legacy MI_STORE_DATA_IMM · e3d29130

Chris Wilson authored May 04, 2020

The older arches did not convert MI_STORE_DATA_IMM to using the GTT, but
left them writing to a physical address. The notes suggest that the
primary reason would be so that the writes were cache coherent, as the
CPU cache uses physical tagging. As such we did not implement the
legacy variant of MI_STORE_DATA_IMM and so left all the relocations
synchronous -- but with a small function to convert from the vma address
into the physical address, we can implement asynchronous relocs on these
older arches, fixing up a few tests that require them.

In order to be able to test the legacy paths, refactor the gpu
relocations so that we can hook them up to a selftest.

v2: Use an array of offsets not enum labels for the selftest
v3: Refactor the common igt_hexdump()

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/757Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200504140629.28240-1-chris@chris-wilson.co.uk

e3d29130

drm/i915/gem: Specify address type for chained reloc batches · f5b62bdb

Chris Wilson authored May 04, 2020

It is required that a chained batch be in the same address domain as its
parent, and also that must be specified in the command for earlier gen
as it is not inferred from the chaining until gen6.

Fixes: 964a9b0f ("drm/i915/gem: Use chained reloc batches")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200504125149.4396-1-chris@chris-wilson.co.uk

f5b62bdb

drm/i915: Allow some leniency in PCU reads · 378974f7

Chris Wilson authored May 04, 2020

Extend the timeout for pcode reads to 20ms as they should not be
performed along critical paths, and succeeding after a short delay is
better than failing entirely.

References: https://gitlab.freedesktop.org/drm/intel/-/issues/1800Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200504044903.7626-1-chris@chris-wilson.co.uk

378974f7

drm/i915/gem: Lazily acquire the device wakeref for freeing objects · 6983dafa

Chris Wilson authored May 03, 2020

We only need the device wakeref on freeing the objects if we have to
unbind the object from the global GTT, or otherwise update device
information. If the objects are clean, we never need the wakeref, so
avoid taking until required.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Reviewed-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200503171513.18704-1-chris@chris-wilson.co.uk

6983dafa

03 May, 2020 1 commit

drm/i915/gt: Sanitize RPS interrupts upon resume · 389b7f00

Chris Wilson authored May 02, 2020

Currently we clear and disable the RPS pm interrupts on module load, and
presume that they remain disabled forevermore. However, the mask is
cleared on suspend and so after resume they may start showing up again
unexepectedly.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1811
Fixes: 8e99299a ("drm/i915/gt: Track use of RPS interrupts in flags")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi@etezian.org>
Reviewed-by: Andi Shyti <andi@etezian.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200502173512.32353-1-chris@chris-wilson.co.uk

389b7f00

01 May, 2020 6 commits

drm/i915/gem: Try an alternate engine for relocations · 6f576d62

Chris Wilson authored May 01, 2020

If at first we don't succeed, try try again.

Not all engines may support the MI ops we need to perform asynchronous
relocation patching, and so we end up falling back to a synchronous
operation that has a liability of blocking. However, Tvrtko pointed out
we don't need to use the same engine to perform the relocations as we
are planning to execute the execbuf on, and so if we switch over to a
working engine, we can perform the relocation asynchronously. The user
execbuf will be queued after the relocations by virtue of fencing.

This patch creates a new context per execbuf requiring asynchronous
relocations on an unusable engines. This is perhaps a bit excessive and
can be ameliorated by a small context cache, but for the moment we only
need it for working around a little used engine on Sandybridge, and only
if relocations are actually required to an active batch buffer.

Now we just need to teach the relocation code to handle physical
addressing for gen2/3, and we should then have universal support!
Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Testcase: igt/gem_exec_reloc/basic-spin # snb
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200501192945.22215-3-chris@chris-wilson.co.uk

6f576d62

drm/i915/gem: Use a single chained reloc batches for a single execbuf · 0e97fbb0

Chris Wilson authored May 01, 2020

As we can now keep chaining together a relocation batch to process any
number of relocations, we can keep building that relocation batch for
all of the target vma. This avoiding emitting a new request into the
ring for each target, consuming precious ring space and a potential
stall.

v2: Propagate the failure from submitting the relocation batch.

Testcase: igt/gem_exec_reloc/basic-wide-active
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200501192945.22215-2-chris@chris-wilson.co.uk

0e97fbb0

drm/i915/gem: Use chained reloc batches · 964a9b0f

Chris Wilson authored May 01, 2020

The ring is a precious resource: we anticipate to only use a few hundred
bytes for a request, and only try to reserve that before we start. If we
go beyond our guess in building the request, then instead of waiting at
the start of execbuf before we hold any locks or other resources, we
may trigger a wait inside a critical region. One example is in using gpu
relocations, where currently we emit a new MI_BB_START from the ring
every time we overflow a page of relocation entries. However, instead of
insert the command into the precious ring, we can chain the next page of
relocation entries as MI_BB_START from the end of the previous.

v2: Delay the emit_bb_start until after all the chained vma
synchronisation is complete. Since the buffer pool batches are idle, this
_should_ be a no-op, but one day we may some fancy async GPU bindings
for new vma!

v3: Use pool/batch consitently, once we start thinking in terms of the
batch vma, use batch->obj.
v4: Explain the magic number 4.

Tvrtko spotted that we lose propagation of the error for failing to
submit the relocation request; that's easier to fix up in the next
patch.

Testcase: igt/gem_exec_reloc/basic-many-active
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200501192945.22215-1-chris@chris-wilson.co.uk

964a9b0f

drm/i915: Implement vm_ops->access for gdb access into mmaps · 9f909e21

Chris Wilson authored May 01, 2020

gdb uses ptrace() to peek and poke bytes of the target's address space.
The driver must implement an vm_ops->access() handler or else gdb will
be unable to inspect the pointer and report it as out-of-bounds.
Worse than useless as it causes immediate suspicion of the valid GTT
pointer, distracting the poor programmer trying to find his bug.

v2: Write-protect readonly objects (Matthew).

Testcase: igt/gem_mmap_gtt/ptrace
Testcase: igt/gem_mmap_offset/ptrace
Suggested-by: Kristian H. Kristensen <hoegsberg@google.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Maciej Patelczyk <maciej.patelczyk@intel.com>
Cc: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200501145120.18830-1-chris@chris-wilson.co.uk

9f909e21

drm/i915/gt: Make timeslicing an explicit engine property · a211da9c

Chris Wilson authored May 01, 2020

In order to allow userspace to rely on timeslicing to reorder their
batches, we must support preemption of those user batches. Declare
timeslicing as an explicit property that is a combination of having the
kernel support and HW support.
Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 8ee36e04 ("drm/i915/execlists: Minimalistic timeslicing")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200501122249.12417-1-chris@chris-wilson.co.uk

a211da9c

drm/i915/pmu: Keep a reference to module while active · 3b55cdeb

Chris Wilson authored Apr 30, 2020

While a perf event is open, keep a reference to the module so we don't
remove the driver internals mid-sampling.

Testcase: igt/perf_pmu/module-unload
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430183324.23984-1-chris@chris-wilson.co.uk

3b55cdeb

30 Apr, 2020 1 commit

drm/i915/gt: Move the batch buffer pool from the engine to the gt · 16e87459

Chris Wilson authored Apr 30, 2020

Since the introduction of 'soft-rc6', we aim to park the device quickly
and that results in frequent idling of the whole device. Currently upon
idling we free the batch buffer pool, and so this renders the cache
ineffective for many workloads. If we want to have an effective cache of
recently allocated buffers available for reuse, we need to decouple that
cache from the engine powermanagement and make it timer based. As there
is no reason then to keep it within the engine (where it once made
retirement order easier to track), we can move it up the hierarchy to the
owner of the memory allocations.

v2: Hook up to debugfs/drop_caches to clear the cache on demand.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430111819.10262-2-chris@chris-wilson.co.uk

16e87459