- 02 Dec, 2019 13 commits
-
-
Ville Syrjälä authored
Let's make the display info more useful by dumping both the uapi and hw states for each crtc/plane. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191129185434.25549-9-ville.syrjala@linux.intel.comReviewed-by: Ramalingam C <ramalingam.c@intel.com>
-
Ville Syrjälä authored
Use the canonical "[CRTC:%d:%s]" format for the obj id/name in the debugfs display_info dump. Everyone should already be familiar with the format since it's used in the debug logs extensively. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191129185434.25549-8-ville.syrjala@linux.intel.comReviewed-by: Ramalingam C <ramalingam.c@intel.com>
-
Ville Syrjälä authored
Make out life easier by just grabbing all modeset locks around the display_info dump. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191129185434.25549-7-ville.syrjala@linux.intel.comReviewed-by: Ramalingam C <ramalingam.c@intel.com>
-
Ville Syrjälä authored
No point in repeating the crtc mode for each cloned encoder. Just print it once, and avoid using multiple lines for it. And while at let's polish the fixed mode print to fit on one line as well. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191129185434.25549-6-ville.syrjala@linux.intel.comReviewed-by: Ramalingam C <ramalingam.c@intel.com>
-
Ville Syrjälä authored
Pull the crtc dumping stuff into a nice function so the loop over the crtcs doesn't look like crap. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191129185434.25549-5-ville.syrjala@linux.intel.comReviewed-by: Ramalingam C <ramalingam.c@intel.com>
-
Ville Syrjälä authored
Eliminate the special cases for the primary and cursor planes and just dump all the information consistently for all the planes. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191129185434.25549-4-ville.syrjala@linux.intel.comReviewed-by: Ramalingam C <ramalingam.c@intel.com>
-
Ville Syrjälä authored
Switch to using intel_ types in the debugfs display_info code. Should make it easier to handle bigjoiner etc. in the future. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191129185434.25549-3-ville.syrjala@linux.intel.comReviewed-by: Ramalingam C <ramalingam.c@intel.com>
-
Ville Syrjälä authored
Use DRM_RECT_FMT & co. to simpify the code. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191129185434.25549-2-ville.syrjala@linux.intel.comReviewed-by: Ramalingam C <ramalingam.c@intel.com>
-
Ville Syrjälä authored
It's hard to see what is going on when the function mixes drm_ and intel_ types. Switch to intel_ types. v2: Deal with another use of 'intel_crtc' being introduced Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191105171447.22111-2-ville.syrjala@linux.intel.comReviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
-
Matt Roper authored
The bspec tells us 'Program SHPD_FILTER_CNT with the "500 microseconds adjusted" value before enabling hotplug detection' on CNP+. We haven't been touching this register at all thus far, but we should probably follow the bspec's guidance. The register also exists on LPT and SPT, but there isn't any specific guidance I can find on how we should be programming it there so let's leave it be for now. Bspec: 4342 Bspec: 31297 Bspec: 8407 Bspec: 49305 Bspec: 50473 Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191127221314.575575-3-matthew.d.roper@intel.com
-
Matt Roper authored
When looking at SDEISR to determine the connection status of combo outputs, we should use the phy index rather than the port index. Although they're usually the same thing, EHL's DDI-D (port D) is attached to PHY-A and SDEISR doesn't even have bits for a "D" output. It's also possible that future platforms may map DDIs (the internal display engine programming units) to PHYs (the output handling on the IO side) in ways where port!=phy, so let's look at the PHY index by default. v2: Rename to intel_combo_phy_connected. (Lucas) Fixes: 719d2400 ("drm/i915/ehl: Enable DDI-D") Cc: José Roberto de Souza <jose.souza@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191127221314.575575-2-matthew.d.roper@intel.com
-
Matt Roper authored
The South Display is part of the PCH so we should technically be basing our port detection logic off the PCH in use rather than the platform generation. Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191127221314.575575-1-matthew.d.roper@intel.com
-
Ville Syrjälä authored
LPT/WPT only have PCH transcoder A. Make sure we poke at its chicken register instead of some non-existent register when FDI is being driven by pipe B or C. Cc: Uma Shankar <uma.shankar@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191128182358.14477-1-ville.syrjala@linux.intel.comReviewed-by: Uma Shankar <uma.shankar@intel.com>
-
- 01 Dec, 2019 1 commit
-
-
Chris Wilson authored
As the gen6 page directory is written on binding and after every update, the code ended up duplicated. Refactor the code into a single routine to share the locking and serialisation. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191201140916.2128905-1-chris@chris-wilson.co.uk
-
- 30 Nov, 2019 4 commits
-
-
Chris Wilson authored
Now that many threads may try to use the same mmio to flush the global buffers after updating the PTE, serialise access to the mmio to prevent concurrent access on gen7. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191130162320.1683424-1-chris@chris-wilson.co.uk
-
Chris Wilson authored
Move our "wait for the PD load to complete" paranoia before the MI_SET_CONTEXT just in case the context restore tries to access local addresses. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191130120503.1609483-1-chris@chris-wilson.co.uk
-
Chris Wilson authored
After much hair pulling, resort to preallocating the ppGTT entries on init to circumvent the apparent lack of PD invalidate following the write to PP_DCLV upon switching mm between contexts (and here the same context after binding new objects). However, the details of that PP_DCLV invalidate are still unknown, and it appears we need to reload the mm twice to cover over a timing issue. Worrying. Fixes: 3dc007fe ("drm/i915/gtt: Downgrade gen7 (ivb, byt, hsw) back to aliasing-ppgtt") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191129201328.1398583-1-chris@chris-wilson.co.uk
-
Chris Wilson authored
Keep the engine awake and so avoid frequent cycling in and out of powersaving mode to eliminate the unnecessary overhead and speed up the testing. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191129222702.1456292-1-chris@chris-wilson.co.uk
-
- 29 Nov, 2019 8 commits
-
-
Chris Wilson authored
As we only cancel the timers asynchronously, they may still be running on another CPU as we shutdown, raising one last softirq. So be safe and make sure the tasklet is flushed before destroying the engine's memory. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191129172542.1222810-1-chris@chris-wilson.co.uk
-
Chris Wilson authored
Though the context is closed and so no more requests can be added to the timeline, retirement can still be removing requests. It can even be removing the very request we are inspecting and so cause us to wander into dead links. Serialise with the retirement by taking the timeline->mutex used for guarding the timeline->requests list. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112404 Fixes: 4a317415 ("drm/i915/gem: Refine occupancy test in kill_context()") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191129151845.1092933-1-chris@chris-wilson.co.uk
-
Ville Syrjälä authored
skl_commit_modeset_enables() straight up compares dirty_pipes with a bitmask of already committed pipes. If we set bits in dirty_pipes for non-existent pipes that comparison will never work right. So let's limit ourselves to bits that exist. And we'll do the same for the active_pipes_changed bitmask. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191011200949.7839-5-ville.syrjala@linux.intel.comReviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
-
Chris Wilson authored
Since commit c45e788d ("drm/i915/tgl: Suspend pre-parser across GTT invalidations"), we now disable the advanced preparser on Tigerlake for the invalidation phase at the start of the batch, we no longer need to emit the GPU relocations from a second context as they are now flushed inlined. References: 8a9a9827 ("drm/i915: use a separate context for gpu relocs") References: c45e788d ("drm/i915/tgl: Suspend pre-parser across GTT invalidations") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191129124846.949100-1-chris@chris-wilson.co.uk
-
Chris Wilson authored
Wait on only the last request on the kernel_context after emitting a barrier so that we do not wait for everything in general and by doing so cause an accidental emission of the barrier! Bugzilla; https://bugs.freedesktop.org/show_bug.cgi?id=112405Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191129103455.744389-1-chris@chris-wilson.co.uk
-
Chris Wilson authored
Be paranoid and make sure the drm_mm is locked whenever we insert/remove our own nodes. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191129095659.665381-1-chris@chris-wilson.co.uk
-
Chris Wilson authored
Use the normal sgt_iter to walk the pages scatterlist on free so that we handle the error path correctly. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112225Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191128232946.546831-1-chris@chris-wilson.co.uk
-
Michel Thierry authored
Implement Wa_1604555607 (set the DS pairing timer to 128 cycles). FF_MODE2 is part of the register state context, that's why it is implemented here. At TGL A0 stepping, FF_MODE2 register read back is broken, hence disabling the WA verification. v2: Rebased on top of the WA refactoring (Oscar) v3: Correctly add to ctx_workarounds_init (Michel) v4: uncore read is used [Tvrtko] Macros as used for MASK definition [Chris] v5: Skip the Wa_1604555607 verification [Ram] i915 ptr retrieved from engine. [Tvrtko] v6: Added wa_add as a wrapper for __wa_add [Chris] wa_add is directly called instead of new wrapper [tvrtko] BSpec: 19363 HSDES: 1604555607 Signed-off-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Ramalingam C <ramlingam.c@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> [v5] Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191128021005.3350-1-ramalingam.c@intel.com
-
- 28 Nov, 2019 4 commits
-
-
Chris Wilson authored
After obtaining a local reference to the vm from the context, remember to drop it before it goes out of scope! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191128185402.110678-1-chris@chris-wilson.co.uk
-
Chris Wilson authored
Don't rely on the RUNTIME_INFO() when we loop over a particular context and only run on a filtered set of engines. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191127223252.3777141-1-chris@chris-wilson.co.uk
-
Chris Wilson authored
We have a case of a mysteriously absent pulse, so dump the engine details to see if we can find out what happened to it. References: https://bugs.freedesktop.org/show_bug.cgi?id=112405Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191128102546.3857140-1-chris@chris-wilson.co.uk
-
Chris Wilson authored
One does not lightly add a new hidden struct_mutex dependency deep within the execbuf bowels! The immediate suspicion in seeing the whitelist cached on the context, is that it is intended to be preserved between batches, as the kernel is quite adept at caching small allocations itself. But no, it's sole purpose is to serialise command submission in order to save a kmalloc on a slow, slow path! By removing the whitelist dependency from the context, our freedom to chop the big struct_mutex is greatly augmented. v2: s/set_bit/__set_bit/ as the whitelist shall never be accessed concurrently. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191128113424.3885958-1-chris@chris-wilson.co.uk
-
- 27 Nov, 2019 5 commits
-
-
Chris Wilson authored
The design of our interrupt handlers is that we ack the receipt of the interrupt first, inside the critical section where the master interrupt control is off and other cpus cannot start processing the next interrupt; and then process the interrupt events afterwards. However, Icelake introduced a whole new set of banked GT_IIR that are inherently serialised and slow to retrieve the IIR and must be processed within the critical section. We can still push our breadcrumbs out of this critical section by using our irq_worker. On bdw+, this should not make too much of a difference as we only slightly defer the breadcrumbs, but on icl+ this should make a big difference to our throughput of interrupts from concurrently executing engines. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191127115813.3345823-1-chris@chris-wilson.co.uk
-
Chris Wilson authored
The expected downside to commit 58b4c1a0 ("drm/i915: Reduce nested prepare_remote_context() to a trylock") was that it would need to return -EAGAIN to userspace in order to resolve potential mutex inversion. Such an unsightly round trip is unnecessary if we could atomically insert a barrier into the i915_active_fence, so make it happen. Currently, we use the timeline->mutex (or some other named outer lock) to order insertion into the i915_active_fence (and so individual nodes of i915_active). Inside __i915_active_fence_set, we only need then serialise with the interrupt handler in order to claim the timeline for ourselves. However, if we remove the outer lock, we need to ensure the order is intact between not only multiple threads trying to insert themselves into the timeline, but also with the interrupt handler completing the previous occupant. We use xchg() on insert so that we have an ordered sequence of insertions (and each caller knows the previous fence on which to wait, preserving the chain of all fences in the timeline), but we then have to cmpxchg() in the interrupt handler to avoid overwriting the new occupant. The only nasty side-effect is having to temporarily strip off the RCU-annotations to apply the atomic operations, otherwise the rules are much more conventional! Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112402 Fixes: 58b4c1a0 ("drm/i915: Reduce nested prepare_remote_context() to a trylock") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191127134527.3438410-1-chris@chris-wilson.co.uk
-
Chris Wilson authored
Now that we rapidly park the GT when the GPU idles, we often find ourselves idling faster than the RC6 promotion timer. Thus if we tell the GPU to enter RC6 manually as we park, we can do so quicker (by around 50ms, half an EI on average) and marginally increase our powersaving across all execlists platforms. v2: Now with a selftest to check we can enter RC6 manually Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Imre Deak <imre.deak@intel.com> Acked-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191127095657.3209854-1-chris@chris-wilson.co.uk
-
Clint Taylor authored
During the Display Interrupt Service routine the Display Interrupt Enable bit must be disabled, The interrupts handled, then the Display Interrupt Enable bit must be set to prevent possible missed interrupts. Bspec: 49212 V2: Change Title to remove SDE reference. V3: Fix TAB spacing. Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Aditya Swarup <aditya.swarup@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Clint Taylor <clinton.a.taylor@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191121201455.2558-1-clinton.a.taylor@intel.com
-
Kai Vehmanen authored
Starting with gen12, PORT_A can be connected to a transcoder with audio support. Modify the existing logic that disabled audio on PORT_A unconditionally. Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191125125313.17584-1-kai.vehmanen@linux.intel.com
-
- 26 Nov, 2019 3 commits
-
-
Stanislav Lisovskiy authored
According to BSpec 53998, there is a mask of max 8 SAGV/QGV points we need to support. Bumping this up to keep the CI happy(currently preventing tests to run), until all SAGV changes land. v2: Fix second plane where QGV points were hardcoded as well. v3: Change the naming of I915_NUM_SAGV_POINTS to be I915_NUM_QGV_POINTS, as more meaningful (Ville Syrjälä) Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112189Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191125160800.14740-1-stanislav.lisovskiy@intel.com [vsyrjala: Add missing braces around else (checkpatch), fix Bugzilla tag] Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
-
Chris Wilson authored
On context retiring, we may invoke the kernel_context to unpin this context. Elsewhere, we may use the kernel_context to modify this context. This currently leads to an AB-BA lock inversion, so we need to back-off from the contended lock, and repeat. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111732Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Fixes: a9877da2 ("drm/i915/oa: Reconfigure contexts on the fly") Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191126065521.2331017-1-chris@chris-wilson.co.uk
-
Chris Wilson authored
Based on a sampling of a number of benchmarks across platforms, by default opt for a much more lenient timeout so that we should not adversely affect existing "good" clients. 640ms ought to be enough for anyone. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112169 Fixes: 3a7a92ab ("drm/i915/execlists: Force preemption") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Eero Tamminen <eero.t.tamminen@intel.com> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191125162737.2161069-1-chris@chris-wilson.co.uk
-
- 25 Nov, 2019 2 commits
-
-
Chris Wilson authored
An i915_vma struct on the stack may push the frame over the limit, if set conservatively, so move it to the heap. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191125124856.1761176-1-chris@chris-wilson.co.uk
-
Chris Wilson authored
The major drawback of commit 7e34f4e4 ("drm/i915/gen8+: Add RC6 CTX corruption WA") is that it disables RC6 while Skylake (and friends) is active, and we do not consider the GPU idle until all outstanding requests have been retired and the engine switched over to the kernel context. If userspace is idle, this task falls onto our background idle worker, which only runs roughly once a second, meaning that userspace has to have been idle for a couple of seconds before we enable RC6 again. Naturally, this causes us to consume considerably more energy than before as powersaving is effectively disabled while a display server (here's looking at you Xorg) is running. As execlists will get a completion event as each context is completed, we can use this interrupt to queue a retire worker bound to this engine to cleanup idle timelines. We will then immediately notice the idle engine (without userspace intervention or the aid of the background retire worker) and start parking the GPU. Thus during light workloads, we will do much more work to idle the GPU faster... Hopefully with commensurate power saving! v2: Watch context completions and only look at those local to the engine when retiring to reduce the amount of excess work we perform. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112315 References: 7e34f4e4 ("drm/i915/gen8+: Add RC6 CTX corruption WA") References: 2248a283 ("drm/i915/gen8+: Add RC6 CTX corruption WA") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-3-chris@chris-wilson.co.uk
-