- 31 Jan, 2023 2 commits
-
-
Michal Wajdeczko authored
Use new macros to have common prefix that also include GT#. v2: drop now redundant "GuC" word from the message Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230128195907.1837-3-michal.wajdeczko@intel.com
-
Michal Wajdeczko authored
While we do have GT oriented print macros, add few more GuC specific to have common look and feel across all messages related to the GuC and to avoid chasing the gt pointer. We will use these macros shortly in upcoming patches. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230128195907.1837-2-michal.wajdeczko@intel.com
-
- 30 Jan, 2023 1 commit
-
-
Rob Clark authored
A userspace with multiple threads racing I915_GEM_SET_TILING to set the tiling to I915_TILING_NONE could trigger a double free of the bit_17 bitmask. (Or conversely leak memory on the transition to tiled.) Move allocation/free'ing of the bitmask within the section protected by the obj lock. Signed-off-by: Rob Clark <robdclark@chromium.org> Fixes: 2850748e ("drm/i915: Pull i915_vma_pin under the vm->mutex") Cc: <stable@vger.kernel.org> # v5.5+ [tursulin: Correct fixes tag and added cc stable.] Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230127200550.3531984-1-robdclark@gmail.com
-
- 27 Jan, 2023 9 commits
-
-
John Harrison authored
The GuC specific register state entry in the error capture object was just called 'capture'. Although the companion 'node' entry was called 'guc_capture_node'. Rename the base entry to be 'guc_capture' instead so that it is a) more consistent and b) more obvious what it is. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-9-John.C.Harrison@Intel.com
-
John Harrison authored
For understanding bug reports, it can be useful to have an explicit dmesg print when a reset notification is received from GuC. As opposed to simply inferring that this happened from other messages. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-8-John.C.Harrison@Intel.com
-
John Harrison authored
Engine resets are supposed to never fail. But in the case when one does (due to unknown reasons that normally come down to a missing w/a), it is useful to get as much information out of the system as possible. Given that the GuC intentionally dies on such a situation, it is not possible to get a guilty context notification back. So do a manual search instead. Given that GuC is dead, this is safe because GuC won't be changing the engine state asynchronously. v2: Change comment to be less alarming (Tvrtko) Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-7-John.C.Harrison@Intel.com
-
John Harrison authored
A hang situation has been observed where the only requests on the context were either completed or not yet started according to the breaadcrumbs. However, the register state claimed a batch was (maybe) in progress. So, allow capture of the pending request on the grounds that this might be better than nothing. v2: Reword 'not started' warning message (Tvrtko) Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-6-John.C.Harrison@Intel.com
-
John Harrison authored
There was a report of error captures occurring without any hung context being indicated despite the capture being initiated by a 'hung context notification' from GuC. The problem was not reproducible. However, it is possible to happen if the context in question has no active requests. For example, if the hang was in the context switch itself then the breadcrumb write would have occurred and the KMD would see an idle context. In the interests of attempting to provide as much information as possible about a hang, it seems wise to include the engine info regardless of whether a request was found or not. As opposed to just prentending there was no hang at all. So update the error capture code to always record engine information if a context is given. Which means updating record_context() to take a context instead of a request (which it only ever used to find the context anyway). And split the request agnostic parts of intel_engine_coredump_add_request() out into a seaprate function. v2: Remove a duplicate 'if' statement (Umesh) and fix a put of a null pointer. v3: Tidy up request locking code flow (Tvrtko) v4: Pull in improved info message from next patch and fix up potential leak of GuC register state (Daniele) Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> (v2) Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-5-John.C.Harrison@Intel.com
-
John Harrison authored
The debugfs dump of requests was confused about what state requires the execlist lock versus the GuC lock. There was also a bunch of duplicated messy code between it and the error capture code. So refactor the hung request search into a re-usable function. And reduce the span of the execlist state lock to only the execlist specific code paths. In order to do that, also move the report of hold count (which is an execlist only concept) from the top level dump function to the lower level execlist specific function. Also, move the execlist specific code into the execlist source file. v2: Rename some functions and move to more appropriate files (Daniele). v3: Rename new execlist dump function (Daniele) Fixes: dc0dad36 ("drm/i915/guc: Fix for error capture after full GPU reset with GuC") Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: Michael Cheng <michael.cheng@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Bruce Chang <yu.bruce.chang@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-4-John.C.Harrison@Intel.com
-
John Harrison authored
When GuC support was added to error capture, the reference counting around the request object was broken. Fix it up. The context based search manages the spinlocking around the search internally. So it needs to grab the reference count internally as well. The execlist only request based search relies on external locking, so it needs an external reference count but within the spinlock not outside it. The only other caller of the context based search is the code for dumping engine state to debugfs. That code wasn't previously getting an explicit reference at all as it does everything while holding the execlist specific spinlock. So, that needs updaing as well as that spinlock doesn't help when using GuC submission. Rather than trying to conditionally get/put depending on submission model, just change it to always do the get/put. v2: Explicitly document adding an extra blank line in some dense code (Andy Shevchenko). Fix multiple potential null pointer derefs in case of no request found (some spotted by Tvrtko, but there was more!). Also fix a leaked request in case of !started and another in __guc_reset_context now that intel_context_find_active_request is actually reference counting the returned request. v3: Add a _get suffix to intel_context_find_active_request now that it grabs a reference (Daniele). v4: Split the intel_guc_find_hung_context change to a separate patch and rename intel_context_find_active_request_get to intel_context_get_active_request (Tvrtko). v5: s/locking/reference counting/ in commit message (Tvrtko) Fixes: dc0dad36 ("drm/i915/guc: Fix for error capture after full GPU reset with GuC") Fixes: 573ba126 ("drm/i915/guc: Capture error state on context reset") Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Andrzej Hajda <andrzej.hajda@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: Michael Cheng <michael.cheng@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Bruce Chang <yu.bruce.chang@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-3-John.C.Harrison@Intel.com
-
John Harrison authored
intel_guc_find_hung_context() was not acquiring the correct spinlock before searching the request list. So fix that up. While at it, add some extra whitespace padding for readability. Fixes: dc0dad36 ("drm/i915/guc: Fix for error capture after full GPU reset with GuC") Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: Michael Cheng <michael.cheng@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com> Cc: Chris Wilson <chris.p.wilson@intel.com> Cc: Bruce Chang <yu.bruce.chang@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-2-John.C.Harrison@Intel.com
-
Rob Clark authored
Adding the vm to the vm_xa table makes it visible to userspace, which could try to race with us to close the vm. So we need to take our extra reference before putting it in the table. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Fixes: 9ec8795e ("drm/i915: Drop __rcu from gem_context->vm") Cc: <stable@vger.kernel.org> # v5.16+ Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230119173321.2825472-1-robdclark@gmail.com
-
- 26 Jan, 2023 4 commits
-
-
Matt Roper authored
GAMSTLB_CTRL and GAMCNTRL_CTRL became multicast/replicated registers on Xe_HP. They should be defined accordingly and use MCR-aware operations. These registers have only been used for some dg2/xehpsdv workarounds, so this fix is mostly just for consistency/future-proofing; even lacking the MCR annotation, workarounds will always be properly applied in a multicast manner on these platforms. Cc: Gustavo Sousa <gustavo.sousa@intel.com> Fixes: 58bc2453 ("drm/i915: Define multicast registers as a new type") Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230125234159.3015385-3-matthew.d.roper@intel.com
-
Matt Roper authored
Workaround Wa_18018781329 has applied to several recent Xe_HP-based platforms. However there are some extra gotchas to implementing this properly for MTL that we need to take into account: * Due to the separation of media and render/compute into separate GTs, this workaround needs to be implemented on each GT, not just the primary GT. Since each class of register only exists on one of the two GTs, we should program the appropriate registers on each GT. * As with past Xe_HP platforms, the registers on the primary GT (Xe_LPG IP) are multicast/replicated registers and should be handled with the MCR-aware functions. However the registers on the media GT (Xe_LPM+ IP) are regular singleton registers and should _not_ use MCR handling. We need to create separate register definitions for the Xe_HP multicast form and the Xe_LPM+ singleton form and use each in the appropriate place. * Starting with MTL, workarounds documented by the hardware teams are technically associated with IP versions/steppings rather than top-level platforms. That means we should take care to check the media IP version rather than the graphics IP version when deciding whether the workaround is needed on the Xe_LPM+ media GT (in this case the workaround applies to both IPs and the stepping bounds are identical, but we should still write the code appropriately to set a proper precedent for future workaround implementations). * It's worth noting that the GSC register and the CCS register are defined with the same MMIO offset (0xCF30). Since the CCS is only relevant to the primary GT and the GSC is only relevant to the media GT there isn't actually a clash here (the media GT automatically adds the additional 0x380000 GSI offset). However there's currently a glitch in the bspec where the CCS register doesn't show up at all and the GSC register is listed as existing on both GTs. That's a known documentation problem for several registers with shared GSC/CCS offsets; rest assured that the CCS register really does still exist. Cc: Gustavo Sousa <gustavo.sousa@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Fixes: 41bb543f ("drm/i915/mtl: Add initial gt workarounds") Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230125234159.3015385-2-matthew.d.roper@intel.com
-
Matt Roper authored
Register reset characteristics (i.e., whether the register maintains or loses its value on engine reset) is an important factor that determines which wa_list we want to add workarounds to. We recently found out that the bspec documentation for the Xe_HP's "GAM" registers in the 0xC800 - 0xCFFF range was misleading; these registers do not actually lose their value on engine resets as the documentation implied. This means there's no need to re-apply workarounds touching these registers after a reset, and the corresponding workarounds should be moved from the 'engine' lists back to the 'gt' list. v2: - Don't add Wa_18018781329 to xehpsdv; the original condition didn't include that platform. (Gustavo) - Move the MTL code to the GT function as-is for now; we'll take care of the additional fixes needed in a follow-up patch. Cc: Gustavo Sousa <gustavo.sousa@intel.com> Fixes: edf176f4 ("drm/i915/dg2: Move misplaced 'ctx' & 'gt' wa's to engine wa list") Fixes: b2006061 ("drm/i915/xehpsdv: Move render/compute engine reset domains related workarounds") Fixes: 41bb543f ("drm/i915/mtl: Add initial gt workarounds") Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230125234159.3015385-1-matthew.d.roper@intel.com
-
Tvrtko Ursulin authored
We want to idle all tiles when exiting selftests. Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230125100003.18243-1-nirmoy.das@intel.com
-
- 24 Jan, 2023 4 commits
-
-
Tvrtko Ursulin authored
Lets eradicate a silent conflict between drm-intel-next and drm-intel-gt-next which added a duplicate function (try_firmware_load). Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
-
Daniel Vetter authored
Merge tag 'drm-intel-gt-next-2023-01-18' of git://anongit.freedesktop.org/drm/drm-intel into drm-next Driver Changes: Fixes/improvements/new stuff: - Fix workarounds on Gen2-3 (Tvrtko Ursulin) - Fix HuC delayed load memory leaks (Daniele Ceraolo Spurio) - Fix a BUG caused by impendance mismatch in dma_fence_wait_timeout and GuC (Janusz Krzysztofik) - Add DG2 workarounds Wa_18018764978 and Wa_18019271663 (Matt Atwood) - Apply recommended L3 hashing mask tuning parameters (Gen12+) (Matt Roper) - Improve suspend / resume times with VT-d scanout workaround active (Andi Shyti, Chris Wilson) - Silence misleading "mailbox access failed" warning in snb_pcode_read (Ashutosh Dixit) - Fix null pointer dereference on HSW perf/OA (Umesh Nerlige Ramappa) - Avoid trampling the ring during buffer migration (and selftests) (Chris Wilson, Matthew Auld) - Fix DG2 visual corruption on small BAR systems by not forgetting to copy CCS aux state (Matthew Auld) - More fixing of DG2 visual corruption by not forgetting to copy CCS aux state of backup objects (Matthew Auld) - Fix TLB invalidation for Gen12.50 video and compute engines (Andrzej Hajda) - Limit Wa_22012654132 to just specific steppings (Matt Roper) - Fix userspace crashes due eviction not working under lock contention after the object locking conversion (Matthew Auld) - Avoid double free is user deploys a corrupt GuC firmware (John Harrison) - Fix 32-bit builds by using "%zu" to format size_t (Nirmoy Das) - Fix a possible BUG in TTM async unbind due not reserving enough fence slots (Nirmoy Das) - Fix potential use after free by not exposing the GEM context id to userspace too early (Rob Clark) - Show clamped PL1 limit to the user (hwmon) (Ashutosh Dixit) - Workaround unreliable reset on Jasperlake (Chris Wilson) - Cover rest of SVG unit MCR registers (Gustavo Sousa) - Avoid PXP log spam on platforms which do not support the feature (Alan Previn) - Re-disable RC6p on Sandy Bridge to avoid GPU hangs and visual glitches (Sasa Dragic) Future platform enablement: - Manage uncore->lock while waiting on MCR register (Matt Roper) - Enable Idle Messaging for GSC CS (Vinay Belgaumkar) - Only initialize GSC in tile 0 (José Roberto de Souza) - Media GT and Render GT share common GGTT (Aravind Iddamsetty) - Add dedicated MCR lock (Matt Roper) - Implement recommended caching policy (PVC) (Wayne Boyer) - Add hardware-level lock for steering (Matt Roper) - Check full IP version when applying hw steering semaphore (Matt Roper) - Enable GuC GGTT invalidation from the start (Daniele Ceraolo Spurio) - MTL GSC firmware support (Daniele Ceraolo Spurio, Jonathan Cavitt) - MTL OA support (Umesh Nerlige Ramappa) - MTL initial gt workarounds (Matt Roper) Driver refactors: - Hold forcewake and MCR lock over PPAT setup (Matt Roper) - Acquire fw before loop in intel_uncore_read64_2x32 (Umesh Nerlige Ramappa) - GuC filename cleanups and use submission API version number (John Harrison) - Promote pxp subsystem to top-level of i915 (Alan Previn) - Finish proofing the code agains object size overflows (Chris Wilson, Gwan-gyeong Mun) - Start adding module oriented dmesg output (John Harrison) Miscellaneous: - Correct kerneldoc for intel_gt_mcr_wait_for_reg() (Matt Roper) - Bump up sample period for busy stats selftest (Umesh Nerlige Ramappa) - Make GuC default_lists const data (Jani Nikula) - Fix table order verification to check all FW types (John Harrison) - Remove some limited use register access wrappers (Jani Nikula) - Remove struct_member macro (Andrzej Hajda) - Remove hardcoded value with a macro (Nirmoy Das) - Use helper func to find out map type (Nirmoy Das) - Fix a static analysis warning (John Harrison) - Consolidate VMA active tracking helpers (Andrzej Hajda) - Do not cover all future platforms in TLB invalidation (Tvrtko Ursulin) - Replace zero-length arrays with flexible-array members (Gustavo A. R. Silva) - Unwind hugepages to drop wakeref on error (Chris Wilson) - Remove a couple of superfluous i915_drm.h includes (Jani Nikula) Merges: - Merge drm/drm-next into drm-intel-gt-next (Rodrigo Vivi) danvet: Fix up merge conflict in intel_uc_fw.c, we ended up with 2 copies of try_firmware_load() somehow. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Y8fW2Ny1B1hZ5ZmF@tursulin-desk
-
Tvrtko Ursulin authored
Default engine map is exactly about uabi engines so no excuse not to use the appropriate iterator to populate it. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> [tursulin: Fixed up r-b tag spelling.] Link: https://patchwork.freedesktop.org/patch/msgid/20230123185629.1593320-1-jonathan.cavitt@intel.com
-
Gustavo Sousa authored
That register became a multicast register as of Xe_HP and it is currently used only for DG2. Use a proper prefix since there could be usage of the same register for previous platforms in the future, which would require a different definition (i.e. using _MMIO). Note that, in its current state, the code does not cause functional problems, since the actual application of the workaround would implicitly use multicast mode. This fix is more toward consistency and being future-proof uses of this register outside of workarounds. v2: - Add paragraph noting that this change is for consistency and making the code future-proof. (Matt) Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Matthew Atwood <matthew.s.atwood@intel.com> Fixes: 468a4e63 ("drm/i915/dg2: Introduce Wa_18018764978") Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230120181423.90507-1-gustavo.sousa@intel.com
-
- 20 Jan, 2023 1 commit
-
-
Arnd Bergmann authored
The definition of intel_selftest_modify_policy() does not match the declaration, as gcc-13 points out: drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.c:29:5: error: conflicting types for 'intel_selftest_modify_policy' due to enum/integer mismatch; have 'int(struct intel_engine_cs *, struct intel_selftest_saved_policy *, u32)' {aka 'int(struct intel_engine_cs *, struct intel_selftest_saved_policy *, unsigned int)'} [-Werror=enum-int-mismatch] 29 | int intel_selftest_modify_policy(struct intel_engine_cs *engine, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.c:11: drivers/gpu/drm/i915/selftests/intel_scheduler_helpers.h:28:5: note: previous declaration of 'intel_selftest_modify_policy' with type 'int(struct intel_engine_cs *, struct intel_selftest_saved_policy *, enum selftest_scheduler_modify)' 28 | int intel_selftest_modify_policy(struct intel_engine_cs *engine, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ Change the type in the definition to match. Fixes: 617e87c0 ("drm/i915/selftest: Fix hangcheck self test for GuC submission") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230117163743.1003219-1-arnd@kernel.org
-
- 19 Jan, 2023 3 commits
-
-
Gustavo Sousa authored
That register doesn't belong to a specific engine, so the proper placement for workarounds programming it should be general_render_compute_wa_init(). Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230118155249.41551-3-gustavo.sousa@intel.com
-
Gustavo Sousa authored
Extend the existing documentation in gt/intel_workarounds.c to make it clear which functions register workarounds should be implemented in according to their types. Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230118155249.41551-2-gustavo.sousa@intel.com
-
Lucas De Marchi authored
Commit 0d0e7d1e ("drm/i915/mtl: Define engine context layouts") added the engine context for Meteor Lake. In a second revision of the patch it was believed the xcs offsets were wrong due to a tagging issue in the spec. The first version was actually correct, as shown by the intel_lrc_live_selftests/live_lrc_layout test: i915: Running gt_lrc i915: Running intel_lrc_live_selftests/live_lrc_layout bcs0: LRI command mismatch at dword 1, expected 1108101d found 11081019 [drm:drm_helper_probe_single_connector_modes [drm_kms_helper]] [CONNECTOR:236:DP-1] disconnected bcs0: HW register image: [0000] 00000000 1108101d 00022244 ffff0008 00022034 00000088 00022030 00000088 ... bcs0: SW register image: [0000] 00000000 11081019 00022244 00090009 00022034 00000000 00022030 00000000 The difference in the 2 additional dwords (0x1d vs 0x19) are the offsets 0x120 / 0x124 that are indeed part of the context image. Bspec: 45585 Fixes: 0d0e7d1e ("drm/i915/mtl: Define engine context layouts") Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230111235531.3353815-2-radhakrishna.sripada@intel.com
-
- 18 Jan, 2023 3 commits
-
-
Matt Roper authored
The implementation of Wa_22011450934 introduced three new register definitions in i915_reg.h that didn't get moved to the GT/engine register headers when all the other registers moved; let's move them to the appropriate headers and tidy up their definitions now for consistency: - STATE_ACK_DEBUG is moved to the engine register header and converted to a parameterized definition; the workaround only needs the RCS instance to be programmed, but there are instances on other engines that could be used by other workarounds in the future. - The two CULLBIT registers move to the GT register header. Since they belong to MMIO ranges that became MCR starting with Xe_HP, their definitions should be defined as MCR_REG() and use an Xe_HP prefix to keep the register semantics clear. Note that the MCR definition is just for consistency and to prevent accidental misuse if other workarounds related to these registers show up in the future. There's no functional change to today's driver since the workaround that references these registers only accesses them via MI_LRR engine instructions. Engine-initiated register accesses do not utilize the same steering controls as CPU-initiated accesses; they use a different steering control register (0x20CC) which is initialized to a non-terminated DSS target by pre-OS firmware and never changed thereafter (i915 does not touch it and userspace does not have permission to change that register). Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230117202627.4134579-1-matthew.d.roper@intel.com
-
Jani Nikula authored
Remove a couple of unnecessary includes. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230117123856.2271720-1-jani.nikula@intel.com
-
Chris Wilson authored
Make sure that upon error after we have acquired the wakeref we do release it again. v2: add another missing "goto out_wf"(Andi). Fixes: 027c38b4 ("drm/i915/selftests: Grab the runtime pm in shrink_thp") Cc: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Signed-off-by: Chris Wilson <chris.p.wilson@linux.intel.com> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230117123234.26487-1-nirmoy.das@intel.com
-
- 17 Jan, 2023 2 commits
-
-
John Harrison authored
When trying to analyse bug reports from CI, customers, etc. it can be difficult to work out exactly what is happening on which GT in a multi-GT system. So add GT oriented debug/error message wrappers. If used instead of the drm_ equivalents, you get the same output but with a GT# prefix on it. v2: Go back to using lower case names (combined review feedback). Convert intel_gt.c as a first step. v3: Add gt_err_ratelimited() as well, undo one conversation that might not have a GT pointer in some scenarios (review feedback from Michal W). Split definitions into separate header (review feedback from Jani). Convert all intel_gt*.c files. v4: Re-order some macro definitions (Andi S), update (c) date (Tvrtko) Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230111200429.2139084-2-John.C.Harrison@Intel.com
-
Sasa Dragic authored
RC6p on Sandy Bridge got re-enabled over time, causing visual glitches and GPU hangs. Disabled originally in commit 1c8ecf80 ("drm/i915: do not enable RC6p on Sandy Bridge"). Signed-off-by: Sasa Dragic <sasa.dragic@gmail.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221219172927.9603-2-sasa.dragic@gmail.com Fixes: fb6db0f5 ("drm/i915: Remove unsafe i915.enable_rc6") Fixes: 13c5a577 ("drm/i915/gt: Select the deepest available parking mode for rc6") Cc: stable@vger.kernel.org Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
- 16 Jan, 2023 4 commits
-
-
git://anongit.freedesktop.org/drm/drm-intelDave Airlie authored
drm/i915 feature pull #1 for v6.3: Features and functionality: - Meteorlake display enabling (Animesh, Luca, Stan, Jouni, Anusha) - DP MST DSC support (Stan) - Gamma/degamma readout support for the state checker (Ville) - Enable SDP split support for DP 2.0 (Vinod) - Add probe blocking support to i915.force_probe parameter (Rodrigo) - Enable Xe HP 4tile support (Jonathan) Refactoring and cleanups: - Color refactoring, especially related to DSB usage (Ville) - DSB refactoring (Ville) - DVO refactoring (Ville) - Backlight register and logging cleanups (Jani) - Avoid display direct calls to uncore (Maarten, Jani) - Add new "soc" sub-directory (Jani) - Refactor DSC platform support checks (Swati) Fixes: - Interlace modes are no longer supported starting at display version 12 (Ankit) - Use polling read for aux control (Arun) - DMC firmware no longer requires specific versions (Gustavo) - Fix PSR flickering and freeze issues (Jouni) - Fix ICL+ DSI GPIO handling (Jani) - Ratelimit errors in display engine irqs (Lucas) - Fix DP MST DSC bpp and timeslot calculations (Stan) - Fix CDCLK squash and crawl sequences (Ville, Anusha) - Fix bigjoiner checks for fused pipes (Ville) - Fix ADP+ degamma LUT size (Ville) - Fix DVO ch7xxx and sil164 suspend/resume (Ville) - Fix memory leak in VBT parsing (Xia Fukun) - Fix VBT packet port selection for dual link DSI (Mikko Kovanen) - Fix SDP infoframe product string for discrete graphics (Clint) - Fix VLV/CHV HDMI/DP audio enable (Ville) - Fix VRR delays and calculations (Ville) - No longer disable transcoder for PHY test pattern change (Khaled) - Fix dual PPS handling (Ville) - Fix timeout and wait for DDI BUF CTL active after enabling (Ankit) Merges: - Backmerge drm-next to sync up with v6.2-rc1 (Jani) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/87tu0wez34.fsf@intel.com
-
git://anongit.freedesktop.org/drm/drm-miscDave Airlie authored
drm-misc-next for v6.3: UAPI Changes: * fourcc: Document Open Source user waiver Cross-subsystem Changes: * firmware: fix color-format selection for system framebuffers Core Changes: * format-helper: Add conversion from XRGB8888 to various sysfb formats; Make XRGB8888 the only driver-emulated legacy format * fb-helper: Avoid blank consoles from selecting an incorrect color format * probe-helper: Enable/disable HPD on connectors plus driver updates * Use drm_dbg_ helpers in several places * docs: Document defaults for CRTC backgrounds; Document use of drm_minor Driver Changes: * arm/hdlcd: Use new debugfs helpers * gud: Use new debugfs helpers * panel: Support Visionox VTDR6130 AMOLED DSI; Support Himax HX8394; Convert many drivers to common generic DSI write-sequence helper * v3d: Do not opencode drm_gem_object_lookup() * vc4: Various HVS an CRTC fixes * vkms: Fix SEGFAULT from incorrect GEM-buffer mapping * Convert various drivers to i2c probe_new() Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/Y8ADeSzZDj+tpibF@linux-uq9g
-
https://gitlab.freedesktop.org/agd5f/linuxDave Airlie authored
amd-drm-next-6.3-2023-01-13: amdgpu: - Fix possible segfault in failure case - Rework FW requests to happen in early_init for all IPs so that we don't lose the sbios console if FW is missing - PSR fixes - Misc cleanups - Unload fix - SMU13 fixes amdkfd: - Fix for cleared VRAM BOs - Fix cleanup if GPUVM creation fails - Memory accounting fix - Use resource_size rather than open codeing it - GC11 mGPU fix radeon: - Fix memory leak on shutdown Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230113225911.7776-1-alexander.deucher@amd.com
-
https://gitlab.freedesktop.org/agd5f/linuxDave Airlie authored
amd-drm-next-6.3-2023-01-06: amdgpu: - secure display support for multiple displays - DML optimizations - DCN 3.2 updates - PSR updates - DP 2.1 updates - SR-IOV RAS updates - VCN RAS support - SMU 13.x updates - Switch 1 element arrays to flexible arrays - Add RAS support for DF 4.3 - Stack size improvements - S0ix rework - Soft reset fix - Allow 0 as a vram limit on APUs - Display fixes - Misc code cleanups - Documentation fixes - Handle profiling modes for SMU13.x amdkfd: - Error handling fixes - PASID fixes radeon: - Switch 1 element arrays to flexible arrays drm: - Add DP adaptive sync DPCD definitions UAPI: - Add new INFO queries for peak and min sclk/mclk for profile modes on newer chips Proposed mesa patch: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/278Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230106222037.7870-1-alexander.deucher@amd.com
-
- 13 Jan, 2023 1 commit
-
-
Alan Previn authored
If PXP arb-session is being attempted on older hardware SKUs or on hardware with older, unsupported, firmware versions, then don't report the failure with a drm_error. Instead, look specifically for the API-version error reply and drm_dbg that reply. In this case, the user-space will eventually get a -ENODEV for the protected context creation which is the correct behavior and we don't create unnecessary drm_error's in our dmesg (for what is unsupported platforms). Changes from prio revs: v2 : - remove unnecessary newline. (Jani) v1 : - print incorrect version from input packet, not output. Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221221174901.2703954-1-alan.previn.teres.alexis@intel.com
-
- 12 Jan, 2023 3 commits
-
-
Maíra Canal authored
With commit 359c6649 ("drm/gem: Implement shadow-plane {begin, end}_fb_access with vmap"), the behavior of the shadow-plane helpers changed and the vunmap is now performed at the end of the current pageflip, instead of the end of the following pageflip. By performing the vunmap at the end of the current pageflip, invalid memory is accessed by the vkms during the plane composition, as the data is being unmapped before being used, as reported by the following warning: [ 275.866047] BUG: unable to handle page fault for address: ffffb382814e8002 [ 275.866055] #PF: supervisor read access in kernel mode [ 275.866058] #PF: error_code(0x0000) - not-present page [ 275.866061] PGD 1000067 P4D 1000067 PUD 110a067 PMD 46e3067 PTE 0 [ 275.866066] Oops: 0000 [#1] PREEMPT SMP PTI [ 275.866070] CPU: 2 PID: 49 Comm: kworker/u8:2 Not tainted 6.1.0-rc6-00018-gb357e7ac-dirty #54 [ 275.866074] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014 [ 275.866076] Workqueue: vkms_composer vkms_composer_worker [vkms] [ 275.866084] RIP: 0010:XRGB8888_to_argb_u16+0x5c/0xa0 [vkms] [ 275.866092] Code: bf 56 0a 0f af 56 70 48 8b 76 28 01 ca 49 83 f8 02 41 b9 01 00 00 00 4d 0f 43 c8 48 01 f2 48 83 c2 02 31 f6 66 c7 04 f0 ff ff <0f> b6 0c b2 89 cf c1 e7 08 09 cf 66 89 7c f0 02 0f b6 4c b2 ff 89 [ 275.866095] RSP: 0018:ffffb382801b7db0 EFLAGS: 00010246 [ 275.866098] RAX: ffff896336ace000 RBX: ffff896310e293c0 RCX: 0000000000000000 [ 275.866101] RDX: ffffb382814e8002 RSI: 0000000000000000 RDI: ffffb382801b7de8 [ 275.866103] RBP: 0000000000001400 R08: 0000000000000280 R09: 0000000000000280 [ 275.866105] R10: 0000000000000010 R11: ffffffffc011d990 R12: ffff896302a1ece0 [ 275.866107] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000080008001 [ 275.866109] FS: 0000000000000000(0000) GS:ffff89637dd00000(0000) knlGS:0000000000000000 [ 275.866112] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 275.866114] CR2: ffffb382814e8002 CR3: 0000000003bb4000 CR4: 00000000000006e0 [ 275.866120] Call Trace: [ 275.866123] <TASK> [ 275.866124] compose_active_planes+0x1c4/0x380 [vkms] [ 275.866132] vkms_composer_worker+0x9f/0x130 [vkms] [ 275.866139] process_one_work+0x1c0/0x370 [ 275.866160] worker_thread+0x221/0x410 [ 275.866164] ? worker_clr_flags+0x50/0x50 [ 275.866167] kthread+0xe1/0x100 [ 275.866172] ? kthread_blkcg+0x30/0x30 [ 275.866176] ret_from_fork+0x22/0x30 [ 275.866181] </TASK> [ 275.866182] Modules linked in: vkms [ 275.866186] CR2: ffffb382814e8002 [ 275.866191] ---[ end trace 0000000000000000 ]--- Therefore, introduce again prepare_fb and cleanup_fb functions to the vkms, which were previously removed on commit b43e2ec0 ("drm/vkms: Let shadow-plane helpers prepare the plane's FB"). Fixes: 359c6649 ("drm/gem: Implement shadow-plane {begin, end}_fb_access with vmap") Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Maíra Canal <mcanal@igalia.com> Signed-off-by: Melissa Wen <melissa.srw@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230111131304.106039-1-mcanal@igalia.com
-
Ankit Nautiyal authored
Defeature Display Interlace support. Support for interlace modes is removed from Gen 12 onwards. Pruning the interlace modes for HDMI for Display >=12. Bspec: 50490 v2: Add check for both DP and HDMI. (Ville) Get rid of redundant check for interlace mode in modevalid. (Ville) v3: Simplify the condition to avoid interlace modes. (Jani) Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Uma Shankar <uma.shankar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230105124125.1129653-1-ankit.k.nautiyal@intel.com
-
Ankit Nautiyal authored
Since the DP/HDMI connector do not set connector->doublescan_allowed, the doublescan modes will get automatically filtered during drm_helper_probe_single_connector_modes(). Therefore check for double scan modes is not required and is dropped from modevalid functions for both DP and HDMI. Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Uma Shankar <uma.shankar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221017143038.1748319-2-ankit.k.nautiyal@intel.com
-
- 11 Jan, 2023 3 commits
-
-
Daniel Vetter authored
The documentation for struct drm_minor already states this, but that's not always that easy to find. Also due to historical reasons we still have the minor-centric interfaces (like drm_debugfs_create_files), but since this is now getting fixed we can put a few more pointers in place as to how this should be done ideally. Note that debugfs isn't there yet for all cases (debugfs files on kms objects like crtc/connector aren't supported, neither debugfs files with full fops), so the debugfs side of this is still rather aspirational and more for new users than converting everything existing. todo.rst covers the additional work needed already. Motivated by some discussion with Rodrigo on irc about how drm/xe should lay out its sysfs interfaces. v2: Make the debugfs situation clearer in the commit message, but don't elaborate more in the actual kerneldoc to avoid distracting from the main message around sysfs (Jani) Also fix some typos. Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Maíra Canal <mcanal@igalia.com> Acked-by: Maxime Ripard <maxime@cerno.tech> Acked-by: Jani Nikula <jani.nikula@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Wambui Karuga <wambui.karugax@gmail.com> Cc: Maíra Canal <mcanal@igalia.com> Cc: Maxime Ripard <maxime@cerno.tech> Cc: Melissa Wen <mwen@igalia.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230109164604.3860862-1-daniel.vetter@ffwll.ch
-
Philip Yang authored
Use page aligned size to reserve memory usage because page aligned TTM BO size is used to unreserve memory usage, otherwise no page aligned size causes memory usage accounting unbalanced. Change vram_used definition type to int64_t to be able to trigger WARN_ONCE(adev && adev->kfd.vram_used < 0, "..."), to help debug the accounting issue with warning and backtrace. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Philip Yang authored
If acquire_vm failed when initializing KFD vm, set vm->process_info to NULL and free process info, otherwise, the future acquire_vm will always fail as vm->process_info is not NULL. Pass avm as parameter to remove the duplicate code. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-