- 19 Dec, 2023 40 commits
-
-
Matthew Auld authored
No need to allocate extra pages for this if we know flat-ccs AUX state is not even possible, like for normal system memory objects. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Thomas Hellström authored
Modify the xe_migrate_copy() function somewhat to explicitly allow copying of data between two buffer objects including system memory buffer objects. Update the migrate test accordingly. v2: - Check that buffer object sizes match when copying (Matthew Auld) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Thomas Hellström authored
The TTM resource cursor was set up incorrectly. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Lucas De Marchi authored
Port Wa_14014475959 to xe_wa fixing its condition. The workaround should only be applied on the primary GT, not on media. So just checking by MTL platform is not enough: checking GT is of the right type is also needed. Since the GRAPHICS_STEP() does checks the GT type, we could leave the first check as a platform one: it'd would be easier to understand and not go out of sync with the graphics_ip_map[] in drivers/gpu/drm/xe/xe_pci.c. However it also means that new platforms using the same IP wouldn't match. Prefer using the IP version. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230526164358.86393-22-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Lucas De Marchi authored
When running rules on MTL and beyond that have media as a standalone GT, the rule should only match if the gt passed as parameter match the version/range/stepping that the rule is checking. This allows workarounds affecting only the media GT to be applied only on that GT and vice-versa. For platforms before MTL, the GT will not be of media type, even if it includes media engines. Make sure to cover that case by checking if the platforma has standalone media. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230526164358.86393-21-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Lucas De Marchi authored
Port Wa_1509372804 to xe_wa so it's reported as active. v2: Match workaround database, starting from A0 stepping (Matt Roper) Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230526164358.86393-20-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Lucas De Marchi authored
Wa_16015675438 and Wa_18020744125 apply to DG2 using the same action and conditions. Add both to the oob rules so they are both reported as active. Note that previously they were not checking by platform or IP version, hence making them not future-proof. Those workarounds should only be active in PVC and DG2, besides the check for "no render engine". v2: From current WA database, Wa_16015675438 applies to all DG2 subplatforms except G11. Migrate condition to use subplatform and remove G11 from the match (Matt Roper) Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230526164358.86393-19-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Lucas De Marchi authored
Wa_22012727170 and Wa_22012727685 apply to DG2 using the same action and conditions. Add both to the oob rules so they are both reported as active. Do not Wa_22012727170 to PVC and MTL since only early A* steppings are affected. v2: Remove DG2_G10 from Wa_22012727685 to match current WA database (Matt Roper) v3: GRAPHICS_STEP(A0, FOREVER) can be left alone for DG2 as this means all steppings Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230526164358.86393-18-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Lucas De Marchi authored
Port Wa_16011777198 to xe_wa so it's reported as active. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230526164358.86393-17-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Lucas De Marchi authored
Wa_14012197797 and Wa_22011391025 apply to DG2 using the same action. They apply to slightly different conditions. Add both to the oob rules so they are both reported as active. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230526164358.86393-16-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Lucas De Marchi authored
Port Wa_16011759253 to oob. Wa_22011383443, that has the same action, doesn't need to be ported as it targets early PVC steppings. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230526164358.86393-15-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Lucas De Marchi authored
Let xe_guc.c start using XE_WA() for workarounds, starting from a simple one: Wa_22012773006. It's also changed to start with graphics version 12, since that is the first supported by xe. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230526164358.86393-14-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Lucas De Marchi authored
There are WAs that, due to their nature, cannot be applied from a central place like xe_wa.c. Those are peppered around the rest of the code, as needed. Now they have a new name: "out-of-band workarounds". These workarounds have their names and rules still grouped in xe_wa.c, inside the xe_wa_oob array, which is generated at compile time by xe_wa_oob.rules and the hostprog xe_gen_wa_oob. The code generation guarantees that the header xe_wa_oob.h contains the IDs for the workarounds that match the index in the table. This way the runtime checks that are spread throughout the code are simple tests against the bitmap saved during initialization. v2: Fix prev_name tracking not working when it's empty, i.e. when there is more than 1 continuation rule. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230526164358.86393-13-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Lucas De Marchi authored
When doing out-of-tree builds with O= or KBUILD_OUTPUT=, it's important to also add the directory where the target is saved. Otherwise any file generated by the build system may not be available for other targets depending on it. The $(obj) is added automatically when building the entire kernel, but it's not added when M=drivers/gpu/drm/xe is added. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230526164358.86393-12-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Lucas De Marchi authored
Add a separate struct to hold entries in a table that has no action associated with each of them. The goal is that the caller in future can set a per-context callback, or just use the active entry marking feature. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230526164358.86393-11-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Lucas De Marchi authored
Start differentiating the media and graphics stepping as it will be important for MTL. Note that RTP is still not prepared to handle the different types of GT, i.e. checking for graphics version/range/stepping on a media gt or vice versa still matches regardless of the gt being passed as parameter. Changing it to accommodate MTL is left for a future patch. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230526164358.86393-10-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Lucas De Marchi authored
Rename the RTP match in order to prepare the code base to check for the media version. Up until MTL, the graphics vs media distinction wrt to stepping was not ver relevant as they were the same GT. However, with MTL this is no longer true. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230526164358.86393-9-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Lucas De Marchi authored
Add a "workarounds" node in debugfs that can dump all the active workarounds using the information recorded by rtp infra when those workarounds were processed. v2: move workarounds to be reported per-GT Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230526164358.86393-8-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Lucas De Marchi authored
Allocate the data to track workarounds on each gt of the device, and pass that to RTP so the active workarounds are tracked. Even if the workarounds available until now are mostly device or platform centric, with the different IP versions for media and graphics starting with MTL, it's possible that some workarounds need to be applied only on select GTs. Also, given the workaround database is per IP block, for tracking purposes there is no need to differentiate the workarounds per engine class. Hence the bitmask to track active workarounds can be tracked per GT. v2: Move the tracking from per-device to per-GT basis (Matt Roper) Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230526164358.86393-7-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Lucas De Marchi authored
Add the metadata in struct xe_rtp_process_ctx, to be set by xe_rtp_process_ctx_enable_active_tracking(), so rtp knows how to mark the active entries while processing the table. This can be used by the WA infra to record what are the active workarounds. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230526164358.86393-6-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Lucas De Marchi authored
The xe_rtp_process() function and xe_rtp_entry depend on the save-restore struct. In future it will be desired to process rtp rules, regardless of adding them to a save-restore. Rename the struct and function so the intent is clear and the name is freed for future uses. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230526164358.86393-5-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Lucas De Marchi authored
Now that rule_matches() always has an xe pointer, replace the XE_WARN_ON with the more appropriate drm_warn(). Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230526164358.86393-4-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Lucas De Marchi authored
The selection between hwe and gt is exposed to the outside of rtp, by the xe_rtp_process() function. However it doesn't make seense from the caller point of view to pass a hwe and a gt as argument since the gt should always be the one containing the hwe. This clarifies the interface by separating the context creation into an initializer. The initializer then passes the correct value and there should never be a case with hwe and gt set: when hwe is passed, the gt is the one containing it. Internally the functions continue receiving the argument separately. v2: Leave the device-only context to a separate patch if they are indeed needed later Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230526164358.86393-3-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Lucas De Marchi authored
It was missing one digit, so not showing up as a proper WA number. Add the missing number and annotate it with a FIXME as there are more to be implemented to consider this WA done: ensure CS is stop before doing a reset, wait for pending. Also, this WA applies to platforms up to graphics version 1270 (with the exception of MTL A*, that are not supported in xe). Fix platform check. Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/284Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230526164358.86393-2-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Matt Roper authored
Generally !has_flatccs implies that a platform has AuxCCS compression and thus needs to invalidate the AuxCCS TLB. However PVC is a special case because it has no compression of either type (FlatCCS or AuxCCS) so we should avoid writing to non-existent AuxCCS registers. Reviewed-by: Haridhar Kalvala <haridhar.kalvala@intel.com> Link: https://lore.kernel.org/r/20230524192635.673293-1-matthew.d.roper@intel.comSigned-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Christopher Snowhill authored
Padding and reserved fields are declared such that they must be zeroed, so verify that they're all zero in the respective ioctl functions. Derived from original patch by mlankhorst. v2: Removed extensions checks where there were none originally. (José) Moved extraneous parentheses to the correct places. (Lucas) Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Christopher Snowhill <kode54@gmail.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Christopher Snowhill authored
Pad the uAPI definition so that it would align identically between 64-bit and 32-bit uarch, so consumers using this header will work correctly from 32-bit compat userspace on a 64-bit kernel. Do it in a minimally invasive way, so that 64-bit userspace will still work with the previous header, and so that no fields suddenly change sizes. Originally inspired by mlankhorst. Signed-off-by: Christopher Snowhill <kode54@gmail.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Niranjana Vishwanathapura authored
The iommu_dma_map_sg() function ensures iova allocation doesn't cross dma segment boundary. It does so by padding some sg elements. This can cause overflow, ending up with sg->length being set to 0. Avoid this by halving the maximum segment size (rounded down to PAGE_SIZE). Specify maximum segment size for sg elements by using sg_alloc_table_from_pages_segment() to allocate sg_table. v2: Use correct max segment size in dma_set_max_seg_size() call Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Bruce Chang <yu.bruce.chang@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Matt Roper authored
For platforms with GMD_ID registers, the IP stepping should be determined from the 'revid' field of those registers rather than from the PCI revid. The hardware teams have indicated that they plan to keep the revid => stepping mapping consistent across all GMD_ID platforms, with major steppings (A0, B0, C0, etc.) having revids that are multiples of 4, and minor steppings (A1, A2, A3, etc.) taking the intermediate values. For now we'll trust that hardware follows through on this plan; if they have to change direction in the future (e.g., they wind up needing something like an "A4" that doesn't fit this scheme), we can add a GMD_ID-based lookup table when the time comes. v2: - Set xe->info.platform before finding stepping; the pre-GMD_ID code relies on this value to pick a lookup table. v3: - Also set xe->info.subplatform before picking the stepping for pre-GMD_ID lookup. Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Link: https://lore.kernel.org/r/20230524185952.666158-1-matthew.d.roper@intel.comSigned-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Gustavo Sousa authored
Let's make sure we give the driver a valid workqueue. While at it, also make sure to call destroy_workqueue() only if the workqueue is a valid one. That is necessary because xe_device_destroy() is indirectly called as part of the cleanup process of a failed xe_device_create(). Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20230518215651.502159-3-gustavo.sousa@intel.comSigned-off-by: Gustavo Sousa <gustavo.sousa@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Gustavo Sousa authored
Otherwise no cleanup is actually done if we branch to err_put. This works for now: currently we do know that, once inside xe_device_destroy(), ttm_device_init() was successful so we can safely call ttm_device_fini(); and, for xe->ordered_wq, there is an upcoming commit to check its value before calling destroy_workqueue(). However, we might need change this in the future if we have more initializers called that can fail in a way that we can not know which one was it once inside xe_device_destroy(). Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20230518215651.502159-2-gustavo.sousa@intel.comSigned-off-by: Gustavo Sousa <gustavo.sousa@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Gustavo Sousa authored
The function drm_dev_put() should also be called if xe_device_probe() fails. v2: - Improve commit message. (Lucas) Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20230519194802.578182-1-gustavo.sousa@intel.comSigned-off-by: Gustavo Sousa <gustavo.sousa@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Matthew Auld authored
drivers/gpu/drm/xe/xe_guc_submit_types.h:47: warning: cannot understand function prototype: 'struct guc_submit_parallel_scratch ' drivers/gpu/drm/xe/xe_devcoredump_types.h:38: warning: Function parameter or member 'ct' not described in 'xe_devcoredump_snapshot' CI doesn't appear to be running BAT anymore, assuming this is caused by the CI.Hooks now failing due to above warnings. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Michal Wajdeczko authored
Both GUC_HOST_INTERRUPT and MED_GUC_HOST_INTERRUPT can pass additional payload data to the GuC but this capability is not used by the firmware yet. Stop using value mandated by legacy GuC interrupt register and use default notify value (zero) instead. Bspec: 49813, 63363 Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Michal Wajdeczko authored
This GuC register can be moved together with the rest of the GuC register definitions and be named in a similar way. v2: fix placement Bspec: 63363 Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com> #v1 Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Rodrigo Vivi authored
There are multiple kind of config prints and with the upcoming devcoredump there will be another layer. Let's limit the config to the top level functions and leave the clean-up work for the compilers so we don't create a spider-web of configs. No functional change. Just a preparation for the devcoredump. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com>
-
Rodrigo Vivi authored
Let's continue to add our existent simple logs to devcoredump one by one. Any format change should come on follow-up work. v2: remove unnecessary, and now duplicated, dma_fence annotation. (Matthew) v3: avoid for_each with faulty_engine since that can be already freed at the time of the read/free. Instead, iterate in the full array of hw_engines. (Kasan) Cc: Francois Dugast <francois.dugast@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com>
-
Rodrigo Vivi authored
The goal is to allow for a snapshot capture to be taken at the time of the crash, while the print out can happen at a later time through the exposed devcoredump virtual device. v2: Addressing these Matthew comments: - Handle memory allocation failures. - Do not use GFP_ATOMIC on cases like debugfs prints. - placement of @reg doc. - identation issues. v3: checkpatch v4: Rebase and get back to GFP_ATOMIC only. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com>
-
Rodrigo Vivi authored
Let's start to move our existent logs to devcoredump one by one. Any format change should come on follow-up work. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com>
-
Rodrigo Vivi authored
The goal is to allow for a snapshot capture to be taken at the time of the crash, while the print out can happen at a later time through the exposed devcoredump virtual device. v2: Handle memory allocation failures. (Matthew) Do not use GFP_ATOMIC on cases like debugfs prints. (Matthew) v3: checkpatch v4: pending_list allocation needs to be atomic because of the spin_lock. (Matthew) get back to GFP_ATOMIC only. (lockdep). Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com>
-