- 21 Dec, 2023 40 commits
-
-
Thomas Hellström authored
Currently we're using "compute mode" for long running VMs using preempt-fences for memory management, and "fault mode" for long running VMs using page faults. Change this to use the terminology "long-running" abbreviated as LR for long-running VMs. These VMs can then either be in preempt-fence mode or fault mode. The user can force fault mode at creation time, but otherwise the driver can choose to use fault- or preempt-fence mode for long-running vms depending on the device capabilities. Initially unless fault-mode is specified, the driver uses preempt-fence mode. v2: - Fix commit message wording and the documentation around CREATE_FLAG_LR_MODE and CREATE_FLAG_FAULT_MODE Cc: Matthew Brost <matthew.brost@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Francois Dugast <francois.dugast@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Rodrigo Vivi authored
Although the exec ioctl is a very important one, it makes no sense to explain xe_exec before explaining the exec_queue. So, let's move this down to help bring a better flow on the documentation and code readability. It is important to highlight that this patch is changing all the ioctl numbers in a non-backward compatible way. However, we are doing this final uapi clean-up before we submit our first pull-request to be part of the upstream Kernel. Once we get there, no other change like this will ever happen and all the backward compatibility will be respected. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
-
Rodrigo Vivi authored
Let's respect Documentation/process/botching-up-ioctls.rst and add the proper padding for a 64b alignment with all as well as all the required checks and settings for the pads and the reserved entries. v2: Fix remaining holes and double check with pahole (Jose) Ensure with pahole that both 32b and 64b have exact same layout (Thomas) Do not set query's pad and reserved bits to zero since it is redundant and already done by kzalloc (Matt) v3: Fix alignment after rebase (José Roberto de Souza) v4: Fix pad check (Francois Dugast) Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Francois Dugast <francois.dugast@intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
-
Rodrigo Vivi authored
As an information only. So Userspace can use this information and be able to correlate different GTs. Make API symmetric between Engine and GT info. There's no need right now to include a tile_query entry since there's no other information that we need from tile that is not already exposed through different queries. However, this could be added later if we have different Tile information that could matter to userspace. But let's keep the API ready for a direct reference to Tile ID based on the GT entry. Signed-off-by: Francois Dugast <francois.dugast@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
-
Rodrigo Vivi authored
First of all, let's remove the duplication. But also, let's rename it to remove the word 'frequency' out of it. In general, the first thing people think of frequency is the frequency in which the GTs are operating to execute the GPU instructions. While this frequency here is a crystal reference clock frequency which is the base of everything else, and in this case of this uAPI it is used to calculate a better and precise timestamp. v2: (Suggested by Jose) Remove the engine_cs and keep the GT info one since it might be useful for other SRIOV cases where the engine_cs will be zeroed. So, grabbing from the GT_LIST should be cleaner. v3: Keep comment on put_user() call (José Roberto de Souza) Cc: Matt Roper <matthew.d.roper@intel.com> Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: Jose Souza <jose.souza@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
-
Rodrigo Vivi authored
It is currently unused, so by the rules it cannot go upstream. Also there was the desire to convert that to align with the engine_class_instance selection, but the consensus on that one is to remain with the global gt_id. So we are keeping the gt_id there, not converting to a generic sched_group and also killing this tile_mask and only using the default behavior of 0 that is to create a mapping / page_table entry on every tile, similar to what i915. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
-
Rodrigo Vivi authored
Let's continue on the uapi clean-up with more splits with stuff into their own exclusive fields instead of reusing stuff. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
-
Francois Dugast authored
The uAPI provides queries which return arrays of elements. As of now the format used in the struct is different depending on which element is queried. Fix this for engines by applying the pattern below: struct drm_xe_query_Xs { __u32 num_Xs; struct drm_xe_X Xs[]; ... } Instead of directly returning an array of struct drm_xe_query_engine_info, a new struct drm_xe_query_engines is introduced. It contains itself an array of struct drm_xe_engine which holds the information about each engine. v2: Use plural for struct drm_xe_query_engines as multiple engines are returned (José Roberto de Souza) Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Francois Dugast authored
The uAPI provides queries which return arrays of elements. As of now the format used in the struct is different depending on which element is queried. However, aligning on the new common pattern: struct drm_xe_query_Xs { __u32 num_Xs; struct drm_xe_X Xs[]; ... } ... would mean bringing back the name "gts" which is avoided per commit fca54ba12470 ("drm/xe/uapi: Rename gts to gt_list") so make an exception for gt and leave gt_list. Also, this change removes "query" in the name of struct drm_xe_query_gt as it is not returned from the query IOCTL. There is no functional change. v2: Leave gt_list (Matt Roper) Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Francois Dugast authored
The uAPI provides queries which return arrays of elements. As of now the format used in the struct is different depending on which element is queried. Fix this for memory regions by applying the pattern below: struct drm_xe_query_Xs { __u32 num_Xs; struct drm_xe_X Xs[]; ... } This removes "query" in the name of struct drm_xe_query_mem_region as it is not returned from the query IOCTL. There is no functional change. v2: Only rename drm_xe_query_mem_region to drm_xe_mem_region (José Roberto de Souza) v3: Rename usage to mem_regions in xe_query.c (José Roberto de Souza) Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Mauro Carvalho Chehab authored
For xe bo creation we request passing size which matches system or vram minimum page alignment. This way we want to ensure userspace is aware of region constraints and not aligned allocations will be rejected returning EINVAL. v2: - Rebase, Update uAPI documentation. (Thomas) v3: - Adjust the dma-buf kunit test accordingly. (Thomas) v4: - Fixed rebase conflicts and updated commit message. (Francois) Signed-off-by: Mauro Carvalho Chehab <mauro.chehab@linux.intel.com> Signed-off-by: Zbigniew Kempczyński <zbigniew.kempczynski@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
José Roberto de Souza authored
We have at least 2 future features(OA and future media engines capabilities) that will require Xe to provide more information about engines to UMDs. But this information should not just be added to drm_xe_engine_class_instance for a couple of reasons: - drm_xe_engine_class_instance is used as input to other structs/uAPIs and those uAPIs don't care about any of these future new engine fields - those new fields are useless information after initialization for some UMDs, so it should not need to carry that around So here my proposal is to make DRM_XE_DEVICE_QUERY_ENGINES return an array of drm_xe_query_engine_info that contain drm_xe_engine_class_instance and 3 u64s to be used for future features. Reference OA: https://patchwork.freedesktop.org/patch/558362/?series=121084&rev=6 v2: Reduce reserved[] to 3 u64 (Matthew Brost) Cc: Francois Dugast <francois.dugast@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [Rodrigo Rebased] Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
-
Rodrigo Vivi authored
Although the flags are about the creation, the memory placement of the BO deserves a proper dedicated field in the uapi. Besides getting more clear, it also allows to remove the 'magic' shifts from the flags that was a concern during the uapi reviews. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
-
Mika Kuoppala authored
The bind api is extensible but for a single bind op, there is not a mechanism to extend. Add extensions field to struct drm_xe_vm_bind_op. Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Francois Dugast <francois.dugast@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Dominik Grzegorzek <dominik.grzegorzek@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Matthew Auld authored
From the CI logs we want to easily know if the machine is capable and allowed to enter d3cold, and can therefore potentially trigger the d3cold RPM suspend and resume path. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Anshuman Gupta <anshuman.gupta@intel.com> Cc: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Michał Winiarski authored
Move params that are not used for initial "hwconfig" load to "post-hwconfig" phase. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Michał Winiarski authored
Earlier GuC load will require more fine-grained control over reset. Extract it outside of xe_uc_init_hw. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Michał Winiarski authored
The firmware loading for GuC is about to be moved, and will happen much earlier in the probe process, when local-memory is not yet available. While this has the potential to make the firmware loading process slower, this is only happening during probe and full device reset. Since both are not hot-paths - store all UC-like firmware in system memory. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Michał Winiarski authored
The function does a driver specific "request firmware" step that includes validating the input, followed by wrapping the firmware binary into a buffer object. Split it into smaller parts. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Michał Winiarski authored
A helper for managed BO allocations makes it possible to remove specific "fini" actions and will simplify the following patches adding ability to execute a release action for specific BO directly. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Michał Winiarski authored
GuC will need to be loaded earlier during probe. Having functional GGTT is one of the prerequisites. Also rename xe_ggtt_init_noalloc to xe_ggtt_init_early to match the new call site. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Michał Winiarski authored
GuC will need to be loaded earlier during probe. And in order to load GuC, being able to take the forcewake is going to be needed. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Michał Winiarski authored
GuC will need to be loaded earlier during probe. And in order to load GuC, we will need the ability to create system memory allocations. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Michał Winiarski authored
Now that MMIO init got moved to device early, we can use regular xe_mmio_read helpers to get to GMD_ID register. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Michał Winiarski authored
SR-IOV VF doesn't have access to MMIO registers used to determine graphics/media ID. It can however communicate with GuC. Introduce xe_device_probe_early, which initializes enough HW to use MMIO GuC communication. This will allow both VF and PF/native driver to have unified probe ordering. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Michał Winiarski authored
Both MMIO registers and GGTT for root tile will need to be used earlier during probe. Don't rely on tile count to compute the mapping size. Further more, there's no need to remap after figuring out the real resource size. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Michał Winiarski authored
It also merges the GT (which is part of tile) initialization happening at xe_info_init with allocating other per-tile data structures into a common helper function. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Michał Winiarski authored
Parts of xe_info_init are only dealing with processing driver_data. Extract it into xe_info_init_early to be able to use it earlier during probe. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Animesh Manna authored
Add xe specific DSB buffer handling methods. v1: Initial version. v2: Add null check after dynamic memory allocation of vma. [Uma] Reviewed-by: Uma Shankar <uma.shankar@intel.com> Signed-off-by: Animesh Manna <animesh.manna@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Tejas Upadhyay authored
Workaround applies to Graphics 20.04 as part of ring submission V4(MattR): - Rule for engine in oob WA not supported, add explicitly V3(MattR): - Pass hwe and rename API name to hint end of ring work - Use existing RING_NOPID API V2: - Marking this WA for 20.04 instead of 20.00 Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Brian Welty authored
SW is not expected to handle TRTT faults and should report these as unsuccessful page fault in the reply, such that HW can respond by raising a CAT error. Signed-off-by: Brian Welty <brian.welty@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Brian Welty authored
Update xe_migrate_prepare_vm() to use the usm batch buffer even for servicing device page faults on integrated platforms. And as we have no VRAM on integrated platforms, device pagefault handler should not attempt to migrate into VRAM. LNL is first integrated platform to support device pagefaults. Signed-off-by: Brian Welty <brian.welty@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Michał Winiarski authored
MMIO is going to be setup earlier during probe. Move xe_mmio_probe_tiles outside of MMIO setup. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20231129214509.1174116-6-michal.winiarski@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Michał Winiarski authored
MMIO is going to be setup earlier during probe. Move xe_set_dma_info outside of MMIO setup. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20231129214509.1174116-5-michal.winiarski@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Michał Winiarski authored
For devres managed devices, pci_alloc_irq_vectors is also managed (see pci_setup_msi_context for reference). PCI device used by Xe is devres managed (it was enabled with pcim_enable_device), which means that calls to pci_free_irq_vectors are redundant and can be safely removed. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20231129214509.1174116-4-michal.winiarski@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Michał Winiarski authored
Xe uses devres for most of its driver-lifetime resources, use it for pci device as well. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20231129214509.1174116-3-michal.winiarski@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Michał Winiarski authored
DRM device used by Xe is managed, which means that final ref will be dropped on driver detach. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20231129214509.1174116-2-michal.winiarski@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Michał Winiarski authored
Additional underscore in the header guard causes the build to fail with: drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.h:6:9: error: '_XE_ENGINE_CLASS_SYSFS_H_' is used as a header guard here, followed by #define of a different macro [-Werror,-Wheader-guard] Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20231129214509.1174116-1-michal.winiarski@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Koby Elbaz authored
Per device, set this flag to access the MTCFG register or to skip it. This is done to standardise Xe driver naming if an access to any HW should be avoided. Signed-off-by: Koby Elbaz <kelbaz@habana.ai> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Koby Elbaz authored
Per device, set this flag to enable access to the PCODE uC or to skip it. Signed-off-by: Koby Elbaz <kelbaz@habana.ai> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-