Commits · cf667aec0abeda839937cbd92884799b19df1ab7 · Kirill Smelkov / linux

19 Dec, 2023 40 commits

drm/xe: Decrement fault mode counts in xe_vm_close_and_put · cf667aec

Matthew Brost authored Mar 24, 2023

Rather waiting for the VM to be destroyed (all refs to VM go to zero),
drop the fault mode counts when the VM is closed in xe_vm_close_and_put.
This avoids a window where user space can create a faulting VM, close
it, and a subsequent creation of a non-faulting VM fails.

v2 (Lucas): Drop VLK reference in commit message
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Suggested-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

cf667aec

drm/xe: Add max engine priority to xe query · ef5e3c2f

José Roberto de Souza authored Mar 23, 2023

Intel Vulkan driver needs to know what is the maximum priority to fill
a device info struct for applications.

Right now we getting this information by creating a engine and setting
priorities from min to high to know what is the maximum priority for
running process but this leads to info messages to be printed to
dmesg:

xe 0000:03:00.0: [drm] Ioctl argument check failed at drivers/gpu/drm/xe/xe_engine.c:178: value == DRM_SCHED_PRIORITY_HIGH && !capable(CAP_SYS_NICE)

It does not cause any harm but when executing a test suite like
crucible it causes thousands of those messages to be printed.

So here adding one more property to drm_xe_query_config to fetch the
max engine priority.

Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

ef5e3c2f

drm/xe: Load HuC on Alderlake S · 9bddebf1

Anusha Srivatsa authored Mar 23, 2023

Alderlake S uses TGL HuC.
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230323224651.1187366-3-lucas.demarchi@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

9bddebf1

drm/xe/huc: Support for loading unversiond HuC · 85ea2bd2

Anusha Srivatsa authored Mar 23, 2023

Follow the new direction of firmware and add macro
support for loading unversioned HuC. Keep HuC
versioned loading support as well for platforms
that fall under force_probe support

Add check to ensure driver does not do any version check
for HuC if going through unversioned load.

v2: unversioned firmware to be the default for platforms
not under force_probe. Maintain versioned firmware macro support
for platforms under force-probe protection.
v3: Minor style and naming adjustments (Lucas)

Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230323224651.1187366-2-lucas.demarchi@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

85ea2bd2

drm/xe: Fix potential deadlock handling page faults · b4eecedc

Matthew Brost authored Mar 20, 2023

Within a class the GuC will hault scheduling if the head of the queue
can't be scheduled the queue will block. This can lead to deadlock if
BCS0-7 all have faults and another engine on BCS0-7 is at head of the
GuC scheduling queue as the migration engine used to fix tthe fault will
be blocked. To work around this set the migration engine to the highest
priority when servicing page faults.

v2 (Maarten): Set priority to kernel once at creation
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

b4eecedc

drm/xe: Use BO's GT to determine dma_offset when programming PTEs · 59ea53ee

Matthew Brost authored Mar 23, 2023

Rather than using the passed in GT, use the BO's GT determine dma_offset
when programming PTEs as these two GT's could differ (i.e. mapping a BO
from a remote GT). The BO's GT is correct GT to use as this where BO
resides, while the passed in GT is where the mapping is created.

v2:
  (Thomas)      - Kernel doc, extra new line
  (CI)          - Rebase to tip
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

59ea53ee

drm/xe/gt: some error handling fixes · 99c5952f

Matthew Auld authored Mar 22, 2023

Make sure we pass along the correct errors.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

99c5952f

drm/xe: Reinstate render / compute cache invalidation in ring ops · 4f1411e2

Matthew Brost authored Mar 21, 2023

Render / compute engines have additional caches (not just TLBs) that
need to be invalidated each batch, reinstate these invalidations in ring
ops.
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Suggested-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

4f1411e2

drm/xe: Use atomic instead of mutex for xe_device_mem_access_ongoing · 38c04b47

Maarten Lankhorst authored Feb 28, 2023

xe_guc_ct_fast_path() is called from an irq context, and cannot lock
the mutex used by xe_device_mem_access_ongoing().

Fortunately it is easy to fix, and the atomic guarantees are good enough
to ensure xe->mem_access.hold_rpm is set before last ref is dropped.

As far as I can tell, the runtime ref in device access should be
killable, but don't dare to do it yet.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

38c04b47

drm/xe: Drop zero length arrays · 044f0cfb

Matthew Brost authored Mar 20, 2023

Zero-length arrays as fake flexible arrays are deprecated and we are
moving towards adopting C99 flexible-array members instead.
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

044f0cfb

drm/xe/buddy: add compatible and intersects hooks · ce79c6c4

Matthew Auld authored Mar 21, 2023

Copy this from i915. We need .compatible for vram -> vram transfers, so
they don't just get nooped by ttm, if need to move something from
mappable to non-mappble or vice versa. The .intersects is needed for
eviction, to determine if a victim resource is worth eviction. e.g if we
need mappable space there is no point in evicting a resource that has
zero mappable pages.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

ce79c6c4

drm/xe/buddy: add visible tracking · 793e6612

Matthew Auld authored Mar 21, 2023

Replace the allocation code with the i915 version. This simplifies the
code a little, and importantly we get the accounting at the mgr level,
which is useful for debug (and maybe userspace), plus per resource
tracking so we can easily check if a resource is using one or pages in
the mappable part of vram (useful for eviction), or if the resource is
completely within the mappable portion (useful for checking if the
resource can be safely CPU mapped).

v2: Fix missing PAGE_SHIFT
v3: (Gwan-gyeong Mun)
  - Fix incorrect usage of ilog2(mm.chunk_size).
  - Fix calculation when checking for impossible allocation sizes, also
    check much earlier.
v4: (Gwan-gyeong Mun)
  - Fix calculation when extending the [fpfn, lpfn] range due to the
    roundup_pow_of_two().
v5: (Gwan-gyeong Mun)
  - Move the check for running out of mappable VRAM to before doing any of
    the roundup_pow_of_two().
v6: (Jani)
  - Stop abusing BUG_ON(). We can easily just use WARN_ON() here and
    return a proper error to the caller, which is much nicer if we ever
    trigger these.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

793e6612

drm/xe: Stop accepting value in xe_migrate_clear · 11a2407e

Balasubramani Vivekanandan authored Mar 17, 2023

Although xe_migrate_clear() has a value argument, currently the driver
is only passing 0 at all the places this function is invoked with the
exception the kunit tests are using the parameter to validate this
function with different values.
xe_migrate_clear() is failing on platforms with link copy engines
because xe_migrate_clear() via emit_clear() is using the blitter
instruction XY_FAST_COLOR_BLT to clear the memory. But this instruction
is not supported by link copy engine.
So the solution is to use the alternate instruction MEM_SET when
platform contains link copy engine. But MEM_SET instruction accepts only
8-bit value for setting whereas the value agrument of xe_migrate_clear()
is 32-bit.
So instead of spreading this limitation around all invocations of
xe_migrate_clear() and causing more confusion, it was decided to not
accept any value itself as driver does not really need this currently.

All the kunit tests are adapted as per the new function prototype.

This will be followed by a patch to add support for link copy engines.
Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

11a2407e

drm/xe: Use max wopcm size when validating the preset GuC wopcm size · eb230dc4

Balasubramani Vivekanandan authored Mar 17, 2023

When the GuC wopcm base and size registers are populated by BIOS/IFWI,
validate the parameters against the maximum allowed wopcm size.

Bpsec: 44982
Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

eb230dc4

drm/xe/buddy: remove the virtualized start · 1a653b87

Matthew Auld authored Mar 14, 2023

Hopefully not needed anymore. We can add a .compatible() hook once we
need to differentiate between mappable and non-mappable vram. If the
allocation is not contiguous then the start value is kind of
meaningless, so rather just mark as invalid.

In upstream, TTM wants to eventually remove the ttm_resource.start
usage.

References: 54443270 ("drm/ttm: Add new callbacks to ttm res mgr")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

1a653b87

drm/xe/mcr: Separate version from engine type selection · 436dbd6b

Lucas De Marchi authored Mar 17, 2023

In order to improve readability and make it more future proof,
split the engine type from the graphics/platform checks.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230317223441.3891073-1-lucas.demarchi@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

436dbd6b

drm/xe/vram: start tracking the io_size · 2492f454

Matthew Auld authored Mar 14, 2023

First step towards supporting small-bar is to track the io_size for
vram. We can longer assume that the io_size == vram size. This way we
know how much is CPU accessible via the BAR, and how much is not.
Effectively giving us a two tiered vram, where in some later patches we
can support different allocation strategies depending on if the memory
needs to be CPU accessible or not.

Note as this stage we still clamp the vram size to the usable vram size.
Only in the final patch do we turn this on for real, and allow distinct
io_size and vram_size.

v2: (Lucas):
  - Improve the commit message, plus improve the kernel-doc for the
    io_size to give a better sense of what it actually is.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

2492f454

drm/xe/vm: Defer vm rebind until next exec if nothing to execute · 8e41443e

Thomas Hellström authored Mar 14, 2023

If all compute engines of a vm in compute mode are idle,
defer a rebind to the next exec to avoid the VM unnecessarily trying
to make memory resident and compete with other VMs for available
memory space.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

8e41443e

drm/xe/tests: Test both CPU- and GPU page-table updates with the migrate test · 7cba3396

Thomas Hellström authored Mar 13, 2023

Add a test parameter to force GPU page-table updates with the migrate
test and test both CPU- and GPU updates. Also provide some timing
results.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

7cba3396

drm/xe: Use a small negative initial seqno · a5dfb471

Thomas Hellström authored Mar 10, 2023

Causes an early 32-bit wrap and may thus help CI catch wrapping errors
that may otherwise not show early enough.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

a5dfb471

drm/xe: Introduce xe_engine_is_idle() · 155c9165

Thomas Hellström authored Mar 10, 2023

Introduce xe_engine_is_idle, and replace the static function in
xe_migrate.c.
The latter had two flaws. First the seqno == 1 test might return a false
true value each time the seqno counter wrapped, Second, the
cur_seqno == next_seqno test would never return true.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

155c9165

drm/xe/tests: Support CPU page-table updates in the migrate test · 17a28ea2

Thomas Hellström authored Mar 10, 2023

The migrate test currently supports only GPU pagetable updates and
will thus break if we fix the CPU pagetable update selection.

Fix the migrate test first.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

17a28ea2

drm/xe/migrate: Update cpu page-table updates · fc1cc680

Thomas Hellström authored Mar 10, 2023

Don't wait for GPU to be able to update page-tables using CPU. Putting
ourselves to sleep may be more of a problem than using GPU for
page-table updates. Also allow the vm to be NULL since the migrate
kunit test uses NULL for vm.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

fc1cc680

drm/xe: Use a define to set initial seqno for fences · 7c51050b

Thomas Hellström authored Mar 09, 2023

Also for HW fences, write the initial seqno - 1 to the HW completed
seqno to initialize.

v2:
- Use __dma_fence_is_later() to compare hw fence seqnos. (Matthew Auld)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

7c51050b

drm/xe/xe_uc_fw: Use firmware files from standard locations · 8eb7ad99

Mauro Carvalho Chehab authored Mar 10, 2023

The GuC/HuC firmware files used by Xe drivers are the same as
used by i915. Use the already-known location to find those
firmware files, for a couple of reasons:

1. Avoid having the same firmware placed on two different
   places on MODULE_FIRMWARE(), if both 915 and xe drivers
   are compiled;

2. Having firmware files located on different locations may end
   creating bigger initramfs, as the same files will be copied
   twice my mkinitrd/dracut/...;

3. this is the place where those firmware files are located at
   https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git
   Upstream doesn't expect them to have on other places;

4. When built with display support, DMC firmware will be
   loaded from i915/ directory. It is very confusing to have
   some firmware files on a different place for the same driver.

Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Lucas de Marchi <lucas.demarchi@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
[ Mostly agree with the direction of "use the firmware blobs from
  upstream at their current location for these platforms". Previous
  directory was not wrong as the plan was to have it handled in the
  upstream firmware repo. For future platforms the location can be
  changed if the support is only in xe ]
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230310081338.3275583-1-mauro.chehab@linux.intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

8eb7ad99

drm/xe/irq: the irq handler local variable need not be static · 5fd92bdd

Jani Nikula authored Mar 09, 2023

It's just a local variable.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

5fd92bdd

drm/xe: Replace i915 with xe in uapi · ccbb6ad5

Lucas De Marchi authored Mar 13, 2023

All structs and defines had already been renamed to "xe", but some
comments with "i915" were left over. Rename them.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://lore.kernel.org/r/20230313211628.2492587-1-lucas.demarchi@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

ccbb6ad5

drm/xe: Add missing LRC workarounds for graphics 1200 · fd93946d

Lucas De Marchi authored Mar 13, 2023

Synchronize LRC workarounds for graphics version 1200 with i915 up to
commit 7cdae9e9 ("drm/i915: Move DG2 tuning to the right function").
These were probably missed for TGL/RKL before because in i915 it uses a
!IS_DG1() condition.  Avoid a similar issue by just checking the
graphics version 1200 since DG1 is 1210.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230314003012.2600353-14-lucas.demarchi@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

fd93946d

drm/xe: Add missing ADL-P engine workaround · 95ff48c2

Lucas De Marchi authored Mar 13, 2023

Add the one missing workaround for ADL-P when comparing to i915 up to
commit 7cdae9e9 ("drm/i915: Move DG2 tuning to the right function").
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230314003012.2600353-13-lucas.demarchi@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

95ff48c2

drm/xe: Add missing DG2 lrc workarounds · 8cd7e975

Lucas De Marchi authored Mar 13, 2023

Synchronize with i915 the DG2 lrc workarounds as of
commit 4d14d771 ("drm/i915/selftest: Fix ktime_get() and h/w access
order").

A few simplifications were done when the WA should be applied to some
steps of a subplatform and all the steppings of the other subplatforms.
In this case, it was simply applied to all the steppings, which only
means applying it to a few more A* steppings.

The implementation of the workaround 16011186671 triggers a bug in the
RTP infra: it's not possible to set the flag the usual way when having
multiple actions in the entry. This may be fixed later, but for now it's
sufficient to just set the flag directly without the helper macro.

v2: Fix 14014947963 to use FIELD_SET (Matt Roper)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230314003012.2600353-12-lucas.demarchi@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

8cd7e975

drm/xe: Add missing DG2 lrc tunings · 11f78b13

Lucas De Marchi authored Mar 13, 2023

Synchronize with i915 the DG2 tunings as of
commit 4d14d771 ("drm/i915/selftest: Fix ktime_get() and h/w access
order").

Contrary to the tuning "gang timer" for TGL, there is no quick
justification for why the read back is disabled in i915. Keep it
with that flag for now. That can be tentatively removed later when the
read values are checked.

v2: Use XEHP_FF_MODE2 instead of GEN12_FF_MODE2 (Matt Roper)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230314003012.2600353-11-lucas.demarchi@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

11f78b13

drm/xe: Add missing DG2 engine workarounds · 4d5ab121

Lucas De Marchi authored Mar 13, 2023

Synchronize with i915 the DG2 gt workarounds as of
commit 4d14d771 ("drm/i915/selftest: Fix ktime_get() and h/w access
order").

A few simplifications were done when the WA should be applied to
some steps of a subplatform and all the steppings of the other
subplatforms. This happened with Wa_1509727124, Wa_22012856258 and a few
others. In figure the pre-production steppings will be removed, so this
can be already simplified a little bit.

v2:
  - Make 1308578152 conditional on first gslice fused off
  - Add the missing Wa_1608949956/Wa_14010198302 (Matt Roper)
v3:
  - Do not duplicate the implementation of 18019627453 since it's
    already covered by other WA numbers in graphics versions 1200 and
    1210
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230314003012.2600353-10-lucas.demarchi@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

4d5ab121

drm/xe: Add missing DG2 gt workarounds and tunings · 911aeb0f

Lucas De Marchi authored Mar 13, 2023

Synchronize with i915 the DG2 gt workarounds as of
commit 4d14d771 ("drm/i915/selftest: Fix ktime_get() and h/w access
order").
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230314003012.2600353-9-lucas.demarchi@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

911aeb0f

drm/xe: Add PVC engine workarounds · 4688d9ce

Lucas De Marchi authored Mar 13, 2023

Sync PVC engine workarounds with i915.

v2: Remove 16016694945. It was added by mistake. It's a GT workaround,
already present in the GT table (Matt Roper)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230314003012.2600353-8-lucas.demarchi@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

4688d9ce

drm/xe: Add PVC gt workarounds · a19220fa

Lucas De Marchi authored Mar 13, 2023

Synchronize with i915 the PVC gt workarounds as of committ
commit 4d14d771 ("drm/i915/selftest: Fix ktime_get() and h/w
access order").

v2: Add masked flag to XEHPC_LNCFMISCCFGREG0 (Matt Roper)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230314003012.2600353-7-lucas.demarchi@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

a19220fa

drm/xe: Reorder WAs to consider the platform · 6b5ccd63

Lucas De Marchi authored Mar 13, 2023

Now that number of platforms is growing, it's getting hard to know the
workarounds for each platform. Split the entries inside the same table
so the workarounds checking IP version are listed first, followed by
each platform. Next step when it grows too much is to split in smaller
tables.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230314003012.2600353-6-lucas.demarchi@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

6b5ccd63

drm/xe/debugfs: Dump register save-restore tables · 6647e2fe

Lucas De Marchi authored Mar 13, 2023

Add debugfs entry to dump the final tables with register save-restore
information.

For the workarounds, this has a format a little bit different than when the
values are applied because we don't want to read the values from the HW
when dumping via debugfs. For whitelist it just re-uses the print
function added for when the whitelist is being built.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230314003012.2600353-5-lucas.demarchi@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

6647e2fe

drm/xe: Print whitelist while applying · d855d224

Lucas De Marchi authored Mar 13, 2023

Besides printing the various register save-restore, it's also useful to
know the register being allowed/denied access from unprivileged batch
buffers. Print them during device probe.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230314003012.2600353-4-lucas.demarchi@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

d855d224

drm/xe/reg_sr: Tweak verbosity for register printing · 5be84050

Lucas De Marchi authored Mar 13, 2023

If there is no register to save-restore or whitelist, just return. This
drops some noise from the log, particurlarly for platforms with several
engines like PVC:

	[drm:xe_reg_sr_apply_mmio [xe]] Applying bcs0 save-restore MMIOs
	[drm:xe_reg_sr_apply_whitelist [xe]] Whitelisting bcs0 registers
	[drm:xe_reg_sr_apply_mmio [xe]] Applying bcs1 save-restore MMIOs
	[drm:xe_reg_sr_apply_whitelist [xe]] Whitelisting bcs1 registers
	[drm:xe_reg_sr_apply_mmio [xe]] Applying bcs2 save-restore MMIOs
	[drm:xe_reg_sr_apply_whitelist [xe]] Whitelisting bcs2 registers
	[drm:xe_reg_sr_apply_mmio [xe]] Applying bcs5 save-restore MMIOs
	[drm:xe_reg_sr_apply_whitelist [xe]] Whitelisting bcs5 registers
	[drm:xe_reg_sr_apply_mmio [xe]] Applying bcs6 save-restore MMIOs
	[drm:xe_reg_sr_apply_whitelist [xe]] Whitelisting bcs6 registers
	[drm:xe_reg_sr_apply_mmio [xe]] Applying bcs7 save-restore MMIOs
	[drm:xe_reg_sr_apply_whitelist [xe]] Whitelisting bcs7 registers
	[drm:xe_reg_sr_apply_mmio [xe]] Applying bcs8 save-restore MMIOs
	[drm:xe_reg_sr_apply_whitelist [xe]] Whitelisting bcs8 registers
	[drm:xe_reg_sr_apply_mmio [xe]] Applying ccs0 save-restore MMIOs
	[drm:xe_reg_sr_apply_mmio [xe]] REG[0x20e4] = 0x00008000
	[drm:xe_reg_sr_apply_mmio [xe]] REG[0xb01c] = 0x00000001
	[drm:xe_reg_sr_apply_mmio [xe]] REG[0xe48c] = 0x00000800
	[drm:xe_reg_sr_apply_mmio [xe]] REG[0xe7c8] = 0x40000000
	...

On a PVC system it should show something like below. Whitelist calls
are still there since they aren't actually empty - driver just doesn't
print each individual entry. This will be fixed in future.

	[drm:xe_reg_sr_apply_whitelist [xe]] Whitelisting bcs0 registers
	[drm:xe_reg_sr_apply_whitelist [xe]] Whitelisting bcs1 registers
	[drm:xe_reg_sr_apply_whitelist [xe]] Whitelisting bcs2 registers
	[drm:xe_reg_sr_apply_whitelist [xe]] Whitelisting bcs5 registers
	[drm:xe_reg_sr_apply_whitelist [xe]] Whitelisting bcs6 registers
	[drm:xe_reg_sr_apply_whitelist [xe]] Whitelisting bcs7 registers
	[drm:xe_reg_sr_apply_whitelist [xe]] Whitelisting bcs8 registers
	[drm:xe_reg_sr_apply_mmio [xe]] Applying ccs0 save-restore MMIOs
	[drm:xe_reg_sr_apply_mmio [xe]] REG[0x20e4] = 0x00008000
	[drm:xe_reg_sr_apply_mmio [xe]] REG[0xb01c] = 0x00000001
	[drm:xe_reg_sr_apply_mmio [xe]] REG[0xe48c] = 0x00000800
	[drm:xe_reg_sr_apply_mmio [xe]] REG[0xe7c8] = 0x40000000

v2: Only tweak log verbosity, leave the whitelist printout for later
    since decoding the whitelist is more complex.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230314003012.2600353-3-lucas.demarchi@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

5be84050

drm/xe/rtp: Add match helper for gslice fused off · 14380054

Lucas De Marchi authored Mar 13, 2023

Add match helper to detect when the first gslice is fused off, as needed
by future workarounds.

v2:
  - Add warning if called on a platform without geometry pipeline
    (Matt Roper)
  - Hardcode 4 as the number of gslices, which matches all the currently
    supported platforms. PVC doesn't have geometry pipeline and
    shouldn't use this function (Matt Roper)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://lore.kernel.org/r/20230314003012.2600353-2-lucas.demarchi@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

14380054