Commits · fa313ede703115c9ebb31ce36a946aae23c14d61 · Kirill Smelkov / linux

12 Sep, 2022 10 commits

mei: mkhi: add memory ready command · fa313ede

Tomas Winkler authored Sep 08, 2022

Add GSC memory ready command.
The command indicates to the firmware that extend operation
memory was setup and the firmware may enter PXP mode.

CC: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220907215113.1596567-11-tomas.winkler@intel.comAcked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

fa313ede

mei: bus: export common mkhi definitions into a separate header · 7d88a258

Vitaly Lubart authored Sep 08, 2022

Exported common mkhi definitions from bus-fixup.c into a separate
header file mkhi.h for other driver usage.
Signed-off-by: Vitaly Lubart <vitaly.lubart@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220907215113.1596567-10-tomas.winkler@intel.comAcked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

7d88a258

mei: extend timeouts on slow devices · 95953618

Alexander Usyskin authored Sep 08, 2022

Parametrize operational timeouts in order
to support slow firmware on some graphics devices.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220907215113.1596567-9-tomas.winkler@intel.comAcked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

95953618

mei: gsc: wait for reset thread on stop · 9b2e03e2

Alexander Usyskin authored Sep 08, 2022

Wait for reset work to complete before initiating
stop reset flow sequence.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220907215113.1596567-8-tomas.winkler@intel.comAcked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

9b2e03e2

mei: gsc: use polling instead of interrupts · 5b063995

Tomas Winkler authored Sep 08, 2022

A work-around for a HW issue in XEHPSDV that manifests itself when SW reads
a gsc register when gsc is sending an interrupt. The work-around is
to disable interrupts and to use polling instead.

Cc: James Ausmus <james.ausmus@intel.com>
Signed-off-by: Vitaly Lubart <vitaly.lubart@intel.com>
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220907215113.1596567-7-tomas.winkler@intel.comAcked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

5b063995

drm/i915/gsc: add GSC XeHP SDV platform definition · 2d427248

Alexander Usyskin authored Sep 08, 2022

Define GSC on XeHP SDV (Intel(R) dGPU without display)

XeHP SDV uses the same hardware settings as DG1, but uses polling
instead of interrupts and runs the firmware in slow pace due to
hardware limitations.
Signed-off-by: Vitaly Lubart <vitaly.lubart@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220907215113.1596567-6-tomas.winkler@intel.comSigned-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

2d427248

drm/i915/gsc: add slow_firmware flag to the gsc device definition · d6728776

Alexander Usyskin authored Sep 08, 2022

Add slow_firmware flag to the gsc device definition
and pass it to mei auxiliary device, this instructs
the driver to use longer operation timeouts.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220907215113.1596567-5-tomas.winkler@intel.comSigned-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

d6728776

mei: add slow_firmware flag to the mei auxiliary device · ed57967a

Tomas Winkler authored Sep 08, 2022

Add slow_firmware flag to the mei auxiliary device info
to inform the mei driver about slow underlying firmware.
Such firmware will require to use larger operation timeouts.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220907215113.1596567-4-tomas.winkler@intel.comAcked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

ed57967a

mei: add kdoc for struct mei_aux_device · fd72cb1b

Tomas Winkler authored Sep 08, 2022

struct mei_aux_device is an interface structure
requires proper documenation.
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220907215113.1596567-3-tomas.winkler@intel.comAcked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

fd72cb1b

drm/i915/gsc: skip irq initialization if using polling · 5c4d2536

Vitaly Lubart authored Sep 08, 2022

Some platforms require the host to poll on the
GSC registers instead of relaying on the interrupts.
For those platforms, irq initialization should be skipped
Signed-off-by: Vitaly Lubart <vitaly.lubart@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220907215113.1596567-2-tomas.winkler@intel.comSigned-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

5c4d2536

08 Sep, 2022 2 commits

drm/i915: Set correct domains values at _i915_vma_move_to_active · 04f7eb3d

Nirmoy Das authored Sep 07, 2022

Fix regression introduced by commit:
"drm/i915: Individualize fences before adding to dma_resv obj"
which sets obj->read_domains to 0 for both read and write paths.
Also set obj->write_domain to 0 on read path which was removed by
the commit.

References: https://gitlab.freedesktop.org/drm/intel/-/issues/6639
Fixes: 420a07b8 ("drm/i915: Individualize fences before adding to dma_resv obj")
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Cc: <stable@vger.kernel.org> # v5.16+
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220907172641.12555-1-nirmoy.das@intel.com

04f7eb3d

drm/i915: Rename ggtt_view as gtt_view · 3bb6a442

Niranjana Vishwanathapura authored Sep 01, 2022

So far, different views (normal, partial, rotated and remapped)
into the same object are only supported for GGTT mappings.
But with the upcoming VM_BIND feature, PPGTT will also use the
partial view mapping. Hence rename ggtt_view to more generic
gtt_view.
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220901183854.3446-1-niranjana.vishwanathapura@intel.com

3bb6a442

07 Sep, 2022 2 commits

drm/i915/uc: Add patch level version number support · 65332a5b

John Harrison authored Sep 06, 2022

With the move to un-versioned filenames, it becomes more difficult to
know exactly what version of a given firmware is being used. So add
the patch level version number to the debugfs output.

Also, support matching by patch level when selecting code paths for
firmware compatibility. While a patch level change cannot be backwards
breaking, it is potentially possible that a new feature only works
from a given patch level onwards (even though it was theoretically
added in an earlier version that bumped the major or minor version).
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220906230147.479945-2-daniele.ceraolospurio@intel.com

65332a5b

drm/i915/uc: Support for version reduced and multiple firmware files · 665ae9c9

John Harrison authored Sep 06, 2022

There was a misunderstanding in how firmware file compatibility should
be managed within i915. This has been clarified as:
i915 must support all existing firmware releases forever
new minor firmware releases should replace prior versions
only backwards compatibility breaking releases should be a new file

This patch cleans up the single fallback file support that was added
as a quick fix emergency effort. That is now removed in preference to
supporting arbitrary numbers of firmware files per platform.

The patch also adds support for having GuC firmware files that are
named by major version only (because the major version indicates
backwards breaking changes that affect the KMD) and for having HuC
firmware files with no version number at all (because the KMD has no
interface requirements with the HuC).

For GuC, the driver will report via dmesg if the found file is older than
expected. For HuC, the KMD will no longer require updating for any new
HuC release so will not be able to report what the latest expected
version is.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220906230147.479945-1-daniele.ceraolospurio@intel.com

665ae9c9

06 Sep, 2022 2 commits

drm/i915: Don't try to disable host RPS when this was never enabled. · 68eb42b3

Rodrigo Vivi authored Sep 02, 2022

Specially in GT reset case this could be triggered and try
to disable things that had never been enabled. Let's add
some protection here.

Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220902095126.373036-1-rodrigo.vivi@intel.com

68eb42b3

drm/i915: consider HAS_FLAT_CCS() in needs_ccs_pages · 873fef88

Matthew Auld authored Sep 05, 2022

Just move the HAS_FLAT_CCS() check into needs_ccs_pages. This also then
fixes i915_ttm_memcpy_allowed() which was incorrectly reporting true on
DG1, even though it doesn't have small-BAR or flat-CCS.

References: https://gitlab.freedesktop.org/drm/intel/-/issues/6605
Fixes: efeb3caf ("drm/i915/ttm: disallow CPU fallback mode for ccs pages")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220905105329.41455-1-matthew.auld@intel.com

873fef88

05 Sep, 2022 1 commit

drm/i915/ttm: Abort suspend on i915_ttm_backup failure · c2a6502f

Nirmoy Das authored Sep 01, 2022

On system suspend when system memory is low then i915_gem_obj_copy_ttm()
could fail trying to backup a lmem obj. GEM_WARN_ON() is not enough,
suspend shouldn't continue if i915_ttm_backup() throws an error.

v2: Keep the fdo issue till we have a igt test(Matt).
v3: Use %pe(Andrzej)

References: https://gitlab.freedesktop.org/drm/intel/-/issues/6529Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Suggested-by: Chris P Wilson <chris.p.wilson@intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220901172217.18392-1-nirmoy.das@intel.com

c2a6502f

01 Sep, 2022 1 commit

drm/i915/slpc: Let's fix the PCODE min freq table setup for SLPC · 018a7bdb

Rodrigo Vivi authored Aug 31, 2022

We need to inform PCODE of a desired ring frequencies so PCODE update
the memory frequencies to us. rps->min_freq and rps->max_freq are the
frequencies used in that request. However they were unset when SLPC was
enabled and PCODE never updated the memory freq.

v2 (as Suggested by Ashutosh): if SLPC is in use, let's pick the right
   frequencies from the get_ia_constants instead of the fake init of
   rps' min and max.

v3: don't forget the max <= min return

v4: Move all the freq conversion to intel_rps.c. And the max <= min
    check to where it belongs.

v5: (Ashutosh) Fix old comment s/50 HZ/50 MHz and add a doc explaining
    the "raw format"

Fixes: 7ba79a67 ("drm/i915/guc/slpc: Gate Host RPS when SLPC is enabled")
Cc: <stable@vger.kernel.org> # v5.15+
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Tested-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220831214538.143950-1-rodrigo.vivi@intel.com

018a7bdb

31 Aug, 2022 1 commit

drm/i915/slpc: Fix inconsistent locked return · ff4e0caf

Rodrigo Vivi authored Aug 30, 2022

Fix for intel_guc_slpc_set_min_freq() warn:
inconsistent returns '&slpc->lock'.

v2: Avoid with_intel_runtime_pm with the
    internal goto/return. (Ashutosh)
    Also standardize the 'ret' if this came from
    the efficient setup. And avoid the 'unlikely'.

Fixes: 95ccf312 ("drm/i915/guc/slpc: Allow SLPC to use efficient frequency")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220830193537.52201-1-rodrigo.vivi@intel.com

ff4e0caf

30 Aug, 2022 3 commits

drm/i915/ats-m: Add thread execution tuning setting · 73c7a8a8

Matt Roper authored Aug 26, 2022

On client DG2 platforms, optimal performance is achieved with the
hardware's default "age based" thread execution setting.  However on
ATS-M, switching this to "round robin after dependencies" provides
better performance.  We'll add a new "tuning" feature flag to the ATS-M
device info to enable/disable this setting.

Bspec: 68331
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220826212718.409948-1-matthew.d.roper@intel.com

73c7a8a8

Revert "drm/i915/dg2: Add preemption changes for Wa_14015141709" · d9927abb

Matt Roper authored Aug 26, 2022

This reverts commit ca692081.

The intent of Wa_14015141709 was to inform us that userspace can no
longer control object-level preemption as it has on past platforms
(i.e., by twiddling register bit CS_CHICKEN1[0]).  The description of
the workaround in the spec wasn't terribly well-written, and when we
requested clarification from the hardware teams we were told that on the
kernel side we should also probably stop setting
FF_SLICE_CS_CHICKEN1[14], which is the register bit that directs the
hardware to honor the settings in per-context register CS_CHICKEN1.  It
turns out that this guidance about FF_SLICE_CS_CHICKEN1[14] was a
mistake; even though CS_CHICKEN1[0] is non-operational and useless to
userspace, there are other bits in the register that do still work and
might need to be adjusted by userspace in the future (e.g., to implement
other workarounds that show up).  If we don't set
FF_SLICE_CS_CHICKEN1[14] in i915, then those future workarounds would
not take effect.

This miscommunication came to light because another workaround
(Wa_16013994831) has now shown up that requires userspace to adjust the
value of CS_CHICKEN[10] in certain circumstances.  To ensure userspace's
updates to this chicken bit are handled properly by the hardware, we
need to make sure that FF_SLICE_CS_CHICKEN1[14] is once again set by the
kernel.
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220826210233.406482-1-matthew.d.roper@intel.com

d9927abb

drm/i915/selftests: allow misaligned_pin test work with unmappable memory · 13cc5123

Andrzej Hajda authored Aug 25, 2022

In case of Small BAR configurations stolen local memory can be unmappable.
Since the test does not touch the memory, passing I915_BO_ALLOC_GPU_ONLY
flag to i915_gem_object_create_region, will prevent -ENOSPC error from
_i915_gem_object_stolen_init.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6565Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220825154239.52343-1-andrzej.hajda@intel.com

13cc5123

29 Aug, 2022 1 commit

drm/i915/guc: Remove log size module parameters · f54e515c

Joonas Lahtinen authored Aug 26, 2022

Remove the module parameters for configuring GuC log size.

We should instead rely on tuning the defaults to be usable for
reporting bugs.

v2:
- Use correct 1M unit

Fixes: 8ad0152a ("drm/i915/guc: Make GuC log sizes runtime configurable")
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220826092343.184568-1-joonas.lahtinen@linux.intel.com

f54e515c

26 Aug, 2022 1 commit

drm/i915/dg2: Incorporate Wa_16014892111 into DRAW_WATERMARK tuning · 25bcc828

Matt Roper authored Aug 23, 2022

Although register tuning settings are generally implemented via the
workaround infrastructure, it turns out that the DRAW_WATERMARK register
is not properly saved/restored by hardware around power events (i.e.,
RC6 entry) so updates to the value cannot be applied in the usual
manner. New workaround Wa_16014892111 informs us that any tuning
updates to this register must instead be applied via an INDIRECT_CTX
batch buffer. This will ensure that the necessary value is re-applied
when a context begins running, even if an RC6 entry had wiped the
register back to hardware defaults since the last context ran.

Fixes: 6dc85721 ("drm/i915/dg2: Add additional tuning settings")
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6642Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220823202449.83727-1-matthew.d.roper@intel.com

25bcc828

25 Aug, 2022 4 commits

drm/i915/pxp: don't start pxp without mei_pxp bind · 6127b3bc

Juston Li authored Aug 18, 2022

pxp will not start correctly until after mei_pxp bind completes and
intel_pxp_init_hw() is called.
Wait for the bind to complete before proceeding with startup.

This fixes a race condition during bootup where we observed a small
window for pxp commands to be sent, starting pxp before mei_pxp bind
completed.

Changes since v2:
- wait for pxp_component to bind instead of returning -EAGAIN (Daniele)

Changes since v1:
- check pxp_component instead of pxp_component_added (Daniele)
- pxp_component needs tee_mutex (Daniele)
- return -EAGAIN so caller knows to retry (Daniele)
Signed-off-by: Juston Li <justonli@chromium.org>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220818174205.2412730-1-justonli@chromium.org

6127b3bc

drm/i915/mtl: Don't mask off CCS according to DSS fusing · 068a0f5c

Matt Roper authored Aug 18, 2022

Unlike the Xe_HP platforms, MTL only has a single CCS engine; the
quad-based engine masking logic does not apply to this platform (or
presumably any future platforms that only have 0 or 1 CCS).
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220818234202.451742-5-radhakrishna.sripada@intel.com

068a0f5c

drm/i915/mtl: MMIO range is now 4MB · da30390b

Matt Roper authored Aug 18, 2022

Previously only dgfx platforms had a 4MB MMIO range, but starting with
MTL we now use the larger range for all platforms.

Bspec: 63834, 63830
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220818234202.451742-4-radhakrishna.sripada@intel.com

da30390b

drm/i915: Skip Bit12 fw domain reset for gen12+ · 6509dd11

Radhakrishna Sripada authored Aug 17, 2022

Bit12 of the Forcewake request register should not be cleared post
gen12. Do not touch this bit while clearing during fw domain reset.

v2: Tweak the comment to drop older platforms(MattR)

Bspec: 52542
Signed-off-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com>
Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220817224304.255767-1-radhakrishna.sripada@intel.com

6509dd11

24 Aug, 2022 2 commits

drm/i915/guc/slpc: Allow SLPC to use efficient frequency · 95ccf312

Vinay Belgaumkar authored Aug 19, 2022

Host Turbo operates at efficient frequency when GT is not idle unless
the user or workload has forced it to a higher level. Replicate the same
behavior in SLPC by allowing the algorithm to use efficient frequency.
We had disabled it during boot due to concerns that it might break
kernel ABI for min frequency. However, this is not the case since
SLPC will still abide by the (min,max) range limits.

With this change, min freq will be at efficient frequency level at init
instead of fused min (RPn). If user chooses to reduce min freq below the
efficient freq, we will turn off usage of efficient frequency and honor
the user request. When a higher value is written, it will get toggled
back again.

The patch also corrects the register which needs to be read for obtaining
the correct efficient frequency for Gen9+.

We see much better perf numbers with benchmarks like glmark2 with
efficient frequency usage enabled as expected.

v2: Address review comments (Rodrigo)
v3: with efficient frequency being dynamic, it is possible that the req
frequency may go beyond max freq. This will cause SLPC selftests to fail.
Add a FIXME there to start the test with [RPn, RP0] instead and restore
it afterwards.

BugLink: https://gitlab.freedesktop.org/drm/intel/-/issues/5468

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220820010832.15350-1-vinay.belgaumkar@intel.com

95ccf312

drm/i915/guc: remove runtime info printing from time stamp logging · f0c70d41

Jani Nikula authored Aug 19, 2022

Commit 368d179a ("drm/i915/guc: Add GuC <-> kernel time stamp
translation information") added intel_device_info_print_runtime() in the
time info dump for no obvious reason or explanation in the commit
message. It only logs the rawclk freq. Remove it.

Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/b395ac4c909042f5daabf29959d8733993545aa2.1660910433.git.jani.nikula@intel.com

f0c70d41

20 Aug, 2022 1 commit

Revert "drm/i915/guc: Add delay to disable scheduling after pin count goes to zero" · 54c204c5

Matthew Auld authored Aug 19, 2022

This reverts commit 6a079903.

Everything in CI using GuC is now timing out[1], and killing the machine
with this change (perhaps a deadlock?). CI was recently on fire due to
some changes coming in from -rc1, so likely the pre-merge CI results for
this series were invalid? For now just revert, unless GuC experts
already have a fix in mind.

[1] https://intel-gfx-ci.01.org/tree/drm-tip/index.html?
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220819123904.913750-1-matthew.auld@intel.com

54c204c5

18 Aug, 2022 2 commits

drm/i915/guc: Add delay to disable scheduling after pin count goes to zero · 6a079903

Matthew Brost authored Aug 16, 2022

Add a delay, configurable via debugfs (default 34ms), to disable
scheduling of a context after the pin count goes to zero. Disable
scheduling is a costly operation as it requires synchronizing with
the GuC. So the idea is that a delay allows the user to resubmit
something before doing this operation. This delay is only done if
the context isn't closed and less than a given threshold
(default is 3/4) of the guc_ids are in use.

As temporary WA disable this feature for the selftests. Selftests are
very timing sensitive and any change in timing can cause failure. A
follow up patch will fixup the selftests to understand this delay.

Alan Previn: Matt Brost first introduced this series back in Oct 2021.
However no real world workload with measured performance impact was
available to prove the intended results. Today, this series is being
republished in response to a real world workload that benefited greatly
from it along with measured performance improvement.

Workload description: 36 containers were created on a DG2 device where
each container was performing a combination of 720p 3d game rendering
and 30fps video encoding. The workload density was configured in a way
that guaranteed each container to ALWAYS be able to render and
encode no less than 30fps with a predefined maximum render + encode
latency time. That means the totality of all 36 containers and their
workloads were not saturating the engines to their max (in order to
maintain just enough headrooom to meet the min fps and max latencies
of incoming container submissions).

Problem statement: It was observed that the CPU core processing the i915
soft IRQ work was experiencing severe load. Using tracelogs and an
instrumentation patch to count specific i915 IRQ events, it was confirmed
that the majority of the CPU cycles were caused by the
gen11_other_irq_handler() -> guc_irq_handler() code path. The vast
majority of the cycles was determined to be processing a specific G2H
IRQ: i.e. INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_DONE. These IRQs are sent
by GuC in response to i915 KMD sending H2G requests:
INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_SET. Those H2G requests are sent
whenever a context goes idle so that we can unpin the context from GuC.
The high CPU utilization % symptom was limiting density scaling.

Root Cause Analysis: Because the incoming execution buffers were spread
across 36 different containers (each with multiple contexts) but the
system in totality was NOT saturated to the max, it was assumed that each
context was constantly idling between submissions. This was causing
a thrashing of unpinning contexts from GuC at one moment, followed quickly
by repinning them due to incoming workload the very next moment. These
event-pairs were being triggered across multiple contexts per container,
across all containers at the rate of > 30 times per sec per context.

Metrics: When running this workload without this patch, we measured an
average of ~69K INTEL_GUC_ACTION_SCHED_CONTEXT_MODE_DONE events every 10
seconds or ~10 million times over ~25+ mins. With this patch, the count
reduced to ~480 every 10 seconds or about ~28K over ~10 mins. The
improvement observed is ~99% for the average counts per 10 seconds.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220817020511.2180747-3-alan.previn.teres.alexis@intel.com

6a079903

drm/i915/selftests: Use correct selfest calls for live tests · 61faec5f

Matthew Brost authored Aug 16, 2022

This will help in an upcoming patch where the live selftest wrappers
are extended to do more.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: John Harrison <john.c.harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220817020511.2180747-2-alan.previn.teres.alexis@intel.com

61faec5f

17 Aug, 2022 7 commits

drm/i915/guc: clear stalled request after a reset · f922fbb0

Daniele Ceraolo Spurio authored Aug 11, 2022

If the GuC CTs are full and we need to stall the request submission
while waiting for space, we save the stalled request and where the stall
occurred; when the CTs have space again we pick up the request submission
from where we left off.

If a full GT reset occurs, the state of all contexts is cleared and all
non-guilty requests are unsubmitted, therefore we need to restart the
stalled request submission from scratch. To make sure that we do so,
clear the saved request after a reset.

Fixes note: the patch that introduced the bug is in 5.15, but no
officially supported platform had GuC submission enabled by default
in that kernel, so the backport to that particular version (and only
that one) can potentially be skipped.

Fixes: 925dc1cf ("drm/i915/guc: Implement GuC submission tasklet")
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Cc: <stable@vger.kernel.org> # v5.15+
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220811210812.3239621-1-daniele.ceraolospurio@intel.com

f922fbb0

drm/i915/guc: skip scrub_ctbs selftest if reset is disabled · b0f2eb94

Daniele Ceraolo Spurio authored Jul 08, 2022

The test needs GT reset to trigger the scrubbing logic, so we can only
run it when reset is enabled.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220708224158.929327-1-daniele.ceraolospurio@intel.com

b0f2eb94

drm/i915/guc: Reduce spam from error capture · b092e4a9

John Harrison authored Jul 27, 2022

Some debug code got left in when the GuC based register save for error
capture was added. Remove that.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220728022028.2190627-8-John.C.Harrison@Intel.com

b092e4a9

drm/i915/guc: Make GuC log sizes runtime configurable · 8ad0152a

John Harrison authored Jul 27, 2022

The GuC log buffer sizes had to be configured statically at compile
time. This can be quite troublesome when needing to get larger logs
out of a released driver. So re-organise the code to allow a boot time
module parameter override.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220728022028.2190627-7-John.C.Harrison@Intel.com

8ad0152a

drm/i915/guc: Use streaming loads to speed up dumping the guc log · 5ece208a

Chris Wilson authored Jul 27, 2022

Use a temporary page and mempy_from_wc to reduce the time it takes to
dump the guc log to debugfs.
Signed-off-by: Chris Wilson <chris.p.wilson@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220728022028.2190627-6-John.C.Harrison@Intel.com

5ece208a

drm/i915/guc: Record CTB info in error logs · c5de70f6

John Harrison authored Jul 27, 2022

When debugging GuC communication issues, it is useful to have the CTB
info available. So add the state and buffer contents to the error
capture log.

Also, add a sub-structure for the GuC specific error capture info as
it is now becoming numerous.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220728022028.2190627-5-John.C.Harrison@Intel.com

c5de70f6

drm/i915/guc: Add GuC <-> kernel time stamp translation information · 368d179a

John Harrison authored Jul 27, 2022

It is useful to be able to match GuC events to kernel events when
looking at the GuC log. That requires being able to convert GuC
timestamps to kernel time. So, when dumping error captures and/or GuC
logs, include a stamp in both time zones plus the clock frequency.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220728022028.2190627-4-John.C.Harrison@Intel.com

368d179a