Commits · 4673402ebf9fc1712fb41cce6a808923481495f3 · Kirill Smelkov / linux

04 Dec, 2019 6 commits

ia64: agp: Replace empty define with do while · 4673402e

Corentin Labbe authored Nov 21, 2019

It's dangerous to use empty code define.
Furthermore it lead to the following warning:
drivers/char/agp/generic.c: In function « agp_generic_destroy_page »:
drivers/char/agp/generic.c:1266:28: attention : suggest braces around empty body in an « if » statement [-Wempty-body]

So let's replace emptyness by "do {} while(0)"
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1574324085-4338-6-git-send-email-clabbe@baylibre.com

4673402e

agp: Add bridge parameter documentation · 5f448266

Corentin Labbe authored Nov 21, 2019

This patch add documentation about the bridge parameter in several
function.

This will fix the following build warning:
drivers/char/agp/generic.c:220: warning: No description found for parameter 'bridge'
drivers/char/agp/generic.c:364: warning: No description found for parameter 'bridge'
drivers/char/agp/generic.c:1283: warning: No description found for parameter 'bridge'
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1574324085-4338-5-git-send-email-clabbe@baylibre.com

5f448266

agp: remove unused variable num_segments · 5f1b24a6

Corentin Labbe authored Nov 21, 2019

This patch fix the following build warning:
warning: variable 'num_segments' set but not used [-Wunused-but-set-variable]
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1574324085-4338-4-git-send-email-clabbe@baylibre.com

5f1b24a6

agp: move AGPGART_MINOR to include/linux/miscdevice.h · 0f109f0e

Corentin Labbe authored Nov 21, 2019

This patch move the define for AGPGART_MINOR to include/linux/miscdevice.h.
It is better that all minor number definitions are in the same place.
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1574324085-4338-3-git-send-email-clabbe@baylibre.com

0f109f0e

agp: remove unused variable size in agp_generic_create_gatt_table · 9fc785f1

Corentin Labbe authored Nov 21, 2019

This patch fix the following warning:
drivers/char/agp/generic.c:853:6: attention : variable ‘size’ set but not used [-Wunused-but-set-variable]
by removing the unused variable size in agp_generic_create_gatt_table
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1574324085-4338-2-git-send-email-clabbe@baylibre.com

9fc785f1

Merge tag 'drm-next-5.5-2019-12-03' of git://people.freedesktop.org/~agd5f/linux into drm-next · 909a6065

Dave Airlie authored Dec 04, 2019

drm-next-5.5-2019-12-03:

amdgpu:
- Fix vram lost handling with BACO on VI/CI asics
- DC fixes for Navi14
- Misc gfx10 fixes
- SR-IOV fixes
- Fix driver unload
- Fix XGMI limits on Arcturus

amdkfd:
- Enable KFD on PPC
- Optimize KFD page table reservations

radeon:
- Fix register checker for r1xx/r2xx
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191203204135.5437-1-alexander.deucher@amd.com

909a6065

03 Dec, 2019 14 commits

drm/radeon: fix r1xx/r2xx register checker for POT textures · 008037d4

Alex Deucher authored Nov 26, 2019

Shift and mask were reversed.  Noticed by chance.
Tested-by: Meelis Roos <mroos@linux.ee>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

008037d4

drm/amdgpu: fix GFX10 missing CSIB set(v3) · 4905880b

Monk Liu authored Nov 26, 2019

still need to init csb even for SRIOV

v2:
drop init_pg() for gfx10 at all since
PG and GFX off feature will be fully controled
by RLC and SMU fw for gfx10

v3:
drop the flush_gpu_tlb lines since we consider
it is only usefull in emulation
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

4905880b

drm/amdgpu: should stop GFX ring in hw_fini · cd05b51a

Monk Liu authored Nov 29, 2019

To align with the scheme from gfx9

disabling GFX ring after VM shutdown could avoid
garbage data be fetched to GFX RB which may lead
to unnecessary screw up on GFX
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

cd05b51a

drm/amdgpu: do autoload right after MEC loaded for SRIOV VF · dacf56e4

Monk Liu authored Nov 26, 2019

since we don't have RLCG ucode loading and no SRlist as well
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

dacf56e4

drm/amdgpu: skip rlc ucode loading for SRIOV gfx10 · 6294017f

Monk Liu authored Nov 26, 2019

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

6294017f

drm/amdgpu: fix calltrace during kmd unload(v3) · 747d4f71

Monk Liu authored Nov 26, 2019

issue:
kernel would report a warning from a double unpin
during the driver unloading on the CSB bo

why:
we unpin it during hw_fini, and there will be another
unpin in sw_fini on CSB bo.

fix:
actually we don't need to pin/unpin it during
hw_init/fini since it is created with kernel pinned,
we only need to fullfill the CSB again during hw_init
to prevent CSB/VRAM lost after S3

v2:
get_csb in init_rlc so hw_init() will make CSIB content
back even after reset or s3

v3:
use bo_create_kernel instead of bo_create_reserved for CSB
otherwise the bo_free_kernel() on CSB is not aligned and
would lead to its internal reserve pending there forever

take care of gfx7/8 as well
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

747d4f71

drm/amdgpu/gfx10: unlock srbm_mutex after queue programming finish · fa2b93e3

Xiaojie Yuan authored Nov 06, 2019

srbm_mutex is to guarantee atomicity for r/w of gfx indexed registers
Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

fa2b93e3

drm/amdgpu: Added ASIC specific checks in gfxhub V1.1 get XGMI info · f0312f45

John Clements authored Dec 02, 2019

Added max hive/node info checks per supported ASIC
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

f0312f45

drm/amdgpu/powerplay: unify smu send message function · 76d8f83b

Likun Gao authored Dec 02, 2019

Drop smu_send_smc_msg function from ASIC specify structure.
Reuse smu_send_smc_msg_with_param function for smu_send_smc_msg.
Set paramer to 0 for smu_send_msg function, otherwise it will send
with previous paramer value (Not a certain value).
Materialize msg type for smu send message function definition.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

76d8f83b

drm/amd/display: re-enable wait in pipelock, but add timeout · 627f75d1

Alex Deucher authored Nov 15, 2019

Removing this causes hangs in some games, so re-add it, but add
a timeout so we don't hang while switching flip types.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=205169
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=112266Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

627f75d1

drm/amd/display: Get NV14 specific ip params as needed · 30c51773

Zhan liu authored Dec 02, 2019

[Why]
NV14 is using its own ip params that's different from other
DCN2.0 ASICs.

[How]
Add ASIC revision check to make sure NV14 gets correct
ip params.
Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

30c51773

drm/amd/display: Adding NV14 IP Parameters · 516fb68d

Zhan liu authored Dec 02, 2019

[Why]
NV14 IP Parameters are missing.

[How]
Add IP Parameters in.
Signed-off-by: Zhan liu <zhan.liu@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

516fb68d

drm/amd/display: Include num_vmid and num_dsc within NV14's resource caps · c3d03c5a

Zhan Liu authored Nov 28, 2019

[Why]
"num_vmid" and "num_dsc" are missing within NV14's resource caps structure.

[How]
Add the missing parts.
Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

c3d03c5a

drm/amdgpu: use CPU to flush vmhub if sched stopped · e2195f7d

Monk Liu authored Nov 26, 2019

otherwse the flush_gpu_tlb will hang if we unload the
KMD becuase the schedulers already stopped
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

e2195f7d

02 Dec, 2019 2 commits

Merge tag 'drm-intel-next-fixes-2019-11-28' of... · 3e25dbca

Dave Airlie authored Dec 02, 2019

Merge tag 'drm-intel-next-fixes-2019-11-28' of git://anongit.freedesktop.org/drm/drm-intel into drm-next

- Important fix to uAPI alignment on query IOCTL
- Fixes for the power regression introduced by the previous security patches
- Avoid regressing super heavy benchmarks by increasing the default request pre-emption timeout from 100 ms to 640 ms to
- Resulting set of smaller fixes done while problem was inspected
- Display fixes for EHL voltage level programming and TGL DKL PHY vswing for HDMI
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191128141524.GA11992@jlahtine-desk.ger.corp.intel.com

3e25dbca

Merge tag 'drm-msm-next-2019-11-05' of https://gitlab.freedesktop.org/drm/msm into drm-next · 36a170b1

Dave Airlie authored Dec 02, 2019

+ OCMEM support to enable the couple generations that had shared OCMEM
  rather than GMEM exclusively for the GPU (late a3xx and I think basically
  all of a4xx).  Bjorn and Brian decided to land this through the drm
  tree to avoid having to coordinate merge requests.
+ a510 support, and various associated display support
+ the usual misc cleanups and fixes
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ <CAF6AEGv-JWswEJRxe5AmnGQO1SZnpxK05kO1E29K6UUzC9GMMw@mail.gmail.com

36a170b1

27 Nov, 2019 2 commits

drm/i915: Reduce nested prepare_remote_context() to a trylock · 3cc44feb

Chris Wilson authored Nov 26, 2019

On context retiring, we may invoke the kernel_context to unpin this
context. Elsewhere, we may use the kernel_context to modify this
context. This currently leads to an AB-BA lock inversion, so we need to
back-off from the contended lock, and repeat.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111732Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Fixes: a9877da2 ("drm/i915/oa: Reconfigure contexts on the fly")
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191126065521.2331017-1-chris@chris-wilson.co.uk
(cherry picked from commit 58b4c1a0)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

3cc44feb

drm/i915: Default to a more lenient forced preemption timeout · 8d15ede5

Chris Wilson authored Nov 25, 2019

Based on a sampling of a number of benchmarks across platforms, by
default opt for a much more lenient timeout so that we should not
adversely affect existing "good" clients.

640ms ought to be enough for anyone.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112169
Fixes: 3a7a92ab ("drm/i915/execlists: Force preemption")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125162737.2161069-1-chris@chris-wilson.co.uk
(cherry picked from commit 5766a5ff)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

8d15ede5

26 Nov, 2019 7 commits

amdgpu: Enable KFD on POWER systems · c38402fe

Timothy Pearson authored Nov 24, 2019

KFD has been verified to function on POWER systems (Talos II / Vega 64).
It should be available as a kernel configuration option on these systems.
Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

c38402fe

drm/amdgpu: Optimize KFD page table reservation · 29a39c90

Felix Kuehling authored Jul 15, 2019

Be less pessimistic about estimated page table use for KFD. Most
allocations use 2MB pages and therefore need less VRAM for page
tables. This allows more VRAM to be used for applications especially
on large systems with many GPUs and hundreds of GB of system memory.

Example: 8 GPUs with 32GB VRAM each + 256GB system memory = 512GB
Old page table reservation per GPU:  1GB
New page table reservation per GPU: 32MB
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: xinhui pan <xinhui.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

29a39c90

drm/amdgpu: flag vram lost on baco reset for VI/CIK · dea8b900

Alex Deucher authored Nov 25, 2019

VI/CIK BACO was inflight when this fix landed for SOC15/NV.
Add the fix to VI/CIK as well.
Acked-by: Evan Quan <evan.quan@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

dea8b900

MAINTAINERS: Drop Rex Zhu for amdgpu powerplay · a0c2a84d

Alex Deucher authored Nov 22, 2019

No longer works on the driver.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

a0c2a84d

drm/amdgpu: Resolved offchip EEPROM I/O issue · 5985ebbe

John Clements authored Nov 25, 2019

Updated target I2C address
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

5985ebbe

drm/amd/display: add default clocks if not able to fetch them · 94662169

Alex Deucher authored Nov 19, 2019

dm_pp_get_clock_levels_by_type needs to add the default clocks
to the powerplay case as well. This was accidently dropped.

Fixes: b3ea88fe ("drm/amd/powerplay: add get_clock_by_type interface for display")
Bug: https://gitlab.freedesktop.org/drm/amd/issues/906Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

94662169

drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint · 0725d9a3

Chris Wilson authored Nov 18, 2019

In order to avoid some nasty mutex inversions, commit 09c5ab38
("drm/i915: Keep rings pinned while the context is active") allowed the
intel_ring unpinning to be run concurrently with the next context
pinning it. Thus each step in intel_ring_unpin() needed to be atomic and
ordered in a nice onion with intel_ring_pin() so that the lifetimes
overlapped and were always safe.

Sadly, a few steps in intel_ring_unpin() were overlooked, such as
closing the read/write pointers of the ring and discarding the
intel_ring.vaddr, as these steps were not serialised with
intel_ring_pin() and so could leave the ring in disarray.

Fixes: 09c5ab38 ("drm/i915: Keep rings pinned while the context is active")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191118230254.2615942-6-chris@chris-wilson.co.uk
(cherry picked from commit a266bf42)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

0725d9a3

25 Nov, 2019 9 commits

Merge tag 'drm-next-5.5-2019-11-22' of git://people.freedesktop.org/~agd5f/linux into drm-next · acc61b89

Dave Airlie authored Nov 26, 2019

drm-next-5.5-2019-11-22:

amdgpu:
- Fix bad DMA on some PPC platforms
- MMHUB fix for powergating
- BACO fix for Navi
- Misc raven fixes
- Enable vbios fetch directly from rom on navi
- debugfs fix for DC
- SR-IOV fixes for arcturus
- Misc power fixes

radeon:
- Fix bad DMA on some PPC platforms
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191122203025.3787-1-alexander.deucher@amd.com

acc61b89

Merge tag 'drm-intel-next-fixes-2019-11-22' of... · e639ea0f

Dave Airlie authored Nov 26, 2019

Merge tag 'drm-intel-next-fixes-2019-11-22' of git://anongit.freedesktop.org/drm/drm-intel into drm-next

- Reverts a patch to avoid spinning forever when context's timeline
  is active but has no requests
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191122155523.GA20167@jlahtine-desk.ger.corp.intel.com

e639ea0f

drm/i915/gt: Schedule request retirement when timeline idles · 31177017

Chris Wilson authored Nov 25, 2019

The major drawback of commit 7e34f4e4 ("drm/i915/gen8+: Add RC6 CTX
corruption WA") is that it disables RC6 while Skylake (and friends) is
active, and we do not consider the GPU idle until all outstanding
requests have been retired and the engine switched over to the kernel
context. If userspace is idle, this task falls onto our background idle
worker, which only runs roughly once a second, meaning that userspace has
to have been idle for a couple of seconds before we enable RC6 again.
Naturally, this causes us to consume considerably more energy than
before as powersaving is effectively disabled while a display server
(here's looking at you Xorg) is running.

As execlists will get a completion event as each context is completed,
we can use this interrupt to queue a retire worker bound to this engine
to cleanup idle timelines. We will then immediately notice the idle
engine (without userspace intervention or the aid of the background
retire worker) and start parking the GPU. Thus during light workloads,
we will do much more work to idle the GPU faster...  Hopefully with
commensurate power saving!

v2: Watch context completions and only look at those local to the engine
when retiring to reduce the amount of excess work we perform.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112315
References: 7e34f4e4 ("drm/i915/gen8+: Add RC6 CTX corruption WA")
References: 2248a283 ("drm/i915/gen8+: Add RC6 CTX corruption WA")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-3-chris@chris-wilson.co.uk
(cherry picked from commit 4f88f874)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

31177017

drm/i915/gt: Adapt engine_park synchronisation rules for engine_retire · a09c2860

Chris Wilson authored Nov 25, 2019

In the next patch, we will introduce a new asynchronous retirement
worker, fed by execlists CS events. Here we may queue a retirement as
soon as a request is submitted to HW (and completes instantly), and we
also want to process that retirement as early as possible and cannot
afford to postpone (as there may not be another opportunity to retire it
for a few seconds). To allow the new async retirer to run in parallel
with our submission, pull the __i915_request_queue (that passes the
request to HW) inside the timelines spinlock so that the retirement
cannot release the timeline before we have completed the submission.

v2: Actually to play nicely with engine_retire, we have to raise the
timeline.active_lock before releasing the HW. intel_gt_retire_requsts()
is still serialised by the outer lock so they cannot see this
intermediate state, and engine_retire is serialised by HW submission.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-2-chris@chris-wilson.co.uk
(cherry picked from commit 88a4655e)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

a09c2860

drm/i915/execlists: Fixup cancel_port_requests() · 4ec5cc78

Chris Wilson authored Nov 25, 2019

I rushed a last minute correction to cancel_port_requests() to prevent
the snooping of *execlists->active as the inflight array was being
updated, without noticing we iterated the inflight array starting from
active! Oops.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112387
Fixes: 97f9af78 ("drm/i915/gt: Mark the execlists->active as the primary volatile access")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125112520.1760492-1-chris@chris-wilson.co.uk
(cherry picked from commit da0ef77e)
[Joonas: Fixed Fixes: tag to match drm-intel-next-fixes]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

4ec5cc78

drm/i915/gt: Mark the execlists->active as the primary volatile access · 97f9af78

Chris Wilson authored Nov 25, 2019

Since we want to do a lockless read of the current active request, and
that request is written to by process_csb also without serialisation, we
need to instruct gcc to take care in reading the pointer itself.

Otherwise, we have observed execlists_active() to report 0x40.

[ 2400.760381] igt/para-4098    1..s. 2376479300us : process_csb: rcs0 cs-irq head=3, tail=4
[ 2400.760826] igt/para-4098    1..s. 2376479303us : process_csb: rcs0 csb[4]: status=0x00000001:0x00000000
[ 2400.761271] igt/para-4098    1..s. 2376479306us : trace_ports: rcs0: promote { b9c59:2622, b9c55:2624 }
[ 2400.761726] igt/para-4097    0d... 2376479311us : __i915_schedule: rcs0: -2147483648->3, inflight:0000000000000040, rq:ffff888208c1e940

which is impossible!

The answer is that as we keep the existing execlists->active pointing
into the array as we copy over that array, the unserialised read may see
a partial pointer value.

Fixes: df403069 ("drm/i915/execlists: Lift process_csb() out of the irq-off spinlock")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125094318.1630806-1-chris@chris-wilson.co.uk
(cherry picked from commit 331bf905)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

97f9af78

drm/i915/gt: Unlock engine-pm after queuing the kernel context switch · bf201f5e

Chris Wilson authored Nov 20, 2019

In commit a79ca656 ("drm/i915: Push the wakeref->count deferral to
the backend"), I erroneously concluded that we last modify the engine
inside __i915_request_commit() meaning that we could enable concurrent
submission for userspace as we enqueued this request. However, this
falls into a trap with other users of the engine->kernel_context waking
up and submitting their request before the idle-switch is queued, with
the result that the kernel_context is executed out-of-sequence most
likely upsetting the GPU and certainly ourselves when we try to retire
the out-of-sequence requests.

As such we need to hold onto the effective engine->kernel_context mutex
lock (via the engine pm mutex proxy) until we have finish queuing the
request to the engine.

v2: Serialise against concurrent intel_gt_retire_requests()
v3: Describe the hairy locking scheme with intel_gt_retire_requests()
for future reference.
v4: Combine timeline->lock and engine pm release; it's hairy.

Fixes: a79ca656 ("drm/i915: Push the wakeref->count deferral to the backend")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191120165514.3955081-2-chris@chris-wilson.co.uk
(cherry picked from commit 5cba2884)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

bf201f5e

drm/i915/gt: Close race between engine_park and intel_gt_retire_requests · ca1711d1

Chris Wilson authored Nov 20, 2019

The general concept was that intel_timeline.active_count was locked by
the intel_timeline.mutex. The exception was for power management, where
the engine->kernel_context->timeline could be manipulated under the
global wakeref.mutex.

This was quite solid, as we always manipulated the timeline only while
we held an engine wakeref.

And then we started retiring requests outside of struct_mutex, only
using the timelines.active_list and the timeline->mutex. There we
started manipulating intel_timeline.active_count outside of an engine
wakeref, and so introduced a race between __engine_park() and
intel_gt_retire_requests(), a race that could result in the
engine->kernel_context not being added to the active timelines and so
losing requests, which caused us to keep the system permanently powered
up [and unloadable].

The race would be easy to close if we could take the engine wakeref for
the timeline before we retire -- except timelines are not bound to any
engine and so we would need to keep all active engines awake. The
alternative is to guard intel_timeline_enter/intel_timeline_exit for use
outside of the timeline->mutex.

Fixes: e5dadff4 ("drm/i915: Protect request retirement with timeline->mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191120165514.3955081-1-chris@chris-wilson.co.uk
(cherry picked from commit a6edbca7)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

ca1711d1

drm/i915: Mark up the calling context for intel_wakeref_put() · ee33baa8

Chris Wilson authored Nov 20, 2019

Previously, we assumed we could use mutex_trylock() within an atomic
context, falling back to a worker if contended. However, such trickery
is illegal inside interrupt context, and so we need to always use a
worker under such circumstances. As we normally are in process context,
we can typically use a plain mutex, and only defer to a work when we
know we are being called from an interrupt path.

Fixes: 51fbd8de ("drm/i915/pmu: Atomically acquire the gt_pm wakeref")
References: a0855d24 ("locking/mutex: Complain upon mutex API misuse in IRQ contexts")
References: https://bugs.freedesktop.org/show_bug.cgi?id=111626Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191120125433.3767149-1-chris@chris-wilson.co.uk
(cherry picked from commit 07779a76)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

ee33baa8