Commits · e8f6b4952ec54a9d7e43f908d39dc168b0310599 · Kirill Smelkov / linux

28 Aug, 2019 3 commits

drm/i915/execlists: Flush the post-sync breadcrumb write harder · e8f6b495

Chris Wilson authored Aug 27, 2019

Quite rarely we see that the CS completion event fires before the
breadcrumb is coherent, which presumably is a result of the CS_STALL not
waiting for the post-sync operation. Try throwing in a DC_FLUSH into
the following pipecontrol to see if that makes any difference.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190827120615.31390-1-chris@chris-wilson.co.uk

e8f6b495

drm/i915/selftests: Try to recycle context allocations · c4e64881

Chris Wilson authored Aug 27, 2019

igt_ctx_exec allocates a new context for each iteration, keeping them
all allocated until the end. Instead, release the local ctx reference at
the end of each iteration, allowing ourselves to reap those if under
mempressure.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190827161726.3640-2-chris@chris-wilson.co.uk

c4e64881

drm/i915/selftests: Remove accidental serialization between gpu_fill · f2085c8e

Chris Wilson authored Aug 27, 2019

Upon object creation for live_gem_contexts, we fill the object with
known scratch and flush it out of the CPU cache. Before performing the
GPU fill, we don't need to flush it again and so avoid serialising with
previous fills.

However, we do need some throttling on the internal interfaces if we do
not want to run out of memory!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190827161726.3640-1-chris@chris-wilson.co.uk

f2085c8e

27 Aug, 2019 14 commits

drm/i915: use a separate context for gpu relocs · 8a9a9827

Daniele Ceraolo Spurio authored Aug 27, 2019

The CS pre-parser can pre-fetch commands across memory sync points and
starting from gen12 it is able to pre-fetch across BB_START and BB_END
boundaries as well, so when we emit gpu relocs the pre-parser might
fetch the target location of the reloc before the memory write lands.

The parser can't pre-fetch across the ctx switch, so we use a separate
context to guarantee that the memory is synchronized before the parser
can get to it.

Note that there is no risk of the CS doing a lite restore from the reloc
context to the user context, even if the two have the same hw_id,
because since gen11 the CS also checks the LRCA when deciding if it can
lite-restore.

v2: limit new context to gen12+, release in eb_destroy, add a comment
    in emit_fini_breadcrumb (Chris).
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190827185805.21799-1-daniele.ceraolospurio@intel.com

8a9a9827

drm/i915/tgl/perf: use the same oa ctx_id format as icl · 45e9c829

Michel Thierry authored Aug 23, 2019

Compared to Icelake, Tigerlake's MAX_CONTEXT_HW_ID is smaller by one, but
since we just use the upper 32 bits of the lrc_desc, it's guaranteed OA
will use the correct one.

Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823082055.5992-19-lucas.demarchi@intel.com

45e9c829

drm/i915/tgl: Do not apply WaIncreaseDefaultTLBEntries from GEN12 onwards · a8ff5d40

Michel Thierry authored Aug 23, 2019

Workaround no longer needed (plus L3_LRA_1_GPGPU doesn't exist).

Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823082055.5992-17-lucas.demarchi@intel.com

a8ff5d40

drm/i915/tgl: Implement TGL DisplayPort training sequence · 99389390

José Roberto de Souza authored Aug 23, 2019

On TGL some registers moved from DDI to transcoder and the
DisplayPort training sequence has a separate BSpec page.

I started adding 'ifs' to the original intel_ddi_pre_enable_dp() but
it was becoming really hard to follow, so a new and cleaner function
for TGL was added with comments of all steps. It's similar to ICL,
but different enough to deserve a new function.

The rest of DisplayPort enable and the whole disable sequences
remained the same.

v2: FEC and DSC should be enabled on sink side before start link
training(Maarten reported and Manasi confirmed the DSC part)

v3: Add call to enable FEC on step 7.l(Manasi)

BSpec: 49190
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823082055.5992-16-lucas.demarchi@intel.com

99389390

drm/i915: Disable pipes in reverse order · 9c722e17

José Roberto de Souza authored Aug 23, 2019

Disable CRTC/pipes in reverse order because some features (MST in
TGL+) requires master and slave relationship between pipes, so it
should always pick the lowest pipe as master as it will be enabled
first and disable in the reverse order so the master will be the last
one to be disabled.

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823082055.5992-13-lucas.demarchi@intel.com

9c722e17

drm: Add for_each_oldnew_intel_crtc_in_state_reverse() · 0456417e

José Roberto de Souza authored Aug 23, 2019

Same as for_each_oldnew_intel_crtc_in_state() but iterates in reverse
order.

v2: Fix additional blank line
v3: Rebase

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823082055.5992-12-lucas.demarchi@intel.com

0456417e

drm/i915/tgl: Add maximum resolution supported by PSR2 HW · f7b3c226

José Roberto de Souza authored Aug 23, 2019

TGL PSR2 HW supports a bigger resolution, so lets add it

BSpec: 50422, 49199
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823082055.5992-10-lucas.demarchi@intel.com

f7b3c226

drm/i915: Do not read PSR2 register in transcoders without PSR2 · 0f81e645

José Roberto de Souza authored Aug 17, 2019

This fix unclaimed access warnings:

[  245.525788] ------------[ cut here ]------------
[  245.525884] Unclaimed read from register 0x62900
[  245.526154] WARNING: CPU: 0 PID: 1234 at drivers/gpu/drm/i915/intel_uncore.c:1100 __unclaimed_reg_debug+0x40/0x50 [i915]
[  245.526160] Modules linked in: i915 x86_pkg_temp_thermal ax88179_178a coretemp usbnet crct10dif_pclmul mii crc32_pclmul ghash_clmulni_intel e1000e [last unloaded: i915]
[  245.526191] CPU: 0 PID: 1234 Comm: kms_fullmodeset Not tainted 5.1.0-rc6+ #915
[  245.526197] Hardware name: Intel Corporation Tiger Lake Client Platform/TigerLake U DDR4 SODIMM RVP, BIOS TGLSFWR1.D00.2081.A10.1904182155 04/18/2019
[  245.526273] RIP: 0010:__unclaimed_reg_debug+0x40/0x50 [i915]
[  245.526281] Code: 74 05 5b 5d 41 5c c3 45 84 e4 48 c7 c0 76 97 21 a0 48 c7 c6 6c 97 21 a0 89 ea 48 0f 44 f0 48 c7 c7 7f 97 21 a0 e8 4f 1e fe e0 <0f> 0b 83 2d 6f d9 1c 00 01 5b 5d 41 5c c3 66 90 41 57 41 56 41 55
[  245.526288] RSP: 0018:ffffc900006bf7d8 EFLAGS: 00010086
[  245.526297] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[  245.526304] RDX: 0000000000000007 RSI: 0000000000000000 RDI: 00000000ffffffff
[  245.526310] RBP: 0000000000061900 R08: 0000000000000000 R09: 0000000000000001
[  245.526317] R10: 0000000000000006 R11: 0000000000000000 R12: 0000000000000001
[  245.526324] R13: 0000000000000000 R14: ffff8882914f0d58 R15: 0000000000000206
[  245.526332] FS:  00007fed2a3c39c0(0000) GS:ffff8882a8600000(0000) knlGS:0000000000000000
[  245.526340] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  245.526347] CR2: 00007fed28dff000 CR3: 00000002a086c006 CR4: 0000000000760ef0
[  245.526354] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  245.526361] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  245.526367] PKRU: 55555554
[  245.526373] Call Trace:
[  245.526454]  gen11_fwtable_read32+0x219/0x250 [i915]
[  245.526576]  intel_psr_activate+0x57/0x400 [i915]
[  245.526697]  intel_psr_enable_locked+0x367/0x4b0 [i915]
[  245.526828]  intel_psr_enable+0xa4/0xd0 [i915]
[  245.526946]  intel_enable_ddi+0x127/0x2f0 [i915]
[  245.527075]  intel_encoders_enable.isra.79+0x62/0x90 [i915]
[  245.527202]  haswell_crtc_enable+0x2a2/0x850 [i915]
[  245.527337]  intel_update_crtc+0x51/0x360 [i915]
[  245.527466]  skl_update_crtcs+0x26c/0x300 [i915]
[  245.527603]  intel_atomic_commit_tail+0x3e5/0x13c0 [i915]
[  245.527757]  intel_atomic_commit+0x24d/0x2d0 [i915]
[  245.527782]  drm_atomic_helper_set_config+0x7b/0x90
[  245.527799]  drm_mode_setcrtc+0x1b4/0x6f0
[  245.527856]  ? drm_mode_getcrtc+0x180/0x180
[  245.527867]  drm_ioctl_kernel+0xad/0xf0
[  245.527886]  drm_ioctl+0x2f4/0x3b0
[  245.527902]  ? drm_mode_getcrtc+0x180/0x180
[  245.527935]  ? rcu_read_lock_sched_held+0x6f/0x80
[  245.527956]  do_vfs_ioctl+0xa0/0x6d0
[  245.527970]  ? __task_pid_nr_ns+0xb6/0x200
[  245.527991]  ksys_ioctl+0x35/0x70
[  245.528009]  __x64_sys_ioctl+0x11/0x20
[  245.528020]  do_syscall_64+0x55/0x180
[  245.528034]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  245.528042] RIP: 0033:0x7fed2cc7c3c7
[  245.528050] Code: 00 00 90 48 8b 05 c9 3a 0d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 99 3a 0d 00 f7 d8 64 89 01 48
[  245.528057] RSP: 002b:00007ffe36944378 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  245.528067] RAX: ffffffffffffffda RBX: 00007ffe369443b0 RCX: 00007fed2cc7c3c7
[  245.528074] RDX: 00007ffe369443b0 RSI: 00000000c06864a2 RDI: 0000000000000003
[  245.528081] RBP: 00007ffe369443b0 R08: 0000000000000000 R09: 0000564c0173ae98
[  245.528088] R10: 0000564c0173aeb8 R11: 0000000000000246 R12: 00000000c06864a2
[  245.528095] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000000
[  245.528128] irq event stamp: 140866
[  245.528138] hardirqs last  enabled at (140865): [<ffffffff819a63dc>] _raw_spin_unlock_irqrestore+0x4c/0x60
[  245.528148] hardirqs last disabled at (140866): [<ffffffff819a624d>] _raw_spin_lock_irqsave+0xd/0x50
[  245.528158] softirqs last  enabled at (140860): [<ffffffff81c0038c>] __do_softirq+0x38c/0x499
[  245.528170] softirqs last disabled at (140853): [<ffffffff810b4a09>] irq_exit+0xa9/0xc0
[  245.528247] WARNING: CPU: 0 PID: 1234 at drivers/gpu/drm/i915/intel_uncore.c:1100 __unclaimed_reg_debug+0x40/0x50 [i915]
[  245.528254] ---[ end trace 366069676e98a410 ]---
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823082055.5992-7-lucas.demarchi@intel.com

0f81e645

drm/i915/tgl: Guard and warn if more than one eDP panel is present · 6056517a

José Roberto de Souza authored Aug 23, 2019

On TGL+ it's possible to have PSR1 enabled in other ports besides DDIA.
PSR2 is still limited to DDIA. However currently we handle only one
instance of PSR struct. Lets guard intel_psr_init_dpcd() against
multiple eDP panels and warn about it.

v2: Reword commit message to be TGL+ only and with the info where
PSR1/PSR2 are supported (Lucas)

Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823082055.5992-6-lucas.demarchi@intel.com

6056517a

drm/i915: Make engine's batch pool safe for use with virtual engines · cccdce1d

Chris Wilson authored Aug 27, 2019

A virtual engine itself does not have a batch pool, but we can gleefully
use any of its siblings instead.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190827135935.3831-1-chris@chris-wilson.co.uk

cccdce1d

drm/i915: Only activate i915_active debugobject once · f52c6d0d

Chris Wilson authored Aug 27, 2019

The point of debug_object_activate is to mark the first, and only the
first, acquisition. The object then remains active until the last
release. However, we marked up all successful first acquires even though
we allowed concurrent parties to try and acquire the i915_active
simultaneously (serialised by the i915_active.mutex).

Testcase: igt/gem_mmap_gtt/fault-concurrent
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190827132631.18627-1-chris@chris-wilson.co.uk

f52c6d0d

drm/i915/selftests: Markup impossible error pointers · 21b0c32b

Chris Wilson authored Aug 27, 2019

If we create a new live_context() we should have a mapping for each
engine. Document that assumption with an assertion.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190827094933.13778-1-chris@chris-wilson.co.uk

21b0c32b

drm/i915: Use NOEVICT for first pass on attemping to pin a GGTT mmap · ebfdf5cd

Chris Wilson authored Aug 26, 2019

The intention is that we first try to pin the current vma into the
mappable aperture only if it is already in use or it fits in the free
space and will not cause contention. The first attempt was meant to be
using PIN_NOEVICT to reuse the current vma if possible, following up
with different eviction strategies.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111485
Fixes: 6846895f ("drm/i915: Replace PIN_NONFAULT with calls to PIN_NOEVICT")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190826130750.17272-1-chris@chris-wilson.co.uk

ebfdf5cd

drm/i915/selftests: Add the usual batch vma managements to st_workarounds · 1d5b7773

Chris Wilson authored Aug 26, 2019

To properly handle asynchronous migration of batch objects, we need to
couple the fences on the incoming batch into the request and should not
assume that they always start idle.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190826072149.9447-1-chris@chris-wilson.co.uk

1d5b7773

26 Aug, 2019 1 commit

drm/i915: Call dma_set_max_seg_size() in i915_driver_hw_probe() · acd674af

Lyude Paul authored Aug 23, 2019

Currently, we don't call dma_set_max_seg_size() for i915 because we
intentionally do not limit the segment length that the device supports.
However, this results in a warning being emitted if we try to map
anything larger than SZ_64K on a kernel with CONFIG_DMA_API_DEBUG_SG
enabled:

[    7.751926] DMA-API: i915 0000:00:02.0: mapping sg segment longer
than device claims to support [len=98304] [max=65536]
[    7.751934] WARNING: CPU: 5 PID: 474 at kernel/dma/debug.c:1220
debug_dma_map_sg+0x20f/0x340

This was originally brought up on
https://bugs.freedesktop.org/show_bug.cgi?id=108517 , and the consensus
there was it wasn't really useful to set a limit (and that dma-debug
isn't really all that useful for i915 in the first place). Unfortunately
though, CONFIG_DMA_API_DEBUG_SG is enabled in the debug configs for
various distro kernels. Since a WARN_ON() will disable automatic problem
reporting (and cause any CI with said option enabled to start
complaining), we really should just fix the problem.

Note that as me and Chris Wilson discussed, the other solution for this
would be to make DMA-API not make such assumptions when a driver hasn't
explicitly set a maximum segment size. But, taking a look at the commit
which originally introduced this behavior, commit 78c47830
("dma-debug: check scatterlist segments"), there is an explicit mention
of this assumption and how it applies to devices with no segment size:

	Conversely, devices which are less limited than the rather
	conservative defaults, or indeed have no limitations at all
	(e.g. GPUs with their own internal MMU), should be encouraged to
	set appropriate dma_parms, as they may get more efficient DMA
	mapping performance out of it.

So unless there's any concerns (I'm open to discussion!), let's just
follow suite and call dma_set_max_seg_size() with UINT_MAX as our limit
to silence any warnings.

Changes since v3:
* Drop patch for enabling CONFIG_DMA_API_DEBUG_SG in CI. It looks like
  just turning it on causes the kernel to spit out bogus WARN_ONs()
  during some igt tests which would otherwise require teaching igt to
  disable the various DMA-API debugging options causing this. This is
  too much work to be worth it, since DMA-API debugging is useless for
  us. So, we'll just settle with this single patch to squelch WARN_ONs()
  during driver load for users that have CONFIG_DMA_API_DEBUG_SG turned
  on for some reason.
* Move dma_set_max_seg_size() call into i915_driver_hw_probe() - Chris
  Wilson
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v4.18+
Link: https://patchwork.freedesktop.org/patch/msgid/20190823205251.14298-1-lyude@redhat.com

acd674af

24 Aug, 2019 2 commits

drm/i915: to make vgpu ppgtt notificaiton as atomic operation · 52988009

Xiaolin Zhang authored Aug 23, 2019

vgpu ppgtt notification was split into 2 steps, the first step is to
update PVINFO's pdp register and then write PVINFO's g2v_notify register
with action code to tirgger ppgtt notification to GVT side.

currently these steps were not atomic operations due to no any protection,
so it is easy to enter race condition state during the MTBF, stress and
IGT test to cause GPU hang.

the solution is to add a lock to make vgpu ppgtt notication as atomic
operation.

Cc: stable@vger.kernel.org
Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1566543451-13955-1-git-send-email-xiaolin.zhang@intel.com

52988009

drm/i915/selftests: Teach igt_gpu_fill_dw() to take intel_context · 75b974a8

Chris Wilson authored Aug 24, 2019

Avoid having to pass around (ctx, engine) everywhere by passing the
actual intel_context we intend to use. Today we preach this lesson to
igt_gpu_fill_dw and its callers' callers.

The immediate benefit for the GEM selftests is that we aim to use the
GEM context as the control, the source of the engines on which to test
the GEM context.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823235141.31799-1-chris@chris-wilson.co.uk

75b974a8

23 Aug, 2019 20 commits

drm/i915: Keep drm_i915_file_private around under RCU · 77715906

Chris Wilson authored Aug 23, 2019

Ensure that the drm_i915_file_private continues to exist as we attempt
to remove a request from its list, which may race with the destruction
of the file.

<6> [38.380714] [IGT] gem_ctx_create: starting subtest basic-files
<0> [42.201329] BUG: spinlock bad magic on CPU#0, kworker/u16:0/7
<4> [42.201356] general protection fault: 0000 [#1] PREEMPT SMP PTI
<4> [42.201371] CPU: 0 PID: 7 Comm: kworker/u16:0 Tainted: G     U            5.3.0-rc5-CI-Patchwork_14169+ #1
<4> [42.201391] Hardware name: Dell Inc.                 OptiPlex 745                 /0GW726, BIOS 2.3.1  05/21/2007
<4> [42.201594] Workqueue: i915 retire_work_handler [i915]
<4> [42.201614] RIP: 0010:spin_dump+0x5a/0x90
<4> [42.201625] Code: 00 48 8d 88 c0 06 00 00 48 c7 c7 00 71 09 82 e8 35 ef 00 00 48 85 db 44 8b 4d 08 41 b8 ff ff ff ff 48 c7 c1 0b cd 0f 82 74 0e <44> 8b 83 e0 04 00 00 48 8d 8b c0 06 00 00 8b 55 04 48 89 ee 48 c7
<4> [42.201660] RSP: 0018:ffffc9000004bd80 EFLAGS: 00010202
<4> [42.201673] RAX: 0000000000000031 RBX: 6b6b6b6b6b6b6b6b RCX: ffffffff820fcd0b
<4> [42.201688] RDX: 0000000000000000 RSI: ffff88803de266f8 RDI: 00000000ffffffff
<4> [42.201703] RBP: ffff888038381ff8 R08: 00000000ffffffff R09: 000000006b6b6b6b
<4> [42.201718] R10: 0000000041cb0b89 R11: 646162206b636f6c R12: ffff88802a618500
<4> [42.201733] R13: ffff88802b32c288 R14: ffff888038381ff8 R15: ffff88802b32c250
<4> [42.201748] FS:  0000000000000000(0000) GS:ffff88803de00000(0000) knlGS:0000000000000000
<4> [42.201765] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [42.201778] CR2: 00007f2cefc6d180 CR3: 00000000381ee000 CR4: 00000000000006f0
<4> [42.201793] Call Trace:
<4> [42.201805]  do_raw_spin_lock+0x66/0xb0
<4> [42.201898]  i915_request_retire+0x548/0x7c0 [i915]
<4> [42.201989]  retire_requests+0x4d/0x60 [i915]
<4> [42.202078]  i915_retire_requests+0x144/0x2e0 [i915]
<4> [42.202169]  retire_work_handler+0x10/0x40 [i915]

Recently, in commit 44c22f3f ("drm/i915: Serialize insertion into the
file->mm.request_list"), we fixed a race on insertion. Now, it appears
we also have a race with destruction!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823181455.31910-1-chris@chris-wilson.co.uk

77715906

drm/i915/uc: define GuC and HuC FWs for EHL · 936ad29d

Daniele Ceraolo Spurio authored Aug 19, 2019

First uc firmware release for EHL.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Tested-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190820012327.36443-1-daniele.ceraolospurio@intel.com

936ad29d

drm/i915: Flush the existing fence before GGTT read/write · 636e83f2

Chris Wilson authored Aug 23, 2019

Our fence management is lazy, very lazy. If the user marks an object as
untiled, we do not immediately flush the fence but merely mark it as
dirty. On the next use we have to remember to check and remove the fence,
by which time we hope it is idle and we do not have to wait.

v2: Throw away the old fence on the next ggtt_pin.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111468
Fixes: 1f7fd484 ("drm/i915: Replace i915_vma_put_fence()")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823153944.20630-1-chris@chris-wilson.co.uk

636e83f2

drm/i915/gtt: Preallocate Braswell top-level page directory · 191797a8

Chris Wilson authored Aug 23, 2019

In order for the Braswell top-level PD to remain the same from the time
of request construction to its submission onto HW, as we may be
asynchronously rewriting the page tables (thus changing the expected
register state after having already stored the old addresses in the
request), the top level PD must be preallocated.

So wave goodbye to our lazy allocation of those 4x2 pages.

v2: A little bit of write-flushing required (presumably it always has
been required, but now we are more susceptible and it is showing up!)

v3: Put back the forced-PD-reload on every batch, we can't survive
without it and explicitly marking the context for PD reload makes
Braswell turn nasty.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823141421.2398-1-chris@chris-wilson.co.uk

191797a8

drm/i915: Hold irq-off for the entire fake lock period · 6dcb85a0

Chris Wilson authored Aug 23, 2019

Sadly lockdep records when the irqs are re-enabled and then marks up the
fake lock as being irq-unsafe. Our hand is forced and so we must mark up
the entire fake lock critical section as irq-off.

Hopefully this is the last tweak required!

v2: Not quite, we need to mark the timeline spinlock as irqsafe. That
was a genuine bug being hidden by the earlier lockdep splat.

Fixes: d6773926 ("drm/i915/gt: Mark up the nested engine-pm timeline lock as irqsafe")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823132700.25286-2-chris@chris-wilson.co.uk

6dcb85a0

drm/i915: Use hweight8() for 8bit masks · 0b14d968

Ville Syrjälä authored Aug 21, 2019

Use hweight8() instead of hweight32() for 8bit masks. Doesn't actually
matter for us since the arch code will go for hweight32() anyway, but
maybe we stil want to do this for documentation purposes?
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190821173033.24123-5-ville.syrjala@linux.intel.comReviewed-by: Jani Nikula <jani.nikula@intel.com>

0b14d968

drm/i915: s/num_active_crtcs/num_active_pipes/ · c08e9132

Ville Syrjälä authored Aug 21, 2019

Set a good example and talk about pipes rather than crtcs.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190821173033.24123-4-ville.syrjala@linux.intel.comReviewed-by: Jani Nikula <jani.nikula@intel.com>

c08e9132

drm/i915: Use enum pipe consistently · d048a268

Ville Syrjälä authored Aug 21, 2019

Replace all "int pipe"s with "enum pipe pipe"s to make it clear
what we're dealing with.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190821173033.24123-3-ville.syrjala@linux.intel.comReviewed-by: Jani Nikula <jani.nikula@intel.com>

d048a268

drm/i915: Unconfuse pipe vs. crtc->index in i915_get_crtc_scanoutpos() · e8edae54

Ville Syrjälä authored Aug 21, 2019

The "pipe" argument passed in by the vblank code is in fact the crtc
index. Don't assume that is the same as the pipe.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190821173033.24123-2-ville.syrjala@linux.intel.comReviewed-by: Jani Nikula <jani.nikula@intel.com>

e8edae54

drm/i915: Use enum pipe instead of crtc index to track active pipes · d06a79d3

Ville Syrjälä authored Aug 21, 2019

We may need to eliminate the crtc->index == pipe assumptions from
the code to support arbitrary pipes being fused off. Start that by
switching some bitmasks over to using pipe instead of the crtc index.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190821173033.24123-1-ville.syrjala@linux.intel.comReviewed-by: Jani Nikula <jani.nikula@intel.com>

d06a79d3

drm/i915: Expand subslice mask · 100f5f7f

Stuart Summers authored Aug 23, 2019

Currently, the subslice_mask runtime parameter is stored as an
array of subslices per slice. Expand the subslice mask array to
better match what is presented to userspace through the
I915_QUERY_TOPOLOGY_INFO ioctl. The index into this array is
then calculated:
  slice * subslice stride + subslice index / 8

v2: Fix 32-bit build
v3: Use new helper function in SSEU workaround warning message
v4: Use GEM_BUG_ON to force developers to use valid SSEU configurations
    per platform (Chris)
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823160307.180813-12-stuart.summers@intel.com

100f5f7f

drm/i915: Add new function to copy subslices for a slice · 668df17f

Stuart Summers authored Aug 23, 2019

Add a new function to copy subslices for a specified slice
between intel_sseu structures for the purpose of determining
power-gate status. Note that currently ss_stride has a max
of 1.
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823160307.180813-11-stuart.summers@intel.com

668df17f

drm/i915: Refactor instdone loops on new subslice functions · eaef5b3c

Stuart Summers authored Aug 23, 2019

Refactor instdone loops to use the new intel_sseu_has_subslice
function.
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823160307.180813-10-stuart.summers@intel.com

eaef5b3c

drm/i915: Add function to determine if a slice has a subslice · e1210bbf

Stuart Summers authored Aug 23, 2019

Add a new function to determine whether a particular slice
has a given subslice.
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823160307.180813-9-stuart.summers@intel.com

e1210bbf

drm/i915: Use subslice stride to set subslices for a given slice · 6db40ec8

Stuart Summers authored Aug 23, 2019

Add a subslice stride calculation when setting subslices. This
aligns more closely with the userspace expectation of the subslice
mask structure.

v2: Use local variable for subslice_mask on HSW and
    clean up a few other subslice_mask local variable
    changes
v3: Add GEM_BUG_ON for ss_stride to prevent array overflow (Chris)
    Split main set function and refactors in intel_device_info.c
    into separate patches (Chris)
v4: Reduce ss_stride size check when setting subslices per slice
    based on actual expected max stride (Chris)
    Move that GEM_BUG_ON check for the ss_stride out to the patch
    which adds the ss_stride
v5: Use memcpy instead of looping through each stride index
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823160307.180813-8-stuart.summers@intel.com

6db40ec8

drm/i915: Add function to set subslices · 9e8a135e

Stuart Summers authored Aug 23, 2019

Add a new function to set a set of subslices for a given
slice.

v2: Fix typo in subslice_mask assignment
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823160307.180813-7-stuart.summers@intel.com

9e8a135e

drm/i915: Use local variables for subslice_mask for device info · 33ee9e86

Stuart Summers authored Aug 23, 2019

When setting up subslice_mask, instead of operating on the slice
array directly, use a local variable to start bits per slice, then
use this to set the per slice array in one step.
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823160307.180813-6-stuart.summers@intel.com

33ee9e86

drm/i915: Add EU stride runtime parameter · 49610c37

Stuart Summers authored Aug 23, 2019

Add a new SSEU runtime parameter, eu_stride, which is
used to mirror the userspace concept of a range of EUs
per subslice.

This patch simply adds the parameter and updates usage
in the QUERY_TOPOLOGY_INFO handler.

v2: Add GEM_BUG_ON to make sure eu_stride is valid
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823160307.180813-5-stuart.summers@intel.com

49610c37

drm/i915: Add subslice stride runtime parameter · 7a200aad

Stuart Summers authored Aug 23, 2019

Add a new parameter, ss_stride, to the runtime info
structure. This is used to mirror the userspace concept
of subslice stride, which is a range of subslices per slice.

This patch simply adds the definition and updates usage
in the QUERY_TOPOLOGY_INFO handler.

v2: Add GEM_BUG_ON to make sure ss_stride is valid
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823160307.180813-4-stuart.summers@intel.com

7a200aad

drm/i915: Add function to set SSEU info per platform · 8b355db9

Stuart Summers authored Aug 23, 2019

Add a new function to allow each platform to set maximum
slice, subslice, and EU information to reduce code duplication.
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823160307.180813-3-stuart.summers@intel.com

8b355db9