Commits · a58c01aa6e4c9b8a8e883c496b33b1c50b38b3e4 · Kirill Smelkov / linux

29 Apr, 2016 14 commits

drm/i915: Apply strongly ordered RCS breadcrumb to gen8/legacy · a58c01aa

Chris Wilson authored Apr 29, 2016

For legacy ringbuffer mode, we need the new ordered breadcrumb emission
tried and tested on execlists in order to avoid the dreaded "missed
interrupt" syndrome. A secondary advantage of the execlists method is
that it writes to an arbitrary address, useful if one wants to write a
breadcrumb elsewhere.

This fix is taken from commit 7c17d377 (drm/i915: Use ordered seqno
write interrupt generation on gen8+ execlists).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461932305-14637-1-git-send-email-chris@chris-wilson.co.uk

a58c01aa

drm/i915: Trim the flush for the execlists request emission · 0e93cdd4

Chris Wilson authored Apr 29, 2016

At the start of request emission, we flush some space for the request,
estimating the typical size for the request body. The common tail is now
much larger than the typical body, so we can shrink the flush
substantially.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461917226-9132-3-git-send-email-chris@chris-wilson.co.uk

0e93cdd4

drm/i915: Trim the flush for the legacy request emission · a0442461

Chris Wilson authored Apr 29, 2016

At the start of request emission, we flush some space for the request,
estimating the typical size for the request body. The tail is now much
larger than the typical body, so we can shrink the flush slightly.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461917226-9132-2-git-send-email-chris@chris-wilson.co.uk

a0442461

drm/i915: Bump reserved size for legacy gen8 semaphore emission · 596e5efc

Chris Wilson authored Apr 29, 2016

With 5 rings and a flush, we need 192 bytes of space to emit the
breadcrumb and semaphores. However, we need some spare room the size of
the single largest packet (36 dwords, 144 bytes) to accommodate
wraparound giving a grand total of 336 bytes
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461917226-9132-1-git-send-email-chris@chris-wilson.co.uk

596e5efc

drm/i915: Move VLV HDMI lane reset work around logic to intel_dpio_phy.c · 0f572ebe

Ander Conselvan de Oliveira authored Apr 27, 2016

This moves the last phy specific code from the encoders to the phy
specific file.
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Jim Bride <jim.bride@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461761065-21195-11-git-send-email-ander.conselvan.de.oliveira@intel.com

0f572ebe

drm/i915: Unduplicate pre encoder enabling phy code · 5f68c275

Ander Conselvan de Oliveira authored Apr 27, 2016

The phy code in vlv_pre_enable_dp() and vlv_hdmi_pre_enable() is
exectly the same, so extract it to intel_dpio_phy.c.
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Jim Bride <jim.bride@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461761065-21195-10-git-send-email-ander.conselvan.de.oliveira@intel.com

5f68c275

drm/i915: Unduplicate VLV phy pre pll enabling code · 6da2e616

Ander Conselvan de Oliveira authored Apr 27, 2016

The code used by the DP and HDMI paths was very similar, so make them
share it. Note that this removes the write to signal level registers
from the HDMI pre pll enable path, but that's OK since those are set
in vlv_hdmi_pre_enable() function.
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Jim Bride <jim.bride@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461761065-21195-9-git-send-email-ander.conselvan.de.oliveira@intel.com

6da2e616

drm/i915: Unduplicate VLV signal level code · 53d98725

Ander Conselvan de Oliveira authored Apr 27, 2016

The logic for setting signal levels is used for both HDMI and DP with
small variations. But it is similar enough to put behind a function
called from the encoders.

v2: Remove unrelated MST changes due to rebase fumble. (Jim Bride)
    Fix typo in the commit message. (Jim Bride)

v3: Really fix the typo. (Jim)

Cc: Jim Bride <jim.bride@linux.intel.com>
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Jim Bride <jim.bride@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461761065-21195-8-git-send-email-ander.conselvan.de.oliveira@intel.com

53d98725

drm/i915: Unduplicate CHV encoders' post pll disable code · 204970b5

Ander Conselvan de Oliveira authored Apr 27, 2016

The exact same code was used by HDMI and DP encoders, so move it to
intel_dpio_phy.c.

v2: Fix typo in the commit message. (Jim Bride)
v3: Call the new function chv_phy_post_pll_disable() instead of
    chv_phy_post_disable(), as it should be called after the pll
    is disabled. (Ville)

Cc: Jim Bride <jim.bride@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Jim Bride <jim.bride@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461761065-21195-7-git-send-email-ander.conselvan.de.oliveira@intel.com

204970b5

drm/i915: Unduplicate CHV pre-encoder enabling phy logic · e7d2a717

Ander Conselvan de Oliveira authored Apr 27, 2016

The only difference between the DP and HDMI versions was the lane count.
Since lane_count is now set appropriately for HDMI too, get rid of the
duplication and move this to intel_dpio_phy.c

v2: Don't move comments about 2nd common lane staying alive. (Ville)
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Jim Bride <jim.bride@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461761065-21195-6-git-send-email-ander.conselvan.de.oliveira@intel.com

e7d2a717

drm/i915: Unduplicate CHV phy-releated pre pll enabling code · 419b1b7a

Ander Conselvan de Oliveira authored Apr 27, 2016

The same logic is used for DP and HDMI so move it to intel_dpio_phy.c.

v2: Rebase
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Jim Bride <jim.bride@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461761065-21195-5-git-send-email-ander.conselvan.de.oliveira@intel.com

419b1b7a

drm/i915: Unduplicate chv_data_lane_soft_reset() · 844b2f9a

Ander Conselvan de Oliveira authored Apr 27, 2016

The function chv_data_lane_soft_reset() was duplicated in DP and HDMI
code. Move it to intel_dpio_phy.c.
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Jim Bride <jim.bride@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461761065-21195-4-git-send-email-ander.conselvan.de.oliveira@intel.com

844b2f9a

drm/i915: Unduplicate CHV signal level code · b7fa22d8

Ander Conselvan de Oliveira authored Apr 27, 2016

The code for programming voltage swing and emphasis was duplicated
between DP and HDMI code. Move that to a new file, intel_dpio_phy.c.

v2: Keep the "Use 800mV-0dB" comment in the HDMI code. (Ville)
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Jim Bride <jim.bride@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461761065-21195-3-git-send-email-ander.conselvan.de.oliveira@intel.com

b7fa22d8

drm/i915: Set crtc_state->lane_count for HDMI · d4d6279a

Ander Conselvan de Oliveira authored Apr 27, 2016

Set the lane count for HDMI to 4. This will make it easier to
unduplicate CHV phy code.

This also fixes the the soft reset programming for HDMI with CHV. After
commit a8f327fb ("drm/i915: Clean up CHV lane soft reset
programming"), it wouldn't set the right bits for PCS23 since it relied
on a lane count that was never set.

v2: Set lane_count in *_get_config() to please state checker. (0day)
v3: Set lane_count for DDI in DVI mode too. (CI)
v4: Add note about CHV soft lane reset. (Ander)

Fixes: a8f327fb ("drm/i915: Clean up CHV lane soft reset programming")
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Jim Bride <jim.bride@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461761065-21195-2-git-send-email-ander.conselvan.de.oliveira@intel.com

d4d6279a

28 Apr, 2016 26 commits

drm/i915: Fix comments about GMBUSFREQ register · b5d99ff9

Ville Syrjälä authored Apr 26, 2016

The comment about GMBUSFREQ is confused. The spec actually explains
the 4MHz thing perfectly by noting that the 4MHz divider values is
actually just bits [9:2] not [9:0], hence the divide by 1000 correct.
Replace the confused note with a quote from the spec, and eliminate
the duplicated comment that snuck in.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461689194-6079-4-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Mika Kahola <mika.kahola@intel.com>

b5d99ff9

drm/i915: Use cached cdclk value in i915_audio_component_get_cdclk_freq() · 1033f92e

Ville Syrjälä authored Apr 26, 2016

No point in reading the cdclk out from the hardware every single time
since we have it cached already. Just return the cached value to the
audio driver.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461689194-6079-3-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Mika Kahola <mika.kahola@intel.com>

1033f92e

drm/i915: Update CDCLK_FREQ register on BDW after changing cdclk frequency · 7f1052a8

Ville Syrjälä authored Apr 26, 2016

Update CDCLK_FREQ on BDW after changing the cdclk frequency. Not sure
if this is a late addition to the spec, or if I simply overlooked this
step when writing the original code.

This is what Bspec has to say about CDCLK_FREQ:
"Program this field to the CD clock frequency minus one. This is used to
 generate a divided down clock for miscellaneous timers in display."

And the "Broadwell Sequences for Changing CD Clock Frequency" section
clarifies this further:
"For CD clock 337.5 MHz, program 337 decimal.
 For CD clock 450 MHz, program 449 decimal.
 For CD clock 540 MHz, program 539 decimal.
 For CD clock 675 MHz, program 674 decimal."

Cc: stable@vger.kernel.org
Cc: Mika Kahola <mika.kahola@intel.com>
Fixes: b432e5cf ("drm/i915: BDW clock change support")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461689194-6079-2-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Mika Kahola <mika.kahola@intel.com>

7f1052a8

drm/i915/bxt: Adjusting the error in horizontal timings retrieval · 042ab0c3

Ramalingam C authored Apr 19, 2016

In BXT DSI there is no regs programmed with few horizontal timings
in Pixels but txbyteclkhs.. So retrieval process adds some
ROUND_UP ERRORS in the process of PIXELS<==>txbyteclkhs.

Actually here for the given adjusted_mode, we are calculating the
value programmed to the port and then back to the horizontal timing
param in pixels. This is the expected value at the end of get_config,
including roundup errors. And if that is same as retrieved value
from port, then retrieved (HW state) adjusted_mode's horizontal
timings are corrected to match with SW state to nullify the errors.
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461053894-5058-2-git-send-email-ramalingam.c@intel.com

042ab0c3

drm/i915/BXT: Retrieving the horizontal timing for DSI · cefc4e18

Ramalingam C authored Apr 19, 2016

Retriving the horizontal timings from the port registers as part of
get_config().

This fixes a division by zero:

[   56.916557] divide error: 0000 [#1] PREEMPT SMP
[   56.921741] Modules linked in: i915(+) drm_kms_helper syscopyarea
sysfillrect sysimgblt fb_sys_fops drm intel_gtt agpgart cf
g80211 rfkill binfmt_misc ax88179_178a kvm_intel kvm irqbypass crc32c_intel
efivars tpm_tis tpm fuse
[   56.944106] CPU: 3 PID: 1097 Comm: modprobe Not tainted 4.6.0-rc4+ #433
[   56.951501] Hardware name: Intel Corp. Broxton M/RVP, BIOS
BXT1RVPA.X64.0131.B30.1604142217 04/14/2016
[   56.961908] task: ffff88007a854d00 ti: ffff88007aea0000 task.ti:
ffff88007aea0000
[   56.970273] RIP: 0010:[<ffffffffa01235b2>]  [<ffffffffa01235b2>]
drm_mode_hsync+0x22/0x40 [drm]
[   56.980043] RSP: 0018:ffff88007aea3788  EFLAGS: 00010206
[   56.985982] RAX: 000000000788b600 RBX: ffff880073c22108 RCX:
0000000000000000
[   56.993957] RDX: 0000000000000000 RSI: ffff88007ab06800 RDI:
ffff880073c22108
[   57.001935] RBP: ffff88007aea3788 R08: 0000000000000001 R09:
ffff880073c221e8
[   57.009903] R10: ffff880073c22108 R11: 0000000000000001 R12:
ffff88007a300000
[   57.017872] R13: ffff880073c22000 R14: ffff880175f78000 R15:
ffff880175f78798
[   57.025849] FS:  00007f105d3e6700(0000) GS:ffff88017fd80000(0000)
knlGS:0000000000000000
[   57.034894] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   57.041317] CR2: 00007f4d485101d0 CR3: 000000007a820000 CR4:
00000000003406e0
[   57.049292] Stack:
[   57.051539]  ffff88007aea37a0 ffffffffa043b632 ffff880175f787c8
ffff88007aea3810
[   57.059825]  ffffffffa043d59e ffff880175f787b0 ffff88007ab68c00
ffff88007aea37f0
[   57.068128]  ffff880073c221e8 ffff880073c22108 ffff880175f78780
ffff880100000000
[   57.076430] Call Trace:
[   57.079254]  [<ffffffffa043b632>] intel_mode_from_pipe_config+0x82/0xb0
[i915]
[   57.087405]  [<ffffffffa043d59e>] intel_modeset_setup_hw_state+0x55e/0xd60
[i915]
[   57.095847]  [<ffffffffa043ff94>] intel_modeset_init+0x8e4/0x1630 [i915]
[   57.103415]  [<ffffffffa047bcf0>] i915_driver_load+0xbe0/0x1980 [i915]
[   57.110745]  [<ffffffffa0116c19>] drm_dev_register+0xa9/0xc0 [drm]
[   57.117681]  [<ffffffffa011921d>] drm_get_pci_dev+0x8d/0x1e0 [drm]
[   57.124600]  [<ffffffff8195f942>] ? _raw_spin_unlock_irqrestore+0x42/0x70
[   57.132253]  [<ffffffffa03b0384>] i915_pci_probe+0x34/0x50 [i915]
[   57.139070]  [<ffffffff8149c375>] local_pci_probe+0x45/0xa0
[   57.145303]  [<ffffffff8149d300>] ? pci_match_device+0xe0/0x110
[   57.151924]  [<ffffffff8149d6cb>] pci_device_probe+0xdb/0x130
[   57.158355]  [<ffffffff81579b93>] driver_probe_device+0x223/0x440
[   57.165169]  [<ffffffff81579e85>] __driver_attach+0xd5/0x100
[   57.171500]  [<ffffffff81579db0>] ? driver_probe_device+0x440/0x440
[   57.178510]  [<ffffffff81577736>] bus_for_each_dev+0x66/0xa0
[   57.184841]  [<ffffffff815793de>] driver_attach+0x1e/0x20
[   57.190881]  [<ffffffff81578d6e>] bus_add_driver+0x1ee/0x280
[   57.197212]  [<ffffffff8157abc0>] driver_register+0x60/0xe0
[   57.203447]  [<ffffffff8149bc50>] __pci_register_driver+0x60/0x70
[   57.210285]  [<ffffffffa0119450>] drm_pci_init+0xe0/0x110 [drm]
[   57.216911]  [<ffffffff810dcd8d>] ? trace_hardirqs_on+0xd/0x10
[   57.223434]  [<ffffffffa023a000>] ? 0xffffffffa023a000
[   57.229237]  [<ffffffffa023a092>] i915_init+0x92/0x99 [i915]
[   57.235570]  [<ffffffff810003db>] do_one_initcall+0xab/0x1d0
[   57.241900]  [<ffffffff810f9eef>] ? rcu_read_lock_sched_held+0x7f/0x90
[   57.249205]  [<ffffffff81204f18>] ? kmem_cache_alloc_trace+0x248/0x2b0
[   57.256509]  [<ffffffff811a5eee>] ? do_init_module+0x27/0x1d9
[   57.262934]  [<ffffffff811a5f26>] do_init_module+0x5f/0x1d9
[   57.269167]  [<ffffffff8112392f>] load_module+0x20ef/0x27b0
[   57.275401]  [<ffffffff8111f8e0>] ? store_uevent+0x40/0x40
[   57.281541]  [<ffffffff81124243>] SYSC_finit_module+0xc3/0xf0
[   57.287969]  [<ffffffff8112428e>] SyS_finit_module+0xe/0x10
[   57.294203]  [<ffffffff81960069>] entry_SYSCALL_64_fastpath+0x1c/0xac
[   57.301406] Code: ff 5d c3 66 0f 1f 44 00 00 0f 1f 44 00 00 8b 87 d8 00 00
00 55 48 89 e5 85 c0 75 22 8b 4f 68 85 c9 78 1b 69 47 58 e8 03 00 00 99 <f7> f9
b9 d3 4d 62 10 05 f4 01 00 00 f7 e1 89 d0 c1 e8 06 5d c3
[   57.322964] RIP  [<ffffffffa01235b2>] drm_mode_hsync+0x22/0x40 [drm]
[   57.330103]  RSP <ffff88007aea3788>
[   57.334276] ---[ end trace d414224cb2e2a4cf ]---
[   57.339861] modprobe (1097) used greatest stack depth: 12048 bytes left

Fixes: 6f0e7535 ("drm/i915/BXT: Get pipe conf from the port registers")
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Acked-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461053894-5058-1-git-send-email-ramalingam.c@intel.com

cefc4e18

drm/i915: Unify GPU resets upon shutdown · e7ae86ba

Chris Wilson authored Apr 28, 2016

Both execlists and legacy need to reset the context (and mode) of the
GPU before we lose control of the system. By resetting the GPU, we
revert back to default settings. This simplifies the life of any
subsequent driver (in particular for virtualized setups) as it does not
then have to try and recover from an unknown condition. As both paths
need to reset for the same reason, move the reset to a common point.

This unifies the resets added in a647828a (drm/i915: Also perform gpu
reset under execlist mode) and 8e96d9c4 (drm/i915: reset the GPU on
context fini).

v2: Restrict the reset to "modern" gen (where we enable HW contexts) to
try and avoid leaving the machine in an unusable state with a risky
reset on older GPU. This should keep the status quo as to who performs
resets (i.e. currently only GPUs with HW contexts perform a reset on
shutdown).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
CC: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: "Niu, Bing" <bing.niu@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-25-git-send-email-chris@chris-wilson.co.uk

e7ae86ba

drm/i915: Stop tracking execlists retired requests · e39d42fa

Tvrtko Ursulin authored Apr 28, 2016

With the previous patch having extended the pinned lifetime of
contexts by referencing the previous context from the current
request until the latter is retired (completed by the GPU),
we can now remove usage of execlist retired queue entirely.

This is because the above now guarantees that all execlist
object access requirements are satisfied by this new tracking,
and we can stop taking additional references and stop keeping
request on the execlists retired queue.

The latter was a source of significant scalability issues in
the driver causing performance hits on some tests. Most
dramatical of which was igt/gem_close_race which had run time
in tens of minutes which is now reduced to tens of seconds.
Signed-off-by: Tvrtko Ursulin <tvrtko@ursulin.net>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-24-git-send-email-chris@chris-wilson.co.uk

e39d42fa

drm/i915: Store LRC hardware id in the request · a3d12761

Tvrtko Ursulin authored Apr 28, 2016

This way in the following patch we can disconnect requests
from contexts.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-23-git-send-email-chris@chris-wilson.co.uk

a3d12761

drm/i915: Track the previous pinned context inside the request · a16a4052

Chris Wilson authored Apr 28, 2016

As the contexts are accessed by the hardware until the switch is completed
to a new context, the hardware may still be writing to the context object
after the breadcrumb is visible. We must not unpin/unbind/prune that
object whilst still active and so we keep the previous context pinned until
the following request. We can generalise the tracking we already do via
the engine->last_context and move it to the request so that it works
equally for execlists and GuC.

v2: Drop the execlists double pin as that exposes a race inside the lrc
irq handler as it tries to access the context after it may be retired.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-22-git-send-email-chris@chris-wilson.co.uk

a16a4052

drm/i915: Move releasing of the GEM request from free to retire/cancel · 73db04cf

Chris Wilson authored Apr 28, 2016

If we move the release of the GEM request (i.e. decoupling it from the
various lists used for client and context tracking) after it is complete
(either by the GPU retiring the request, or by the caller cancelling the
request), we can remove the requirement that the final unreference of
the GEM request need to be under the struct_mutex.

The careful reader may notice that one or two impossible NULL pointer
tests are dropped for readability. These pointers cannot be NULL since
they are assigned during request construction and never unset.

v2,v3: Rebalance execlists by moving the context unpinning.
v4: Rebase onto -nightly
v5: Avoid trying to rebalance execlist/GuC context pinning, leave that
to the next step
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-21-git-send-email-chris@chris-wilson.co.uk

73db04cf

drm/i915: Move the magical deferred context allocation into the request · 978f1e09

Chris Wilson authored Apr 28, 2016

We can hide more details of execlists from higher level code by removing
the explicit call to create an execlist context from execbuffer and
into its first use by execlists.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-20-git-send-email-chris@chris-wilson.co.uk

978f1e09

drm/i915: Refactor execlists default context pinning · 24f1d3cc

Chris Wilson authored Apr 28, 2016

Refactor pinning and unpinning of contexts, such that the default
context for an engine is pinned during initialisation and unpinned
during teardown (pinning of the context handles the reference counting).
Thus we can eliminate the special case handling of the default context
that was required to mask that it was not being pinned normally.

v2: Rebalance context_queue after rebasing.
v3: Rebase to -nightly (not 40 patches in)
v4: Rebase onto request_alloc unwinding
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-19-git-send-email-chris@chris-wilson.co.uk

24f1d3cc

drm/i915: Replace the pinned context address with its unique ID · 7069b144

Chris Wilson authored Apr 28, 2016

Rather than reuse the current location of the context in the global GTT
for its hardware identifier, use the context's unique ID assigned to it
for its whole lifetime.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-18-git-send-email-chris@chris-wilson.co.uk

7069b144

drm/i915: Assign every HW context a unique ID · 5d1808ec

Chris Wilson authored Apr 28, 2016

The hardware tracks contexts and expects all live contexts (those active
on the hardware) to have a unique identifier. This is used by the
hardware to assign pagefaults and the like to a particular context.

v2: Reorder to make sure ctx->link is not left dangling if the
assignment of a hw_id fails (Mika).

v3: We have 21bits of context space, not 20.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-17-git-send-email-chris@chris-wilson.co.uk

5d1808ec

drm/i915: Update execlists context descriptor format commentary · ef87bba8

Chris Wilson authored Apr 28, 2016

The comments describing the Context Descriptor Format are off by a bit
for the size of the context ID.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Gordon <david.s.gordon@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-16-git-send-email-chris@chris-wilson.co.uk

ef87bba8

drm/i915: Preallocate enough space for the average request · 6310346e

Chris Wilson authored Apr 28, 2016

Rather than being interrupted when we run out of space halfway through
the request, and having to restart from the beginning (and returning to
userspace), flush a little more free space when we prepare the request.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-15-git-send-email-chris@chris-wilson.co.uk

6310346e

drm/i915: Manually unwind after a failed request allocation · bfa01200

Chris Wilson authored Apr 28, 2016

In the next patches, we want to move the work out of freeing the request
and into its retirement (so that we can free the request without
requiring the struct_mutex). This means that we cannot rely on
unreferencing the request to completely teardown the request any more
and so we need to manually unwind the failed allocation. In doing so, we
reorder the allocation in order to make the unwind simple (and ensure
that we don't try to unwind a partial request that may have modified
global state) and so we end up pushing the initial preallocation down
into the engine request initialisation functions where we have the
requisite control over the state of the request.

Moving the initial preallocation into the engine is less than ideal: it
moves logic to handle a specific problem with request handling out of
the common code. On the other hand, it does allow those backends
significantly more flexibility in performing its allocations.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-14-git-send-email-chris@chris-wilson.co.uk

bfa01200

drm/i915: Remove the identical implementations of request space reservation · 0251a963

Chris Wilson authored Apr 28, 2016

Now that we share intel_ring_begin(), reserving space for the tail of
the request is identical between legacy/execlists and so the tautology
can be removed. In the process, we move the reserved space tracking
from the ringbuffer on to the request. This is to enable us to reorder
the reserved space allocation in the next patch.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-13-git-send-email-chris@chris-wilson.co.uk

0251a963

drm/i915: Unify intel_ring_begin() · 987046ad

Chris Wilson authored Apr 28, 2016

Combine the near identical implementations of intel_logical_ring_begin()
and intel_ring_begin() - the only difference is that the logical wait
has to check for a matching ring (which is assumed by legacy).

In the process some debug messages are culled as there were following a
WARN if we hit an actual error.

v2: Updated commentary
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-12-git-send-email-chris@chris-wilson.co.uk

987046ad

drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use · f9326be5

Chris Wilson authored Apr 28, 2016

The code to switch_mm() is already handled by i915_switch_context(), the
only difference required to setup the aliasing ppgtt is that we need to
emit te switch_mm() on the first context, i.e. when transitioning from
engine->last_context == NULL. This allows us to defer the
initialisation of the GPU from early device initialisation to first use,
which should marginally speed up both. The caveat is that we then defer
the context initialisation until first use - i.e. we cannot assume that
the GPU engines are initialised. For example, this means that power
contexts for rc6 (Ironlake) need to explicitly loaded, as they are.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-11-git-send-email-chris@chris-wilson.co.uk

f9326be5

drm/i915: Remove early l3-remap · d200cda6

Chris Wilson authored Apr 28, 2016

Since we do the l3-remap on context switch, we can remove the redundant
early call to set the mapping prior to performing the first context
switch.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-10-git-send-email-chris@chris-wilson.co.uk

d200cda6

drm/i915: Consolidate L3 remapping LRI · ff55b5e8

Chris Wilson authored Apr 28, 2016

We can use a single MI_LOAD_REGISTER_IMM command packet to write all the
L3 remapping registers, shrinking the number of bytes required to emit
the context switch.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-9-git-send-email-chris@chris-wilson.co.uk

ff55b5e8

drm/i915: L3 cache remapping is part of context switching · b0ebde39

Chris Wilson authored Apr 28, 2016

Move the i915_gem_l3_remap function such that it next to the context
switching, which is where we perform the L3 remap.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-8-git-send-email-chris@chris-wilson.co.uk

b0ebde39

drm/i915: Mark the current context as lost on suspend · b2e862d0

Chris Wilson authored Apr 28, 2016

In order to force a reload of the context image upon resume, we first
need to mark its absence on suspend. Currently we are failing to restore
the golden context state and any context w/a to the default context
after resume.

One oversight corrected, is that we had forgotten to reapply the L3
remapping when restoring the lost default context.

v2: Remove deprecated WARN.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-7-git-send-email-chris@chris-wilson.co.uk

b2e862d0

drm/i915: Use i915_vma_pin_iomap on the ringbuffer object · 3d77e9be

Chris Wilson authored Apr 28, 2016

Similarly to i915_gem_object_pin_map on LLC platforms, we can
use the new VMA based io mapping on !LLC to amoritize the cost
of ringbuffer pinning and unpinning.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-6-git-send-email-chris@chris-wilson.co.uk

3d77e9be

drm/i915: Move ioremap_wc tracking onto VMA · 8ef8561f

Chris Wilson authored Apr 28, 2016

By tracking the iomapping on the VMA itself, we can share that area
between multiple users. Also by only revoking the iomapping upon
unbinding from the mappable portion of the GGTT, we can keep that iomap
across multiple invocations (e.g. execlists context pinning).

Note that by moving the iounnmap tracking to the VMA, we actually end up
fixing a leak of the iomapping in intel_fbdev.

v1.5: Rebase prompted by Tvrtko
v2: Drop dev_priv parameter, we can recover the i915_ggtt from the vma.
v3: Move handling of ioremap space exhaustion to vmap_purge and also
allow vmallocs to recover old iomaps. Add Tvrtko's kerneldoc.
v4: Fix a use-after-free in shrinker and rearrange i915_vma_iomap
v5: Back to i915_vm_to_ggtt
v6: Use i915_vma_pin_iomap and i915_vma_unpin_iomap to mark critical
sections and ensure the VMA cannot be reaped whilst mapped.
v7: Move i915_vma_iounmap so that consumers of the API are not tempted,
and add iomem annotations
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-5-git-send-email-chris@chris-wilson.co.uk

8ef8561f