Commits · 88595ac9ad199fc2e09270ec92e1ff0806033b83 · nexedi / linux

26 Mar, 2015 5 commits

drm/i915: always preserve bios swizzling · 88595ac9

Daniel Vetter authored Mar 26, 2015

Currently we only set preserve_bios_swizzling when the initial fb is
shared and totally miss the single-screen case. Fix this by
consolidating all the logic for both cases.

This seems to go back to when swizzle preservation was originally
merged in

commit d9ceb816
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date:   Thu Oct 9 12:57:43 2014 -0700

    drm/i915: preserve swizzle settings if necessary v4

Cc: Kristian Høgsberg <hoegsberg@gmail.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

88595ac9

drm/i915: Add initial_ prefix to bios fb takeover code · f6936e29

Daniel Vetter authored Mar 26, 2015

In spirit with

commit 5724dbd1
Author: Damien Lespiau <damien.lespiau@intel.com>
Date:   Tue Jan 20 12:51:52 2015 +0000

    drm/i915: Rename plane_config to initial_plane_config

to make it clear that this code is all special-purpose for the initial
plane takeover.

Cc: Damien Lespiau <damien.lespiau@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

f6936e29

drm/i915: Remove duplicated psr.active unset · 3cd919fc

Rodrigo Vivi authored Mar 25, 2015

psr.active is being unset out of the if so this here is useless and
duplicated.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

3cd919fc

drm/i915: Fixup legacy plane->crtc link for initial fb config · 5097f8c7

Daniel Vetter authored Mar 25, 2015

This is a very similar bug in the load detect code fixed in

commit 9128b040
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Tue Mar 3 17:31:21 2015 +0100

    drm/i915: Fix modeset state confusion in the load detect code

But this time around it was the initial fb code that forgot to update
the plane->crtc pointer. Otherwise it's the exact same bug, with the
exact same restrains (any set_config call/ioctl that doesn't disable
the pipe papers over the bug for free, so fairly hard to hit in normal
testing). So if you want the full explanation just go read that one
over there - it's rather long ...

Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Reported-and-tested-by: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

5097f8c7

drm/i915: kill i915.powersave · ab585dea

Rodrigo Vivi authored Mar 24, 2015

This flag was being mostly used as a meta flag in some
cases and not covering other cases.

One of the risks is that it was masking some frontbuffer
trackings without disabling PSR.

So, better to kill this at once and avoid umbrella parameters.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Drop unused out: label to appease gcc.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

ab585dea

25 Mar, 2015 3 commits

drm/i915: Add fault address to error state for gen8 and gen9 · 6c826f34

Mika Kuoppala authored Mar 24, 2015

The faulting virtual address is >32bits and has been moved
to different registers. Add to error state and output upper
register first, in the same line for easy reconstruction of
the fault address.

v2: correct gen masking (Michel)
v3: s/TBL/TLB (Ville)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

6c826f34

drm/i915/skl: Fix up positive error code · 1d00dad5

Tvrtko Ursulin authored Mar 25, 2015

It should have been negative since it is returned with ERR_PTR().

Introduced in new code commit:

   commit 50470bb0
   Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
   Date:   Mon Mar 23 11:10:36 2015 +0000

    drm/i915/skl: Support secondary (rotated) frame buffer mapping
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

1d00dad5

drm/i915: make unsupported fb modifier message DRM_DEBUG · c0f40428

Jesse Barnes authored Mar 23, 2015

Or users can just spam the log all they want.

Issue introduced in

commit 9a8f0a12
Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Date:   Fri Feb 27 11:15:24 2015 +0000

    drm/i915/skl: Allow Y (and Yf) frame buffer creation

References: https://bugs.freedesktop.org/show_bug.cgi?id=89628Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

c0f40428

24 Mar, 2015 3 commits

drm/i915: Removing the drrs capability enum initialization · c2d885c6

Ramalingam C authored Mar 23, 2015

As part of allocation of the drm_i915_private variable, drrs capability
enum is initialized to DRRS_NOT_SUPPORTED. Hence need not initialize at
each connector init.

Moreover initializing this enum at connector init will reset
the successful DRRS initialization of previous connector, as we have
the DRRS support for only one panel at a time.
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

c2d885c6

drm/i915: move clearing of RPS interrupt bits from disable to reset time · 096fad9e

Imre Deak authored Mar 23, 2015

The logical place for clearing the RPS latched interrupt bits is when
resetting the RPS interrupts, so move the corresponding part from the RPS
disable function to the reset function. During resetting we already
cleared the IIR bits, so the only thing missing there was clearing pm_iir.

Note that we call gen6_disable_rps_interrupts() also during driver load
and resume time via intel_uncore_sanitize() when i915 interrupts are
still not installed. If there are any pending RPS bits at this point
(which after this patch wouldn't be cleared) they will be cleared by the
reset code via the interrupt preinstall hooks.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

096fad9e

drm/i915: fix race when clearing RPS IIR bits · 58072ccb

Imre Deak authored Mar 23, 2015

When disabling RPS interrupts there is a race where we disable RPS
inerrupts while the interrupt handler is running and the handler has
already latched the pending RPS interrupt from the master IIR register.
Afterwards the disabling path clears the PM IIR bits, making the state
of pending interrupts inconsistent from the interrupt handler's point of
view. This triggers the following warning: "The master control interrupt
lied (PM)!".

To fix this make sure that any running interrupt handler (which may
have already latched the master IIR) finishes before clearing the IIR
bits.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87347Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

58072ccb

23 Mar, 2015 8 commits

drm/i915/skl: Take 90/270 rotation into account in watermark calculations · 1fc0a8f7

Tvrtko Ursulin authored Mar 23, 2015

v2: Pass in rotation info to sprite plane updates as well.

v3: Use helper to determine 90/270 rotation. (Michel Thierry)

v4: Rebased for fb modifiers and atomic changes.

For: VIZ-4546
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com> (v3)
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

1fc0a8f7

drm/i915/skl: Query display address through a wrapper · 121920fa

Tvrtko Ursulin authored Mar 23, 2015

Need to do this in order to support 90/270 rotated display.

v2: Pass in drm_plane instead of plane index to intel_obj_display_address.

v3:
    * Renamed intel_obj_display_address to intel_plane_obj_offset.
      (Chris Wilson)
    * Simplified rotation check to bitwise AND. (Chris Wilson)

v4:
    * Extracted 90/270 rotation check into a helper function. (Michel Thierry)

v5:
    * Rebased for ggtt view changes.

For: VIZ-4545
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

121920fa

drm/i915/skl: Support secondary (rotated) frame buffer mapping · 50470bb0

Tvrtko Ursulin authored Mar 23, 2015

90/270 rotated scanout needs a rotated GTT view of the framebuffer.

This is put in a separate VMA with a dedicated ggtt view and wired such that
it is created when a framebuffer is pinned to a 90/270 rotated plane.

Rotation is only possible with Yb/Yf buffers and error is propagated to
user space in case of a mismatch.

Special rotated page view is constructed at the VMA creation time by
borrowing the DMA addresses from obj->pages.

v2:
    * Do not bother with pages for rotated sg list, just populate the DMA
      addresses. (Daniel Vetter)
    * Checkpatch cleanup.

v3:
    * Rebased on top of new plane handling (create rotated mapping when
      setting the rotation property).
    * Unpin rotated VMA on unpinning from display plane.
    * Simplify rotation check using bitwise AND. (Chris Wilson)

v4:
    * Fix unpinning of optional rotated mapping so it is really considered
      to be optional.

v5:
   * Rebased for fb modifier changes.
   * Rebased for atomic commit.
   * Only pin needed view for display. (Ville Syrjälä, Daniel Vetter)

v6:
   * Rebased after preparatory work has been extracted out. (Daniel Vetter)

v7:
   * Slightly simplified tiling geometry calculation.
   * Moved rotated GGTT view implementation into i915_gem_gtt.c (Daniel Vetter)

v8:
   * Do not use i915_gem_obj_size to get object size since that actually
     returns the size of an VMA which may not exist.
   * Rebased for ggtt view changes.

v9:
   * Rebased after code review changes on the preceding patches.
   * Tidy function definitions. (Joonas Lahtinen)

For: VIZ-4726
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com> (v4)
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

50470bb0

drm/i915: Helper function to determine GGTT view from plane state · f64b98cd

Tvrtko Ursulin authored Mar 23, 2015

For now only default implementation defaulting to normal view.

v2: Some code review cleanups. (Joonas Lahtinen)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v2)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

f64b98cd

drm/i915: Pass in plane state when (un)pinning frame buffers · 82bc3b2d

Tvrtko Ursulin authored Mar 23, 2015

Plane state carries the rotation information which is needed for determining
the appropriate GGTT view type.

This just adds the parameter with the actual usage coming in future patches.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

82bc3b2d

drm/i915: Use GGTT view when (un)pinning objects to planes · e6617330

Tvrtko Ursulin authored Mar 23, 2015

To support frame buffer rotation we need to be able to pass on the information
on what kind of GGTT view is required for display.

This patch just adds the parameter and makes all the callers default to the
normal view.

v2: Rebased for ggtt view changes.
v3: Don't limit PIN_MAPPABLE to normal views just yet. (Joonas Lahtinen)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v3)
[danvet: s/BUG/WARN/ in the patch hunk because. At least where the
BUG_ON isn't fatal right away.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

e6617330

drm/i915/skl: Extract tile height code into a helper function · 6761dd31

Tvrtko Ursulin authored Mar 23, 2015

It will be used in a later patch and also convert all height parameters
from int to unsigned int.

v2: Rebased for fb modifiers.
v3: Fixed v2 rebase.
v4:
   * Height should be unsigned int.
   * Make it take pixel_format for consistency and simplicity.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com> (v1)
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v4)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

6761dd31

drm/i915: Use usleep_range() in wait_for() · 9848de08

Ville Syrjälä authored Mar 20, 2015

msleep() can sleep for way too long, so switch wait_for() to use
usleep_range() instead. Following a totally unscientific method
I just picked the range as W-2W.

This cuts the i915 init time on my BSW to almost half:
- initcall i915_init+0x0/0xa8 [i915] returned 0 after 419977 usecs
+ initcall i915_init+0x0/0xa8 [i915] returned 0 after 238419 usecs

Note that I didn't perform any other benchmarks on this so far.

Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

9848de08

20 Mar, 2015 21 commits

drm/i915: Do not leak objects after capturing error state · 0b37a9a9

Michel Thierry authored Mar 20, 2015

While running kmemleak chasing a different memleak, I saw that the
capture_error_state function was leaking some objects, for example:

unreferenced object 0xffff8800a9b72148 (size 8192):
  comm "kworker/u16:0", pid 1499, jiffies 4295201243 (age 990.096s)
  hex dump (first 32 bytes):
    00 00 04 00 00 00 00 00 5d f4 ff ff 00 00 00 00  ........].......
    00 30 b0 01 00 00 00 00 37 00 00 00 00 00 00 00  .0......7.......
  backtrace:
    [<ffffffff811e5ae4>] create_object+0x104/0x2c0
    [<ffffffff8178f50a>] kmemleak_alloc+0x7a/0xc0
    [<ffffffff811cde4b>] __kmalloc+0xeb/0x220
    [<ffffffffa038f1d9>] kcalloc.constprop.12+0x2d/0x2f [i915]
    [<ffffffffa0316064>] i915_capture_error_state+0x3f4/0x1660 [i915]
    [<ffffffffa03207df>] i915_handle_error+0x7f/0x660 [i915]
    [<ffffffffa03210f7>] i915_hangcheck_elapsed+0x2e7/0x470 [i915]
    [<ffffffff8108d574>] process_one_work+0x144/0x490
    [<ffffffff8108dfbd>] worker_thread+0x11d/0x530
    [<ffffffff81094079>] kthread+0xc9/0xe0
    [<ffffffff817a2398>] ret_from_fork+0x58/0x90
    [<ffffffffffffffff>] 0xffffffffffffffff

The following objects are allocated in i915_gem_capture_buffers, but not
released in i915_error_state_free:
  - error->active_bo_count
  - error->pinned_bo
  - error->pinned_bo_count
  - error->active_bo[vm_count] (allocated in i915_gem_capture_vm).

The leaks were introduced by
commit 95f5301d
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Wed Jul 31 17:00:15 2013 -0700

    drm/i915: Update error capture for VMs

v2: Reuse iterator and add culprit commit details (Chris)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

0b37a9a9

drm/i915: Fix SKL sprite disable double buffer register update · 2ddc1dad

Ville Syrjälä authored Mar 19, 2015

Write the PLANE_SURF register instead of PLANE_CTL to arm the double
buffer regisrter update.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

2ddc1dad

drm/i915: Eliminate plane control register RMW from sprite code · 48fe4691

Ville Syrjälä authored Mar 19, 2015

Replace the RMW access with explicit initialization of the entire plane
control register, as was done for primary planes in:

 commit f45651ba
 Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
 Date:   Fri Aug 8 21:51:10 2014 +0300

    drm/i915: Eliminate rmw from .update_primary_plane()

The automagic primary plane disable is still doing RMWs, but that will
require more work to untangle, so leave it alone for now.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

48fe4691

drm/i915: Eliminate the RMW sprite colorkey management · 47ecbb20

Ville Syrjälä authored Mar 19, 2015

Store the colorkey in intel_plane and kill off all the RMW stuff
handling it.

This is just an intermediate step and eventually the colorkey needs to
be converted into some properties.

v2: Actually update the hardware state in the set_colorkey ioctl (Daniel)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

47ecbb20

drm/i915: Move vblank wait determination to 'check' phase · 08fd59fc

Matt Roper authored Mar 18, 2015

Determining whether we'll need to wait for vblanks is something we
should determine during the atomic 'check' phase, not the 'commit'
phase.  Note that we only set these bits in the branch of 'check' where
intel_crtc->active is true so that we don't try to wait on a disabled
CRTC.

The whole 'wait for vblank after update' flag should go away in the
future, once we start handling watermarks in a proper atomic manner.

This regression has been introduced in

commit 2fdd7def16dd7580f297827930126c16b152ec11
Author: Matt Roper <matthew.d.roper@intel.com>
Date:   Wed Mar 4 10:49:04 2015 -0800
    drm/i915: Don't clobber plane state on internal disables

Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Root-cause-analysis-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89550
Testcase: igt/pm_rpm/legacy-planes
Testcase: igt/pm_rpm/legacy-planes-dpms
Testcase: igt/pm_rpm/universal-planes
Testcase: igt/pm_rpm/universal-planes-dpms
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Tested-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

08fd59fc

drm/i915/chv: use vlv_PLL_is_optimal in chv_find_best_dpll · 9ca3ba01

Imre Deak authored Mar 17, 2015

Prepare chv_find_best_dpll to be used for BXT too, where we want to
consider the error between target and calculated frequency too when
choosing a better PLL configuration.

No functional change.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

9ca3ba01

drm/i915: check for div-by-zero in vlv_PLL_is_optimal · 24be4e46

Imre Deak authored Mar 17, 2015

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

24be4e46

drm/i915: factor out vlv_PLL_is_optimal · d5dd62bd

Imre Deak authored Mar 17, 2015

Factor out the logic to decide whether the newly calculated dividers are
better than the best found so far. Do this for clarity and to prepare
for the upcoming BXT helper needing the same.

No functional change.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

d5dd62bd

drm/i915: Kill intel_plane->obj · bdd7554d

Ville Syrjälä authored Mar 19, 2015

intel_plane->obj is not used anymore so kill it. Also don't pass both
the fb and obj to the sprite .update_plane() hook, as just passing the fb
is enough.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

bdd7554d

drm/i915: Initialize all contexts · 6702cf16

Ben Widawsky authored Mar 16, 2015

The problem is we're going to switch to a new context, which could be
the default context. The plan was to use restore inhibit, which would be
fine, except if we are using dynamic page tables (which we will). If we
use dynamic page tables and we don't load new page tables, the previous
page tables might go away, and future operations will fault.

CTXA runs.
switch to default, restore inhibit
CTXA dies and has its address space taken away.
Run CTXB, tries to save using the context A's address space - this
fails.

The general solution is to make sure every context has it's own state,
and its own address space. For cases when we must restore inhibit, first
thing we do is load a valid address space. I thought this would be
enough, but apparently there are references within the context itself
which will refer to the old address space - therefore, we also must
reinitialize.

v2: to->ppgtt is only valid in full ppgtt.
v3: Rebased.
v4: Make post PDP update clearer.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

6702cf16

drm/i915: Track page table reload need · 563222a7

Ben Widawsky authored Mar 19, 2015

This patch was formerly known as, "Force pd restore when PDEs change,
gen6-7." I had to change the name because it is needed for GEN8 too.

The real issue this is trying to solve is when a new object is mapped
into the current address space. The GPU does not snoop the new mapping
so we must do the gen specific action to reload the page tables.

GEN8 and GEN7 do differ in the way they load page tables for the RCS.
GEN8 does so with the context restore, while GEN7 requires the proper
load commands in the command streamer. Non-render is similar for both.

Caveat for GEN7
The docs say you cannot change the PDEs of a currently running context.
We never map new PDEs of a running context, and expect them to be
present - so I think this is okay. (We can unmap, but this should also
be okay since we only unmap unreferenced objects that the GPU shouldn't
be tryingto va->pa xlate.) The MI_SET_CONTEXT command does have a flag
to signal that even if the context is the same, force a reload. It's
unclear exactly what this does, but I have a hunch it's the right thing
to do.

The logic assumes that we always emit a context switch after mapping new
PDEs, and before we submit a batch. This is the case today, and has been
the case since the inception of hardware contexts. A note in the comment
let's the user know.

It's not just for gen8. If the current context has mappings change, we
need a context reload to switch

v2: Rebased after ppgtt clean up patches. Split the warning for aliasing
and true ppgtt options. And do not break aliasing ppgtt, where to->ppgtt
is always null.

v3: Invalidate PPGTT TLBs inside alloc_va_range.

v4: Rename ppgtt_invalidate_tlbs to mark_tlbs_dirty and move
pd_dirty_rings from i915_address_space to i915_hw_ppgtt. Fixes when
neither ctx->ppgtt and aliasing_ppgtt exist.

v5: Removed references to teardown_va_range.

v6: Updated needs_pd_load_pre/post.

v7: Fix pd_dirty_rings check in needs_pd_load_post, and update/move
comment about updated PDEs to object_pin/bind (Mika).

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

563222a7

drm/i915: Track GEN6 page table usage · 678d96fb

Ben Widawsky authored Mar 16, 2015

Instead of implementing the full tracking + dynamic allocation, this
patch does a bit less than half of the work, by tracking and warning on
unexpected conditions. The tracking itself follows which PTEs within a
page table are currently being used for objects. The next patch will
modify this to actually allocate the page tables only when necessary.

With the current patch there isn't much in the way of making a gen
agnostic range allocation function. However, in the next patch we'll add
more specificity which makes having separate functions a bit easier to
manage.

One important change introduced here is that DMA mappings are
created/destroyed at the same page directories/tables are
allocated/deallocated.

Notice that aliasing PPGTT is not managed here. The patch which actually
begins dynamic allocation/teardown explains the reasoning for this.

v2: s/pdp.page_directory/pdp.page_directories
Make a scratch page allocation helper

v3: Rebase and expand commit message.

v4: Allocate required pagetables only when it is needed, _bind_to_vm
instead of bind_vma (Daniel).

v5: Rebased to remove the unnecessary noise in the diff, also:
 - PDE mask is GEN agnostic, renamed GEN6_PDE_MASK to I915_PDE_MASK.
 - Removed unnecessary checks in gen6_alloc_va_range.
 - Changed map/unmap_px_single macros to use dma functions directly and
   be part of a static inline function instead.
 - Moved drm_device plumbing through page tables operation to its own
   patch.
 - Moved allocate/teardown_va_range calls until they are fully
   implemented (in subsequent patch).
 - Merged pt and scratch_pt unmap_and_free path.
 - Moved scratch page allocator helper to the patch that will use it.

v6: Reduce complexity by not tearing down pagetables dynamically, the
same can be achieved while freeing empty vms. (Daniel)

v7: s/i915_dma_map_px_single/i915_dma_map_single
s/gen6_write_pdes/gen6_write_pde
Prevent a NULL case when only GGTT is available. (Mika)

v8: Rebased after s/page_tables/page_table/.

v9: Reworked i915_pte_index and i915_pte_count.
Also exercise bitmap allocation here (gen6_alloc_va_range) and fix
incorrect write_page_range in i915_gem_restore_gtt_mappings (Mika).

Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v3+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

678d96fb

drm/i915: Extract context switch skip and add pd load logic · 317b4e90

Ben Widawsky authored Mar 16, 2015

In Gen8, PDPs are saved and restored with legacy contexts (legacy contexts
only exist on the render ring). So change the ordering of LRI vs MI_SET_CONTEXT
for the initialization of the context. Also the only cases in which we
need to manually update the PDPs are when MI_RESTORE_INHIBIT has been
set in MI_SET_CONTEXT (i.e. when the context is not yet initialized or
it is the default context).

Legacy submission is not available post GEN8, so it isn't necessary to
add extra checks for newer generations.

v2: Use new functions to replace the logic right away (Daniel)
v3: Add missing pd load logic.
v4: Add warning in case pd_load_pre & pd_load_post are true, and add
missing trace_switch_mm. Cleaned up pd_load conditions. Add more
information about when is pd_load_post needed. (Mika)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

317b4e90

drm/i915: page table generalizations · 07749ef3

Michel Thierry authored Mar 16, 2015

No functional changes, but will improve code clarity and removed some
duplicated defines.
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

07749ef3

drm/i915: Send out the full AUX address · d2d9cbbd

Ville Syrjälä authored Mar 19, 2015

AUX addresses are 20 bits long. Send out the entire address instead of
just the low 16 bits.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

d2d9cbbd

drm/i915: kerneldoc for i915_gem_shrinker.c · eb0b44ad

Daniel Vetter authored Mar 18, 2015

And remove one bogus * from i915_gem_gtt.c since that's not a
kerneldoc there.

v2: Review from Chris:
- Clarify memory space to better distinguish from address space.
- Add note that shrink doesn't guarantee the freed memory and that
  users must fall back to shrink_all.
- Explain how pinning ties in with eviction/shrinker.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

eb0b44ad

drm/i915: Extract i915_gem_shrinker.c · be6a0376

Daniel Vetter authored Mar 18, 2015

Two code changes:
- Extract i915_gem_shrinker_init.
- Inline i915_gem_object_is_purgeable since we open-code it everywhere
  else too.

This already has the benefit of pulling all the shrinker code
together, next patch adds a bit of kerneldoc.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

be6a0376

drm/i915: Use down ei for manual Baytrail RPS calculations · 6f4b12f8

Chris Wilson authored Mar 18, 2015

Use both up/down manual ei calcuations for symmetry and greater
flexibility for reclocking, instead of faking the down interrupt based
on a fixed integer number of up interrupts.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Deepak S<deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

6f4b12f8

drm/i915: Improved w/a for rps on Baytrail · 43cf3bf0

Chris Wilson authored Mar 18, 2015

Rewrite commit 31685c25
Author: Deepak S <deepak.s@linux.intel.com>
Date:   Thu Jul 3 17:33:01 2014 -0400

    drm/i915/vlv: WA for Turbo and RC6 to work together.

Other than code clarity, the major improvement is to disable the extra
interrupts generated when idle.  However, the reclocking remains rather
slow under the new manual regime, in particular it fails to downclock as
quickly as desired. The second major improvement is that for certain
workloads, like games, we need to combine render+media activity counters
as the work of displaying the frame is split across the engines and both
need to be taken into account when deciding the global GPU frequency as
memory cycles are shared.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Deepak S <deepak.s@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Deepak S<deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

43cf3bf0

drm/i915: Relax RPS contraints to allows setting minfreq on idle · aed242ff

Chris Wilson authored Mar 18, 2015

When we idle, we set the GPU frequency to the hardware minimum (not user
minimum). We introduce a new variable to distinguish between the
different roles, and to allow easy tuning of the idle frequency without
impacting over aspects of RPS. Setting the minimum frequency should be a
safety blanket as the pcu on the GPU should be power gating itself
anyway. However, in order for us to do set the absolute minimum
frequency, we need to relax a few of our assertions that we do not
exceed the user limits.

v2: Add idle_freq
v3: Init idle_freq for vlv and add a bunch of WARNs
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Deepak S <deepak.s@linux.intel.com>
Reviewed-by: Deepak S<deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

aed242ff

drm/i915: Fallback to using CPU relocations for large batch buffers · edf4427b

Chris Wilson authored Jan 14, 2015

If the batch buffer is too large to fit into the aperture and we need a
GTT mapping for relocations, we currently fail. This only applies to a
subset of machines for a subset of environments, quite undesirable. We
can simply check after failing to insert the batch into the GTT as to
whether we only need a mappable binding for relocation and, if so, we can
revert to using a non-mappable binding and an alternate relocation
method. However, using relocate_entry_cpu() is excruciatingly slow for
large buffers on non-LLC as the entire buffer requires clflushing before
and after the relocation handling. Alternatively, we can implement a
third relocation method that only clflushes around the relocation entry.
This is still slower than updating through the GTT, so we prefer using
the GTT where possible, but is orders of magnitude faster as we
typically do not have to then clflush the entire buffer.

An alternative idea of using a temporary WC mapping of the backing store
is promising (it should be faster than using the GTT itself), but
requires fairly extensive arch/x86 support - along the lines of
kmap_atomic_prof_pfn() (which is not universally implemented even for
x86).

Testcase: igt/gem_exec_big #pnv,byt
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88392Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Add a WARN_ONCE for the impossible reloc case and explain in
a short comment why we want to avoid ping-pong.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

edf4427b