Commits · 5ba001783ba6451fd3db0259d30549ca1fe91870 · Kirill Smelkov / linux

03 Mar, 2016 9 commits

drm/i915: Do not wait atomically for display clocks · 5ba00178

Tvrtko Ursulin authored Mar 03, 2016

Looks like this code does not need to wait atomically since it
otherwise takes the mutex.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1457015805-23742-1-git-send-email-tvrtko.ursulin@linux.intel.com

5ba00178

drm/i915: Do not lie about atomic timeout granularity · 0351b939

Tvrtko Ursulin authored Mar 03, 2016

Currently the wait_for_atomic_us only allows for a jiffie
timeout granularity which is not nice towards callers
requesting small micro-second timeouts.

Re-implement it so micro-second timeout granularity is really
supported and not just in the name of the macro.

This has another beneficial side effect that it improves
"gem_latency -n 100" results by approximately 2.5% (throughput
and latencies) and 3% (CPU usage). (Note this improvement is
relative to not yet merged execlist lock uncontention patch
which moves the CSB MMIO outside this lock.)

It also shrinks some hot functions like fw_domains_get by a
tiny 3%.

v2:
  * Warn when used from non-atomic context (if possible).
  * Warn on too long atomic waits.

v3:
  * Added comment explaining CONFIG_PREEMPT_COUNT.
  * Fixed pre-processor indentation.
  (Chris Wilson)

v4:
 * Commit msg update (gem_latency) and rebase.

v5:
 * Commit message re-wording.
 * Added comment about no need for double cond check. (Chris Wilson)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

0351b939

drm/i915: Kconfig for extra driver debugging · 643a24b6

Tvrtko Ursulin authored Mar 03, 2016

v2: Added a submenu based on an idea by Chris Wilson.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

643a24b6

drm/i915/lrc: Do not wait atomically when stopping engines · 8de1b23e

Tvrtko Ursulin authored Mar 03, 2016

I do not see that this needs to be done atomically and up to
one second is quite a long time to busy loop.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

8de1b23e

drm/i915: Add wait_for_us · 3f177625

Tvrtko Ursulin authored Mar 03, 2016

This is for callers who want micro-second precision but are not
waiting from the atomic context.

v2:
  * Fix atomic waits. (Dave Gordon)
  * Use USEC_PER_SEC and USEC_PER_MSEC. (Chris Wilson)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

3f177625

drm/i915/bxt: Additional MIPI clock divider form B0 stepping onwards · 782d25ca

Deepak M authored Feb 15, 2016

The MIPI clock calculations for the addtional clock
are revised from B0 stepping onwards, the bit definitions
have changed compared to old stepping.

v2: Fixing compilation warning.
v3: Retained the old Macros (Jani)
Signed-off-by: Deepak M <m.deepak@intel.com>
Tested-by: Ramalingam C <ramalingam.c@intel.com> # BXT-T with Tianma panel
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1455556437-29267-1-git-send-email-m.deepak@intel.com

782d25ca

drm/i915: Only recalculate wm's for planes part of the state, v2. · e3bddded

Maarten Lankhorst authored Mar 01, 2016

Only planes that are part of the state should be used for recalculating
watermarks. For planes not part of the state the previous patch allows
us to re-use the old values since they're calculated even for levels
that are not actively used.

Changes since v1:
- Remove big if from intel_crtc_atomic_check.
- Remove extra newline.
- Remove memset in ilk_compute_pipe_wm.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1456826842-32553-2-git-send-email-maarten.lankhorst@linux.intel.comReviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>

e3bddded

drm/i915: Allow preservation of watermarks, v2. · d81f04c5

Maarten Lankhorst authored Mar 02, 2016

As Paulo has noted we can help bisectability by separating computing
watermarks on a noop in 2 separate commits.

This patch no longer clears the crtc watermark state, but recalculates
it completely. Regardless whether a level is used the full values for
each level are calculated. If a level is invalid wm[level].enable is
unset.

Changes since v1:
- Only call ilk_validate_wm_level when level <= usable_level. (Ville)
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/56D6D09E.5040007@linux.intel.comReviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>

d81f04c5

drm/i915: Handle invalid ilk pipe watermarks correctly. · 1a426d61

Maarten Lankhorst authored Mar 02, 2016

This function returns an int, but when ilk_validate_pipe_wm fails it
returns false, which is 0 (success). As a result invalid watermarks
are applied, while they should have been rejected.

Fix this by returning -EINVAL.

Fixes: ed4a6a7c ("drm/i915: Add two-stage ILK-style watermark programming (v11)")
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1456918563-28696-1-git-send-email-maarten.lankhorst@linux.intel.comReviewed-by: Matt Roper <matthew.d.roper@intel.com>

1a426d61

02 Mar, 2016 3 commits

drm/i915: Hold RPM reference while setting freq limits through sysfs · 933bfb44

Sagar Arun Kamble authored Feb 08, 2016

This changes ensures device is active when frequency limits are changed.
This is needed as we are writing to register RPNSWREQ in intel_set_rps.
If not done, might lead to undesired errors like:
[ 1965.189137] [drm:fw_domains_get] *ERROR* blitter: timed out waiting for forcewake ack to clear.

v2: Added elaborate commit message. (Jani)
    Fixing RPM reference drop in early exit paths. (Ville)
Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1454951831-11778-1-git-send-email-sagar.a.kamble@intel.comSigned-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

933bfb44

drm/i915: Avoid snooping with userptr where not supported · ca377809

Tvrtko Ursulin authored Mar 02, 2016

   commit e5756c10
   Author: Imre Deak <imre.deak@intel.com>
   Date:   Fri Aug 14 18:43:30 2015 +0300

       drm/i915/bxt: don't allow cached GEM mappings on A stepping

Added an exception of disallowing snooping for Broxton A
stepping hardware but userptr was still enabling it regardless.

Move the check to HAS_SNOOP now that it is used from multiple
call sites and use it.

v2: Userptr cannot be supported when it cannot be coherent and
    generalize the code better. (Chris Wilson)

v3: Make has_snoop true only when !has_llc. (Chris Wilson)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1456920631-34302-1-git-send-email-tvrtko.ursulin@linux.intel.com

ca377809

drm/i915: Do not return unknown status when load detection is tested. · 32fff610

Maarten Lankhorst authored Mar 01, 2016

This fixes the IGT test, which interprets unknown status as failed to
acquire load detect pipe.

Cc: Gabriel Feceoru <gabriel.feceoru@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1456848241-6431-1-git-send-email-maarten.lankhorst@linux.intel.comReviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

32fff610

01 Mar, 2016 23 commits

drm/i915/gen9: Remove state asserts when disabling DC states · 5b773eb4

Imre Deak authored Feb 29, 2016

Disabling the DC states when it's already disabled is a valid scenario,
for example during HW state sanitization during driver loading and
resuming or when DC states are disabled via the i915.enable_dc or
disable_power_well option.

CC: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1456778945-5411-4-git-send-email-imre.deak@intel.com

5b773eb4

drm/i915/gen9: Disable DC states if power well support is disabled · 66e2c4c3

Imre Deak authored Feb 29, 2016

If power well support is disabled via the i915.disable_power_well module
option we should never enable DC states. Currently we would enable DC
states even in this case during system suspend, where we need to disable
all power wells regardless of the disable_power_well option.

66e2c4c3

drm/i915/gen9: Sanitize handling of allowed DC states · a37baf3b

Imre Deak authored Feb 29, 2016

We can simplify the conditions selecting the target DC state during
runtime by calculating the allowed DC states in advance during driver
loading. This also makes it easier to disable DC states depending on the
i915.disable_power_well module option, added in the next patch.

v2:
- Print a debug message if the requested max DC value was adjusted due
  to a platform limit. Also debug print the calculated mask value. (Patrik)

CC: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1456778945-5411-2-git-send-email-imre.deak@intel.com

a37baf3b

drm/i915/skl: Fix power domain suspend sequence · 2622d79b

Imre Deak authored Feb 29, 2016

During system suspend we need to first disable power wells then
unitialize the display core. In case power well support is disabled we
did this in the wrong order, so fix this up.

Fixes: d314cd43 ("drm/i915: fix handling of the disable_power_well module option")
CC: stable@vger.kernel.org
CC: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1456778945-5411-1-git-send-email-imre.deak@intel.com

2622d79b

drm/i915/error: Capture WA ctx batch in error state · f85db059

arun.siluvery@linux.intel.com authored Mar 01, 2016

execute during context save/restore, good to have them in error state.

v2: use wa_ctx->size and print only size values (Mika)
v3: simplify conditions when recording and freeing object (Chris)
v4: resolve checkpatch errors (Tvrtko)

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1456831476-10782-1-git-send-email-arun.siluvery@linux.intel.com

f85db059

drm/i915: Try to fix CRT port clock limits · debded84

Ville Syrjälä authored Feb 17, 2016

LPT/WPT-H are limited to max 180 MHz CRT dotclock. Most other platforms
have a limit of 350 MHz. Supposedly gen3 and gen4 go up to 400 MHz.

VLV is a bit special since the docs are poor. Supposedly the DAC
would be good up to 355 MHz, but currently we limit the DPLL to
270 MHz, so we'll have to limit the port clock to the same unless
we change the DPLL limits.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1455738073-14502-7-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

debded84

drm/i915: Read out VGA dotclock properly on LPT · 8802e5b6

Ville Syrjälä authored Feb 17, 2016

Rather than assume the VGA dotclock is really the FDI based thing,
let's read out the real thing via iclkip, and after readout it'll
get to compare it with the FDI based number to make sure they're
in sync.

Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1455738073-14502-6-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

8802e5b6

drm/i915: Make the LPT iclkip 20MHz case more generic · 64b46a06

Ville Syrjälä authored Feb 17, 2016

The reason for spcial casing 20MHz in the iclkip calculations is that
it would overflow the 7 bit divisor value. Let's rewrite the special
case to check for just that, and bump up auxdiv when needed. This makes
the code work for freqeuencies close to but not exactly 20MHz. The real
lower limit for auxdiv=0 is actually:
172800000/(0x7f+2)*64)=~20930 kHz, and below that we must resort to
auxdiv=1.

Actually this is all very theoretical since we limit the dotclock to
min 25MHz with CRT on all platforms. 25Mhz is actually the documented
limit in Bspec, so it seems we ought to never need to worry about the
auxdiv=1 case. But no harm in having it.

Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1455738073-14502-5-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>

64b46a06

drm/i915: Remove the SPLL==270Mhz assumption from intel_fdi_link_freq() · 21a727b3

Ville Syrjälä authored Feb 17, 2016

Instead of assuming we've correctly set up SPLL to run at 270Mhz for
FDI, let's use the port_clock from pipe_config which should be what
we want. This would catch problems if someone misconfigures SPLL for
whatever reason.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1455738073-14502-4-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

21a727b3

drm/i915: Move the encoder vs. FDI dotclock check out from encoder .get_config() · e3b247da

Ville Syrjälä authored Feb 17, 2016

Currently we check if the encoder's idea of dotclock agrees with what
we calculated based on the FDI parameters. We do this in the encoder
.get_config() hooks, which isn't so nice in case the BIOS (or some other
outside party) made a mess of the state and we're just trying to take
over.

So as a prep step to being able sanitize such a bogus state, move the
the sanity check to just after we've read out the entire state. If
we then need to sanitize a bad state, it should be easier to move the
sanity check to occur after sanitation instead of before it.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1455738073-14502-3-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

e3b247da

drm/i915: Dump ddi_pll_sel in hex instead of decimal on HSW/BDW · 1260f07e

Ville Syrjälä authored Feb 17, 2016

On HSW/BDW ddi_pll_sel is the actual register value. Let's dump
it in hex so that people migth actually understand what it says.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1455738073-14502-2-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Imre Deak <imre.deak@intel.com>

1260f07e

drm/i915: Embed rotation_info under intel_framebuffer · 2d7a215f

Ville Syrjälä authored Feb 15, 2016

Instead of repopulatin the rotation_info struct for the fb every time
we try to use the fb, we can just populate it once when creating the fb,
and later we can just copy the pre-populate struct into the gtt_view.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1455569699-27905-10-git-send-email-ville.syrjala@linux.intel.com

2d7a215f

drm/i915: Move the NULL sg handling out from rotate_pages() · 11f20322

Ville Syrjälä authored Feb 15, 2016

rotate_pages() checks to see if it got called with a NULL sg, and then
goes to extract it from sg->sgl. It always gets called with a NULL sg
for the first plane, so moving the initial 'sg=st->sgl' assignment out
into intel_rotate_fb_obj_pages() seems less special-casey.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1455569699-27905-9-git-send-email-ville.syrjala@linux.intel.com

11f20322

drm/i915: Reorganize intel_rotation_info · 1663b9d6

Ville Syrjälä authored Feb 15, 2016

Throw out a bunch of unnecessary stuff from struct intel_rotation_info,
and pull most of the remaining stuff to live under an array of
per-color plane sub-structures.

What still remains outside the sub-structure will be reorgranized later
as well, but that requires more work elsewhere so leave it be for now.

v2: Split the vma size == luma+chroma size fix to prep patch (Daniel)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v1)
Link: http://patchwork.freedesktop.org/patch/msgid/1455569699-27905-8-git-send-email-ville.syrjala@linux.intel.com

1663b9d6

drm/i915: Pass drm_frambuffer to intel_compute_page_offset() · 4f2d9934

Ville Syrjälä authored Feb 15, 2016

intel_compute_page_offsets() gets passed a bunch of the framebuffer
metadate sepearately. Just pass the framebuffer itself to make life
simpler for the caller, and make it less likely they would make a
mistake in the order of the arguments (as most as just unsigned ints and
such).

We still pass the pitch explicitly since for 90/270 degree rotation
the caller has to pass in the right thing.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1455569699-27905-7-git-send-email-ville.syrjala@linux.intel.com

4f2d9934

drm/i915: Don't pass plane+plane_state to intel_pin_and_fence_fb_obj() · 3465c580

Ville Syrjälä authored Feb 15, 2016

intel_pin_and_fence_fb_obj() only needs the framebuffer, and the desird
rotation (to find the right GTT view for it), so no need to pass all
kinds of plane stuff.

The main motivation is to get rid of the uggy NULL plane_state handling
due to fbdev.

v2: Add a note why I really want this
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Grumpily-Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1455569699-27905-6-git-send-email-ville.syrjala@linux.intel.com

3465c580

drm/i915: Support for extra alignment for tiled surfaces · 29cf9491

Ville Syrjälä authored Feb 15, 2016

SKL+ needs >4K alignment for tiled surfaces, so make
intel_compute_page_offset() handle it.

The way we do it is first we compute the closest tile boundary
as before, and then figure out how many tiles we need to go
to reach the desired alignment. The difference in the offset
is then added into the x/y offsets.

v2: Be less confusing wrt. units (pixels vs. bytes) (Daniel)
v3: Use u32 for offsets
    Have intel_adjust_tile_offset() return the new offset (will be
    useful later)
    Add an offset_aligned variable (Daniel)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1455569699-27905-5-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

29cf9491

drm/i915: Pass 90/270 vs. 0/180 rotation info for intel_gen4_compute_page_offset() · 8d0deca8

Ville Syrjälä authored Feb 15, 2016

The page aligned surface address calculation needs to know which way
things are rotated. The contract now says that the caller must pass the
rotate x/y coordinates, as well as the tile_height aligned stride in
the tile_height direction. This will make it fairly simple to deal with
90/270 degree rotation on SKL+ where we have to deal with the rotated
view into the GTT.

v2: Pass rotation instead of bool even thoughwe only care about 0/180 vs. 90/270
v3: Introduce intel_tile_dims(), and don't mix up different units so much
v4: Unconfuse bytes vs. pixels even more
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1455569699-27905-4-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

8d0deca8

drm/i915: s/tile_width/tile_width_bytes/ · 27ba3910

Ville Syrjälä authored Feb 15, 2016

Make if clear whether we're talking tile widths in bytes or in pixels.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1455569699-27905-3-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

27ba3910

drm/i915: Account for the size of the chroma plane for the rotated gtt view · 9106cf17

Ville Syrjälä authored Feb 15, 2016

The size of the rotated ggtt mapping ought to include the size of the
chroma plane as well. Not a huge deal since we don't expose NV12 (or any
pother planar format for that matter) yet.

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Fixes: 89e3e142 ("drm/i915: Support NV12 in rotated GGTT mapping")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1455569699-27905-2-git-send-email-ville.syrjala@linux.intel.comReviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

9106cf17

drm/i915: Add missing NULL check before calling initial_watermarks · 1d5bf5d9

Imre Deak authored Feb 29, 2016

Not all platforms set this callback, so NULL check it before calling it.

v2:
- Call intel_update_watermarks() on HSW+ where the callback is not set.
  (Matt)

CC: Matt Roper <matthew.d.roper@intel.com>
Fixes: commit ed4a6a7c ("drm/i915: Add two-stage ILK-style watermark programming (v11)")
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1456776633-3401-1-git-send-email-imre.deak@intel.comReviewed-by: Matt Roper <matthew.d.roper@intel.com>

1d5bf5d9

drm/i915: Execlists small cleanups and micro-optimisations · c6a2ac71

Tvrtko Ursulin authored Feb 26, 2016

Assorted changes in the areas of code cleanup, reduction of
invariant conditional in the interrupt handler and lock
contention and MMIO access optimisation.

 * Remove needless initialization.
 * Improve cache locality by reorganizing code and/or using
   branch hints to keep unexpected or error conditions out
   of line.
 * Favor busy submit path vs. empty queue.
 * Less branching in hot-paths.

v2:

 * Avoid mmio reads when possible. (Chris Wilson)
 * Use natural integer size for csb indices.
 * Remove useless return value from execlists_update_context.
 * Extract 32-bit ppgtt PDPs update so it is out of line and
   shared with two callers.
 * Grab forcewake across all mmio operations to ease the
   load on uncore lock and use chepear mmio ops.

v3:

 * Removed some more pointless u8 data types.
 * Removed unused return from execlists_context_queue.
 * Commit message updates.

v4:
 * Unclumsify the unqueue if statement. (Chris Wilson)
 * Hide forcewake from the queuing function. (Chris Wilson)

Version 3 now makes the irq handling code path ~20% smaller on
48-bit PPGTT hardware, and a little bit less elsewhere. Hot
paths are mostly in-line now and hammering on the uncore
spinlock is greatly reduced together with mmio traffic to an
extent.

Benchmarking with "gem_latency -n 100" (keep submitting
batches with 100 nop instruction) shows approximately 4% higher
throughput, 2% less CPU time and 22% smaller latencies. This was
on a big-core while small-cores could benefit even more.

Most likely reason for the improvements are the MMIO
optimization and uncore lock traffic reduction.

One odd result is with "gem_latency -n 0" (dispatching empty
batches) which shows 5% more throughput, 8% less CPU time,
25% better producer and consumer latencies, but 15% higher
dispatch latency which is yet unexplained.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1456505912-22286-1-git-send-email-tvrtko.ursulin@linux.intel.com

c6a2ac71

drm/i915: Handle -EDEADLK in drm_atomic_commit from load-detect. · 3ba86073

Maarten Lankhorst authored Feb 29, 2016

CI runs with DEBUG_WW_MUTEX_SLOWPATH, so -EDEADLK occurs a lot more.
Handle the case where drm_atomic_commit fails with -EDEADLK correctly.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/56D3FEF1.6070306@linux.intel.comReviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

3ba86073

29 Feb, 2016 4 commits

drm/i915: remove dead code · d9f8e52b

Eric Engestrom authored Feb 29, 2016

79e53945 ("DRM: i915: add mode setting
support") added those variables but never used them.
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1456763047-28828-2-git-send-email-eric.engestrom@imgtec.com

d9f8e52b

drm/i915: remove left over dead code · c3454d57

Eric Engestrom authored Feb 29, 2016

ae80152d ("drm/i915: Rewrite VLV/CHV
watermark code") removed everything that would have used those vars.
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1456763047-28828-1-git-send-email-eric.engestrom@imgtec.com

c3454d57

drm/i915: Add two-stage ILK-style watermark programming (v11) · ed4a6a7c

Matt Roper authored Feb 23, 2016

In addition to calculating final watermarks, let's also pre-calculate a
set of intermediate watermark values at atomic check time.  These
intermediate watermarks are a combination of the watermarks for the old
state and the new state; they should satisfy the requirements of both
states which means they can be programmed immediately when we commit the
atomic state (without waiting for a vblank).  Once the vblank does
happen, we can then re-program watermarks to the more optimal final
value.

v2: Significant rebasing/rewriting.

v3:
 - Move 'need_postvbl_update' flag to CRTC state (Daniel)
 - Don't forget to check intermediate watermark values for validity
   (Maarten)
 - Don't due async watermark optimization; just do it at the end of the
   atomic transaction, after waiting for vblanks.  We do want it to be
   async eventually, but adding that now will cause more trouble for
   Maarten's in-progress work.  (Maarten)
 - Don't allocate space in crtc_state for intermediate watermarks on
   platforms that don't need it (gen9+).
 - Move WaCxSRDisabledForSpriteScaling:ivb into intel_begin_crtc_commit
   now that ilk_update_wm is gone.

v4:
 - Add a wm_mutex to cover updates to intel_crtc->active and the
   need_postvbl_update flag.  Since we don't have async yet it isn't
   terribly important yet, but might as well add it now.
 - Change interface to program watermarks.  Platforms will now expose
   .initial_watermarks() and .optimize_watermarks() functions to do
   watermark programming.  These should lock wm_mutex, copy the
   appropriate state values into intel_crtc->active, and then call
   the internal program watermarks function.

v5:
 - Skip intermediate watermark calculation/check during initial hardware
   readout since we don't trust the existing HW values (and don't have
   valid values of our own yet).
 - Don't try to call .optimize_watermarks() on platforms that don't have
   atomic watermarks yet.  (Maarten)

v6:
 - Rebase

v7:
 - Further rebase

v8:
 - A few minor indentation and line length fixes

v9:
 - Yet another rebase since Maarten's patches reworked a bunch of the
   code (wm_pre, wm_post, etc.) that this was previously based on.

v10:
 - Move wm_mutex to dev_priv to protect against racing commits against
   disjoint CRTC sets. (Maarten)
 - Drop unnecessary clearing of cstate->wm.need_postvbl_update (Maarten)

v11:
 - Now that we've moved to atomic watermark updates, make sure we call
   the proper function to program watermarks in
   {ironlake,haswell}_crtc_enable(); the failure to do so on the
   previous patch iteration led to us not actually programming the
   watermarks before turning on the CRTC, which was the cause of the
   underruns that the CI system was seeing.
 - Fix inverted logic for determining when to optimize watermarks.  We
   were needlessly optimizing when the intermediate/optimal values were
   the same (harmless), but not actually optimizing when they differed
   (also harmless, but wasteful from a power/bandwidth perspective).

Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1456276813-5689-1-git-send-email-matthew.d.roper@intel.com

ed4a6a7c

drm/i915: Update DRIVER_DATE to 20160229 · 5790ff74
Daniel Vetter authored Feb 29, 2016
```
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
```
5790ff74

26 Feb, 2016 1 commit

drm/i915: Execlists cannot pin a context without the object · 2743179d

Chris Wilson authored Feb 26, 2016

Given that the intel_lr_context_pin cannot succeed without the object,
we cannot reach intel_lr_context_unpin() without first allocating that
object - so we can remove the redundant test.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1456485751-15213-1-git-send-email-tvrtko.ursulin@linux.intel.com

2743179d