1. 15 Feb, 2012 1 commit
    • Chris Wilson's avatar
      drm/i915: Record the tail at each request and use it to estimate the head · a71d8d94
      Chris Wilson authored
      By recording the location of every request in the ringbuffer, we know
      that in order to retire the request the GPU must have finished reading
      it and so the GPU head is now beyond the tail of the request. We can
      therefore provide a conservative estimate of where the GPU is reading
      from in order to avoid having to read back the ring buffer registers
      when polling for space upon starting a new write into the ringbuffer.
      
      A secondary effect is that this allows us to convert
      intel_ring_buffer_wait() to use i915_wait_request() and so consolidate
      upon the single function to handle the complicated task of waiting upon
      the GPU. A necessary precaution is that we need to make that wait
      uninterruptible to match the existing conditions as all the callers of
      intel_ring_begin() have not been audited to handle ERESTARTSYS
      correctly.
      
      By using a conservative estimate for the head, and always processing all
      outstanding requests first, we prevent a race condition between using
      the estimate and direct reads of I915_RING_HEAD which could result in
      the value of the head going backwards, and the tail overflowing once
      again. We are also careful to mark any request that we skip over in
      order to free space in ring as consumed which provides a
      self-consistency check.
      
      Given sufficient abuse, such as a set of unthrottled GPU bound
      cairo-traces, avoiding the use of I915_RING_HEAD gives a 10-20% boost on
      Sandy Bridge (i5-2520m):
        firefox-paintball  18927ms -> 15646ms: 1.21x speedup
        firefox-fishtank   12563ms -> 11278ms: 1.11x speedup
      which is a mild consolation for the performance those traces achieved from
      exploiting the buggy autoreported head.
      
      v2: Add a few more comments and make request->tail a conservative
      estimate as suggested by Daniel Vetter.
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      [danvet: resolve conflicts with retirement defering and the lack of
      the autoreport head removal (that will go in through -fixes).]
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      a71d8d94
  2. 14 Feb, 2012 3 commits
  3. 13 Feb, 2012 5 commits
    • Sean Paul's avatar
      drm/i915: Don't lock panel registers when downclocking · 8ac5a6d5
      Sean Paul authored
      This patch replaces the locking from the downclock routines with an assert
      to ensure the registers are indeed unlocked. Without this patch, pre-SNB
      devices would lock the registers when downclocking which would cause a
      WARNING on suspend/resume with downclocking enabled.
      
      Note: To hit this bug, you need to have lvds downclocking enabled.
      Signed-off-by: default avatarSean Paul <seanpaul@chromium.org>
      Acked-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      8ac5a6d5
    • Daniel Vetter's avatar
      drm/i915: fix up locking inconsistency around gem_do_init · d3ae0810
      Daniel Vetter authored
      The locking in our setup and teardown paths is rather arbitrary, but
      generally we try to protect gem stuff with dev->struct_mutex. Further,
      the ums/gem ioctl to setup gem _does_ take the look. So fix up this
      benign inconsistency.
      
      Notice while reading through code.
      
      v2: Rebased on top of the ppgtt code.
      Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Signed-Off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      d3ae0810
    • Daniel Vetter's avatar
      drm/i915: enable forcewake voodoo also for gen6 · 99ffa162
      Daniel Vetter authored
      We still have reports of missed irqs even on Sandybridge with the
      HWSTAM workaround in place. Testing by the bug reporter gets rid of
      them with the forcewake voodoo and no HWSTAM writes.
      
      Because I've slightly botched the rebasing I've left out the ACTHD
      readback which is also required to get IVB working. Seems to still
      work on the tester's machine, so I think we should go with the more
      minmal approach on SNB. Especially since I've only found weak evidence
      for holding forcewake while waiting for an interrupt to arrive, but
      none for the ACTHD readback.
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45181
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45332
      Tested-by: Nicolas Kalkhof nkalkhof()at()web.de
      Acked-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Signed-Off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      99ffa162
    • Daniel Vetter's avatar
      drm/i915: fixup seqno allocation logic for lazy_request · 53d227f2
      Daniel Vetter authored
      Currently we reserve seqnos only when we emit the request to the ring
      (by bumping dev_priv->next_seqno), but start using it much earlier for
      ring->oustanding_lazy_request. When 2 threads compete for the gpu and
      run on two different rings (e.g. ddx on blitter vs. compositor)
      hilarity ensued, especially when we get constantly interrupted while
      reserving buffers.
      
      Breakage seems to have been introduced in
      
      commit 6f392d54
      Author: Chris Wilson <chris@chris-wilson.co.uk>
      Date:   Sat Aug 7 11:01:22 2010 +0100
      
          drm/i915: Use a common seqno for all rings.
      
      This patch fixes up the seqno reservation logic by moving it into
      i915_gem_next_request_seqno. The ring->add_request functions now
      superflously still return the new seqno through a pointer, that will
      be refactored in the next patch.
      
      Note that with this change we now unconditionally allocate a seqno,
      even when ->add_request might fail because the rings are full and the
      gpu died. But this does not open up a new can of worms because we can
      already leave behind an outstanding_request_seqno if e.g. the caller
      gets interrupted with a signal while stalling for the gpu in the
      eviciton paths. And with the bugfix we only ever have one seqno
      allocated per ring (and only that ring), so there are no ordering
      issues with multiple outstanding seqnos on the same ring.
      
      v2: Keep i915_gem_get_seqno (but move it to i915_gem.c) to make it
      clear that we only have one seqno counter for all rings. Suggested by
      Chris Wilson.
      
      v3: As suggested by Chris Wilson use i915_gem_next_request_seqno
      instead of ring->oustanding_lazy_request to make the follow-up
      refactoring more clearly correct. Also improve the commit message
      with issues discussed on irc.
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45181
      Tested-by: Nicolas Kalkhof nkalkhof()at()web.de
      Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Signed-Off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      53d227f2
    • Daniel Vetter's avatar
      drm/i915: outstanding_lazy_request is a u32 · 5391d0cf
      Daniel Vetter authored
      So don't assign it false, that's just confusing ... No functional
      change here.
      Signed-Off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      5391d0cf
  4. 11 Feb, 2012 3 commits
  5. 10 Feb, 2012 12 commits
  6. 09 Feb, 2012 9 commits
  7. 08 Feb, 2012 5 commits
  8. 06 Feb, 2012 1 commit
  9. 03 Feb, 2012 1 commit