1. 26 Mar, 2015 5 commits
  2. 25 Mar, 2015 3 commits
  3. 24 Mar, 2015 3 commits
  4. 23 Mar, 2015 8 commits
  5. 20 Mar, 2015 21 commits
    • Michel Thierry's avatar
      drm/i915: Do not leak objects after capturing error state · 0b37a9a9
      Michel Thierry authored
      While running kmemleak chasing a different memleak, I saw that the
      capture_error_state function was leaking some objects, for example:
      
      unreferenced object 0xffff8800a9b72148 (size 8192):
        comm "kworker/u16:0", pid 1499, jiffies 4295201243 (age 990.096s)
        hex dump (first 32 bytes):
          00 00 04 00 00 00 00 00 5d f4 ff ff 00 00 00 00  ........].......
          00 30 b0 01 00 00 00 00 37 00 00 00 00 00 00 00  .0......7.......
        backtrace:
          [<ffffffff811e5ae4>] create_object+0x104/0x2c0
          [<ffffffff8178f50a>] kmemleak_alloc+0x7a/0xc0
          [<ffffffff811cde4b>] __kmalloc+0xeb/0x220
          [<ffffffffa038f1d9>] kcalloc.constprop.12+0x2d/0x2f [i915]
          [<ffffffffa0316064>] i915_capture_error_state+0x3f4/0x1660 [i915]
          [<ffffffffa03207df>] i915_handle_error+0x7f/0x660 [i915]
          [<ffffffffa03210f7>] i915_hangcheck_elapsed+0x2e7/0x470 [i915]
          [<ffffffff8108d574>] process_one_work+0x144/0x490
          [<ffffffff8108dfbd>] worker_thread+0x11d/0x530
          [<ffffffff81094079>] kthread+0xc9/0xe0
          [<ffffffff817a2398>] ret_from_fork+0x58/0x90
          [<ffffffffffffffff>] 0xffffffffffffffff
      
      The following objects are allocated in i915_gem_capture_buffers, but not
      released in i915_error_state_free:
        - error->active_bo_count
        - error->pinned_bo
        - error->pinned_bo_count
        - error->active_bo[vm_count] (allocated in i915_gem_capture_vm).
      
      The leaks were introduced by
      commit 95f5301d
      Author: Ben Widawsky <ben@bwidawsk.net>
      Date:   Wed Jul 31 17:00:15 2013 -0700
      
          drm/i915: Update error capture for VMs
      
      v2: Reuse iterator and add culprit commit details (Chris)
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: default avatarMichel Thierry <michel.thierry@intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      0b37a9a9
    • Ville Syrjälä's avatar
      drm/i915: Fix SKL sprite disable double buffer register update · 2ddc1dad
      Ville Syrjälä authored
      Write the PLANE_SURF register instead of PLANE_CTL to arm the double
      buffer regisrter update.
      Signed-off-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      2ddc1dad
    • Ville Syrjälä's avatar
      drm/i915: Eliminate plane control register RMW from sprite code · 48fe4691
      Ville Syrjälä authored
      Replace the RMW access with explicit initialization of the entire plane
      control register, as was done for primary planes in:
      
       commit f45651ba
       Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
       Date:   Fri Aug 8 21:51:10 2014 +0300
      
          drm/i915: Eliminate rmw from .update_primary_plane()
      
      The automagic primary plane disable is still doing RMWs, but that will
      require more work to untangle, so leave it alone for now.
      Signed-off-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      48fe4691
    • Ville Syrjälä's avatar
      drm/i915: Eliminate the RMW sprite colorkey management · 47ecbb20
      Ville Syrjälä authored
      Store the colorkey in intel_plane and kill off all the RMW stuff
      handling it.
      
      This is just an intermediate step and eventually the colorkey needs to
      be converted into some properties.
      
      v2: Actually update the hardware state in the set_colorkey ioctl (Daniel)
      Signed-off-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      47ecbb20
    • Matt Roper's avatar
      drm/i915: Move vblank wait determination to 'check' phase · 08fd59fc
      Matt Roper authored
      Determining whether we'll need to wait for vblanks is something we
      should determine during the atomic 'check' phase, not the 'commit'
      phase.  Note that we only set these bits in the branch of 'check' where
      intel_crtc->active is true so that we don't try to wait on a disabled
      CRTC.
      
      The whole 'wait for vblank after update' flag should go away in the
      future, once we start handling watermarks in a proper atomic manner.
      
      This regression has been introduced in
      
      commit 2fdd7def16dd7580f297827930126c16b152ec11
      Author: Matt Roper <matthew.d.roper@intel.com>
      Date:   Wed Mar 4 10:49:04 2015 -0800
          drm/i915: Don't clobber plane state on internal disables
      
      Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
      Root-cause-analysis-by: default avatarPaulo Zanoni <paulo.r.zanoni@intel.com>
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89550
      Testcase: igt/pm_rpm/legacy-planes
      Testcase: igt/pm_rpm/legacy-planes-dpms
      Testcase: igt/pm_rpm/universal-planes
      Testcase: igt/pm_rpm/universal-planes-dpms
      Signed-off-by: default avatarMatt Roper <matthew.d.roper@intel.com>
      Tested-by: default avatarPaulo Zanoni <paulo.r.zanoni@intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      08fd59fc
    • Imre Deak's avatar
      drm/i915/chv: use vlv_PLL_is_optimal in chv_find_best_dpll · 9ca3ba01
      Imre Deak authored
      Prepare chv_find_best_dpll to be used for BXT too, where we want to
      consider the error between target and calculated frequency too when
      choosing a better PLL configuration.
      
      No functional change.
      Signed-off-by: default avatarImre Deak <imre.deak@intel.com>
      Reviewed-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      9ca3ba01
    • Imre Deak's avatar
    • Imre Deak's avatar
      drm/i915: factor out vlv_PLL_is_optimal · d5dd62bd
      Imre Deak authored
      Factor out the logic to decide whether the newly calculated dividers are
      better than the best found so far. Do this for clarity and to prepare
      for the upcoming BXT helper needing the same.
      
      No functional change.
      Signed-off-by: default avatarImre Deak <imre.deak@intel.com>
      Reviewed-by: default avatarJesse Barnes <jbarnes@virtuousgeek.org>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      d5dd62bd
    • Ville Syrjälä's avatar
      drm/i915: Kill intel_plane->obj · bdd7554d
      Ville Syrjälä authored
      intel_plane->obj is not used anymore so kill it. Also don't pass both
      the fb and obj to the sprite .update_plane() hook, as just passing the fb
      is enough.
      Signed-off-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      bdd7554d
    • Ben Widawsky's avatar
      drm/i915: Initialize all contexts · 6702cf16
      Ben Widawsky authored
      The problem is we're going to switch to a new context, which could be
      the default context. The plan was to use restore inhibit, which would be
      fine, except if we are using dynamic page tables (which we will). If we
      use dynamic page tables and we don't load new page tables, the previous
      page tables might go away, and future operations will fault.
      
      CTXA runs.
      switch to default, restore inhibit
      CTXA dies and has its address space taken away.
      Run CTXB, tries to save using the context A's address space - this
      fails.
      
      The general solution is to make sure every context has it's own state,
      and its own address space. For cases when we must restore inhibit, first
      thing we do is load a valid address space. I thought this would be
      enough, but apparently there are references within the context itself
      which will refer to the old address space - therefore, we also must
      reinitialize.
      
      v2: to->ppgtt is only valid in full ppgtt.
      v3: Rebased.
      v4: Make post PDP update clearer.
      Signed-off-by: default avatarBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
      Reviewed-by: default avatarMika Kuoppala <mika.kuoppala@intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      6702cf16
    • Ben Widawsky's avatar
      drm/i915: Track page table reload need · 563222a7
      Ben Widawsky authored
      This patch was formerly known as, "Force pd restore when PDEs change,
      gen6-7." I had to change the name because it is needed for GEN8 too.
      
      The real issue this is trying to solve is when a new object is mapped
      into the current address space. The GPU does not snoop the new mapping
      so we must do the gen specific action to reload the page tables.
      
      GEN8 and GEN7 do differ in the way they load page tables for the RCS.
      GEN8 does so with the context restore, while GEN7 requires the proper
      load commands in the command streamer. Non-render is similar for both.
      
      Caveat for GEN7
      The docs say you cannot change the PDEs of a currently running context.
      We never map new PDEs of a running context, and expect them to be
      present - so I think this is okay. (We can unmap, but this should also
      be okay since we only unmap unreferenced objects that the GPU shouldn't
      be tryingto va->pa xlate.) The MI_SET_CONTEXT command does have a flag
      to signal that even if the context is the same, force a reload. It's
      unclear exactly what this does, but I have a hunch it's the right thing
      to do.
      
      The logic assumes that we always emit a context switch after mapping new
      PDEs, and before we submit a batch. This is the case today, and has been
      the case since the inception of hardware contexts. A note in the comment
      let's the user know.
      
      It's not just for gen8. If the current context has mappings change, we
      need a context reload to switch
      
      v2: Rebased after ppgtt clean up patches. Split the warning for aliasing
      and true ppgtt options. And do not break aliasing ppgtt, where to->ppgtt
      is always null.
      
      v3: Invalidate PPGTT TLBs inside alloc_va_range.
      
      v4: Rename ppgtt_invalidate_tlbs to mark_tlbs_dirty and move
      pd_dirty_rings from i915_address_space to i915_hw_ppgtt. Fixes when
      neither ctx->ppgtt and aliasing_ppgtt exist.
      
      v5: Removed references to teardown_va_range.
      
      v6: Updated needs_pd_load_pre/post.
      
      v7: Fix pd_dirty_rings check in needs_pd_load_post, and update/move
      comment about updated PDEs to object_pin/bind (Mika).
      
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Signed-off-by: default avatarBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
      Reviewed-by: default avatarMika Kuoppala <mika.kuoppala@intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      563222a7
    • Ben Widawsky's avatar
      drm/i915: Track GEN6 page table usage · 678d96fb
      Ben Widawsky authored
      Instead of implementing the full tracking + dynamic allocation, this
      patch does a bit less than half of the work, by tracking and warning on
      unexpected conditions. The tracking itself follows which PTEs within a
      page table are currently being used for objects. The next patch will
      modify this to actually allocate the page tables only when necessary.
      
      With the current patch there isn't much in the way of making a gen
      agnostic range allocation function. However, in the next patch we'll add
      more specificity which makes having separate functions a bit easier to
      manage.
      
      One important change introduced here is that DMA mappings are
      created/destroyed at the same page directories/tables are
      allocated/deallocated.
      
      Notice that aliasing PPGTT is not managed here. The patch which actually
      begins dynamic allocation/teardown explains the reasoning for this.
      
      v2: s/pdp.page_directory/pdp.page_directories
      Make a scratch page allocation helper
      
      v3: Rebase and expand commit message.
      
      v4: Allocate required pagetables only when it is needed, _bind_to_vm
      instead of bind_vma (Daniel).
      
      v5: Rebased to remove the unnecessary noise in the diff, also:
       - PDE mask is GEN agnostic, renamed GEN6_PDE_MASK to I915_PDE_MASK.
       - Removed unnecessary checks in gen6_alloc_va_range.
       - Changed map/unmap_px_single macros to use dma functions directly and
         be part of a static inline function instead.
       - Moved drm_device plumbing through page tables operation to its own
         patch.
       - Moved allocate/teardown_va_range calls until they are fully
         implemented (in subsequent patch).
       - Merged pt and scratch_pt unmap_and_free path.
       - Moved scratch page allocator helper to the patch that will use it.
      
      v6: Reduce complexity by not tearing down pagetables dynamically, the
      same can be achieved while freeing empty vms. (Daniel)
      
      v7: s/i915_dma_map_px_single/i915_dma_map_single
      s/gen6_write_pdes/gen6_write_pde
      Prevent a NULL case when only GGTT is available. (Mika)
      
      v8: Rebased after s/page_tables/page_table/.
      
      v9: Reworked i915_pte_index and i915_pte_count.
      Also exercise bitmap allocation here (gen6_alloc_va_range) and fix
      incorrect write_page_range in i915_gem_restore_gtt_mappings (Mika).
      
      Cc: Daniel Vetter <daniel@ffwll.ch>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Signed-off-by: default avatarBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v3+)
      Reviewed-by: default avatarMika Kuoppala <mika.kuoppala@intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      678d96fb
    • Ben Widawsky's avatar
      drm/i915: Extract context switch skip and add pd load logic · 317b4e90
      Ben Widawsky authored
      In Gen8, PDPs are saved and restored with legacy contexts (legacy contexts
      only exist on the render ring). So change the ordering of LRI vs MI_SET_CONTEXT
      for the initialization of the context. Also the only cases in which we
      need to manually update the PDPs are when MI_RESTORE_INHIBIT has been
      set in MI_SET_CONTEXT (i.e. when the context is not yet initialized or
      it is the default context).
      
      Legacy submission is not available post GEN8, so it isn't necessary to
      add extra checks for newer generations.
      
      v2: Use new functions to replace the logic right away (Daniel)
      v3: Add missing pd load logic.
      v4: Add warning in case pd_load_pre & pd_load_post are true, and add
      missing trace_switch_mm. Cleaned up pd_load conditions. Add more
      information about when is pd_load_post needed. (Mika)
      Signed-off-by: default avatarBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
      Reviewed-by: default avatarMika Kuoppala <mika.kuoppala@intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      317b4e90
    • Michel Thierry's avatar
      drm/i915: page table generalizations · 07749ef3
      Michel Thierry authored
      No functional changes, but will improve code clarity and removed some
      duplicated defines.
      Signed-off-by: default avatarMichel Thierry <michel.thierry@intel.com>
      Reviewed-by: default avatarMika Kuoppala <mika.kuoppala@intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      07749ef3
    • Ville Syrjälä's avatar
      drm/i915: Send out the full AUX address · d2d9cbbd
      Ville Syrjälä authored
      AUX addresses are 20 bits long. Send out the entire address instead of
      just the low 16 bits.
      Signed-off-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Reviewed-by: default avatarJani Nikula <jani.nikula@intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      d2d9cbbd
    • Daniel Vetter's avatar
      drm/i915: kerneldoc for i915_gem_shrinker.c · eb0b44ad
      Daniel Vetter authored
      And remove one bogus * from i915_gem_gtt.c since that's not a
      kerneldoc there.
      
      v2: Review from Chris:
      - Clarify memory space to better distinguish from address space.
      - Add note that shrink doesn't guarantee the freed memory and that
        users must fall back to shrink_all.
      - Explain how pinning ties in with eviction/shrinker.
      
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      eb0b44ad
    • Daniel Vetter's avatar
      drm/i915: Extract i915_gem_shrinker.c · be6a0376
      Daniel Vetter authored
      Two code changes:
      - Extract i915_gem_shrinker_init.
      - Inline i915_gem_object_is_purgeable since we open-code it everywhere
        else too.
      
      This already has the benefit of pulling all the shrinker code
      together, next patch adds a bit of kerneldoc.
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      be6a0376
    • Chris Wilson's avatar
      drm/i915: Use down ei for manual Baytrail RPS calculations · 6f4b12f8
      Chris Wilson authored
      Use both up/down manual ei calcuations for symmetry and greater
      flexibility for reclocking, instead of faking the down interrupt based
      on a fixed integer number of up interrupts.
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: Deepak S<deepak.s@linux.intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      6f4b12f8
    • Chris Wilson's avatar
      drm/i915: Improved w/a for rps on Baytrail · 43cf3bf0
      Chris Wilson authored
      Rewrite commit 31685c25
      Author: Deepak S <deepak.s@linux.intel.com>
      Date:   Thu Jul 3 17:33:01 2014 -0400
      
          drm/i915/vlv: WA for Turbo and RC6 to work together.
      
      Other than code clarity, the major improvement is to disable the extra
      interrupts generated when idle.  However, the reclocking remains rather
      slow under the new manual regime, in particular it fails to downclock as
      quickly as desired. The second major improvement is that for certain
      workloads, like games, we need to combine render+media activity counters
      as the work of displaying the frame is split across the engines and both
      need to be taken into account when deciding the global GPU frequency as
      memory cycles are shared.
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Deepak S <deepak.s@linux.intel.com>
      Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
      Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Reviewed-by: Deepak S<deepak.s@linux.intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      43cf3bf0
    • Chris Wilson's avatar
      drm/i915: Relax RPS contraints to allows setting minfreq on idle · aed242ff
      Chris Wilson authored
      When we idle, we set the GPU frequency to the hardware minimum (not user
      minimum). We introduce a new variable to distinguish between the
      different roles, and to allow easy tuning of the idle frequency without
      impacting over aspects of RPS. Setting the minimum frequency should be a
      safety blanket as the pcu on the GPU should be power gating itself
      anyway. However, in order for us to do set the absolute minimum
      frequency, we need to relax a few of our assertions that we do not
      exceed the user limits.
      
      v2: Add idle_freq
      v3: Init idle_freq for vlv and add a bunch of WARNs
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Deepak S <deepak.s@linux.intel.com>
      Reviewed-by: Deepak S<deepak.s@linux.intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      aed242ff
    • Chris Wilson's avatar
      drm/i915: Fallback to using CPU relocations for large batch buffers · edf4427b
      Chris Wilson authored
      If the batch buffer is too large to fit into the aperture and we need a
      GTT mapping for relocations, we currently fail. This only applies to a
      subset of machines for a subset of environments, quite undesirable. We
      can simply check after failing to insert the batch into the GTT as to
      whether we only need a mappable binding for relocation and, if so, we can
      revert to using a non-mappable binding and an alternate relocation
      method. However, using relocate_entry_cpu() is excruciatingly slow for
      large buffers on non-LLC as the entire buffer requires clflushing before
      and after the relocation handling. Alternatively, we can implement a
      third relocation method that only clflushes around the relocation entry.
      This is still slower than updating through the GTT, so we prefer using
      the GTT where possible, but is orders of magnitude faster as we
      typically do not have to then clflush the entire buffer.
      
      An alternative idea of using a temporary WC mapping of the backing store
      is promising (it should be faster than using the GTT itself), but
      requires fairly extensive arch/x86 support - along the lines of
      kmap_atomic_prof_pfn() (which is not universally implemented even for
      x86).
      
      Testcase: igt/gem_exec_big #pnv,byt
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88392Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      [danvet: Add a WARN_ONCE for the impossible reloc case and explain in
      a short comment why we want to avoid ping-pong.]
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      edf4427b