1. 07 Jan, 2015 3 commits
  2. 06 Jan, 2015 5 commits
    • Akash Goel's avatar
      drm/i915: Support creation of unbound wc user mappings for objects · 1816f923
      Akash Goel authored
      This patch provides support to create write-combining virtual mappings of
      GEM object. It intends to provide the same funtionality of 'mmap_gtt'
      interface without the constraints and contention of a limited aperture
      space, but requires clients handles the linear to tile conversion on their
      own. This is for improving the CPU write operation performance, as with such
      mapping, writes and reads are almost 50% faster than with mmap_gtt. Similar
      to the GTT mmapping, unlike the regular CPU mmapping, it avoids the cache
      flush after update from CPU side, when object is passed onto GPU.  This
      type of mapping is specially useful in case of sub-region update,
      i.e. when only a portion of the object is to be updated. Using a CPU mmap
      in such cases would normally incur a clflush of the whole object, and
      using a GTT mmapping would likely require eviction of an active object or
      fence and thus stall. The write-combining CPU mmap avoids both.
      
      To ensure the cache coherency, before using this mapping, the GTT domain
      has been reused here. This provides the required cache flush if the object
      is in CPU domain or synchronization against the concurrent rendering.
      Although the access through an uncached mmap should automatically
      invalidate the cache lines, this may not be true for non-temporal write
      instructions and also not all pages of the object may be updated at any
      given point of time through this mapping.  Having a call to get_pages in
      set_to_gtt_domain function, as added in the earlier patch 'drm/i915:
      Broaden application of set-domain(GTT)', would guarantee the clflush and
      so there will be no cachelines holding the data for the object before it
      is accessed through this map.
      
      The drm_i915_gem_mmap structure (for the DRM_I915_GEM_MMAP_IOCTL) has been
      extended with a new flags field (defaulting to 0 for existent users). In
      order for userspace to detect the extended ioctl, a new parameter
      I915_PARAM_MMAP_VERSION has been added for versioning the ioctl interface.
      
      v2: Fix error handling, invalid flag detection, renaming (ickle)
      
      v3: Rebase to latest drm-intel-nightly codebase
      
      The new mmapping is exercised by igt/gem_mmap_wc,
      igt/gem_concurrent_blit and igt/gem_gtt_speed.
      
      Change-Id: Ie883942f9e689525f72fe9a8d3780c3a9faa769a
      Signed-off-by: default avatarAkash Goel <akash.goel@intel.com>
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      1816f923
    • Chris Wilson's avatar
      drm/i915: Broaden application of set-domain(GTT) · 43566ded
      Chris Wilson authored
      Previously, this was restricted to only operate on bound objects - to
      make pointer access through the GTT to the object coherent with writes
      to and from the GPU. A second usecase is drm_intel_bo_wait_rendering()
      which at present does not function unless the object also happens to
      be bound into the GGTT (on current systems that is becoming increasingly
      rare, especially for the typical requests from mesa). A third usecase is
      a future patch wishing to extend the coverage of the GTT domain to
      include objects not bound into the GGTT but still in its coherent cache
      domain. For the latter pair of requests, we need to operate on the
      object regardless of its bind state.
      
      v2: After discussion with Akash, we came to the conclusion that the
      get-pages was required in order for accurate domain tracking in the
      corner cases (like the shrinker) and also useful for ensuring memory
      coherency with earlier cached CPU mmaps in case userspace uses exotic
      cache bypass (non-temporal) instructions.
      
      v3: Fix the inactive object check.
      
      v4: Rebase to latest drm-intel-nightly codebase
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: default avatarAkash Goel <akash.goel@intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      43566ded
    • Ben Widawsky's avatar
      drm/i915: Add some extra guards in evict_vm · b9b5dce5
      Ben Widawsky authored
      v2: Use WARN_ONs (Daniel)
      
      Cc: Daniel Vetter <daniel@ffwll.ch>
      Signed-off-by: default avatarBen Widawsky <ben@bwidawsk.net>
      Signed-off-by: default avatarMichel Thierry <michel.thierry@intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      b9b5dce5
    • Daniel Vetter's avatar
      drm/i915: Include i915_gem_evict.c kerneldoc into the drm docbook · 7838a63a
      Daniel Vetter authored
      I've written these long before we've had a reasonable docbook
      structure, and naturally they've gone stale. Fix this up asap.
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      7838a63a
    • Kenneth Graunke's avatar
      drm/i915: Make sample_c messages go faster on Haswell. · 94411593
      Kenneth Graunke authored
      Haswell significantly improved the performance of sampler_c messages,
      but the optimization appears to be off by default.  Later platforms
      remove this bit, and apparently always enable the optimization.
      
      Improves performance in "Counter Strike: Global Offensive" by 18%
      at default settings on Iris Pro.
      
      This may break sampling of paletted formats (P8/A8P8/P8A8).  It's
      unclear whether it affects sampling of paletted formats in general,
      or just the sample_c message (which is never used).
      
      While libva does have support for using paletted formats (primarily
      for OSDs), that support appears to have been broken for at least a
      year, so I couldn't observe a regression from this:
      
      I tried to get libva-intel to use paletted formats, and observe a
      regression...but the only thing I found that used it was mplayer's OSD
      (on screen display).  Even without my patch, the colors were totally
      wrong with that, and it's according to a few distro wikis, that's been
      the case for over a year.
      
      If libva's code for paletted formats /is/ broken, they could always
      add code to disable this bit using the command validator when fixing
      it.
      
      Further investigation from Haihao shows that libva mplayer OSD seems
      to work at least on his setup (still unclear what's wron with Ken's),
      and that it's not affected by this patch. Quoting the discussion
      between Haihao and Ken:
      
      > > > If you use "-vo gl" or "-vo xv", the OSD is solid white text with a black
      > > > border around it.  I presume that it's supposed to be white with vaapi as
      > > > well, but I guess I'm not entirely sure.
      > > >
      > > > It's possible that the optimization doesn't affect the palette as long as
      > > > you never use sample_c with the paletted textures.
      > >
      > > I verified the palette takes effect in the following way:
      > >
      > > 1. Only support P8A8 format in the driver
      > >
      > > 2. ran the above command and I saw white OSD text
      > >
      > > 3. Only support P4A4 format in the driver and don't use
      > > 3DSTATE_SAMPLER_PALETTE_LOAD0 to load the value to the texture palette,
      > > so the palette keeps unchanged.
      > >
      > > 4. ran the above command and I saw black OSD text.
      > >
      > > 5. Load the right value to the texture palette and ran the above command
      > > again, I saw white OSD text.
      > >
      > > Hence I think sample_c with the paletted textures is used in the driver.
      >
      > That sounds like the palette is actually working, then.  Great :)
      >
      > I doubt that libva would use sample_c - sampling with a shadow comparison?
      > It looks like it just uses sample and sample+killpix.
      
      You are right, libva driver doesn't use sample_c message.
      
      > I'm pretty sure the sample_c optimization just uses the palette memory as
      > storage for some stuff, so it's quite possible it just works if you're
      > only using sample and sample+killpix.
      
      Thanks for the explanation, it makes sense to me.
      Signed-off-by: default avatarKenneth Graunke <kenneth@whitecape.org>
      [danvet: Add wa name from Ville's review to the comment and copypaste
      the explanation why we don't care about libva (already broken) from
      Ken. Also add conclusion from libva devs that&why this is all fine.]
      Reviewed-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Cc: "Xiang, Haihao" <haihao.xiang@intel.com>
      Cc: libva@lists.freedesktop.org
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      94411593
  3. 19 Dec, 2014 1 commit
  4. 17 Dec, 2014 6 commits
  5. 16 Dec, 2014 12 commits
  6. 15 Dec, 2014 12 commits
    • Damien Lespiau's avatar
      drm/i915/skl: Skylake also supports DP MST · c86ea3d0
      Damien Lespiau authored
      I've checked that TRANS_DDI_MODE, DP_TP_CTL MST bits are identical to
      HSW/BDW on SKL, as well as the long vs short HPD bits. So we have a good
      chance to be working as well as prevous platforms.
      Signed-off-by: default avatarDamien Lespiau <damien.lespiau@intel.com>
      Reviewed-by: default avatarDave Airlie <airlied@redhat.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      c86ea3d0
    • Damien Lespiau's avatar
      drm/i915: Consolidate DDI clock reading out in a single function · 22606a18
      Damien Lespiau authored
      2 pieces of code need to read out the DDI clock: the DDI encoder and the
      MST encoder .get_config() vfuncs.
      
      Until now the SKL read out code was only in the former, so let's move
      the pre and post SKL logic in intel_ddi_clock_get() and this this one
      everywhere.
      Signed-off-by: default avatarDamien Lespiau <damien.lespiau@intel.com>
      Reviewed-by: default avatarDave Airlie <airlied@redhat.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      22606a18
    • Deepak M's avatar
      drm/i915: Parsing LFP brightness control from VBT · 371abae8
      Deepak M authored
      LFP brighness control from the VBT block 43 indicates which
      controller is used for brightness.
      LFP1 brightness control method:
      Bit 7-4 = This field controller number of the brightnes controller.
      0 = Controller 0
      1 = Controller 1
      2 = Controller 2
      3 = Controller 3
      Others = Reserved
      Bits 3-0 = This field specifies the brightness control pin to be used on the
      platform.
      0 = PMIC pin is used for brightness control
      1 = LPSS PWM is used for brightness control
      2 = Display DDI is used for brightness control
      3 = CABC method to control brightness
      Others = Reserved
      
      Adding the above fields in dev_priv->vbt and corresponding changes in
      parse_backlight()
      
      v2: Jani's review comments addressed
      	- Move PWM definitions to intel_bios.h
      	- Moving vbt_version to intel_vbt_data
      	- Rename brightness to bl_ctrl_data
      	- Logging just control_pin instead of string
      	- Avoid adding vbt_version in dev_priv
      	- Since only DDI option is available as of now, let control pin DDI
      	affect dev_priv->vbt.backlight.present
      
      v3: Jani's review comments addressed
      	- Drop control_pin
      	- Use bdb->version
      	- set controller to 0 instead of using control pin define
      	- check controller bounds
      	- remove superfluous changes in intel_parse_bios
      Signed-off-by: default avatarDeepak M <m.deepak@intel.com>
      Signed-off-by: default avatarVandana Kannan <vandana.kannan@intel.com>
      Reviewed-by: default avatarJani Nikula <jani.nikula@intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      371abae8
    • Sonika Jindal's avatar
      drm/i915/skl: Correcting the flushing of pipe · d9d8e6b3
      Sonika Jindal authored
      We were incorreectly bypassing the flush everytime which led to fifo
      underrun when more than one plane is enabled.
      Signed-off-by: default avatarSonika Jindal <sonika.jindal@intel.com>
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: Satheeshakrishna M<satheeshakrishna.m@intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      d9d8e6b3
    • Thomas Daniel's avatar
      drm/i915/bdw: Enable execlists by default where supported · 27401d12
      Thomas Daniel authored
      Execlist support in the i915 driver is now considered good enough for the
      feature to be enabled by default on Gen8 and later and routinely tested.
      Adjusted i915 parameters structure initialization to reflect this and updated
      the comment in intel_sanitize_enable_execlists().
      
      There's still work to do before we can let the wider massive onto it,
      but there's still time left before the 3.20 cutoff.
      
      v2: Update the MODULE_PARM_DESC too.
      
      Issue: VIZ-2020
      Signed-off-by: default avatarThomas Daniel <thomas.daniel@intel.com>
      [danvet: Add note that there's still some work left to do.]
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      27401d12
    • Sonika Jindal's avatar
      drm/i915/skl: Correctly updating sprite wm parameter · a712f8eb
      Sonika Jindal authored
      The pipe wm parameters is not correctly updated with sprite parameters
      because it copies them for each plane from plane_list to the sprite
      offset in pipe wm parameters. Since plane_list also contains primary and
      cursor planes, we end up updating wrong params for sprites.
      Signed-off-by: default avatarSonika Jindal <sonika.jindal@intel.com>
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      a712f8eb
    • Tvrtko Ursulin's avatar
      drm/i915: Documentation for multiple GGTT views · 45f8f69a
      Tvrtko Ursulin authored
      A short section describing background, implementation and intended usage.
      
      v2:
          * Align section name between template and DOC comment. (Michel Thierry)
      
      For: VIZ-4544
      Signed-off-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: default avatarMichel Thierry <michel.thierry@intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      45f8f69a
    • Tvrtko Ursulin's avatar
      drm/i915: Infrastructure for supporting different GGTT views per object · fe14d5f4
      Tvrtko Ursulin authored
      Things like reliable GGTT mappings and mirrored 2d-on-3d display will need
      to map objects into the same address space multiple times.
      
      Added a GGTT view concept and linked it with the VMA to distinguish between
      multiple instances per address space.
      
      New objects and GEM functions which do not take this new view as a parameter
      assume the default of zero (I915_GGTT_VIEW_NORMAL) which preserves the
      previous behaviour.
      
      This now means that objects can have multiple VMA entries so the code which
      assumed there will only be one also had to be modified.
      
      Alternative GGTT views are supposed to borrow DMA addresses from obj->pages
      which is DMA mapped on first VMA instantiation and unmapped on the last one
      going away.
      
      v2:
          * Removed per view special casing in i915_gem_ggtt_prepare /
            finish_object in favour of creating and destroying DMA mappings
            on first VMA instantiation and last VMA destruction. (Daniel Vetter)
          * Simplified i915_vma_unbind which does not need to count the GGTT views.
            (Daniel Vetter)
          * Also moved obj->map_and_fenceable reset under the same check.
          * Checkpatch cleanups.
      
      v3:
          * Only retire objects once the last VMA is unbound.
      
      v4:
          * Keep scatter-gather table for alternative views persistent for the
            lifetime of the VMA.
          * Propagate binding errors to callers and handle appropriately.
      
      v5:
          * Explicitly look for normal GGTT view in i915_gem_obj_bound to align
            usage in i915_gem_object_ggtt_unpin. (Michel Thierry)
          * Change to single if statement in i915_gem_obj_to_ggtt. (Michel Thierry)
          * Removed stray semi-colon in i915_gem_object_set_cache_level.
      
      For: VIZ-4544
      Signed-off-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Reviewed-by: default avatarMichel Thierry <michel.thierry@intel.com>
      [danvet: Drop hunk from i915_gem_shrink since it's just prettification
      but upsets a __must_check warning.]
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      fe14d5f4
    • Deepak S's avatar
      drm/i915: Forcewake Register Range changes for CHV · db5ff4ac
      Deepak S authored
      According to updated BSpec, Render/Common/media Wells register range changed.
      Updating the same to match the spec and avoid extra forcewake for none
      forcewake range.
      
      v2: Update media forcewake range (Ville)
      Signed-off-by: default avatarDeepak S <deepak.s@linux.intel.com>
      Reviewed-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      db5ff4ac
    • Gaurav K Singh's avatar
      drm/i915: Changes related to the sequence port no for · f915084e
      Gaurav K Singh authored
      From now on for both DSI Ports A & C, the seq_port value has been
      set to 0. seq_port value is parsed from Sequence block#53 of VBT.
      So, for packets that needs to be read/write for DSI single link on
      Port A and Port C will now be based on the DVO port from VBT block 2,
      instead of seq_port.
      Signed-off-by: default avatarGaurav K Singh <gaurav.k.singh@intel.com>
      Reviewed-by: default avatarJani Nikula <jani.nikula@intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      f915084e
    • Daniel Vetter's avatar
      drm/i915: Use BUILD_BUG if possible in the i915 WARN_ON · 5f77eeb0
      Daniel Vetter authored
      Faster feedback to errors is always better. This is inspired by the
      addition to WARN_ONs to mask/enable helpers for registers to make sure
      callers have the arguments ordered correctly: Pretty much always the
      arguments are static.
      
      We use WARN_ON(1) a lot in default switch statements though where we
      should always handle all cases. So add a new macro specifically for
      that.
      
      The idea to use __builtin_constant_p is from Chris Wilson.
      
      v2: Use the ({}) gcc-ism to avoid the static inline, suggested by
      Dave. My first attempt used __cond as the temp var, which is the same
      used by BUILD_BUG_ON, but with inverted sense. Hilarity ensued, so
      sprinkle i915 into the name.
      
      Also use a temporary variable to only evaluate the condition once,
      suggested by Damien.
      
      v3: It's crazy but apparently 32bit gcc can't compile out the
      BUILD_BUG_ON in a lot of cases and just falls over. I have no idea
      why, but until clue grows just disable this nifty idea on 32bit
      builds. Reported by 0-day builder.
      
      v4: Got it all wrong, apparently its the gcc version. We need 4.9+.
      Now reported by Imre.
      
      v5: Chris suggested to add the case to MISSING_CASE for speedier
      debug.
      
      v6: Even some gcc 4.9 versions don't see through the maze, so give up
      for now. Keep the skeleton and MISSING_CASE stuff though.
      
      Cc: Imre Deak <imre.deak@intel.com>
      Cc: Damien Lespiau <damien.lespiau@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Jani Nikula <jani.nikula@linux.intel.com>
      Cc: Dave Gordon <david.s.gordon@intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      5f77eeb0
    • Daniel Vetter's avatar
      drm/i915: Name the lrc irq handler correctly · 3f7531c3
      Daniel Vetter authored
      We consistently use the _irq_handler postfix for functions called in
      hardirq context. Especially when it's a non-static function hardirq is
      a crazy enough calling context to warrant this level of ocd. So rename
      it.
      
      Cc: Thomas Daniel <thomas.daniel@intel.com>
      Reviewed-by: default avatarThomas Daniel <thomas.daniel@intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      3f7531c3
  7. 10 Dec, 2014 1 commit