1. 12 Jan, 2017 13 commits
  2. 11 Jan, 2017 4 commits
    • Chris Wilson's avatar
      drm/i915: Add a sanity check that no request is submitted in the middle · c781c978
      Chris Wilson authored
      It is an error to start a new request on the same timeline (ringbuffer)
      as the current one before the current is submitted. If there are two
      requests emitting to the ringbuffer at the same time, the operation is
      undefined. We can catch this by checking for the timeline having a later
      seqno than ours when we come to submit our request.
      
      Currently we have this check at the end of __i915_add_request, but
      having an early check as well isolates a failure in the caller versus a
      failure in sealing the request (i.e. from inside __i915_add_request
      itself). For example, CI is currently tripping over this late assertion
      on ctg/ilk:
      
      [  100.329399] [IGT] gem_cs_tlb: starting subtest basic-default
      [  100.336333] ------------[ cut here ]------------
      [  100.336341] kernel BUG at drivers/gpu/drm/i915/i915_gem_request.c:908!
      [  100.336347] invalid opcode: 0000 [#1] PREEMPT SMP
      [  100.336351] Modules linked in: snd_hda_intel i915 snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm coretemp mei_me lpc_ich mei e1000e ptp pps_core [last unloaded: i915]
      [  100.336373] CPU: 0 PID: 6308 Comm: gem_cs_tlb Tainted: G     U          4.10.0-rc3-CI-CI_DRM_2045+ #1
      [  100.336380] Hardware name: LENOVO 7465CTO/7465CTO, BIOS 6DET44WW (2.08 ) 04/22/2009
      [  100.336386] task: ffff88012b738040 task.stack: ffffc90000560000
      [  100.336441] RIP: 0010:__i915_add_request+0x4aa/0x510 [i915]
      [  100.336445] RSP: 0018:ffffc90000563ac0 EFLAGS: 00010212
      [  100.336451] RAX: 0000000000005d52 RBX: ffff880133bb84c0 RCX: 0000000000000001
      [  100.336456] RDX: 0000000080000001 RSI: ffff88012b738860 RDI: 00000000ffffffff
      [  100.336461] RBP: ffffc90000563b00 R08: ffff880133bb8780 R09: 0000000000000000
      [  100.336466] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88012f53d950
      [  100.336472] R13: ffff88012a2b0af8 R14: ffff88012a5b0008 R15: ffff88012f53d960
      [  100.336477] FS:  00007f0d19da38c0(0000) GS:ffff88013bc00000(0000) knlGS:0000000000000000
      [  100.336483] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  100.336488] CR2: 00007f0d17706000 CR3: 000000012aa3e000 CR4: 00000000000406f0
      [  100.336496] Call Trace:
      [  100.336527]  i915_gem_switch_to_kernel_context+0x131/0x1b0 [i915]
      [  100.336559]  i915_gem_evict_vm+0x202/0x2b0 [i915]
      [  100.336590]  i915_gem_execbuffer_reserve.isra.9+0x3ae/0x440 [i915]
      [  100.336623]  i915_gem_do_execbuffer.isra.15+0x6d9/0x1b20 [i915]
      [  100.336656]  i915_gem_execbuffer2+0xc0/0x250 [i915]
      [  100.336666]  drm_ioctl+0x200/0x450
      [  100.336697]  ? i915_gem_execbuffer+0x330/0x330 [i915]
      [  100.336708]  do_vfs_ioctl+0x90/0x6e0
      [  100.336716]  ? up_read+0x1a/0x40
      [  100.336723]  ? trace_hardirqs_on_caller+0x122/0x1b0
      [  100.336730]  SyS_ioctl+0x3c/0x70
      [  100.336738]  entry_SYSCALL_64_fastpath+0x1c/0xb1
      [  100.336745] RIP: 0033:0x7f0d187cb357
      [  100.336750] RSP: 002b:00007ffe0b2f7c28 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      [  100.336761] RAX: ffffffffffffffda RBX: 00007ffe0b2f7d60 RCX: 00007f0d187cb357
      [  100.336768] RDX: 00007ffe0b2f7d00 RSI: 0000000040406469 RDI: 0000000000000003
      [  100.336775] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000022
      [  100.336782] R10: 0000000000000007 R11: 0000000000000246 R12: 0000000000000002
      [  100.336789] R13: 0000000000419101 R14: 00007ffe0b2f7d60 R15: 00007ffe0b2f7d50
      [  100.336797] Code: 5f 74 1e e9 d4 fb ff ff e8 bc 1e 9c e0 e9 ae fb ff ff 4c 89 e7 e8 77 22 fd ff e9 88 fd ff ff 0f 0b e8 a3 1e 9c e0 e9 b1 fb ff ff <0f> 0b 0f 0b e8 fd af ab e0 85 c0 75 c2 48 c7 c2 80 2c 71 a0 be
      [  100.336877] RIP: __i915_add_request+0x4aa/0x510 [i915] RSP: ffffc90000563ac0
      [  100.336886] ---[ end trace 22b36545479e5eb7 ]---
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170111140858.1922-1-chris@chris-wilson.co.ukReviewed-by: default avatarJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      c781c978
    • Chris Wilson's avatar
      drm/i915: Prefer random replacement before eviction search · 606fec95
      Chris Wilson authored
      Performing an eviction search can be very, very slow especially for a
      range restricted replacement. For example, a workload like
      gem_concurrent_blit will populate the entire GTT and then cause aperture
      thrashing. Since the GTT is a mix of active and inactive tiny objects,
      we have to search through almost 400k objects before finding anything
      inside the mappable region, and as this search is required before every
      operation performance falls off a cliff.
      
      Instead of performing the full search, we do a trial replacement of the
      node at a random location fitting the specified restrictions. We lose
      the strict LRU property of the GTT in exchange for avoiding the slow
      search (several orders of runtime improvement for gem_concurrent_blit
      4KiB-global-gtt, e.g. from 5000s to 20s). The loss of LRU replacement is
      (later) mitigated firstly by only doing replacement if we find no
      freespace and secondly by execbuf doing a PIN_NONBLOCK search first before
      it starts thrashing (i.e. the random replacement will only occur from the
      already inactive set of objects).
      
      v2: Ascii-art, and check preconditionst
      v3: Rephrase final sentence in comment to explain why we don't bother
      with if (i915_is_ggtt(vm)) for preferring random replacement.
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: default avatarJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170111112312.31493-3-chris@chris-wilson.co.uk
      606fec95
    • Chris Wilson's avatar
      drm/i915: Extract reserving space in the GTT to a helper · 625d988a
      Chris Wilson authored
      Extract drm_mm_reserve_node + calling i915_gem_evict_for_node into its
      own routine so that it can be shared rather than duplicated.
      
      v2: Kerneldoc
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: igvt-g-dev@lists.01.org
      Reviewed-by: default avatarJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170111112312.31493-2-chris@chris-wilson.co.uk
      625d988a
    • Chris Wilson's avatar
      drm/i915: Use the MRU stack search after evicting · e007b19d
      Chris Wilson authored
      When we evict from the GTT to make room for an object, the hole we
      create is put onto the MRU stack inside the drm_mm range manager. On the
      next search pass, we can speed up a PIN_HIGH allocation by referencing
      that stack for the new hole.
      
      v2: Pull together the 3 identical implements (ahem, a couple were
      outdated) into a common routine for allocating a node and evicting as
      necessary.
      v3: Detect invalid calls to i915_gem_gtt_insert()
      v4: kerneldoc
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: default avatarJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/20170111112312.31493-1-chris@chris-wilson.co.uk
      e007b19d
  3. 10 Jan, 2017 23 commits