1. 11 Nov, 2020 1 commit
  2. 09 Nov, 2020 3 commits
  3. 06 Nov, 2020 1 commit
  4. 05 Nov, 2020 2 commits
  5. 03 Nov, 2020 4 commits
  6. 29 Oct, 2020 3 commits
  7. 28 Oct, 2020 1 commit
  8. 23 Oct, 2020 3 commits
  9. 22 Oct, 2020 3 commits
  10. 21 Oct, 2020 2 commits
  11. 20 Oct, 2020 2 commits
  12. 19 Oct, 2020 1 commit
  13. 16 Oct, 2020 5 commits
    • Chris Wilson's avatar
      drm/i915: Use the active reference on the vma while capturing · 178536b8
      Chris Wilson authored
      During error capture, we need to take a reference to the vma from before
      the reset in order to catpure the contents of the vma later. Currently
      we are using both an active reference and a kref, but due to nature of
      the i915_vma reference handling, that kref is on the vma->obj and not
      the vma itself. This means the vma may be destroyed as soon as it is
      idle, that is in between the i915_active_release(&vma->active) and the
      i915_vma_put(vma):
      
      <3> [197.866181] BUG: KASAN: use-after-free in intel_engine_coredump_add_vma+0x36c/0x4a0 [i915]
      <3> [197.866339] Read of size 8 at addr ffff8881258cb800 by task gem_exec_captur/1041
      <3> [197.866467]
      <4> [197.866512] CPU: 2 PID: 1041 Comm: gem_exec_captur Not tainted 5.9.0-g5e4234f97efba-kasan_200+ #1
      <4> [197.866521] Hardware name: Intel Corp. Broxton P/Apollolake RVP1A, BIOS APLKRVPA.X64.0150.B11.1608081044 08/08/2016
      <4> [197.866530] Call Trace:
      <4> [197.866549]  dump_stack+0x99/0xd0
      <4> [197.866760]  ? intel_engine_coredump_add_vma+0x36c/0x4a0 [i915]
      <4> [197.866783]  print_address_description.constprop.8+0x3e/0x60
      <4> [197.866797]  ? kmsg_dump_rewind_nolock+0xd4/0xd4
      <4> [197.866819]  ? lockdep_hardirqs_off+0xd4/0x120
      <4> [197.867037]  ? intel_engine_coredump_add_vma+0x36c/0x4a0 [i915]
      <4> [197.867249]  ? intel_engine_coredump_add_vma+0x36c/0x4a0 [i915]
      <4> [197.867270]  kasan_report.cold.10+0x1f/0x37
      <4> [197.867492]  ? intel_engine_coredump_add_vma+0x36c/0x4a0 [i915]
      <4> [197.867710]  intel_engine_coredump_add_vma+0x36c/0x4a0 [i915]
      <4> [197.867949]  i915_gpu_coredump.part.29+0x150/0x7b0 [i915]
      <4> [197.868186]  i915_capture_error_state+0x5e/0xc0 [i915]
      <4> [197.868396]  intel_gt_handle_error+0x6eb/0xa20 [i915]
      <4> [197.868624]  ? intel_gt_reset_global+0x370/0x370 [i915]
      <4> [197.868644]  ? check_flags+0x50/0x50
      <4> [197.868662]  ? __lock_acquire+0xd59/0x6b00
      <4> [197.868678]  ? register_lock_class+0x1ad0/0x1ad0
      <4> [197.868944]  i915_wedged_set+0xcf/0x1b0 [i915]
      <4> [197.869147]  ? i915_wedged_get+0x90/0x90 [i915]
      <4> [197.869371]  ? i915_wedged_get+0x90/0x90 [i915]
      <4> [197.869398]  simple_attr_write+0x153/0x1c0
      <4> [197.869428]  full_proxy_write+0xee/0x180
      <4> [197.869442]  ? __sb_start_write+0x1f3/0x310
      <4> [197.869465]  vfs_write+0x1a3/0x640
      <4> [197.869492]  ksys_write+0xec/0x1c0
      <4> [197.869507]  ? __ia32_sys_read+0xa0/0xa0
      <4> [197.869525]  ? lockdep_hardirqs_on_prepare+0x32b/0x4e0
      <4> [197.869541]  ? syscall_enter_from_user_mode+0x1c/0x50
      <4> [197.869566]  do_syscall_64+0x33/0x80
      <4> [197.869579]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      <4> [197.869590] RIP: 0033:0x7fd8b7aee281
      <4> [197.869604] Code: c3 0f 1f 84 00 00 00 00 00 48 8b 05 59 8d 20 00 c3 0f 1f 84 00 00 00 00 00 8b 05 8a d1 20 00 85 c0 75 16 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 57 f3 c3 0f 1f 44 00 00 41 54 55 49 89 d4 53
      <4> [197.869613] RSP: 002b:00007ffea3b72008 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      <4> [197.869625] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fd8b7aee281
      <4> [197.869633] RDX: 0000000000000002 RSI: 00007fd8b81a82e7 RDI: 000000000000000d
      <4> [197.869641] RBP: 0000000000000002 R08: 0000000000000000 R09: 0000000000000034
      <4> [197.869650] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fd8b81a82e7
      <4> [197.869658] R13: 000000000000000d R14: 0000000000000000 R15: 0000000000000000
      <3> [197.869707]
      <3> [197.869757] Allocated by task 1041:
      <4> [197.869833]  kasan_save_stack+0x19/0x40
      <4> [197.869843]  __kasan_kmalloc.constprop.5+0xc1/0xd0
      <4> [197.869853]  kmem_cache_alloc+0x106/0x8e0
      <4> [197.870059]  i915_vma_instance+0x212/0x1930 [i915]
      <4> [197.870270]  eb_lookup_vmas+0xe06/0x1d10 [i915]
      <4> [197.870475]  i915_gem_do_execbuffer+0x131d/0x4080 [i915]
      <4> [197.870682]  i915_gem_execbuffer2_ioctl+0x103/0x5d0 [i915]
      <4> [197.870701]  drm_ioctl_kernel+0x1d2/0x270
      <4> [197.870710]  drm_ioctl+0x40d/0x85c
      <4> [197.870721]  __x64_sys_ioctl+0x10d/0x170
      <4> [197.870731]  do_syscall_64+0x33/0x80
      <4> [197.870740]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      <3> [197.870748]
      <3> [197.870798] Freed by task 22:
      <4> [197.870865]  kasan_save_stack+0x19/0x40
      <4> [197.870875]  kasan_set_track+0x1c/0x30
      <4> [197.870884]  kasan_set_free_info+0x1b/0x30
      <4> [197.870894]  __kasan_slab_free+0x111/0x160
      <4> [197.870903]  kmem_cache_free+0xcd/0x710
      <4> [197.871109]  i915_vma_parked+0x618/0x800 [i915]
      <4> [197.871307]  __gt_park+0xdb/0x1e0 [i915]
      <4> [197.871501]  ____intel_wakeref_put_last+0xb1/0x190 [i915]
      <4> [197.871516]  process_one_work+0x8dc/0x15d0
      <4> [197.871525]  worker_thread+0x82/0xb30
      <4> [197.871535]  kthread+0x36d/0x440
      <4> [197.871545]  ret_from_fork+0x22/0x30
      <3> [197.871553]
      <3> [197.871602] The buggy address belongs to the object at ffff8881258cb740
       which belongs to the cache i915_vma of size 968
      
      Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2553
      Fixes: 2850748e ("drm/i915: Pull i915_vma_pin under the vm->mutex")
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: <stable@vger.kernel.org> # v5.5+
      Reviewed-by: default avatarMatthew Auld <matthew.auld@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201016092527.29039-1-chris@chris-wilson.co.uk
      178536b8
    • Chris Wilson's avatar
      drm/i915/gt: Confirm the context survives execution · 89db9537
      Chris Wilson authored
      Repeat our sanitychecks from before execution to after execution. One
      expects that if we were to see these, the gpu would already be on fire,
      but the timing may be informative.
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: default avatarMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201015190816.31763-1-chris@chris-wilson.co.uk
      89db9537
    • Chris Wilson's avatar
      drm/i915/gt: Cleanup kasan warning for on-stack (unsigned long) casting · 6971e07b
      Chris Wilson authored
      Kasan is gving a warning for passing a u32 parameter into find_first_bit
      (casting to a unsigned long *, with appropriate length restrictions):
      
      [   44.678262] BUG: KASAN: stack-out-of-bounds in find_first_bit+0x2e/0x50
      [   44.678295] Read of size 8 at addr ffff888233f4fc30 by task core_hotunplug/474
      [   44.678326]
      [   44.678358] CPU: 0 PID: 474 Comm: core_hotunplug Not tainted 5.9.0+ #608
      [   44.678465] Hardware name: BESSTAR (HK) LIMITED GN41/Default string, BIOS BLT-BI-MINIPC-F4G-EX3R110-GA65A-101-D 10/12/2018
      [   44.678500] Call Trace:
      [   44.678534]  dump_stack+0x84/0xba
      [   44.678569]  print_address_description.constprop.0+0x21/0x220
      [   44.678605]  ? kmsg_dump_rewind_nolock+0x5f/0x5f
      [   44.678638]  ? _raw_spin_lock_irqsave+0x6d/0xb0
      [   44.678669]  ? _raw_write_lock_irqsave+0xb0/0xb0
      [   44.678702]  ? set_task_cpu+0x1e0/0x1e0
      [   44.678733]  ? find_first_bit+0x2e/0x50
      [   44.678763]  kasan_report.cold+0x20/0x42
      [   44.678794]  ? find_first_bit+0x2e/0x50
      [   44.678825]  __asan_load8+0x69/0x90
      [   44.678856]  find_first_bit+0x2e/0x50
      [   44.679027]  __caps_show.isra.0+0x9e/0x1f0 [i915]
      
      Since we are only using the shorter type for our own convenience,
      accommodate kasan and use unsigned long.
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201013110845.16127-1-chris@chris-wilson.co.uk
      6971e07b
    • Chris Wilson's avatar
      drm/i915/gt: Undo forced context restores after trivial preemptions · bb65548e
      Chris Wilson authored
      We may try to preempt the currently executing request, only to find that
      after unravelling all the dependencies that the original executing
      context is still the earliest in the topological sort and re-submitted
      back to HW (if we do detect some change in the ELSP that requires
      re-submission). However, due to the way we check for wrap-around during
      the unravelling, we mark any context that has been submitted just once
      (i.e. with the rq->wa_tail set, but the ring->tail earlier) as
      potentially wrapping and requiring a forced restore on resubmission.
      This was expected to be not a problem, as it was anticipated that most
      unwinding for preemption would result in a context switch and the few
      that did not would be lost in the noise. It did not take long for
      someone to find one particular workload where the cost of those extra
      context restores was measurable.
      
      However, since we know the wa_tail is of fixed size, and we know that a
      request must be larger than the wa_tail itself, we can safely maintain
      the check for request wrapping and check against a slightly future point
      in the ring that includes an expected wa_tail. (That is if the
      ring->tail is already set to rq->wa_tail, including another 8 bytes in
      the check does not invalidate the incremental wrap detection.)
      
      Fixes: 8ab3a381 ("drm/i915/gt: Incrementally check for rewinding")
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Bruce Chang <yu.bruce.chang@intel.com>
      Cc: Ramalingam C <ramalingam.c@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: <stable@vger.kernel.org> # v5.4+
      Reviewed-by: default avatarMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201002083425.4605-1-chris@chris-wilson.co.uk
      bb65548e
    • Chris Wilson's avatar
      drm/i915/gt: Delay execlist processing for tgl · 6ca7217d
      Chris Wilson authored
      When running gem_exec_nop, it floods the system with many requests (with
      the goal of userspace submitting faster than the HW can process a single
      empty batch). This causes the driver to continually resubmit new
      requests onto the end of an active context, a flood of lite-restore
      preemptions. If we time this just right, Tigerlake hangs.
      
      Inserting a small delay between the processing of CS events and
      submitting the next context, prevents the hang. Naturally it does not
      occur with debugging enabled. The suspicion then is that this is related
      to the issues with the CS event buffer, and inserting an mmio read of
      the CS pointer status appears to be very successful in preventing the
      hang. Other registers, or uncached reads, or plain mb, do not prevent
      the hang, suggesting that register is key -- but that the hang can be
      prevented by a simple udelay, suggests it is just a timing issue like
      that encountered by commit 233c1ae3 ("drm/i915/gt: Wait for CSB
      entries on Tigerlake"). Also note that the hang is not prevented by
      applying CTX_DESC_FORCE_RESTORE, or by inserting a delay on the GPU
      between requests.
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Bruce Chang <yu.bruce.chang@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: stable@vger.kernel.org
      Acked-by: default avatarMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201015195023.32346-1-chris@chris-wilson.co.uk
      6ca7217d
  14. 15 Oct, 2020 5 commits
  15. 13 Oct, 2020 1 commit
  16. 07 Oct, 2020 1 commit
  17. 06 Oct, 2020 2 commits
    • Tvrtko Ursulin's avatar
      drm/i915: Fix DMA mapped scatterlist lookup · 934941ed
      Tvrtko Ursulin authored
      As the previous patch fixed the places where we walk the whole scatterlist
      for DMA addresses, this patch fixes the random lookup functionality.
      
      To achieve this we have to add a second lookup iterator and add a
      i915_gem_object_get_sg_dma helper, to be used analoguous to existing
      i915_gem_object_get_sg_dma. Therefore two lookup caches are maintained per
      object and they are flushed at the same point for simplicity. (Strictly
      speaking the DMA cache should be flushed from i915_gem_gtt_finish_pages,
      but today this conincides with unsetting of the pages in general.)
      
      Partial VMA view is then fixed to use the new DMA lookup and properly
      query sg length.
      
      v2:
       * Checkpatch.
      Signed-off-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Matthew Auld <matthew.auld@intel.com>
      Cc: Lu Baolu <baolu.lu@linux.intel.com>
      Cc: Tom Murphy <murphyt7@tcd.ie>
      Cc: Logan Gunthorpe <logang@deltatee.com>
      Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201006092508.1064287-2-tvrtko.ursulin@linux.intel.com
      934941ed
    • Tvrtko Ursulin's avatar
      drm/i915: Fix DMA mapped scatterlist walks · 8a473dba
      Tvrtko Ursulin authored
      When walking DMA mapped scatterlists sg_dma_len has to be used since it
      can be different (coalesced) from the backing store entry.
      
      This also means we have to end the walk when encountering a zero length
      DMA entry and cannot rely on the normal sg list end marker.
      
      Both issues were there in theory for some time but were hidden by the fact
      Intel IOMMU driver was never coalescing entries. As there are ongoing
      efforts to change this we need to start handling it.
      
      v2:
       * Use unsigned int for local storing sg_dma_len. (Logan)
      Signed-off-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      References: 85d1225e ("drm/i915: Introduce & use new lightweight SGL iterators")
      References: b31144c0 ("drm/i915: Micro-optimise gen6_ppgtt_insert_entries()")
      Reported-by: default avatarTom Murphy <murphyt7@tcd.ie>
      Suggested-by: Tom Murphy <murphyt7@tcd.ie> # __sgt_iter
      Suggested-by: Logan Gunthorpe <logang@deltatee.com> # __sgt_iter
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Matthew Auld <matthew.auld@intel.com>
      Cc: Lu Baolu <baolu.lu@linux.intel.com>
      Reviewed-by: default avatarLogan Gunthorpe <logang@deltatee.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20201006092508.1064287-1-tvrtko.ursulin@linux.intel.com
      8a473dba