1. 17 Dec, 2017 1 commit
  2. 16 Dec, 2017 2 commits
  3. 15 Dec, 2017 3 commits
  4. 14 Dec, 2017 12 commits
  5. 13 Dec, 2017 6 commits
    • Chris Wilson's avatar
      drm/i915: Unwind i915_gem_init() failure · 6ca9a2be
      Chris Wilson authored
      Since Michal introduced new user controllable errors other than -EIO
      during i915_gem_init(), we need to actually unwind on the error path as
      we have to abort the module load (and we expect to do so cleanly!).
      
      As we now teardown key state and then mark the driver as wedged (on
      EIO), we have to be careful to not allow ourselves to resume and
      unwedge, thus attempting to use the uninitialised driver.
      
      v2: Try not to free driver state for the suppressed EIO
      v3: Use load-fault-injection to test both error/recovery paths.
      
      References: 8620eb1d ("drm/i915/uc: Don't use -EIO to report missing firmware")
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
      Reviewed-by: default avatarMichał Winiarski <michal.winiarski@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171213134347.4608-1-chris@chris-wilson.co.uk
      6ca9a2be
    • Chris Wilson's avatar
      drm/i915: Ratelimit request allocation under oom · 31c70f97
      Chris Wilson authored
      If we fail to allocate a request, we can reap the outstanding requests
      and push them to the request's slab's freelist before trying again. This
      forces us to ratelimit malicious clients that tie up all of the system
      resources in requests, instead of causing a system-wide oom.
      
      Testcase: igt/gem_shrink/execbuf1
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: default avatarJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171212180652.22061-3-chris@chris-wilson.co.uk
      31c70f97
    • Chris Wilson's avatar
      drm/i915: Allow fence allocations to fail · 2abe2f84
      Chris Wilson authored
      If a fence allocation fails in a blocking context, we will sleep on the
      fence as a last resort. We can therefore allow ourselves to fail and
      sleep on the fence instead of triggering a system-wide oom. This allows
      us to throttle malicious clients that are consuming lots of system
      resources by capping the amount of memory used by fences.
      
      Testcase: igt/gem_shrink/execbufX
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: default avatarJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171212180652.22061-2-chris@chris-wilson.co.uk
      2abe2f84
    • Chris Wilson's avatar
      drm/i915: Mark up potential allocation paths within i915_sw_fence as might_sleep · e30a7581
      Chris Wilson authored
      As kmalloc is allowed to block (if given the right flags), mark up the
      two i915_sw_fence routines that may call kmalloc as potential sleeping
      routines.
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: default avatarJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171212180652.22061-1-chris@chris-wilson.co.uk
      e30a7581
    • Chris Wilson's avatar
      drm/i915: Don't check #active_requests from i915_gem_wait_for_idle() · d7dc4131
      Chris Wilson authored
      i915_gem_wait_for_idle() is called from inside the shrinker, to ensure
      that we drain the last resources from the GPU in dire circumstances (OOM).
      As we may allocate whilst building a request, it is then possible to hit
      the shrinker with a request under construction, and so we must account
      for the incomplete request whilst waiting. In particular, we
      preincrement (in reserve_engine) the i915->gt.active_requests counter
      and mark the GPU as busy, therefore we can not use that counter for
      shortcircuiting the wait-for-idle.
      
      [  950.859024] GEM_BUG_ON(i915->gt.active_requests)
      [  950.859041] WARNING: CPU: 2 PID: 2178 at drivers/gpu/drm/i915/i915_gem.c:3615 i915_gem_wait_for_idle.part.56+0x166/0x4e0
      [  950.859041] Modules linked in: ccm tun fuse nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_mangle iptable_security iptable_raw arc4 iwldvm mac80211 snd_hda_codec_hdmi snd_hda_codec_idt snd_hda_codec_generic snd_hda_intel snd_hda_codec btusb snd_hda_core btrtl btbcm iwlwifi snd_hwdep btintel bluetooth snd_seq snd_seq_device snd_pcm ecdh_generic x86_pkg_temp_thermal tpm_infineon coretemp tpm_tis crc32_pclmul wmi_bmof crc32c_intel iTCO_wdt hp_wmi snd_timer iTCO_vendor_support sparse_keymap tpm_tis_core mei_me cfg80211
      [  950.859082]  snd joydev tpm mei rfkill pcspkr wmi soundcore lpc_ich hp_accel lis3lv02d input_polldev binfmt_misc e1000e ptp serio_raw pps_core
      [  950.859094] CPU: 2 PID: 2178 Comm: gem_exec_nop Tainted: G     U           4.15.0-rc2+ #900
      [  950.859102] Hardware name: Hewlett-Packard HP ProBook 6360b/1620, BIOS 68SCF Ver. B.42 12/29/2010
      [  950.859107] task: c5119cb4 task.stack: f3ccb8d8
      [  950.859112] EIP: i915_gem_wait_for_idle.part.56+0x166/0x4e0
      [  950.859113] EFLAGS: 00010296 CPU: 2
      [  950.859114] EAX: 00000024 EBX: f36c1888 ECX: f777a044 EDX: 00000007
      [  950.859115] ESI: f36c1888 EDI: edd53958 EBP: edd53970 ESP: edd53938
      [  950.859116]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
      [  950.859117] CR0: 80050033 CR2: b7f39000 CR3: 2f2b3000 CR4: 000406d0
      [  950.859118] Call Trace:
      [  950.859125]  ? drm_printk+0x70/0x70
      [  950.859129]  i915_gem_wait_for_idle+0x18/0x30
      [  950.859133]  i915_gem_shrink+0x360/0x410
      [  950.859138]  ? vmpressure+0xa8/0xf0
      [  950.859142]  ? ktime_get+0x4a/0x100
      [  950.859147]  i915_gem_shrink_all+0x21/0x40
      [  950.859151]  i915_gem_shrinker_oom+0x23/0x130
      [  950.859156]  notifier_call_chain+0x4e/0x70
      [  950.859160]  __blocking_notifier_call_chain+0x2f/0x60
      [  950.859164]  blocking_notifier_call_chain+0x11/0x20
      [  950.859169]  out_of_memory+0x207/0x280
      [  950.859174]  __alloc_pages_nodemask+0xd47/0xe60
      [  950.859179]  new_slab+0x32d/0x450
      [  950.859183]  ___slab_alloc.constprop.81+0x358/0x4e0
      [  950.859189]  ? i915_sw_fence_await_dma_fence+0x53/0x160
      [  950.859193]  ? __slab_free+0x1fe/0x310
      [  950.859197]  ? native_sched_clock+0x1e/0xc0
      [  950.859201]  ? i915_gem_request_alloc+0xcf/0x510
      [  950.859205]  ? sched_clock+0x9/0x10
      [  950.859209]  __slab_alloc.constprop.80+0x29/0x40
      [  950.859212]  ? __slab_alloc.constprop.80+0x29/0x40
      [  950.859216]  kmem_cache_alloc_trace+0x160/0x1a0
      [  950.859220]  ? i915_sw_fence_await_dma_fence+0x53/0x160
      [  950.859224]  i915_sw_fence_await_dma_fence+0x53/0x160
      [  950.859229]  i915_gem_request_await_dma_fence+0x1eb/0x390
      [  950.859233]  i915_gem_request_await_object+0xee/0x230
      [  950.859239]  i915_gem_do_execbuffer+0xc16/0x1200
      [  950.859246]  ? irqtime_account_irq+0x3e/0xc0
      [  950.859251]  ? irq_exit+0x4f/0xb0
      [  950.859257]  ? smp_apic_timer_interrupt+0x5f/0x110
      [  950.859261]  ? apic_timer_interrupt+0x35/0x3c
      [  950.859266]  i915_gem_execbuffer2_ioctl+0x212/0x440
      [  950.859270]  ? apic_timer_interrupt+0x35/0x3c
      [  950.859274]  ? i915_gem_do_execbuffer+0x1200/0x1200
      [  950.859279]  ? insn_get_seg_base+0x1b/0x50
      [  950.859283]  ? i915_gem_do_execbuffer+0x1200/0x1200
      [  950.859287]  drm_ioctl_kernel+0x51/0xa0
      [  950.859291]  drm_ioctl+0x2a3/0x350
      [  950.859294]  ? i915_gem_do_execbuffer+0x1200/0x1200
      [  950.859300]  ? sched_clock+0x9/0x10
      [  950.859303]  ? drm_getunique+0x70/0x70
      [  950.859308]  do_vfs_ioctl+0x7d/0x640
      [  950.859311]  ? native_sched_clock+0x1e/0xc0
      [  950.859315]  ? sched_clock+0x9/0x10
      [  950.859319]  ? sched_clock_cpu+0x13/0x120
      [  950.859323]  SyS_ioctl+0x4e/0x80
      [  950.859326]  do_fast_syscall_32+0x75/0x250
      [  950.859331]  ? irq_exit+0x4f/0xb0
      [  950.859334]  entry_SYSENTER_32+0x47/0x71
      [  950.859338] EIP: 0xb7f81d11
      [  950.859339] EFLAGS: 00000296 CPU: 2
      [  950.859340] EAX: ffffffda EBX: 00000003 ECX: 40406469 EDX: bfde4c20
      [  950.859340] ESI: 00000003 EDI: 40406469 EBP: 00000003 ESP: bfde4b38
      [  950.859341]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
      [  950.859343] Code: e8 30 60 01 00 83 c4 10 83 c3 04 39 f3 75 e0 8b 45 d8 8b 80 14 37 00 00 85 c0 74 13 68 dd 33 e4 c0 68 49 6f e3 c0 e8 4a 55 be ff <0f> ff 5e 5f b8 fe ff ff 3f bb 0a 00 00 00 e8 b7 14 c4 ff 8b 15
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171212132148.8124-1-chris@chris-wilson.co.ukReviewed-by: default avatarJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      d7dc4131
    • Chris Wilson's avatar
      drm/i915/fence: Use rcu to defer freeing of irq_work · 7d622351
      Chris Wilson authored
      It is illegal to perform an immediate free of the struct irq_work from
      inside the irq_work callback (as irq_work_run_list modifies work->flags
      after execution of the work->func()). As we use the irq_work to
      coordinate the freeing of the callback from two different softirq paths,
      we need to defer the kfree from inside our irq_work callback, for which
      we can use kfree_rcu.
      
      Fixes: 81c0ed21 ("drm/i915/fence: Avoid del_timer_sync() from inside a timer")
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171213094802.28243-1-chris@chris-wilson.co.uk
      7d622351
  6. 12 Dec, 2017 14 commits
  7. 11 Dec, 2017 2 commits
    • Chris Wilson's avatar
      drm/i915: Only report a wakeup if the waiter was truly asleep · b92326a0
      Chris Wilson authored
      If we attempt to wake up a waiter, who is currently checking the seqno
      it will be in the TASK_INTERRUPTIBLE state and ttwu will report success.
      However, it is actually awake and functioning -- so delay reporting the
      actual wake up until it sleeps. This fixes some spurious claims of
      missed_breadcrumbs when running under heavy load; i.e. sufficient load to
      preempt away the newly woken waiter before they complete their checks.
      However, it does so at the cost of a rare false negative; where the
      waiter changes between the check and ttwu -- the only way to fix that
      would be to extend the reporting from ttwu where the check could be done
      atomically.
      
      v2: Defend against !CONFIG_SMP
      v3: Don't filter out calls to wake_up_process
      v4: Drop risky microoptimisation to skip wakeups
      
      Testcase: igt/drv_missed_irq # sanity check we do detect missed_breadcrumb()
      Testcase: igt/gem_concurrent_blit # for generating false positives
      References: https://bugs.freedesktop.org/show_bug.cgi?id=100007Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171209124710.1606-1-chris@chris-wilson.co.ukReviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      b92326a0
    • Chris Wilson's avatar
      drm/i915: Stop listening to request resubmission from the signaler kthread · 776bc27f
      Chris Wilson authored
      The intent here was that we would be listening to
      i915_gem_request_unsubmit in order to cancel the signaler quickly and
      release the reference on the request. Cancelling the signaler is done
      directly via intel_engine_cancel_signaling (called from unsubmit), but
      that does not directly wake up the signaling thread, and neither does
      setting the request->global_seqno back to zero wake up listeners to the
      request->execute waitqueue. So the only time that listening to the
      request->execute waitqueue would wake up the signaling kthread would be
      on the request resubmission, during which time we would already receive
      wake ups from rejoining the global breadcrumbs wait rbtree.
      
      Trying to wake up to release the request remains an issue. If the
      signaling was cancelled and no other request required signaling, then it
      is possible for us to shutdown with the reference on the request still
      held. To ensure that we do not try to shutdown, leaking that request, we
      kick the signaling threads whenever we disarm the breadcrumbs, i.e. on
      parking the engine when idle.
      
      v2: We do need to be sure to release the last reference on stopping the
      kthread; asserting that it has been dropped already is insufficient.
      
      Fixes: d6a2289d ("drm/i915: Remove the preempted request from the execution queue")
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
      Cc: Michał Winiarski <michal.winiarski@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171208121033.5236-1-chris@chris-wilson.co.ukAcked-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
      776bc27f