1. 22 Dec, 2017 3 commits
  2. 18 Dec, 2017 5 commits
    • Weinan Li's avatar
      drm/i915/gvt: load host render mocs once in mocs switch · b05b3397
      Weinan Li authored
      Load host render mocs registers once for delta update of mocs switch, it
      reduces mmio read times obviously, then brings performance improvement
      during multi-vms switch.
      Signed-off-by: default avatarWeinan Li <weinan.z.li@intel.com>
      Signed-off-by: default avatarZhenyu Wang <zhenyuw@linux.intel.com>
      b05b3397
    • Weinan Li's avatar
      drm/i915/gvt: refine mocs save restore policy · f402f2d6
      Weinan Li authored
      Save and restore the mocs regs of one VM in GVT-g burning too much CPU
      utilization. Add LRI command scan to monitor the change of mocs registers,
      save the state in vreg, and use delta update policy to restore them.
      It can obviously reduce the MMIO r/w count, and improve the performance
      of context switch.
      Signed-off-by: default avatarWeinan Li <weinan.z.li@intel.com>
      Signed-off-by: default avatarZhenyu Wang <zhenyuw@linux.intel.com>
      f402f2d6
    • Weinan Li's avatar
      drm/i915/gvt: optimize for vGPU mmio switch · e47107ad
      Weinan Li authored
      Now mmio switch between vGPUs need to switch to host first then to expected
      vGPU, it waste one time mmio save/restore. r/w mmio usually is
      time-consuming, and there are so many mocs registers need to save/restore
      during vGPU switch. Combine the switch_to_host and switch_to_vgpu can
      reduce 1 time mmio save/restore, it will reduce the CPU utilization and
      performance while there is multi VMs with heavy work load.
      Signed-off-by: default avatarWeinan Li <weinan.z.li@intel.com>
      Signed-off-by: default avatarZhenyu Wang <zhenyuw@linux.intel.com>
      e47107ad
    • Weinan Li's avatar
      drm/i915/gvt: refine trace_render_mmio · dc5718f4
      Weinan Li authored
      Refine trace_render_mmio to show the vm id before and after vgpu switch,
      tag host id as '0', this patch will be used in the future patch for refine
      mocs switch policy.
      Signed-off-by: default avatarWeinan Li <weinan.z.li@intel.com>
      Signed-off-by: default avatarZhenyu Wang <zhenyuw@linux.intel.com>
      dc5718f4
    • Zhenyu Wang's avatar
      Merge tag 'drm-intel-next-2017-12-14' into gvt-next · 2ef6765c
      Zhenyu Wang authored
      - Fix documentation build issues (Randy, Markus)
      - Fix timestamp frequency calculation for perf on CNL (Lionel)
      - New DMC firmware for Skylake (Anusha)
      - GTT flush fixes and other GGTT write track and refactors (Chris)
      - Taint kernel when GPU reset fails (Chris)
      - Display workarounds organization (Lucas)
      - GuC and HuC initialization clean-up and fixes (Michal)
      - Other fixes around GuC submission (Michal)
      - Execlist clean-ups like caching ELSP reg offset and improving log readability (Chri\
      s)
      - Many other improvements on our logs and dumps (Chris)
      - Restore GT performance in headless mode with DMC loaded (Tvrtko)
      - Stop updating legacy fb parameters since FBC is not using anymore (Daniel)
      - More selftest improvements (Chris)
      - Preemption fixes and improvements (Chris)
      - x86/early-quirks improvements for Intel graphics stolen memory. (Joonas, Matthew)
      - Other improvements on Stolen Memory code to be resource centric. (Matthew)
      - Improvements and fixes on fence allocation/release (Chris).
      
      GVT:
      
      - fixes for two coverity scan errors (Colin)
      - mmio switch code refine (Changbin)
      - more virtual display dmabuf fixes (Tina/Gustavo)
      - misc cleanups (Pei)
      - VFIO mdev display dmabuf interface and gvt support (Tina)
      - VFIO mdev opregion support/fixes (Tina/Xiong/Chris)
      - workload scheduling optimization (Changbin)
      - preemption fix and temporal workaround (Zhenyu)
      - and misc fixes after refactor (Chris)
      2ef6765c
  3. 14 Dec, 2017 12 commits
  4. 13 Dec, 2017 6 commits
    • Chris Wilson's avatar
      drm/i915: Unwind i915_gem_init() failure · 6ca9a2be
      Chris Wilson authored
      Since Michal introduced new user controllable errors other than -EIO
      during i915_gem_init(), we need to actually unwind on the error path as
      we have to abort the module load (and we expect to do so cleanly!).
      
      As we now teardown key state and then mark the driver as wedged (on
      EIO), we have to be careful to not allow ourselves to resume and
      unwedge, thus attempting to use the uninitialised driver.
      
      v2: Try not to free driver state for the suppressed EIO
      v3: Use load-fault-injection to test both error/recovery paths.
      
      References: 8620eb1d ("drm/i915/uc: Don't use -EIO to report missing firmware")
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
      Reviewed-by: default avatarMichał Winiarski <michal.winiarski@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171213134347.4608-1-chris@chris-wilson.co.uk
      6ca9a2be
    • Chris Wilson's avatar
      drm/i915: Ratelimit request allocation under oom · 31c70f97
      Chris Wilson authored
      If we fail to allocate a request, we can reap the outstanding requests
      and push them to the request's slab's freelist before trying again. This
      forces us to ratelimit malicious clients that tie up all of the system
      resources in requests, instead of causing a system-wide oom.
      
      Testcase: igt/gem_shrink/execbuf1
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: default avatarJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171212180652.22061-3-chris@chris-wilson.co.uk
      31c70f97
    • Chris Wilson's avatar
      drm/i915: Allow fence allocations to fail · 2abe2f84
      Chris Wilson authored
      If a fence allocation fails in a blocking context, we will sleep on the
      fence as a last resort. We can therefore allow ourselves to fail and
      sleep on the fence instead of triggering a system-wide oom. This allows
      us to throttle malicious clients that are consuming lots of system
      resources by capping the amount of memory used by fences.
      
      Testcase: igt/gem_shrink/execbufX
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: default avatarJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171212180652.22061-2-chris@chris-wilson.co.uk
      2abe2f84
    • Chris Wilson's avatar
      drm/i915: Mark up potential allocation paths within i915_sw_fence as might_sleep · e30a7581
      Chris Wilson authored
      As kmalloc is allowed to block (if given the right flags), mark up the
      two i915_sw_fence routines that may call kmalloc as potential sleeping
      routines.
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: default avatarJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171212180652.22061-1-chris@chris-wilson.co.uk
      e30a7581
    • Chris Wilson's avatar
      drm/i915: Don't check #active_requests from i915_gem_wait_for_idle() · d7dc4131
      Chris Wilson authored
      i915_gem_wait_for_idle() is called from inside the shrinker, to ensure
      that we drain the last resources from the GPU in dire circumstances (OOM).
      As we may allocate whilst building a request, it is then possible to hit
      the shrinker with a request under construction, and so we must account
      for the incomplete request whilst waiting. In particular, we
      preincrement (in reserve_engine) the i915->gt.active_requests counter
      and mark the GPU as busy, therefore we can not use that counter for
      shortcircuiting the wait-for-idle.
      
      [  950.859024] GEM_BUG_ON(i915->gt.active_requests)
      [  950.859041] WARNING: CPU: 2 PID: 2178 at drivers/gpu/drm/i915/i915_gem.c:3615 i915_gem_wait_for_idle.part.56+0x166/0x4e0
      [  950.859041] Modules linked in: ccm tun fuse nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_mangle iptable_security iptable_raw arc4 iwldvm mac80211 snd_hda_codec_hdmi snd_hda_codec_idt snd_hda_codec_generic snd_hda_intel snd_hda_codec btusb snd_hda_core btrtl btbcm iwlwifi snd_hwdep btintel bluetooth snd_seq snd_seq_device snd_pcm ecdh_generic x86_pkg_temp_thermal tpm_infineon coretemp tpm_tis crc32_pclmul wmi_bmof crc32c_intel iTCO_wdt hp_wmi snd_timer iTCO_vendor_support sparse_keymap tpm_tis_core mei_me cfg80211
      [  950.859082]  snd joydev tpm mei rfkill pcspkr wmi soundcore lpc_ich hp_accel lis3lv02d input_polldev binfmt_misc e1000e ptp serio_raw pps_core
      [  950.859094] CPU: 2 PID: 2178 Comm: gem_exec_nop Tainted: G     U           4.15.0-rc2+ #900
      [  950.859102] Hardware name: Hewlett-Packard HP ProBook 6360b/1620, BIOS 68SCF Ver. B.42 12/29/2010
      [  950.859107] task: c5119cb4 task.stack: f3ccb8d8
      [  950.859112] EIP: i915_gem_wait_for_idle.part.56+0x166/0x4e0
      [  950.859113] EFLAGS: 00010296 CPU: 2
      [  950.859114] EAX: 00000024 EBX: f36c1888 ECX: f777a044 EDX: 00000007
      [  950.859115] ESI: f36c1888 EDI: edd53958 EBP: edd53970 ESP: edd53938
      [  950.859116]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
      [  950.859117] CR0: 80050033 CR2: b7f39000 CR3: 2f2b3000 CR4: 000406d0
      [  950.859118] Call Trace:
      [  950.859125]  ? drm_printk+0x70/0x70
      [  950.859129]  i915_gem_wait_for_idle+0x18/0x30
      [  950.859133]  i915_gem_shrink+0x360/0x410
      [  950.859138]  ? vmpressure+0xa8/0xf0
      [  950.859142]  ? ktime_get+0x4a/0x100
      [  950.859147]  i915_gem_shrink_all+0x21/0x40
      [  950.859151]  i915_gem_shrinker_oom+0x23/0x130
      [  950.859156]  notifier_call_chain+0x4e/0x70
      [  950.859160]  __blocking_notifier_call_chain+0x2f/0x60
      [  950.859164]  blocking_notifier_call_chain+0x11/0x20
      [  950.859169]  out_of_memory+0x207/0x280
      [  950.859174]  __alloc_pages_nodemask+0xd47/0xe60
      [  950.859179]  new_slab+0x32d/0x450
      [  950.859183]  ___slab_alloc.constprop.81+0x358/0x4e0
      [  950.859189]  ? i915_sw_fence_await_dma_fence+0x53/0x160
      [  950.859193]  ? __slab_free+0x1fe/0x310
      [  950.859197]  ? native_sched_clock+0x1e/0xc0
      [  950.859201]  ? i915_gem_request_alloc+0xcf/0x510
      [  950.859205]  ? sched_clock+0x9/0x10
      [  950.859209]  __slab_alloc.constprop.80+0x29/0x40
      [  950.859212]  ? __slab_alloc.constprop.80+0x29/0x40
      [  950.859216]  kmem_cache_alloc_trace+0x160/0x1a0
      [  950.859220]  ? i915_sw_fence_await_dma_fence+0x53/0x160
      [  950.859224]  i915_sw_fence_await_dma_fence+0x53/0x160
      [  950.859229]  i915_gem_request_await_dma_fence+0x1eb/0x390
      [  950.859233]  i915_gem_request_await_object+0xee/0x230
      [  950.859239]  i915_gem_do_execbuffer+0xc16/0x1200
      [  950.859246]  ? irqtime_account_irq+0x3e/0xc0
      [  950.859251]  ? irq_exit+0x4f/0xb0
      [  950.859257]  ? smp_apic_timer_interrupt+0x5f/0x110
      [  950.859261]  ? apic_timer_interrupt+0x35/0x3c
      [  950.859266]  i915_gem_execbuffer2_ioctl+0x212/0x440
      [  950.859270]  ? apic_timer_interrupt+0x35/0x3c
      [  950.859274]  ? i915_gem_do_execbuffer+0x1200/0x1200
      [  950.859279]  ? insn_get_seg_base+0x1b/0x50
      [  950.859283]  ? i915_gem_do_execbuffer+0x1200/0x1200
      [  950.859287]  drm_ioctl_kernel+0x51/0xa0
      [  950.859291]  drm_ioctl+0x2a3/0x350
      [  950.859294]  ? i915_gem_do_execbuffer+0x1200/0x1200
      [  950.859300]  ? sched_clock+0x9/0x10
      [  950.859303]  ? drm_getunique+0x70/0x70
      [  950.859308]  do_vfs_ioctl+0x7d/0x640
      [  950.859311]  ? native_sched_clock+0x1e/0xc0
      [  950.859315]  ? sched_clock+0x9/0x10
      [  950.859319]  ? sched_clock_cpu+0x13/0x120
      [  950.859323]  SyS_ioctl+0x4e/0x80
      [  950.859326]  do_fast_syscall_32+0x75/0x250
      [  950.859331]  ? irq_exit+0x4f/0xb0
      [  950.859334]  entry_SYSENTER_32+0x47/0x71
      [  950.859338] EIP: 0xb7f81d11
      [  950.859339] EFLAGS: 00000296 CPU: 2
      [  950.859340] EAX: ffffffda EBX: 00000003 ECX: 40406469 EDX: bfde4c20
      [  950.859340] ESI: 00000003 EDI: 40406469 EBP: 00000003 ESP: bfde4b38
      [  950.859341]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
      [  950.859343] Code: e8 30 60 01 00 83 c4 10 83 c3 04 39 f3 75 e0 8b 45 d8 8b 80 14 37 00 00 85 c0 74 13 68 dd 33 e4 c0 68 49 6f e3 c0 e8 4a 55 be ff <0f> ff 5e 5f b8 fe ff ff 3f bb 0a 00 00 00 e8 b7 14 c4 ff 8b 15
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171212132148.8124-1-chris@chris-wilson.co.ukReviewed-by: default avatarJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      d7dc4131
    • Chris Wilson's avatar
      drm/i915/fence: Use rcu to defer freeing of irq_work · 7d622351
      Chris Wilson authored
      It is illegal to perform an immediate free of the struct irq_work from
      inside the irq_work callback (as irq_work_run_list modifies work->flags
      after execution of the work->func()). As we use the irq_work to
      coordinate the freeing of the callback from two different softirq paths,
      we need to defer the kfree from inside our irq_work callback, for which
      we can use kfree_rcu.
      
      Fixes: 81c0ed21 ("drm/i915/fence: Avoid del_timer_sync() from inside a timer")
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171213094802.28243-1-chris@chris-wilson.co.uk
      7d622351
  5. 12 Dec, 2017 14 commits