1. 29 Sep, 2022 6 commits
  2. 28 Sep, 2022 2 commits
  3. 27 Sep, 2022 4 commits
  4. 26 Sep, 2022 2 commits
  5. 23 Sep, 2022 2 commits
  6. 22 Sep, 2022 4 commits
  7. 21 Sep, 2022 2 commits
  8. 20 Sep, 2022 1 commit
  9. 19 Sep, 2022 3 commits
    • Chris Wilson's avatar
      drm/i915/gem: Really move i915_gem_context.link under ref protection · ad3aa7c3
      Chris Wilson authored
      i915_perf assumes that it can use the i915_gem_context reference to
      protect its i915->gem.contexts.list iteration. However, this requires
      that we do not remove the context from the list until after we drop the
      final reference and release the struct. If, as currently, we remove the
      context from the list during context_close(), the link.next pointer may
      be poisoned while we are holding the context reference and cause a GPF:
      
      [ 4070.573157] i915 0000:00:02.0: [drm:i915_perf_open_ioctl [i915]] filtering on ctx_id=0x1fffff ctx_id_mask=0x1fffff
      [ 4070.574881] general protection fault, probably for non-canonical address 0xdead000000000100: 0000 [#1] PREEMPT SMP
      [ 4070.574897] CPU: 1 PID: 284392 Comm: amd_performance Tainted: G            E     5.17.9 #180
      [ 4070.574903] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017
      [ 4070.574907] RIP: 0010:oa_configure_all_contexts.isra.0+0x222/0x350 [i915]
      [ 4070.574982] Code: 08 e8 32 6e 10 e1 4d 8b 6d 50 b8 ff ff ff ff 49 83 ed 50 f0 41 0f c1 04 24 83 f8 01 0f 84 e3 00 00 00 85 c0 0f 8e fa 00 00 00 <49> 8b 45 50 48 8d 70 b0 49 8d 45 50 48 39 44 24 10 0f 85 34 fe ff
      [ 4070.574990] RSP: 0018:ffffc90002077b78 EFLAGS: 00010202
      [ 4070.574995] RAX: 0000000000000002 RBX: 0000000000000002 RCX: 0000000000000000
      [ 4070.575000] RDX: 0000000000000001 RSI: ffffc90002077b20 RDI: ffff88810ddc7c68
      [ 4070.575004] RBP: 0000000000000001 R08: ffff888103242648 R09: fffffffffffffffc
      [ 4070.575008] R10: ffffffff82c50bc0 R11: 0000000000025c80 R12: ffff888101bf1860
      [ 4070.575012] R13: dead0000000000b0 R14: ffffc90002077c04 R15: ffff88810be5cabc
      [ 4070.575016] FS:  00007f1ed50c0780(0000) GS:ffff88885ec80000(0000) knlGS:0000000000000000
      [ 4070.575021] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 4070.575025] CR2: 00007f1ed5590280 CR3: 000000010ef6f005 CR4: 00000000003706e0
      [ 4070.575029] Call Trace:
      [ 4070.575033]  <TASK>
      [ 4070.575037]  lrc_configure_all_contexts+0x13e/0x150 [i915]
      [ 4070.575103]  gen8_enable_metric_set+0x4d/0x90 [i915]
      [ 4070.575164]  i915_perf_open_ioctl+0xbc0/0x1500 [i915]
      [ 4070.575224]  ? asm_common_interrupt+0x1e/0x40
      [ 4070.575232]  ? i915_oa_init_reg_state+0x110/0x110 [i915]
      [ 4070.575290]  drm_ioctl_kernel+0x85/0x110
      [ 4070.575296]  ? update_load_avg+0x5f/0x5e0
      [ 4070.575302]  drm_ioctl+0x1d3/0x370
      [ 4070.575307]  ? i915_oa_init_reg_state+0x110/0x110 [i915]
      [ 4070.575382]  ? gen8_gt_irq_handler+0x46/0x130 [i915]
      [ 4070.575445]  __x64_sys_ioctl+0x3c4/0x8d0
      [ 4070.575451]  ? __do_softirq+0xaa/0x1d2
      [ 4070.575456]  do_syscall_64+0x35/0x80
      [ 4070.575461]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      [ 4070.575467] RIP: 0033:0x7f1ed5c10397
      [ 4070.575471] Code: 3c 1c e8 1c ff ff ff 85 c0 79 87 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a9 da 0d 00 f7 d8 64 89 01 48
      [ 4070.575478] RSP: 002b:00007ffd65c8d7a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      [ 4070.575484] RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 00007f1ed5c10397
      [ 4070.575488] RDX: 00007ffd65c8d7c0 RSI: 0000000040106476 RDI: 0000000000000006
      [ 4070.575492] RBP: 00005620972f9c60 R08: 000000000000000a R09: 0000000000000005
      [ 4070.575496] R10: 000000000000000d R11: 0000000000000246 R12: 000000000000000a
      [ 4070.575500] R13: 000000000000000d R14: 0000000000000000 R15: 00007ffd65c8d7c0
      [ 4070.575505]  </TASK>
      [ 4070.575507] Modules linked in: nls_ascii(E) nls_cp437(E) vfat(E) fat(E) i915(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) aesni_intel(E) crypto_simd(E) intel_gtt(E) cryptd(E) ttm(E) rapl(E) intel_cstate(E) drm_kms_helper(E) cfbfillrect(E) syscopyarea(E) cfbimgblt(E) intel_uncore(E) sysfillrect(E) mei_me(E) sysimgblt(E) i2c_i801(E) fb_sys_fops(E) mei(E) intel_pch_thermal(E) i2c_smbus(E) cfbcopyarea(E) video(E) button(E) efivarfs(E) autofs4(E)
      [ 4070.575549] ---[ end trace 0000000000000000 ]---
      
      v3: fix incorrect syntax of spin_lock() replacing spin_lock_irqsave()
      
      v2: irqsave not required in a worker, neither conversion to irq safe
          elsewhere (Tvrtko),
        - perf: it's safe to call gen8_configure_context() even if context has
          been closed, no need to check,
        - drop unrelated cleanup (Andi, Tvrtko)
      Reported-by: default avatarMark Janes <mark.janes@intel.com>
      Closes: https://gitlab.freedesktop.org/drm/intel/issues/6222
      References: a4e7ccda ("drm/i915: Move context management under GEM")
      Fixes: f8246cf4 ("drm/i915/gem: Drop free_work for GEM contexts")
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: default avatarAndi Shyti <andi.shyti@linux.intel.com>
      Signed-off-by: default avatarJanusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: <stable@vger.kernel.org> # v5.12+
      Signed-off-by: default avatarAndi Shyti <andi.shyti@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20220916092403.201355-3-janusz.krzysztofik@linux.intel.com
      ad3aa7c3
    • Janusz Krzysztofik's avatar
      drm/i915/gem: Flush contexts on driver release · 1cec3444
      Janusz Krzysztofik authored
      Due to i915_perf assuming that it can use the i915_gem_context reference
      to protect its i915->gem.contexts.list iteration, we need to defer removal
      of the context from the list until last reference to the context is put.
      However, there is a risk of triggering kernel warning on contexts list not
      empty at driver release time if we deleagate that task to a worker for
      i915_gem_context_release_work(), unless that work is flushed first.
      Unfortunately, it is not flushed on driver release.  Fix it.
      
      Instead of additionally calling flush_workqueue(), either directly or via
      a new dedicated wrapper around it, replace last call to
      i915_gem_drain_freed_objects() with existing i915_gem_drain_workqueue()
      that performs both tasks.
      
      Fixes: 75eefd82 ("drm/i915: Release i915_gem_context from a worker")
      Suggested-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: default avatarJanusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
      Reviewed-by: default avatarAndi Shyti <andi.shyti@linux.intel.com>
      Cc: stable@kernel.org # v5.16+
      Signed-off-by: default avatarAndi Shyti <andi.shyti@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20220916092403.201355-2-janusz.krzysztofik@linux.intel.com
      1cec3444
    • Matt Roper's avatar
      drm/i915/mtl: Add MTL forcewake support · 14f2f9bf
      Matt Roper authored
      MTL has separate forcewake tables for the primary/render GT and the
      media GT; each GT's intel_uncore will use a separate forcewake table and
      should only initialize the domains that are relevant to that GT.  The GT
      ack register also moves to a new location of (GSI base + 0xDFC) on this
      platform.
      
      Note that although our uncore handlers take care of transparently
      redirecting all register accesses in the media GT's GSI range to their
      new offset at 0x380000, the forcewake ranges listed in the table should
      use the final, post-translation offsets.
      
      NOTE:  There are two ranges in the media IP that have multicast
      registers where the two register instances reside in different power
      wells (either VD0 or VD2).  We don't have an easy way to deal with this
      today (and in fact we don't even access these register ranges in the
      driver today), so for now we just mark those ranges as FORCEWAKE_ALL
      which will cause all of the media power wells to be grabbed, ensuring
      proper operation.  If we start reading/writing in those ranges in the
      future, we can re-visit whether it's worth adding extra steering
      complexity into our forcewake support.
      
      Bspec: 67788, 67789, 52077
      Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
      Signed-off-by: default avatarMatt Roper <matthew.d.roper@intel.com>
      Reviewed-by: default avatarHarish Chegondi <harish.chegondi@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20220910001631.1986601-1-matthew.d.roper@intel.com
      14f2f9bf
  10. 16 Sep, 2022 9 commits
  11. 15 Sep, 2022 3 commits
  12. 14 Sep, 2022 2 commits
    • John Harrison's avatar
      drm/i915/uc: Fix issues with overriding firmware files · 5d53f4c2
      John Harrison authored
      The earlier update to support reduced versioning of firmware files
      introduced an issue with the firmware override module parameter. A
      self test would specify an invalid file name (invalid meaning not in
      the table) both with and without setting the override flag. The
      *non-override* case would cause an infinite loop. I.e. a situation
      that is impossible to hit outside of the selftest because either the
      file name has come from the table in first place or it came from an
      override. However, the override case was also broken in that it would
      bypass some of the later processing.
      
      The first fix is to update the scanning loop code so that if an
      invalid file is passed in, it will exit rather than loop forever. So
      if the impossible situation did somehow occur in the future, it
      wouldn't be such a big problem.
      
      The second flips the logic on the override early exit to be negative
      rather than positive. That way if an explicit override has been set,
      then it won't try to scan for backup options (because there is no
      point anyway - the user wanted X and if X is not available, that's
      their problem). It also means that it won't skip code that still needs
      to be run once a valid firmware file has been selected.
      
      v2: Also remove ANSI colour codes that accidentally got left in an
      error message in the original patch.
      
      Fixes: 665ae9c9 ("drm/i915/uc: Support for version reduced and multiple firmware files")
      Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
      Cc: Jani Nikula <jani.nikula@linux.intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
      Cc: Matthew Brost <matthew.brost@intel.com>
      Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
      Cc: Matthew Auld <matthew.auld@intel.com>
      Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
      Cc: Matt Roper <matthew.d.roper@intel.com>
      Cc: Lucas De Marchi <lucas.demarchi@intel.com>
      Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
      Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
      Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
      Signed-off-by: default avatarJohn Harrison <John.C.Harrison@Intel.com>
      Reviewed-by: default avatarDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20220914005821.3702446-2-John.C.Harrison@Intel.com
      5d53f4c2
    • Anshuman Gupta's avatar
      drm/i915/dgfx: Release mmap on rpm suspend · ad74457a
      Anshuman Gupta authored
      Release all mmap mapping for all lmem objects which are associated
      with userfault such that, while pcie function in D3hot, any access
      to memory mappings will raise a userfault.
      
      Runtime resume the dgpu(when gem object lies in lmem).
      This will transition the dgpu graphics function to D0
      state if it was in D3 in order to access the mmap memory
      mappings.
      
      v2:
      - Squashes the patches. [Matt Auld]
      - Add adequate locking for lmem_userfault_list addition. [Matt Auld]
      - Reused obj->userfault_count to avoid double addition. [Matt Auld]
      - Added i915_gem_object_lock to check
        i915_gem_object_is_lmem. [Matt Auld]
      
      v3:
      - Use i915_ttm_cpu_maps_iomem. [Matt Auld]
      - Fix 'ret == 0 to ret == VM_FAULT_NOPAGE'. [Matt Auld]
      - Reuse obj->userfault_count as a bool 0 or 1. [Matt Auld]
      - Delete the mmaped obj from lmem_userfault_list in obj
        destruction path. [Matt Auld]
      - Get a wakeref for object destruction patch. [Matt Auld]
      - Use intel_wakeref_auto to delay runtime PM. [Matt Auld]
      
      v4:
      - Avoid using mmo offset to get the vma_node. [Matt Auld]
      - Added comment to use the lmem_userfault_lock. [Matt Auld]
      - Get lmem_userfault_lock in i915_gem_object_release_mmap_offset.
        [Matt Auld]
      - Fixed kernel test robot generated warning.
      
      v5:
      - Addressed the cosmetics comments. [Andi]
      - Changed i915_gem_runtime_pm_object_release_mmap_offset() name to
        i915_gem_object_runtime_pm_release_mmap_offset() to be rhythmic.
      
      PCIe Specs 5.3.1.4.1
      
      Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6331
      Cc: Matthew Auld <matthew.auld@intel.com>
      Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
      Signed-off-by: default avatarAnshuman Gupta <anshuman.gupta@intel.com>
      Reviewed-by: default avatarAndi Shyti <andi.shyti@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20220913152714.16541-3-anshuman.gupta@intel.com
      ad74457a