1. 18 Jul, 2024 5 commits
  2. 17 Jul, 2024 3 commits
    • Matthew Brost's avatar
      drm/xe: Don't suspend device upon wedge · 452bca0e
      Matthew Brost authored
      When wedging a device we shouldn't be suspending device as state for
      debug will be lost.
      
      Also this appears to not work as the below stack trace pops upon trying
      to resume a wedged device:
      
      [  304.245044] INFO: task cat:12115 blocked for more than 151 seconds.
      [  304.251333]       Tainted: G        W          6.10.0-rc7-xe+ #3518
      [  304.257617] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      [  304.265459] task:cat             state:D stack:13384 pid:12115 tgid:12115 ppid:3986   flags:0x00000006
      [  304.265465] Call Trace:
      [  304.265467]  <TASK>
      [  304.265469]  __schedule+0x3c4/0xdf0
      [  304.265478]  schedule+0x3c/0x140
      [  304.265481]  rpm_resume+0x1cc/0x740
      [  304.265484]  ? __pfx_autoremove_wake_function+0x10/0x10
      [  304.265489]  __pm_runtime_resume+0x49/0x80
      [  304.265494]  guc_info+0x6b/0xb0 [xe]
      [  304.265538]  ? __pfx___drm_printfn_seq_file+0x10/0x10
      [  304.265541]  ? __pfx___drm_puts_seq_file+0x10/0x10
      [  304.265545]  seq_read_iter+0x111/0x4c0
      [  304.265551]  seq_read+0xfc/0x140
      [  304.265556]  full_proxy_read+0x58/0x80
      [  304.265560]  vfs_read+0xa7/0x360
      [  304.265563]  ? find_held_lock+0x2b/0x80
      [  304.265568]  ksys_read+0x64/0xe0
      [  304.265571]  do_syscall_64+0x68/0x140
      [  304.265575]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
      [  304.265578] RIP: 0033:0x7f4254d14992
      [  304.265580] RSP: 002b:00007ffc558666f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
      [  304.265583] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f4254d14992
      [  304.265584] RDX: 0000000000020000 RSI: 00007f4254ebb000 RDI: 0000000000000003
      [  304.265586] RBP: 00007f4254ebb000 R08: 00007f4254eba010 R09: 00007f4254eba010
      [  304.265587] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000022000
      [  304.265588] R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000
      [  304.265593]  </TASK>
      [  304.265594]
                     Showing all locks held in the system:
      [  304.265598] 1 lock held by khungtaskd/57:
      [  304.265599]  #0: ffffffff8273b860 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x36/0x1c0
      [  304.265607] 3 locks held by kworker/6:1/90:
      [  304.265610] 1 lock held by in:imklog/547:
      [  304.265611]  #0: ffff88810498cd88 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0x76/0xc0
      [  304.265620] 1 lock held by dmesg/1310:
      
      v2: Drop local 'err' variable (Jonathan)
      
      Fixes: 8ed9aaae ("drm/xe: Force wedged state and block GT reset upon any GPU hang")
      Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
      Signed-off-by: default avatarMatthew Brost <matthew.brost@intel.com>
      Reviewed-by: default avatarJonathan Cavitt <jonathan.cavitt@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20240716063902.1390130-2-matthew.brost@intel.com
      452bca0e
    • Matthew Brost's avatar
      drm/xe: Wedge the entire device · 7dbe8af1
      Matthew Brost authored
      Wedge the entire device, not just GT which may have triggered the wedge.
      To implement this, cleanup the layering so xe_device_declare_wedged()
      calls into the lower layers (GT) to ensure entire device is wedged.
      
      While we are here, also signal any pending GT TLB invalidations upon
      wedging device.
      
      Lastly, short circuit reset wait if device is wedged.
      
      v2:
       - Short circuit reset wait if device is wedged (Local testing)
      
      Fixes: 8ed9aaae ("drm/xe: Force wedged state and block GT reset upon any GPU hang")
      Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
      Signed-off-by: default avatarMatthew Brost <matthew.brost@intel.com>
      Reviewed-by: default avatarJonathan Cavitt <jonathan.cavitt@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20240716063902.1390130-1-matthew.brost@intel.com
      7dbe8af1
    • Alexander Usyskin's avatar
      drm/xe/gsc: add Battlemage support · e02cea83
      Alexander Usyskin authored
      Add heci_cscfi support bit for new CSC engine type.
      It has same mmio offsets as DG2 GSC but separate interrupt flow.
      Signed-off-by: default avatarAlexander Usyskin <alexander.usyskin@intel.com>
      Reviewed-by: default avatarRodrigo Vivi <rodrigo.vivi@intel.com>
      Signed-off-by: default avatarDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20240708084906.2827024-1-alexander.usyskin@intel.com
      e02cea83
  3. 15 Jul, 2024 1 commit
  4. 12 Jul, 2024 9 commits
  5. 11 Jul, 2024 2 commits
  6. 10 Jul, 2024 2 commits
  7. 09 Jul, 2024 2 commits
  8. 08 Jul, 2024 2 commits
  9. 06 Jul, 2024 1 commit
  10. 05 Jul, 2024 2 commits
  11. 04 Jul, 2024 10 commits
  12. 03 Jul, 2024 1 commit