• Philip Yang's avatar
    drm/amdkfd: svm range restore work deadlock when process exit · 6225bb3a
    Philip Yang authored
    kfd_process_notifier_release flush svm_range_restore_work
    which calls svm_range_list_lock_and_flush_work to flush deferred_list
    work, but if deferred_list work mmput release the last user, it will
    call exit_mmap -> notifier_release, it is deadlock with below backtrace.
    
    Move flush svm_range_restore_work to kfd_process_wq_release to avoid
    deadlock. Then svm_range_restore_work take task->mm ref to avoid mm is
    gone while validating and mapping ranges to GPU.
    
    Workqueue: events svm_range_deferred_list_work [amdgpu]
    Call Trace:
     wait_for_completion+0x94/0x100
     __flush_work+0x12a/0x1e0
     __cancel_work_timer+0x10e/0x190
     cancel_delayed_work_sync+0x13/0x20
     kfd_process_notifier_release+0x98/0x2a0 [amdgpu]
     __mmu_notifier_release+0x74/0x1f0
     exit_mmap+0x170/0x200
     mmput+0x5d/0x130
     svm_range_deferred_list_work+0x104/0x230 [amdgpu]
     process_one_work+0x220/0x3c0
    Signed-off-by: default avatarPhilip Yang <Philip.Yang@amd.com>
    Reported-by: default avatarRuili Ji <ruili.ji@amd.com>
    Tested-by: default avatarRuili Ji <ruili.ji@amd.com>
    Reviewed-by: default avatarFelix Kuehling <Felix.Kuehling@amd.com>
    Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
    6225bb3a
kfd_svm.c 94.9 KB