• Leo Li's avatar
    drm: Get ref on CRTC commit object when waiting for flip_done · 4364bcb2
    Leo Li authored
    This fixes a general protection fault, caused by accessing the contents
    of a flip_done completion object that has already been freed. It occurs
    due to the preemption of a non-blocking commit worker thread W by
    another commit thread X. X continues to clear its atomic state at the
    end, destroying the CRTC commit object that W still needs. Switching
    back to W and accessing the commit objects then leads to bad results.
    
    Worker W becomes preemptable when waiting for flip_done to complete. At
    this point, a frequently occurring commit thread X can take over. Here's
    an example where W is a worker thread that flips on both CRTCs, and X
    does a legacy cursor update on both CRTCs:
    
            ...
         1. W does flip work
         2. W runs commit_hw_done()
         3. W waits for flip_done on CRTC 1
         4. > flip_done for CRTC 1 completes
         5. W finishes waiting for CRTC 1
         6. W waits for flip_done on CRTC 2
    
         7. > Preempted by X
         8. > flip_done for CRTC 2 completes
         9. X atomic_check: hw_done and flip_done are complete on all CRTCs
        10. X updates cursor on both CRTCs
        11. X destroys atomic state
        12. X done
    
        13. > Switch back to W
        14. W waits for flip_done on CRTC 2
        15. W raises general protection fault
    
    The error looks like so:
    
        general protection fault: 0000 [#1] PREEMPT SMP PTI
        **snip**
        Call Trace:
         lock_acquire+0xa2/0x1b0
         _raw_spin_lock_irq+0x39/0x70
         wait_for_completion_timeout+0x31/0x130
         drm_atomic_helper_wait_for_flip_done+0x64/0x90 [drm_kms_helper]
         amdgpu_dm_atomic_commit_tail+0xcae/0xdd0 [amdgpu]
         commit_tail+0x3d/0x70 [drm_kms_helper]
         process_one_work+0x212/0x650
         worker_thread+0x49/0x420
         kthread+0xfb/0x130
         ret_from_fork+0x3a/0x50
        Modules linked in: x86_pkg_temp_thermal amdgpu(O) chash(O)
        gpu_sched(O) drm_kms_helper(O) syscopyarea sysfillrect sysimgblt
        fb_sys_fops ttm(O) drm(O)
    
    Note that i915 has this issue masked, since hw_done is signaled after
    waiting for flip_done. Doing so will block the cursor update from
    happening until hw_done is signaled, preventing the cursor commit from
    destroying the state.
    
    v2: The reference on the commit object needs to be obtained before
        hw_done() is signaled, since that's the point where another commit
        is allowed to modify the state. Assuming that the
        new_crtc_state->commit object still exists within flip_done() is
        incorrect.
    
        Fix by getting a reference in setup_commit(), and releasing it
        during default_clear().
    Signed-off-by: default avatarLeo Li <sunpeng.li@amd.com>
    Reviewed-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
    Signed-off-by: default avatarHarry Wentland <harry.wentland@amd.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/1539611200-6184-1-git-send-email-sunpeng.li@amd.com
    4364bcb2
drm_atomic.h 33 KB