• Oak Zeng's avatar
    drm/amdkfd: Fix a concurrency issue during kfd recovery · 4f942aae
    Oak Zeng authored
    start_cpsch and stop_cpsch can be called during kfd device
    initialization or during gpu reset/recovery. So they can
    run concurrently. Currently in start_cpsch and stop_cpsch,
    pm_init and pm_uninit is not protected by the dpm lock.
    Imagine such a case that user use packet manager's function
    to submit a pm4 packet to hang hws (ie through command
    cat /sys/class/kfd/kfd/topology/nodes/1/gpu_id | sudo tee
    /sys/kernel/debug/kfd/hang_hws), while kfd device is under
    device reset/recovery so packet manager can be not initialized.
    There will be unpredictable protection fault in such case.
    
    This patch moves pm_init/uninit inside the dpm lock and check
    packet manager is initialized before using packet manager
    function.
    Signed-off-by: default avatarOak Zeng <Oak.Zeng@amd.com>
    Acked-by: default avatarChristian Konig <christian.koenig@amd.com>
    Reviewed-by: default avatarFelix Kuehling <Felix.Kuehling@amd.com>
    Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
    4f942aae
kfd_packet_manager.c 11.6 KB