• Philip Yang's avatar
    drm/amdgpu: Move reset domain init before calling RREG32 · 436afdfa
    Philip Yang authored
    amdgpu_detect_virtualization reads register, amdgpu_device_rreg access
    adev->reset_domain->sem if kernel defined CONFIG_LOCKDEP, below is the
    random boot hang backtrace on Vega10. It may get random NULL pointer
    access backtrace if amdgpu_sriov_runtime is true too.
    
    Move amdgpu_reset_create_reset_domain before calling to RREG32.
    
     BUG: kernel NULL pointer dereference, address:
     #PF: supervisor read access in kernel mode
     #PF: error_code(0x0000) - not-present page
     PGD 0 P4D 0
     Oops: 0000 [#1] PREEMPT SMP NOPTI
     Workqueue: events work_for_cpu_fn
     RIP: 0010:down_read_trylock+0x13/0xf0
     Call Trace:
      <TASK>
      amdgpu_device_skip_hw_access+0x38/0x80 [amdgpu]
      amdgpu_device_rreg+0x1b/0x170 [amdgpu]
      amdgpu_detect_virtualization+0x73/0x100 [amdgpu]
      amdgpu_device_init.cold.60+0xbe/0x16b1 [amdgpu]
      ? pci_bus_read_config_word+0x43/0x70
      amdgpu_driver_load_kms+0x15/0x120 [amdgpu]
      amdgpu_pci_probe+0x1a1/0x3a0 [amdgpu]
    
    Fixes: d0fb18b5 ("drm/amdgpu: Move reset sem into reset_domain")
    Signed-off-by: default avatarPhilip Yang <Philip.Yang@amd.com>
    Reviewed-by: default avatarAndrey Grodzovsky <andrey.grodzovsky@amd.com>
    Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
    436afdfa
amdgpu_device.c 156 KB