1. 06 Oct, 2017 8 commits
    • ozeng's avatar
      drm/amdgpu: Fixed a potential circular lock · a50ecc54
      ozeng authored
      The dead circular lock senario captured is as followed.
      The idea of the fix is moving read_user_wptr outside of
      acquire_queue...release_queue critical section
      
      [   63.477482] WARNING: possible circular locking dependency detected
      [   63.484091] 4.12.0-kfd-ozeng #3 Not tainted
      [   63.488531] ------------------------------------------------------
      [   63.495146] HelloWorldLoop/2526 is trying to acquire lock:
      [   63.501011]  (&mm->mmap_sem){++++++}, at: [<ffffffff911898ce>] __might_fault+0x3e/0x90
      [   63.509472]
                     but task is already holding lock:
      [   63.515716]  (&adev->srbm_mutex){+.+...}, at: [<ffffffffc0484feb>] lock_srbm+0x2b/0x50 [amdgpu]
      [   63.525099]
                     which lock already depends on the new lock.
      
      [   63.533841]
                     the existing dependency chain (in reverse order) is:
      [   63.541839]
                     -> #2 (&adev->srbm_mutex){+.+...}:
      [   63.548178]        lock_acquire+0x6d/0x90
      [   63.552461]        __mutex_lock+0x70/0x8c0
      [   63.556826]        mutex_lock_nested+0x16/0x20
      [   63.561603]        gfx_v8_0_kiq_resume+0x1039/0x14a0 [amdgpu]
      [   63.567817]        gfx_v8_0_hw_init+0x204d/0x2210 [amdgpu]
      [   63.573675]        amdgpu_device_init+0xdea/0x1790 [amdgpu]
      [   63.579640]        amdgpu_driver_load_kms+0x63/0x220 [amdgpu]
      [   63.585743]        drm_dev_register+0x145/0x1e0
      [   63.590605]        amdgpu_pci_probe+0x11e/0x160 [amdgpu]
      [   63.596266]        local_pci_probe+0x40/0xa0
      [   63.600803]        pci_device_probe+0x134/0x150
      [   63.605650]        driver_probe_device+0x2a1/0x460
      [   63.610785]        __driver_attach+0xdc/0xe0
      [   63.615321]        bus_for_each_dev+0x5f/0x90
      [   63.619984]        driver_attach+0x19/0x20
      [   63.624337]        bus_add_driver+0x40/0x270
      [   63.628908]        driver_register+0x5b/0xe0
      [   63.633446]        __pci_register_driver+0x5b/0x60
      [   63.638586]        rtsx_pci_switch_output_voltage+0x1d/0x20 [rtsx_pci]
      [   63.645564]        do_one_initcall+0x4c/0x1b0
      [   63.650205]        do_init_module+0x56/0x1ea
      [   63.654767]        load_module+0x208c/0x27d0
      [   63.659335]        SYSC_finit_module+0x96/0xd0
      [   63.664058]        SyS_finit_module+0x9/0x10
      [   63.668629]        entry_SYSCALL_64_fastpath+0x1f/0xbe
      [   63.674088]
                     -> #1 (reservation_ww_class_mutex){+.+.+.}:
      [   63.681257]        lock_acquire+0x6d/0x90
      [   63.685551]        __ww_mutex_lock.constprop.11+0x8c/0xed0
      [   63.691426]        ww_mutex_lock+0x67/0x70
      [   63.695802]        amdgpu_verify_access+0x6d/0x100 [amdgpu]
      [   63.701743]        ttm_bo_mmap+0x8e/0x100 [ttm]
      [   63.706615]        amdgpu_bo_mmap+0xd/0x60 [amdgpu]
      [   63.711814]        amdgpu_mmap+0x35/0x40 [amdgpu]
      [   63.716904]        mmap_region+0x3b5/0x5a0
      [   63.721255]        do_mmap+0x400/0x4d0
      [   63.725260]        vm_mmap_pgoff+0xb0/0xf0
      [   63.729625]        SyS_mmap_pgoff+0x19e/0x260
      [   63.734292]        SyS_mmap+0x1d/0x20
      [   63.738199]        entry_SYSCALL_64_fastpath+0x1f/0xbe
      [   63.743681]
                     -> #0 (&mm->mmap_sem){++++++}:
      [   63.749641]        __lock_acquire+0x1401/0x1420
      [   63.754491]        lock_acquire+0x6d/0x90
      [   63.758750]        __might_fault+0x6b/0x90
      [   63.763176]        kgd_hqd_load+0x24f/0x270 [amdgpu]
      [   63.768432]        load_mqd+0x4b/0x50 [amdkfd]
      [   63.773192]        create_queue_nocpsch+0x535/0x620 [amdkfd]
      [   63.779237]        pqm_create_queue+0x34d/0x4f0 [amdkfd]
      [   63.784835]        kfd_ioctl_create_queue+0x282/0x670 [amdkfd]
      [   63.790973]        kfd_ioctl+0x310/0x4d0 [amdkfd]
      [   63.795944]        do_vfs_ioctl+0x90/0x6e0
      [   63.800268]        SyS_ioctl+0x74/0x80
      [   63.804207]        entry_SYSCALL_64_fastpath+0x1f/0xbe
      [   63.809607]
                     other info that might help us debug this:
      
      [   63.818026] Chain exists of:
                       &mm->mmap_sem --> reservation_ww_class_mutex --> &adev->srbm_mutex
      
      [   63.830382]  Possible unsafe locking scenario:
      
      [   63.836605]        CPU0                    CPU1
      [   63.841364]        ----                    ----
      [   63.846123]   lock(&adev->srbm_mutex);
      [   63.850061]                                lock(reservation_ww_class_mutex);
      [   63.857475]                                lock(&adev->srbm_mutex);
      [   63.864084]   lock(&mm->mmap_sem);
      [   63.867657]
                      *** DEADLOCK ***
      
      [   63.873884] 3 locks held by HelloWorldLoop/2526:
      [   63.878739]  #0:  (&process->mutex){+.+.+.}, at: [<ffffffffc06e1a9a>] kfd_ioctl_create_queue+0x24a/0x670 [amdkfd]
      [   63.889543]  #1:  (&dqm->lock){+.+...}, at: [<ffffffffc06eedeb>] create_queue_nocpsch+0x3b/0x620 [amdkfd]
      [   63.899684]  #2:  (&adev->srbm_mutex){+.+...}, at: [<ffffffffc0484feb>] lock_srbm+0x2b/0x50 [amdgpu]
      [   63.909500]
                     stack backtrace:
      [   63.914187] CPU: 3 PID: 2526 Comm: HelloWorldLoop Not tainted 4.12.0-kfd-ozeng #3
      [   63.922184] Hardware name: AMD Carrizo/Gardenia, BIOS WGA5819N_Weekly_15_08_1 08/19/2015
      [   63.930865] Call Trace:
      [   63.933464]  dump_stack+0x85/0xc9
      [   63.936999]  print_circular_bug+0x1f9/0x207
      [   63.941442]  __lock_acquire+0x1401/0x1420
      [   63.945745]  ? lock_srbm+0x2b/0x50 [amdgpu]
      [   63.950185]  lock_acquire+0x6d/0x90
      [   63.953885]  ? __might_fault+0x3e/0x90
      [   63.957899]  __might_fault+0x6b/0x90
      [   63.961699]  ? __might_fault+0x3e/0x90
      [   63.965755]  kgd_hqd_load+0x24f/0x270 [amdgpu]
      [   63.970577]  load_mqd+0x4b/0x50 [amdkfd]
      [   63.974745]  create_queue_nocpsch+0x535/0x620 [amdkfd]
      [   63.980242]  pqm_create_queue+0x34d/0x4f0 [amdkfd]
      [   63.985320]  kfd_ioctl_create_queue+0x282/0x670 [amdkfd]
      [   63.991021]  kfd_ioctl+0x310/0x4d0 [amdkfd]
      [   63.995499]  ? kfd_ioctl_destroy_queue+0x70/0x70 [amdkfd]
      [   64.001234]  do_vfs_ioctl+0x90/0x6e0
      [   64.005065]  ? up_read+0x1a/0x40
      [   64.008496]  SyS_ioctl+0x74/0x80
      [   64.011955]  entry_SYSCALL_64_fastpath+0x1f/0xbe
      [   64.016863] RIP: 0033:0x7f4b3bd35f07
      [   64.020696] RSP: 002b:00007ffe7689ec38 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      [   64.028786] RAX: ffffffffffffffda RBX: 00000000002a2000 RCX: 00007f4b3bd35f07
      [   64.036414] RDX: 00007ffe7689ecb0 RSI: 00000000c0584b02 RDI: 0000000000000005
      [   64.044045] RBP: 00007f4a3212d000 R08: 00007f4b3c919000 R09: 0000000000080000
      [   64.051674] R10: 00007f4b376b64b8 R11: 0000000000000246 R12: 00007f4a3212d000
      [   64.059324] R13: 0000000000000015 R14: 0000000000000064 R15: 00007ffe7689ef50
      Signed-off-by: default avatarOak Zeng <Oak.Zeng@amd.com>
      Reviewed-by: default avatarFelix Kuehling <Felix.Kuehling@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      a50ecc54
    • Colin Ian King's avatar
      drm/radeon: make functions alloc_pasid and free_pasid static · c3fa35a4
      Colin Ian King authored
      The functions alloc_pasid  and free_pasid are local to the
      source and do not need to be in global scope, so make them static.
      
      Cleans up sparse warnings:
      warning: symbol 'alloc_pasid' was not declared. Should it be static?
      warning: symbol 'free_pasid' was not declared. Should it be static?
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      c3fa35a4
    • Marek Olšák's avatar
      drm/amdgpu: add FENCE_TO_HANDLE ioctl that returns syncobj or sync_file · 7ca24cf2
      Marek Olšák authored
      for being able to convert an amdgpu fence into one of the handles.
      Mesa will use this.
      Reviewed-by: default avatarDave Airlie <airlied@redhat.com>
      Signed-off-by: default avatarMarek Olšák <marek.olsak@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      7ca24cf2
    • Marek Olšák's avatar
    • Marek Olšák's avatar
      drm/syncobj: extract two helpers from drm_syncobj_create · 1321fd2c
      Marek Olšák authored
      For amdgpu.
      
      drm_syncobj_create is renamed to drm_syncobj_create_as_handle, and new
      helpers drm_syncobj_create and drm_syncobj_get_handle are added.
      Reviewed-by: default avatarDave Airlie <airlied@redhat.com>
      Signed-off-by: default avatarMarek Olšák <marek.olsak@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      1321fd2c
    • Rex Zhu's avatar
      drm/amd/powerplay: delete flag PP_VALID · e1827a30
      Rex Zhu authored
      don't need to check pp_valid, all pp
      export functions are moved to ip_funcs
      and pp_funcs. so just need to check the
      function point.
      Reviewed-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarRex Zhu <Rex.Zhu@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      e1827a30
    • Rex Zhu's avatar
    • Rex Zhu's avatar
      drm/amd/powerplay: tidy up ret checks in amd_powerplay.c (v3) · ae97988f
      Rex Zhu authored
      v2: squash in regression fix (Rex)
      v3: Squash in regression fix (Rex)
      Reviewed-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarRex Zhu <Rex.Zhu@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      ae97988f
  2. 01 Oct, 2017 1 commit
  3. 29 Sep, 2017 7 commits
  4. 28 Sep, 2017 24 commits