An error occurred fetching the project authors.
- 04 Apr, 2019 1 commit
-
-
Alex Deucher authored
Picasso is a new raven variant. Reviewed-by:
Kent Russell <kent.russell@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 10 Dec, 2018 2 commits
-
-
Alex Deucher authored
New vega20 id. Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Alex Deucher authored
New vega10 ids. Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
-
- 19 Nov, 2018 1 commit
-
-
Gang Ba authored
Add Vega12 and Polaris12 device info and device IDs to KFD. Signed-off-by:
Gang Ba <gaba@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 05 Nov, 2018 3 commits
-
-
Christian König authored
Vega10 has multiple interrupt rings, so this can be called from multiple calles at the same time resulting in: [ 71.779334] ================================ [ 71.779406] WARNING: inconsistent lock state [ 71.779478] 4.19.0-rc1+ #44 Tainted: G W [ 71.779565] -------------------------------- [ 71.779637] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. [ 71.779740] kworker/6:1/120 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 71.779832] 00000000ad761971 (&(&kfd->interrupt_lock)->rlock){?...}, at: kgd2kfd_interrupt+0x75/0x100 [amdgpu] [ 71.780058] {IN-HARDIRQ-W} state was registered at: [ 71.780115] _raw_spin_lock+0x2c/0x40 [ 71.780180] kgd2kfd_interrupt+0x75/0x100 [amdgpu] [ 71.780248] amdgpu_irq_callback+0x6c/0x150 [amdgpu] [ 71.780315] amdgpu_ih_process+0x88/0x100 [amdgpu] [ 71.780380] amdgpu_irq_handler+0x20/0x40 [amdgpu] [ 71.780409] __handle_irq_event_percpu+0x49/0x2a0 [ 71.780436] handle_irq_event_percpu+0x30/0x70 [ 71.780461] handle_irq_event+0x37/0x60 [ 71.780484] handle_edge_irq+0x83/0x1b0 [ 71.780506] handle_irq+0x1f/0x30 [ 71.780526] do_IRQ+0x53/0x110 [ 71.780544] ret_from_intr+0x0/0x22 [ 71.780566] cpuidle_enter_state+0xaa/0x330 [ 71.780591] do_idle+0x203/0x280 [ 71.780610] cpu_startup_entry+0x6f/0x80 [ 71.780634] start_secondary+0x1b0/0x200 [ 71.780657] secondary_startup_64+0xa4/0xb0 Fix this by always using irq save spin locks. Signed-off-by:
Christian König <christian.koenig@amd.com> Acked-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Amber Lin authored
Add amdgpu_amdkfd_ prefix to amdgpu functions served for amdkfd usage. v2: fix indentation Signed-off-by:
Amber Lin <Amber.Lin@amd.com> Acked-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Alex Deucher <alexander.deucher@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Amber Lin authored
After amdkfd module is merged into amdgpu, KFD can call amdgpu directly and no longer needs to use the function pointer. Replace those function pointers with functions if they are not ASIC dependent. Signed-off-by:
Amber Lin <Amber.Lin@amd.com> Acked-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Alex Deucher <alexander.deucher@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 27 Sep, 2018 6 commits
-
-
Shaoyun Liu authored
Firmware have the workaround to replace the atomic Ops with read-modify-write on CP side. User should not expect atomic Ops on system memory works normally if system didn't not support it. Signed-off-by:
Shaoyun Liu <Shaoyun.Liu@amd.com> Acked-by:
Alex Deucher <alexander.deucher@amd.com> Reviewed-By:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Shaoyun Liu authored
Add Vega20 device IDs, device info and enable it in KFD. Signed-off-by:
Shaoyun Liu <Shaoyun.Liu@amd.com> Acked-by:
Alex Deucher <alexander.deucher@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com>
-
Shaoyun Liu authored
Vega20 supports 8 SDMA queues per engine Signed-off-by:
Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Acked-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Felix Kuehling authored
Also save the version in struct kfd_dev so we only need to query it once. Signed-off-by:
Philip Yang <Philip.Yang@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Acked-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Emily Deng authored
KFD module doesn't support TONGA SRIOV, if init KFD module in TONGA SRIOV environment, it will let compute ring IB test fail. Signed-off-by:
Emily Deng <Emily.Deng@amd.com> Reviewed-by:
Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Acked-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Eric Huang authored
Add the flags of properties according to Asic type and pcie capabilities. Signed-off-by:
Eric Huang <JinHuiEric.Huang@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Acked-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 20 Sep, 2018 1 commit
-
-
Yong Zhao authored
CWSR fails on Raven if the control stack is MTYPE_UC, which is used for regular GART mappings. As a workaround we map it using MTYPE_NC. The MEC firmware expects the control stack at one page offset from the start of the MQD so it is part of the MQD allocation on GFXv9. AMDGPU added a memory allocation flag just for this purpose. Acked-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Yong Zhao <yong.zhao@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 11 Sep, 2018 1 commit
-
-
Shaoyun Liu authored
Thunk will generate the XGMI topology information when necessary with the hive_id for each specified device Signed-off-by:
Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Acked-by:
Christian König <christian.koenig@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
- 13 Jul, 2018 2 commits
-
-
Yong Zhao authored
Add DID and kfd_device_info for Raven. Signed-off-by:
Yong Zhao <Yong.Zhao@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Acked-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Oded Gabbay <oded.gabbay@gmail.com>
-
Yong Zhao authored
On Raven there is only one SDMA engine instead of previously assumed two, so we need to adapt our code to this new scenario. Signed-off-by:
Yong Zhao <yong.zhao@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Acked-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Oded Gabbay <oded.gabbay@gmail.com>
-
- 12 Jul, 2018 5 commits
-
-
Shaoyun Liu authored
Signed-off-by:
Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Acked-by:
Christian König <christian.koenig@amd.com> Signed-off-by:
Oded Gabbay <oded.gabbay@gmail.com>
-
Shaoyun Liu authored
Lock KFD and evict existing queues on reset. Notify user mode by signaling hw_exception events. Signed-off-by:
Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Acked-by:
Christian König <christian.koenig@amd.com> Signed-off-by:
Oded Gabbay <oded.gabbay@gmail.com>
-
Shaoyun Liu authored
Signed-off-by:
Shaoyun Liu <Shaoyun.Liu@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Acked-by:
Christian König <christian.koenig@amd.com> Signed-off-by:
Oded Gabbay <oded.gabbay@gmail.com>
-
Lan Xiao authored
Upon VM Fault, the VMID and PASID written by HW are zeros in Hawaii. Instead of reading from ih_ring_entry, read directly from the registers. This workaround fix the soft hang issues caused by mishandled VM Fault in Hawaii. Signed-off-by:
Lan Xiao <Lan.Xiao@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Acked-by:
Christian König <christian.koenig@amd.com> Signed-off-by:
Oded Gabbay <oded.gabbay@gmail.com>
-
Felix Kuehling authored
This is no longer needed with the memalloc_nofs_save/restore in dqm_lock/unlock. Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Acked-by:
Christian König <christian.koenig@amd.com> Signed-off-by:
Oded Gabbay <oded.gabbay@gmail.com>
-
- 01 May, 2018 2 commits
-
-
Yong Zhao authored
Since the assembly code is inside "#if 0", it is ineffective. Despite that, during debugging, we need to change the assembly code, extract it into a separate file and compile the new file into hex values using sp3. That process also requires us to remove "#if 0" and modify lines starting with "#", so that sp3 can successfully compile the new file. With this change, all the above chore is no longer needed, and cwsr_trap_handler_gfx*.asm can be directly used by sp3 to generate its hex values. Signed-off-by:
Yong Zhao <yong.zhao@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Oded Gabbay <oded.gabbay@gmail.com>
-
Felix Kuehling authored
Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by:
Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by:
Oded Gabbay <oded.gabbay@gmail.com>
-
- 10 Apr, 2018 5 commits
-
-
Felix Kuehling authored
* Report 64-bit doorbells as HSA_CAP_DOORBELL_TYPE_2_0 in topology * Report cache information in topology (duplicates GFXv8 info for now) * Add device info for Vega10 support in KFD Raven is not enabled at this time as it needs additional changes in DQM to work with a single SDMA engine. Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by:
Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by:
Oded Gabbay <oded.gabbay@gmail.com>
-
welu authored
Report failure to enable atomics only on GPUs that require them. This allows GPUs that don't require atomics to function, but can benefit if they are available. This is the case for Vega10, which doesn't use atomics for basic functioning of the MEC, AQL and HWS microcode. So it can work without atomics. But shader programs can still use atomic instructions on systems that support PCIe atomics. Signed-off-by:
welu <Wei.Lu2@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by:
Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by:
Oded Gabbay <oded.gabbay@gmail.com>
-
Felix Kuehling authored
Signed-off-by:
Shaoyun Liu <Shaoyun.Liu@amd.com> Signed-off-by:
Jay Cornwall <Jay.Cornwall@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by:
Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by:
Oded Gabbay <oded.gabbay@gmail.com>
-
Felix Kuehling authored
Signed-off-by:
John Bridgman <john.bridgman@amd.com> Signed-off-by:
Jay Cornwall <Jay.Cornwall@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by:
Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by:
Oded Gabbay <oded.gabbay@gmail.com>
-
Felix Kuehling authored
This prepares for GFXv9 (Vega10), which has 64-bit doorbells. Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by:
Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by:
Oded Gabbay <oded.gabbay@gmail.com>
-
- 23 Mar, 2018 2 commits
-
-
Felix Kuehling authored
These interfaces allow KGD to stop and resume all GPU user mode queue access to a process address space. This is needed for handling MMU notifiers of userptrs mapped for GPU access in KFD VMs. Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by:
Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by:
Oded Gabbay <oded.gabbay@gmail.com>
-
Felix Kuehling authored
When an MMU notifier runs in memory reclaim context, it can deadlock trying to take locks that are already held in the thread causing the memory reclaim. The solution is to avoid memory reclaim while holding locks that are taken in MMU notifiers by using GFP_NOIO. This commit fixes memory allocations done while holding the dqm->lock which is needed in the MMU notifier (dqm->ops.evict_process_queues). Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by:
Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by:
Oded Gabbay <oded.gabbay@gmail.com>
-
- 07 Feb, 2018 1 commit
-
-
Felix Kuehling authored
When the TTM memory manager in KGD evicts BOs, all user mode queues potentially accessing these BOs must be evicted temporarily. Once user mode queues are evicted, the eviction fence is signaled, allowing the migration of the BO to proceed. A delayed worker is scheduled to restore all the BOs belonging to the evicted process and restart its queues. During suspend/resume of the GPU we also evict all processes to allow KGD to save BOs in system memory, since VRAM will be lost. v2: * Account for eviction when updating of q->is_active in MQD manager Signed-off-by:
Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Acked-by:
Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by:
Oded Gabbay <oded.gabbay@gmail.com>
-
- 09 Dec, 2017 1 commit
-
-
Felix Kuehling authored
dGPUs work without IOMMUv2. Make IOMMUv2 initialization dependent on ASIC information. Also allow building KFD without IOMMUv2 support. This is still useful for dGPUs and prepares for enabling KFD on architectures that don't support AMD IOMMUv2. v2: * Centralize IOMMUv2 code to avoid #ifdefs in too many places v3: * Imply AMD_IOMMU_V2 in Kconfig Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Acked-by:
Christian Konig <christian.koenig@amd.com> Reviewed-by:
Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by:
Oded Gabbay <oded.gabbay@gmail.com>
-
- 04 Jan, 2018 3 commits
-
-
Felix Kuehling authored
v2: remove needs_iommu field as it doesn't exists CC: linux-pci@vger.kernel.org Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by:
Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by:
Oded Gabbay <oded.gabbay@gmail.com>
-
Felix Kuehling authored
Some dGPUs don't support HWS. Allow them to use a per-device sched_policy that may be different from the global default. Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by:
Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by:
Oded Gabbay <oded.gabbay@gmail.com>
-
Felix Kuehling authored
This will be needed for most dGPUs. CC: linux-pci@vger.kernel.org Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by:
Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by:
Oded Gabbay <oded.gabbay@gmail.com>
-
- 27 Nov, 2017 1 commit
-
-
Felix Kuehling authored
Allow HWS to to execute multiple processes on the hardware concurrently. The number of concurrent processes is limited by the number of VMIDs allocated to the HWS. A module parameter can be used for limiting this further or turn it off altogether (mainly for debugging purposes). Signed-off-by:
Yong Zhao <yong.zhao@amd.com> Signed-off-by:
Jay Cornwall <Jay.Cornwall@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Acked-by:
Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by:
Oded Gabbay <oded.gabbay@gmail.com>
-
- 14 Nov, 2017 1 commit
-
-
Felix Kuehling authored
This hardware feature allows the GPU to preempt shader execution in the middle of a compute wave, save the state and restore it later to resume execution. Memory for saving the state is allocated per queue in user mode and the address and size passed to the create_queue ioctl. The size depends on the number of waves that can be in flight simultaneously on a given ASIC. Signed-off-by:
Shaoyun.liu <shaoyun.liu@amd.com> Signed-off-by:
Yong Zhao <yong.zhao@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Oded Gabbay <oded.gabbay@gmail.com>
-
- 27 Oct, 2017 1 commit
-
-
Andres Rodriguez authored
In systems under heavy load the IH work may experience significant scheduling delays. Under load + system workqueue: Max Latency: 7.023695 ms Avg Latency: 0.263994 ms Under load + high priority workqueue: Max Latency: 1.162568 ms Avg Latency: 0.163213 ms Further work is required to measure the impact of per-cpu settings on IH performance. Signed-off-by:
Andres Rodriguez <andres.rodriguez@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Acked-by:
Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by:
Oded Gabbay <oded.gabbay@gmail.com>
-
- 26 Sep, 2017 1 commit
-
-
Felix Kuehling authored
PASID management is moving into KGD. Limiting the PASID range to the number of doorbell pages is no longer practical. Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Acked-by:
Alex Deucher <alexander.deucher@amd.com> Reviewed-by:
Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-