- 31 May, 2017 40 commits
-
-
Andres Rodriguez authored
Tonga based asics may experience hangs when an HQD's EOP parameters are modified. Workaround this HW issue by avoiding writes to these registers for tonga asics. Based on the following ROCm commit: 2a0fb8 - drm/amdgpu: Synchronize KFD HQD load protocol with CP scheduler From the ROCm git repository: https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver.git CC: Jay Cornwall <Jay.Cornwall@amd.com> Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Andres Rodriguez authored
The MQD structure matches the reg layout. Take advantage of this to simplify HQD programming. Note that the ACTIVE field still needs to be programmed last. Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Andres Rodriguez authored
Instead of taking the first pipe and giving the rest to kfd, take the first 2 queues of each pipe. Effectively, amdgpu and amdkfd own the same number of queues. But because the queues are spread over multiple pipes the hardware will be able to better handle concurrent compute workloads. amdgpu goes from 1 pipe to 4 pipes, i.e. from 1 compute threads to 4 amdkfd goes from 3 pipe to 4 pipes, i.e. from 3 compute threads to 4 v2: fix policy comment Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Andres Rodriguez authored
Instead of picking an arbitrary queue for KIQ, search for one according to policy. The queue must be unused. Also report the KIQ as an unavailable resource to KFD. In testing I ran into KCQ initialization issues when using pipes 2/3 of MEC2 for the KIQ. Therefore the policy disallows grabbing one of these. v2: fix (ring.me + 1) to (ring.me -1) in amdgpu_amdkfd_device_init Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Andres Rodriguez authored
The assumption that we are only using the first pipe no longer holds. Instead, calculate the queue_mask from the queue_bitmap. Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Andres Rodriguez authored
Pipes provide better concurrency than queues, therefore we want to make sure that apps use queues from different pipes whenever possible. Optimize for the trivial case where an app will consume rings in order, therefore we don't want adjacent rings to belong to the same pipe. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Andres Rodriguez authored
This information is already available in adev. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Andres Rodriguez authored
Update the KGD to KFD interface to allow sharing pipes with queue granularity instead of pipe granularity. This allows for more interesting pipe/queue splits. v2: fix overflow check for res.queue_mask v3: fix shift overflow when setting res.queue_mask v4: fix comment in is_pipeline_enabled() v5: clamp res.queue_mask to the first MEC only Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Andres Rodriguez authored
The current implementation is hardcoded to enable ME1/PIPE0 interrupts only. This patch allows amdgpu to enable interrupts for any pipe of ME1. v2: added gfx9 support v3: use soc15_grbm_select for gfx9 Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Andres Rodriguez authored
Previously the queue/pipe split with kfd operated with pipe granularity. This patch allows amdgpu to take ownership of an arbitrary set of queues. It also consolidates the last few magic numbers in the compute initialization process into mec_init. v2: support for gfx9 v3: renamed AMDGPU_MAX_QUEUES to AMDGPU_MAX_COMPUTE_QUEUES v4: fix off-by-one in num_mec checks in *_compute_queue_acquire Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Andres Rodriguez authored
Make amdgpu the owner of all per-pipe state of the HQDs. This change will allow us to split the queues between kfd and amdgpu with a queue granularity instead of pipe granularity. This patch fixes kfd allocating an HDP_EOP region for its 3 pipes which goes unused. v2: support for gfx9 v3: fix gfx7 HPD intitialization Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Andres Rodriguez authored
Take ownership of pipe initialization away from KFD. Note that hpd_eop_gpu_addr was already large enough to accomodate all pipes. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Andres Rodriguez authored
Rename straggler instances of r(adeon)dev to a(mdgpu)dev Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Andres Rodriguez authored
The return value from copy_form_user is 0 for the success case. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Andres Rodriguez authored
Use the same gfx_*_mqd_commit function for kfd and amdgpu codepaths. This removes the last duplicates of this programming sequence. v2: fix cp_hqd_pq_wptr value Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Andres Rodriguez authored
The gfxv7 contains a slightly different version of cik_mqd called bonaire_mqd. This can introduce subtle bugs if fixes are not applied in both places. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Andres Rodriguez authored
Handle HQD deactivation timeouts instead of ignoring them. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Andres Rodriguez authored
The MQD programming sequence currently exists in 3 different places. Refactor it to absorb all the duplicates. The success path remains mostly identical except for a slightly different order in the non-kiq case. This shouldn't matter if the HQD is disabled. The error handling paths have been updated to deal with the new code structure. v2: the non-kiq path for gfxv8 was dropped in the rebase v3: split MEC_HPD_SIZE rename, dropped doorbell changes Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Andres Rodriguez authored
Rename MEC_HPD_SIZE to GFXN_MEC_HPD_SIZE to clarify it is specific to a gfx generation. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Rex Zhu authored
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Rex Zhu authored
This reverts commit f8fdaa0e7b81698ba2ad8c2d20c7f9a44c75e0c6. firmware add support for this feature, so still ctrl by vbios. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Rex Zhu authored
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Rex Zhu authored
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Christian König authored
This isn't beneficial any more since VRAM allocations are now split so that they fits into a single page table. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Christian König authored
Makes it easier to update the PDE with huge pages. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Alex Xie authored
Make the code easier to understand. Signed-off-by: Alex Xie <AlexBin.Xie@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Alex Xie authored
Move several if statements and a loop statment from run time to initialization time. Signed-off-by: Alex Xie <AlexBin.Xie@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Leo Liu authored
We need program ring buffer on instance 1 register space domain, when only if instance 1 available, with two instances or instance 0, and we need only program instance 0 regsiter space domain for ring. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Harish Kasiviswanathan authored
This change is also useful for the upcoming changes where page tables can be updated by CPU. Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Christian König authored
If updating the PDs fails we now invalidate all entries to try again later. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Christian König authored
Rename adjust_mc_addr to get_vm_pde and check the address bits in one place. v2: handle vcn as well, keep setting the valid bit manually, add a BUG_ON() for GMC v6, v7 and v8 as well. v3: handle vcn_v1_0_enc_ring_emit_vm_flush as well. v4: fix the BUG_ON mask for GFX6-8 Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Hawking Zhang authored
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Hawking Zhang authored
Load Balancing Per Watt (LBPW) allows dynamically disable CUs when they are idle Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Hawking Zhang authored
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Hawking Zhang authored
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Hawking Zhang authored
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Hawking Zhang authored
remnants from bring-up. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Eric Huang authored
Tools fb address was failed to send to smu when smu was not running. Changing sequence will fix it. Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Eric Huang authored
It is to fix bug of sysfs entry pp_table which had size 0 of output before. Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Shirish S authored
amdgpu_device_resume() & amdgpu_device_init() have a high time consuming call of amdgpu_late_init() which sets the clock_gating state of all IP blocks and is blocking. This patch defers only this setting of clock gating state operation to post resume of amdgpu driver but ideally before the UI comes up or in some cases post ui as well. With this change the resume time of amdgpu_device comes down from 1.299s to 0.199s which further helps in reducing the overall system resume time. V1: made the optimization applicable during driver load as well. TEST:(For ChromiumOS on STONEY only) * UI comes up * amdgpu_late_init() call gets called consistently and no errors reported. Signed-off-by: Shirish S <shirish.s@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-