- 22 Apr, 2022 2 commits
-
-
Tao Zhou authored
Prepare for the implementation of poison consumption handler. v2: separate umc handler from poison creation. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Yang Wang authored
simplify programming with existing functions. Signed-off-by: Yang Wang <KevinYang.Wang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
- 21 Apr, 2022 12 commits
-
-
Bokun Zhang authored
- In the latest version of the header, there is a variable name change. This should not cause any backward compatibility since the variable is at the same offset in the struct. Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Bokun Zhang authored
- Clean up the identation in the header file Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Bokun Zhang authored
- Update MIT license header Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Alex Deucher authored
It's not used outside of dcn31_hubp.c. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Miaoqian Lin authored
When dcn20_clk_src_construct() fails, we need to release clk_src. Fixes: 6f4e6361 ("drm/amd/display: Add Renoir resource (v2)") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Haowen Bai authored
aux_rep only memset but no use at all, so we drop it. Signed-off-by: Haowen Bai <baihaowen@meizu.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Alex Deucher authored
We normally runtime suspend when there are displays attached if they are in the DPMS off state, however, if something wakes the GPU we send a hotplug event on resume (in case any displays were connected while the GPU was in suspend) which can cause userspace to light up the displays again soon after they were turned off. Prior to commit 087451f3 ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's."), the driver took a runtime pm reference when the fbdev emulation was enabled because we didn't implement proper shadowing support for vram access when the device was off so the device never runtime suspended when there was a console bound. Once that commit landed, we now utilize the core fb helper implementation which properly handles the emulation, so runtime pm now suspends in cases where it did not before. Ultimately, we need to sort out why runtime suspend in not working in this case for some users, but this should restore similar behavior to before. v2: move check into runtime_suspend v3: wake ups -> wakeups in comment, retain pm_runtime behavior in runtime_idle callback Fixes: 087451f3 ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.") Link: https://lore.kernel.org/r/20220403132322.51c90903@darkstar.example.org/Tested-by: Michele Ballabio <ballabio.m@gmail.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Lang Yu authored
This reverts commit 36bf9321. It causes SVM regressions on Vega10 with XNACK-ON. Just revert it at the moment. ./kfdtest --gtest_filter=KFDSVMRangeTest.MigratePolicyTest Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Philip Yang<Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Candice Li authored
v1: Add debugfs support to load/unload/invoke TA in runtime. v2: 1. Update some variables to static. 2. Use PAGE_ALIGN to calculate shared buf size directly. 3. Remove fp check. 4. Update debugfs from read to write. Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Candice Li authored
The upcoming TA debugfs interface needs to use indirect buffer when performing TA invoke and check psp response status for TA load and invoke. Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
David Yat Sin authored
Add support to checkpoint/restore GWS (Global Wave Sync) queues. Signed-off-by: David Yat Sin <david.yatsin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
David Yat Sin authored
dqm->gws_queue_count and pdd->qpd.mapped_gws_queue need to be updated each time the queue gets evicted. Fixes: b8020b03 ("drm/amdkfd: Enable over-subscription with >1 GWS queue") Signed-off-by: David Yat Sin <david.yatsin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
- 19 Apr, 2022 10 commits
-
-
Tales Lelo da Aparecida authored
To make sure maintainers of amdgpu drivers are aware of any changes in their documentation, add its entry to MAINTAINERS. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Tales Lelo da Aparecida <tales.aparecida@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Tales Lelo da Aparecida authored
Add missing acronyms to the amdgppu glossary. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1939Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Tales Lelo da Aparecida <tales.aparecida@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Tom Rix authored
evergreen_default_state and evergreen_default_size are only used in evergreen.c. Single file symbols should be static. So move their definitions to evergreen_blit_shaders.h and change their storage-class-specifier to static. Remove unneeded evergreen_blit_shader.c evergreen_ps/vs definitions were removed with commit 4f862967 ("drm/radeon/kms: remove r6xx+ blit copy routines") So their declarations in evergreen_blit_shader.h are not needed, so remove them. Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Tom Rix authored
Smatch reports this issue virtual_link_hwss.c:32:6: warning: symbol 'virtual_setup_stream_attribute' was not declared. Should it be static? virtual_setup_stream_attribute is only used in virtual_link_hwss.c, but the other functions in the file are declared in the header file and used elsewhere. For consistency, add the virtual_setup_stream_attribute decl to virtual_link_hwss.h. Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Keita Suzuki authored
In function si_parse_power_table(), array adev->pm.dpm.ps and its member is allocated. If the allocation of each member fails, the array itself is freed and returned with an error code. However, the array is later freed again in si_dpm_fini() function which is called when the function returns an error. This leads to potential double free of the array adev->pm.dpm.ps, as well as leak of its array members, since the members are not freed in the allocation function and the array is not nulled when freed. In addition adev->pm.dpm.num_ps, which keeps track of the allocated array member, is not updated until the member allocation is successfully finished, this could also lead to either use after free, or uninitialized variable access in si_dpm_fini(). Fix this by postponing the free of the array until si_dpm_fini() and increment adev->pm.dpm.num_ps everytime the array member is allocated. Signed-off-by: Keita Suzuki <keitasuzuki.park@sslab.ics.keio.ac.jp> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Tales Lelo da Aparecida authored
It's a local function, let's make it static. AGD: remove prototype in dcn10_hubp.h Signed-off-by: Tales Lelo da Aparecida <tales.aparecida@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Darren Powell authored
Clarify the smu_cmn_send_smc_msg_with_param documentation to mention two cases exist where messages are silently dropped with no error returned. These cases occur in unusual situations where either: 1. the message type is not allowed to a virtual GPU, or 2. a PCI recovery is underway and the HW is not yet in sync with the SW For more details see commit 4ea5081c ("drm/amd/powerplay: enable SMC message filter") commit bf36b52e ("drm/amdgpu: Avoid accessing HW when suspending SW state") (v2) Reworked with suggestions from Luben & Paul (v3) Updated wording as per Luben's feedback Corrected error stating all messages denied on virtual GPU (each GPU has mask of which messages are allowed) Signed-off-by: Darren Powell <darren.powell@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Huang Rui authored
It needs to check if the pp_funcs is initialized while release the context, otherwise it will trigger null pointer panic while the software smu is not enabled. [ 1109.404555] BUG: kernel NULL pointer dereference, address: 0000000000000078 [ 1109.404609] #PF: supervisor read access in kernel mode [ 1109.404638] #PF: error_code(0x0000) - not-present page [ 1109.404657] PGD 0 P4D 0 [ 1109.404672] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 1109.404701] CPU: 7 PID: 9150 Comm: amdgpu_test Tainted: G OEL 5.16.0-custom #1 [ 1109.404732] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 1109.404765] RIP: 0010:amdgpu_dpm_force_performance_level+0x1d/0x170 [amdgpu] [ 1109.405109] Code: 5d c3 44 8b a3 f0 80 00 00 eb e5 66 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 08 4c 8b b7 f0 7d 00 00 <49> 83 7e 78 00 0f 84 f2 00 00 00 80 bf 87 80 00 00 00 48 89 fb 0f [ 1109.405176] RSP: 0018:ffffaf3083ad7c20 EFLAGS: 00010282 [ 1109.405203] RAX: 0000000000000000 RBX: ffff9796b1c14600 RCX: 0000000002862007 [ 1109.405229] RDX: ffff97968591c8c0 RSI: 0000000000000001 RDI: ffff9796a3700000 [ 1109.405260] RBP: ffffaf3083ad7c50 R08: ffffffff9897de00 R09: ffff979688d9db60 [ 1109.405286] R10: 0000000000000000 R11: ffff979688d9db90 R12: 0000000000000001 [ 1109.405316] R13: ffff9796a3700000 R14: 0000000000000000 R15: ffff9796a3708fc0 [ 1109.405345] FS: 00007ff055cff180(0000) GS:ffff9796bfdc0000(0000) knlGS:0000000000000000 [ 1109.405378] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1109.405400] CR2: 0000000000000078 CR3: 000000000a394000 CR4: 00000000000506e0 [ 1109.405434] Call Trace: [ 1109.405445] <TASK> [ 1109.405456] ? delete_object_full+0x1d/0x20 [ 1109.405480] amdgpu_ctx_set_stable_pstate+0x7c/0xa0 [amdgpu] [ 1109.405698] amdgpu_ctx_fini.part.0+0xcb/0x100 [amdgpu] [ 1109.405911] amdgpu_ctx_do_release+0x71/0x80 [amdgpu] [ 1109.406121] amdgpu_ctx_ioctl+0x52d/0x550 [amdgpu] [ 1109.406327] ? _raw_spin_unlock+0x1a/0x30 [ 1109.406354] ? drm_gem_handle_delete+0x81/0xb0 [drm] [ 1109.406400] ? amdgpu_ctx_get_entity+0x2c0/0x2c0 [amdgpu] [ 1109.406609] drm_ioctl_kernel+0xb6/0x140 [drm] Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Lang Yu authored
The idea is from commit a50fe707 ("drm/amdkfd: Only apply heavy-weight TLB flush on Aldebaran") and commit f61c40c0 ("drm/amdkfd: enable heavy-weight TLB flush on Arcturus"). At the moment, heavy-weight TLB could cause problems on ASICs except Aldebaran and Arcturus. A simple hipMallocManaged/hipFree program could trigger this issue. [ 97.787657] amdgpu 0000:01:00.0: amdgpu: wait for kiq fence error: 0. [ 106.868758] amdgpu: qcm fence wait loop timeout expired [ 106.868966] amdgpu: The cp might be in an unrecoverable state due to an unsuccessful queues preemption [ 106.869203] amdgpu: Failed to evict process queues [ 106.869261] amdgpu: Failed to quiesce KFD Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Lang Yu authored
To make kfd_flush_tlb_after_unmap visible in kfd_svm.c, move it into kfd_priv.h. And change it to an inline function. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
- 14 Apr, 2022 5 commits
-
-
Gavin Wan authored
[why] These static variables save the RLC Scratch registers address. When we install multiple GPUs (for example: XGMI setting) and multiple GPUs call the function at same time. The RLC Scratch registers address are changed each other. Then it caused reading/writing from/to wrong GPU. [how] Removed the static from the variables. The variables are on the stack. Fixes: 5d447e29 ("drm/amdgpu: add helper for rlcg indirect reg access") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Gavin Wan <Gavin.Wan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Felix Kuehling authored
Add the waiters to the wait queue during initialization, while holding the event spinlock. Otherwise the waiter will not get activated if the event signals before being added to the wait queue. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Philip Yang<Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Rodrigo Siqueira authored
This reverts commit 863fa85e. While we were testing DCN3.1 with a hub, we noticed that only one of 2 connected displays lights up when using some specific display resolution. In summary, this was the setup: 1. Displays: * Sharp LQ156M1JW26 (eDP): 1080@240 * BENQ SW320 (DP): 4k@60 * BENQ EX3203R (DP): 4k@60 2. Hub: Club3D CSV-7300 3. ASIC: DCN3.1 After bisecting this issue, we figured out the commit mentioned above introduced this issue. We are investigating why this patch introduced this regression, but we need to revert it for now. Cc: Harry Wentland <harry.wentland@amd.com> Cc: Mark Broadworth <Mark.Broadworth@amd.com> Cc: Michael Strauss <michael.strauss@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
xinhui pan authored
VM might already be freed when amdgpu_vm_tlb_seq_cb() is called. We see the calltrace below. Fix it by keeping the last flush fence around and wait for it to signal BUG kmalloc-4k (Not tainted): Poison overwritten 0xffff9c88630414e8-0xffff9c88630414e8 @offset=5352. First byte 0x6c instead of 0x6b Allocated in amdgpu_driver_open_kms+0x9d/0x360 [amdgpu] age=44 cpu=0 pid=2343 __slab_alloc.isra.0+0x4f/0x90 kmem_cache_alloc_trace+0x6b8/0x7a0 amdgpu_driver_open_kms+0x9d/0x360 [amdgpu] drm_file_alloc+0x222/0x3e0 [drm] drm_open+0x11d/0x410 [drm] Freed in amdgpu_driver_postclose_kms+0x3e9/0x550 [amdgpu] age=22 cpu=1 pid=2485 kfree+0x4a2/0x580 amdgpu_driver_postclose_kms+0x3e9/0x550 [amdgpu] drm_file_free+0x24e/0x3c0 [drm] drm_close_helper.isra.0+0x90/0xb0 [drm] drm_release+0x97/0x1a0 [drm] __fput+0xb6/0x280 ____fput+0xe/0x10 task_work_run+0x64/0xb0 Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Dan Carpenter authored
If lookup_event_by_id() returns a NULL "ev" pointer then the spin_lock(&ev->lock) will crash. This was detected by Smatch: drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_events.c:644 kfd_set_event() error: we previously assumed 'ev' could be null (see line 639) Fixes: 5273e82c ("drm/amdkfd: Improve concurrency of event handling") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
- 13 Apr, 2022 6 commits
-
-
Mukul Joshi authored
Currently, the IO-links to the device being removed from topology, are not cleared. As a result, there would be dangling links left in the KFD topology. This patch aims to fix the following: 1. Cleanup all IO links to the device being removed. 2. Ensure that node numbering in sysfs and nodes proximity domain values are consistent after the device is removed: a. Adding a device and removing a GPU device are made mutually exclusive. b. The global proximity domain counter is no longer required to be an atomic counter. A normal 32-bit counter can be used instead. 3. Update generation_count to let user-mode know that topology has changed due to device removal. CC: Shuotao Xu <shuotaoxu@microsoft.com> Reviewed-by: Shuotao Xu <shuotaoxu@microsoft.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Yongqiang Sun authored
MS_HYPERV with vega10 doesn't have the interface to process request init data msg. Check hypervisor type to not send the request for MS_HYPERV. Signed-off-by: Yongqiang Sun <yongqiang.sun@amd.com> Reviewed-by: Alice Wong <shiwei.wong@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Lang Yu authored
A MAX_GPU_INSTANCE bits bitmap will suffice. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Wenjing Liu authored
[why] Extract update stream allocation table into link hwss as part of the link hwss refactor work. Reviewed-by: George Shen <George.Shen@amd.com> Reviewed-by: Fangzhi Zuo <Jerry.Zuo@amd.com> Acked-by: Pavle Kotarac <Pavle.Kotarac@amd.com> Signed-off-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
David Zhang authored
[why] creating a generic helper for AMD specific PSR-SU sink validation. Moving the function to the power module to reference it across all OS. [how] - drop PSRSU specific sink validation helper and move to power module by reading PSR version and other PSR caps - call the new helper from linux DM (amdgpu_dm_psr) Acked-by: Pavle Kotarac <Pavle.Kotarac@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: David Zhang <dingchen.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
David Zhang authored
[why & how] As per eDP 1.5 spec, add the below two DPCD bit fields for PSR-SU support and capability: 1. DP_PSR2_WITH_Y_COORD_ET_SUPPORTED 2. DP_PSR2_SU_AUX_FRAME_SYNC_NOT_NEEDED changes in v2 ------------------ * fixed the typo * explicitly list what DPCD bit fields are added Signed-off-by: David Zhang <dingchen.zhang@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
- 12 Apr, 2022 5 commits
-
-
Aric Cyr authored
Title: DC Patches Apri 6, 2022 This DC patchset brings improvements in multiple areas. In summary, we highlight: *Disabling Z10 on DCN31 *Fix issue breaking 32bit Linux build *Fix inconsistent timestamp type *Add DCN30 support FEC init *Fix crash on setting VRR with no display connected *Disable FEC if DSC not supported for EDP *Add odm seamless boot support *Select correct DTO source *Power down hardware if timer not trigger Acked-by: Pavle Kotarac <Pavle.Kotarac@amd.com> Signed-off-by: Aric Cyr <aric.cyr@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Dillon Varone authored
[WHY&HOW] Change criteria for setting DTO source value, and always set it regardless of the signal type. Reviewed-by: Ariel Bernstein <Eric.Bernstein@amd.com> Acked-by: Pavle Kotarac <Pavle.Kotarac@amd.com> Signed-off-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Felix Kuehling authored
The synchronize_rcu call in destroy_events can take several ms, which noticeably slows down applications destroying many events. Use kfree_rcu to free the event structure asynchronously and eliminate the synchronize_rcu call in the user thread. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
hersen wu authored
[Why] within dc link detecion, dp link training will be executed for external sst dp. for debug purpose, we may need skip dp link training. [How] expose dc debug option to skip_detection_link_training to debugfs Reviewed-by: Roman Li <Roman.Li@amd.com> Acked-by: Pavle Kotarac <Pavle.Kotarac@amd.com> Signed-off-by: hersen wu <hersenxs.wu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Dillon Varone authored
Reviewed-by: Ariel Bernstein <Eric.Bernstein@amd.com> Acked-by: Pavle Kotarac <Pavle.Kotarac@amd.com> Signed-off-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-