Commits · b179fc28d521379ba7e0a38eec1a4c722e7ea634 · Kirill Smelkov / linux

28 Apr, 2022 5 commits

drm/amdkfd: Fix circular lock dependency warning · b179fc28

Mukul Joshi authored Apr 22, 2022

[  168.544078] ======================================================
[  168.550309] WARNING: possible circular locking dependency detected
[  168.556523] 5.16.0-kfd-fkuehlin #148 Tainted: G            E
[  168.562558] ------------------------------------------------------
[  168.568764] kfdtest/3479 is trying to acquire lock:
[  168.573672] ffffffffc0927a70 (&topology_lock){++++}-{3:3}, at:
		kfd_topology_device_by_id+0x16/0x60 [amdgpu] [  168.583663]
                but task is already holding lock:
[  168.589529] ffff97d303dee668 (&mm->mmap_lock#2){++++}-{3:3}, at:
		vm_mmap_pgoff+0xa9/0x180 [  168.597755]
                which lock already depends on the new lock.

[  168.605970]
                the existing dependency chain (in reverse order) is:
[  168.613487]
                -> #3 (&mm->mmap_lock#2){++++}-{3:3}:
[  168.619700]        lock_acquire+0xca/0x2e0
[  168.623814]        down_read+0x3e/0x140
[  168.627676]        do_user_addr_fault+0x40d/0x690
[  168.632399]        exc_page_fault+0x6f/0x270
[  168.636692]        asm_exc_page_fault+0x1e/0x30
[  168.641249]        filldir64+0xc8/0x1e0
[  168.645115]        call_filldir+0x7c/0x110
[  168.649238]        ext4_readdir+0x58e/0x940
[  168.653442]        iterate_dir+0x16a/0x1b0
[  168.657558]        __x64_sys_getdents64+0x83/0x140
[  168.662375]        do_syscall_64+0x35/0x80
[  168.666492]        entry_SYSCALL_64_after_hwframe+0x44/0xae
[  168.672095]
                -> #2 (&type->i_mutex_dir_key#6){++++}-{3:3}:
[  168.679008]        lock_acquire+0xca/0x2e0
[  168.683122]        down_read+0x3e/0x140
[  168.686982]        path_openat+0x5b2/0xa50
[  168.691095]        do_file_open_root+0xfc/0x190
[  168.695652]        file_open_root+0xd8/0x1b0
[  168.702010]        kernel_read_file_from_path_initns+0xc4/0x140
[  168.709542]        _request_firmware+0x2e9/0x5e0
[  168.715741]        request_firmware+0x32/0x50
[  168.721667]        amdgpu_cgs_get_firmware_info+0x370/0xdd0 [amdgpu]
[  168.730060]        smu7_upload_smu_firmware_image+0x53/0x190 [amdgpu]
[  168.738414]        fiji_start_smu+0xcf/0x4e0 [amdgpu]
[  168.745539]        pp_dpm_load_fw+0x21/0x30 [amdgpu]
[  168.752503]        amdgpu_pm_load_smu_firmware+0x4b/0x80 [amdgpu]
[  168.760698]        amdgpu_device_fw_loading+0xb8/0x140 [amdgpu]
[  168.768412]        amdgpu_device_init.cold+0xdf6/0x1716 [amdgpu]
[  168.776285]        amdgpu_driver_load_kms+0x15/0x120 [amdgpu]
[  168.784034]        amdgpu_pci_probe+0x19b/0x3a0 [amdgpu]
[  168.791161]        local_pci_probe+0x40/0x80
[  168.797027]        work_for_cpu_fn+0x10/0x20
[  168.802839]        process_one_work+0x273/0x5b0
[  168.808903]        worker_thread+0x20f/0x3d0
[  168.814700]        kthread+0x176/0x1a0
[  168.819968]        ret_from_fork+0x1f/0x30
[  168.825563]
                -> #1 (&adev->pm.mutex){+.+.}-{3:3}:
[  168.834721]        lock_acquire+0xca/0x2e0
[  168.840364]        __mutex_lock+0xa2/0x930
[  168.846020]        amdgpu_dpm_get_mclk+0x37/0x60 [amdgpu]
[  168.853257]        amdgpu_amdkfd_get_local_mem_info+0xba/0xe0 [amdgpu]
[  168.861547]        kfd_create_vcrat_image_gpu+0x1b1/0xbb0 [amdgpu]
[  168.869478]        kfd_create_crat_image_virtual+0x447/0x510 [amdgpu]
[  168.877884]        kfd_topology_add_device+0x5c8/0x6f0 [amdgpu]
[  168.885556]        kgd2kfd_device_init.cold+0x385/0x4c5 [amdgpu]
[  168.893347]        amdgpu_amdkfd_device_init+0x138/0x180 [amdgpu]
[  168.901177]        amdgpu_device_init.cold+0x141b/0x1716 [amdgpu]
[  168.909025]        amdgpu_driver_load_kms+0x15/0x120 [amdgpu]
[  168.916458]        amdgpu_pci_probe+0x19b/0x3a0 [amdgpu]
[  168.923442]        local_pci_probe+0x40/0x80
[  168.929249]        work_for_cpu_fn+0x10/0x20
[  168.935008]        process_one_work+0x273/0x5b0
[  168.940944]        worker_thread+0x20f/0x3d0
[  168.946623]        kthread+0x176/0x1a0
[  168.951765]        ret_from_fork+0x1f/0x30
[  168.957277]
                -> #0 (&topology_lock){++++}-{3:3}:
[  168.965993]        check_prev_add+0x8f/0xbf0
[  168.971613]        __lock_acquire+0x1299/0x1ca0
[  168.977485]        lock_acquire+0xca/0x2e0
[  168.982877]        down_read+0x3e/0x140
[  168.987975]        kfd_topology_device_by_id+0x16/0x60 [amdgpu]
[  168.995583]        kfd_device_by_id+0xa/0x20 [amdgpu]
[  169.002180]        kfd_mmap+0x95/0x200 [amdgpu]
[  169.008293]        mmap_region+0x337/0x5a0
[  169.013679]        do_mmap+0x3aa/0x540
[  169.018678]        vm_mmap_pgoff+0xdc/0x180
[  169.024095]        ksys_mmap_pgoff+0x186/0x1f0
[  169.029734]        do_syscall_64+0x35/0x80
[  169.035005]        entry_SYSCALL_64_after_hwframe+0x44/0xae
[  169.041754]
                other info that might help us debug this:

[  169.053276] Chain exists of:
                  &topology_lock --> &type->i_mutex_dir_key#6 --> &mm->mmap_lock#2

[  169.068389]  Possible unsafe locking scenario:

[  169.076661]        CPU0                    CPU1
[  169.082383]        ----                    ----
[  169.088087]   lock(&mm->mmap_lock#2);
[  169.092922]                                lock(&type->i_mutex_dir_key#6);
[  169.100975]                                lock(&mm->mmap_lock#2);
[  169.108320]   lock(&topology_lock);
[  169.112957]
                 *** DEADLOCK ***

This commit fixes the deadlock warning by ensuring pm.mutex is not
held while holding the topology lock. For this, kfd_local_mem_info
is moved into the KFD dev struct and filled during device init.
This cached value can then be used instead of querying the value
again and again.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

b179fc28

drm/amdkfd: Fix updating IO links during device removal · 98447635

Mukul Joshi authored Apr 20, 2022

The logic to update the IO links when a KFD device
is removed was not correct as it would miss updating
the proximity domain values for some nodes where the
node_from and node_to both were greater values than the
proximity domain value of the KFD device being removed
from topology.

Fixes: 46d18d51 ("drm/amdkfd: Cleanup IO links during KFD device removal")
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

98447635

drm/amdkfd: Use non-atomic bitmap functions when possible · b8b9ba58

Christophe JAILLET authored Nov 28, 2021

All uses of the 'kfd->gtt_sa_bitmap' bitmap are protected with the
'kfd->gtt_sa_lock' mutex.

So:
   - prefer the non-atomic '__set_bit()' function
   - use the non-atomic 'bitmap_[set|clear]()' functions instead of
     equivalent 'for' loops. These functions can work on several bits at a
     time
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

b8b9ba58

drm/amdkfd: Use bitmap_zalloc() when applicable · f43a9f18

Christophe JAILLET authored Nov 28, 2021

'kfd->gtt_sa_bitmap' is a bitmap. So use 'bitmap_zalloc()' to simplify
code, improve the semantic and avoid some open-coded arithmetic in
allocator arguments.

Also change the corresponding 'kfree()' into 'bitmap_free()' to keep
consistency.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

f43a9f18

drm/amd/display: protect remaining FPU-code calls on dcn3.1.x · 7324d02a

Melissa Wen authored Mar 30, 2022

From [1], I realized two other calls to dcn30 code are associated with
FPU operations and are not protected by DC_FP_* macros:
* dcn30_populate_dml_writeback_from_context()
* dcn30_set_mcif_arb_params()

So, since FPU-associated code is not fully isolated in dcn30, and
dcn3.1.x reuses them, let's wrap their calls properly.

Note: this patch complements the fix from [1].

[1] https://lore.kernel.org/amd-gfx/20220329082957.1662655-1-chandan.vurdigerenataraj@amd.com/Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

7324d02a

26 Apr, 2022 17 commits

drm/amd: Fix spelling typo in comment · 322687d5

oushixiong authored Apr 26, 2022

Signed-off-by: oushixiong <oushixiong@kylinos.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

322687d5

drm/amdgpu: fix typo · 2530dc3c

Rongguang Wei authored Apr 26, 2022

Fix spelling mistake:
	"differnt" -> "different"
	"commond"  -> "common"
Signed-off-by: Rongguang Wei <weirongguang@kylinos.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

2530dc3c

drm/amdgpu: debugfs: fix NULL dereference in ta_if_invoke_debugfs_write() · 2f33a397

Dan Carpenter authored Apr 26, 2022

If the kzalloc() fails then this code will crash. Return -ENOMEM instead.

Fixes: e50d9ba0 ("drm/amdgpu: Add debugfs TA load/unload/invoke support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

2f33a397

drm/amdgpu: debugfs: fix error codes in write functions · a52ad5b6

Dan Carpenter authored Apr 26, 2022

There are two error code bugs here.  The copy_to/from_user() functions
return the number of bytes remaining (a positive number).  We should
return -EFAULT if the copy fails.

Second if we fail because "context.resp_status" is non-zero then return
-EINVAL instead of zero.

Fixes: e50d9ba0 ("drm/amdgpu: Add debugfs TA load/unload/invoke support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

a52ad5b6

gpu/drm/radeon: Fix typo in comments · a6f2e0d9

Zhenneng Li authored Apr 26, 2022

Signed-off-by: Zhenneng Li <lizhenneng@kylinos.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

a6f2e0d9

drm/amd: add dc feature mask flags for PSR allow smu and multi-display optimizations · 5533347d

David Zhang authored Apr 25, 2022

[Why]
Allow for PSR SMU optimization and PSR multiple display optimization.

[How]
Add feature flags of PSR smu optimization and PSR multiple display
optimiztaion, and set them during init sequence. By default, flags
are disabled.
Signed-off-by: David Zhang <dingchen.zhang@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Roman Li <Roman.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

5533347d

drm/amdgpu: keep mmhub clock gating being enabled during s2idle suspend · 3bbeaa30

Prike Liang authored Apr 19, 2022

Without MMHUB clock gating being enabled then MMHUB will not disconnect
from DF and will result in DF C-state entry can't be accessed during S2idle
suspend, and eventually s0ix entry will be blocked.
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

3bbeaa30

drm/amd/display: fix if == else warning · e6eb2c5f

Guo Zhengkui authored Apr 24, 2022

Fix the following coccicheck warning:

drivers/gpu/drm/amd/display/dc/dcn201/dcn201_hwseq.c:98:8-10:
WARNING: possible condition with no effect (if == else)
Reviewed-by: Zhan Liu <zhan.liu@amd.com>
Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

e6eb2c5f

drm/amdgpu/display: Make dcn31_set_low_power_state static · 0bed2ace

Alex Deucher authored Apr 25, 2022

It's not used outside of dcn31_clk_mgr.c.
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

0bed2ace

drm/amdgpu: Fix out-of-bound access for gfx_v10_0_ring_test_ib() · 428f273c

Haohui Mai authored Apr 25, 2022

The gfx_v10_0_ring_test_ib() function uses 20 bytes instead of 16
bytes during the test. The patch sets the size of the allocation to be
4-byte larger to match the actual usage.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Haohui Mai <ricetons@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

428f273c

drm/amdgpu/sdma: Remove redundant lower_32_bits() calls when settings SDMA doorbell · ca5d251b

Haohui Mai authored Apr 25, 2022

Updated the patch for the pre-vega hardware. I kept the clamping code
to be safe.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Haohui Mai <ricetons@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

ca5d251b

drm/amdgpu/sdma: Fix incorrect calculations of the wptr of the doorbells · 7dba6e83

Haohui Mai authored Apr 25, 2022

This patch fixes the issue where the driver miscomputes the 64-bit
values of the wptr of the SDMA doorbell when initializing the
hardware. SDMA engines v4 and later on have full 64-bit registers for
wptr thus they should be set properly.

Older generation hardwares like CIK / SI have only 16 / 20 / 24bits
for the WPTR, where the calls of lower_32_bits() will be removed in a
following patch.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Haohui Mai <ricetons@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

7dba6e83

drm/radeon: change cac_weights_* to static · 9714d357

Tom Rix authored Apr 23, 2022

Sparse reports these issues
si_dpm.c:332:26: warning: symbol 'cac_weights_pitcairn' was not declared. Should it be static?
si_dpm.c:1088:26: warning: symbol 'cac_weights_oland' was not declared. Should it be static?

Both of these variables are only used in si_dpm.c. Single file variables
should be static, so change their storage-class specifiers to static.
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

9714d357

drm/radeon: change cik_default_state table from global to static · 790d8e8e

Tom Rix authored Apr 23, 2022

Sparse reports these issues
cik_blit_shaders.c:31:11: warning: symbol 'cik_default_state' was not declared. Should it be static?
cik_blit_shaders.c:246:11: warning: symbol 'cik_default_size' was not declared. Should it be static?

cik_default_state and cik_default_size are only used in cik.c. Single file symbols
should be static. So move their definitions to cik_blit_shaders.h and change their
storage-class-specifier to static.

Remove unneeded cik_blit_shader.c
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

790d8e8e

drm/amd/display: fix non-kernel-doc comment warnings · 4ae182de

Randy Dunlap authored Apr 22, 2022

Fix kernel-doc warnings for a comment that should not use
kernel-doc notation:

dmub_psr.c:235: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
 * Set PSR power optimization flags.
dmub_psr.c:235: warning: missing initial short description on line:
 * Set PSR power optimization flags.

Fixes: e5dfcd27 ("drm/amd/display: dc_link_set_psr_allow_active refactoring")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Robin Chen <po-tchen@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Anthony Koo <Anthony.Koo@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Leo Li <sunpeng.li@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

4ae182de

drm/amdkfd: Update mapping if range attributes changed · 601354f3

Philip Yang authored Apr 18, 2022

Change SVM range mapping flags or access attributes don't trigger
migration, if range is already mapped on GPUs we should update GPU
mapping and pass flush_tlb flag true to amdgpu vm.

Change SVM range preferred_loc or migration granularity don't need
update GPU mapping, skip the validate_and_map.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

601354f3

drm/amdkfd: Add SVM range mapped_to_gpu flag · 6b9c63a6

Philip Yang authored Apr 18, 2022

To avoid unnecessary unmap SVM range from GPUs if range is not mapped on
GPUs when migrating the range. This flag will also be used to flush TLB
when updating the existing mapping on GPUs.

It is protected by prange->migrate_mutex and mmap read lock in MMU
notifier callback.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

6b9c63a6

25 Apr, 2022 15 commits

drm/amd/display: 3.2.183 · 398bb283

Aric Cyr authored Apr 18, 2022

This version brings along following fixes:
- Keep tracking of DSC packed PPS for future use
- Maintain current link settings in link loss interrupt
- Remove DDC write and read size check
- Read PSR-SU cap DPCD for specific panel
- Don't pass HostVM by default on DCN3.1
- Reset cached PSR parameters after hibernate
- Add audio readback registers
- Update dcn315 clk table read
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

398bb283

drm/amd/display: Keep track of DSC packed PPS · 9844792e

Ilya Bakoulin authored Mar 14, 2022

[Why]
Store current packed PPS data in dc_stream_state for future use.
Reviewed-by: Aric Cyr <Aric.Cyr@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Ilya Bakoulin <Ilya.Bakoulin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

9844792e

drm/amd/display: Remove unused integer · 3c540745

Dillon Varone authored Apr 13, 2022

Integer no longer needed.
Reviewed-by: Martin Leung <Martin.Leung@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

3c540745

drm/amd/display: Maintain current link settings in link loss interrupt · 9fbfeaf1

Gary Li authored Apr 14, 2022

[Why]
DP compliance test case 400.3.2.3 is failed because in link loss interrupt
the current link settings is not used in the DP link training.

[How]
In link loss interrupt, use the current link settings in the following DP
link training.
Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Gary Li <garyli12@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

9fbfeaf1

drm/amd/display: Remove ddc write and read size checking · e953cd08

Leo Ma authored Mar 23, 2022

[Why]
Customer found I2C over AUX using ADL_Display_DDCBlockAccess_Get
will fail when sending more than 256 bytes of data;

[How]
Remove the write and read size checking to allow sending data more
than 256 bytes;
Reviewed-by: Martin Leung <Martin.Leung@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Leo Ma <hanghong.ma@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

e953cd08

drm/amd/display: read PSR-SU cap DPCD for specific panel · d9f442e9

David Zhang authored Apr 11, 2022

[why & how]
For some specific eDP panel, we'd check the PSR-SU cap during boot
by reading the vendor specific DPCD, otherwise it will cause to
false report the eDP panel which supports PSR-SU as an non-PSR-SU
panel.

- add the vendor specific DPCD address in ddc_service_types header
- if specific eDP panel detected, check vendor specific DPCD for
  PSR-SU cap
Reviewed-by: Aurabindo Jayamohanan Pillai <Aurabindo.Pillai@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: David Zhang <dingchen.zhang@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

d9f442e9

drm/amd/display: Don't pass HostVM by default on DCN3.1 · 4a0caac0

Michael Strauss authored Apr 05, 2022

[WHY]
Roll back previous change to stop passing this value by default, instead
add a debug flag to override to previous behaviour (or force HostVM calcs)
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Michael Strauss <michael.strauss@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

4a0caac0

drm/amd/display: Reset cached PSR parameters after hibernate · d2069326

Evgenii Krasnikov authored Apr 05, 2022

[WHY]
After hibernate system might be using old invalid psr_power_opt and
psr_allow_active that never get reset

[HOW]
Reset cached Panel Self Refresh parameters when PSR is first configured
for eDP in dc_link_setup_psr.
Reviewed-by: Harry Vanzylldejong <harry.vanzylldejong@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Evgenii Krasnikov <Evgenii.Krasnikov@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

d2069326

drm/amd/display: Add Audio readback registers · e955b547

Ilya Bakoulin authored Mar 28, 2022

[Why]
Can be useful for verifying the correctness of audio output.
Reviewed-by: Aric Cyr <Aric.Cyr@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Ilya Bakoulin <Ilya.Bakoulin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

e955b547

drm/amd/display: update dcn315 clk table read · 89c342a9

Dmytro Laktyushkin authored Apr 08, 2022

Clean up the sequence by making sure clk_mgr always builds a
reasonable clock table regardless of what we read from smu
by moving all defaults from resource soc struct to clk_mgr.

Now the only thing resource soc update does is read
the clock table and apply any DC specific policy decisions
to how clocks are populated in dml soc.
Reviewed-by: Qingqing Zhuo <Qingqing.Zhuo@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

89c342a9

drm/amd/display: 3.2.182 · 259f249c

Aric Cyr authored Apr 10, 2022

This version brings along following improvements:
- Fix HDCP QUERY Error for eDP and Tiled
- Insert smu busy status before sending another request
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

259f249c

drm/amd/display: Fix HDCP QUERY Error for eDP and Tiled · 84ebd73e

Mustapha Ghaddar authored Apr 07, 2022

[WHY]
For dio_output_encoder ID we are relying on SW concept which is
invisible to HW

[HOW]
Needed to create separate cases for when DPIA and non DPIA for
dio link encoder ID
Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com>
Reviewed-by: James Zhang <james.zhang@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Mustapha Ghaddar <mghaddar@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

84ebd73e

drm/amd/display: Insert smu busy status before sending another request · 721af39f

Oliver Logush authored Apr 01, 2022

[why]
Need to check if result register is busy before sending another request

[how]
Call method to check if result register is busy
Reviewed-by: Charlene Liu <Charlene.Liu@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Oliver Logush <oliver.logush@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

721af39f

drm/amdkfd: Ignore bogus signals from MEC efficiently · c3eb12df

Felix Kuehling authored Apr 07, 2022

MEC firmware sometimes sends signal interrupts without a valid context ID
on end of pipe events that don't intend to signal any HSA signals.
This triggers the slow path in kfd_signal_event_interrupt that scans the
entire event page for signaled events. Detect these signals in the top
half interrupt handler to stop processing them as early as possible.

Because we now always treat event ID 0 as invalid, reserve that ID during
process initialization.

v2: Update firmware version checks to support more GPUs
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

c3eb12df

drm/amdgpu: Remove useless kfree · b3ef3205

Haowen Bai authored Apr 22, 2022

After alloc fail, we do not need to kfree.
Signed-off-by: Haowen Bai <baihaowen@meizu.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

b3ef3205

22 Apr, 2022 3 commits

drm/amdgpu: Ta fw needs to be loaded for SRIOV aldebaran · a2443ef0

David Yu authored Apr 22, 2022

Load ta fw during psp_init_sriov_microcode to enable XGMI.
It is required to be loaded by both guest and host starting
from Arcturus. Cap fw needs to be loaded first.
Signed-off-by: David Yu <David.Yu@amd.com>
Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

a2443ef0

drm/amd/pm: fix the deadlock issue observed on SI · 114f0887

Evan Quan authored Apr 08, 2022

The adev->pm.mutx is already held at the beginning of
amdgpu_dpm_compute_clocks/amdgpu_dpm_enable_uvd/amdgpu_dpm_enable_vce.
But on their calling path, amdgpu_display_bandwidth_update will be
called and thus its sub functions amdgpu_dpm_get_sclk/mclk. They
will then try to acquire the same adev->pm.mutex and deadlock will
occur.

By placing amdgpu_display_bandwidth_update outside of adev->pm.mutex
protection(considering logically they do not need such protection) and
restructuring the call flow accordingly, we can eliminate the deadlock
issue. This comes with no real logics change.

Fixes: 3712e7a4 ("drm/amd/pm: unified lock protections in amdgpu_dpm.c")
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reported-by: Arthur Marsh <arthur.marsh@internode.on.net>
Link: https://lore.kernel.org/all/9e689fea-6c69-f4b0-8dee-32c4cf7d8f9c@molgen.mpg.de/
BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1957Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

114f0887

drm/amdgpu: add RAS fatal error interrupt handler · b3c76814

Tao Zhou authored Apr 19, 2022

The fatal error handler is independent from general ras interrupt
handler since there is no related IH ring.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

b3c76814