Commits · 4fe3819443a13f8ecf11f53559ada5711dd8d4b1 · Kirill Smelkov / linux

13 Dec, 2021 40 commits

drm/amd: add some extra checks that is_dig_enabled is defined · 4fe38194

Mario Limonciello authored Dec 10, 2021

There are a few places that this isn't checked that could potentially
be a NULL pointer access.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

4fe38194

drm/amdgpu: Reduce SG bo memory usage for mGPUs · 28fe4164

Philip Yang authored Dec 06, 2021

For userptr bo, if adev is not in IOMMU isolation mode, RAM direct map
to GPU, multiple GPUs use same system memory dma mapping address, they
can share the original mem->bo in attachment to reduce dma address array
memory usage.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

28fe4164

drm/amdgpu: Detect if amdgpu in IOMMU direct map mode · 4a74c38c

Philip Yang authored Dec 06, 2021

If host and amdgpu IOMMU is not enabled or IOMMU is pass through mode,
set adev->ram_is_direct_mapped flag which will be used to optimize
memory usage for multi GPU mappings.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

4a74c38c

Documentation/gpu: Add amdgpu and dc glossary · a723c6d0

Rodrigo Siqueira authored Nov 25, 2021

In the DC driver, we have multiple acronyms that are not obvious most of
the time; the same idea is valid for amdgpu. This commit introduces a DC
and amdgpu glossary in order to make it easier to navigate through our
driver.

Changes since V3:
 - Yann: Add new acronyms to amdgpu glossary
 - Daniel: Add link between dc and amdgpu glossary

Changes since V2:
 - Add MMHUB

Changes since V1:
 - Yann: Divide glossary based on driver context.
 - Alex: Make terms more consistent and update CPLIB
 - Add new acronyms to the glossary
Reviewed-by: Yann Dirson <ydirson@free.fr>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

a723c6d0

Documentation/gpu: Add basic overview of DC pipeline · 522968ae

Rodrigo Siqueira authored Nov 25, 2021

This commit describes how DCN works by providing high-level diagrams
with an explanation of each component. In particular, it details the
Global Sync signals.

Change since V2:
 - Add a comment about MMHUBBUB.
Reviewed-by: Yann Dirson <ydirson@free.fr>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

522968ae

Documentation/gpu: How to collect DTN log · 76659755

Rodrigo Siqueira authored Nov 25, 2021

Introduce how to collect DTN log from debugfs.
Reviewed-by: Yann Dirson <ydirson@free.fr>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

76659755

Documentation/gpu: Document pipe split visual confirmation · b2568d68

Rodrigo Siqueira authored Nov 25, 2021

Display core provides a feature that makes it easy for users to debug
Pipe Split. This commit introduces how to use such a debug option.
Reviewed-by: Yann Dirson <ydirson@free.fr>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

b2568d68

Documentation/gpu: Document amdgpu_dm_visual_confirm debugfs entry · 7971fb35

Rodrigo Siqueira authored Nov 25, 2021

Display core provides a feature that makes it easy for users to debug
Multiple planes by enabling a visual notification at the bottom of each
plane. This commit introduces how to use such a feature.
Reviewed-by: Yann Dirson <ydirson@free.fr>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

7971fb35

Documentation/gpu: Reorganize DC documentation · e91f8401

Rodrigo Siqueira authored Nov 25, 2021

Display core documentation is not well organized, and it is hard to find
information due to the lack of sections. This commit reorganizes the
documentation layout, and it is preparation work for future changes.

Changes since V1:
- Christian: Group amdgpu documentation together.
- Daniel: Drop redundant amdgpu prefix.
- Jani: Create index pages.
- Yann: Mirror display folder in the documentation.
Reviewed-by: Yann Dirson <ydirson@free.fr>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

e91f8401

drm/amdgpu: add support for SMU debug option · 6ff7fddb

Lang Yu authored Nov 30, 2021

SMU firmware expects the driver maintains error context
and doesn't interact with SMU any more when SMU errors
occurred. That will aid in debugging SMU firmware issues.

Add SMU debug option support for this request, it can be
enabled or disabled via amdgpu_smu_debug debugfs file.
Use a 32-bit mask to indicate corresponding debug modes.
Currently, only one mode(HALT_ON_ERROR) is supported.
When enabled, it brings hardware to a kind of halt state
so that no one can touch it any more in the envent of SMU
errors.

The dirver interacts with SMU via sending messages. And
threre are three ways to sending messages to SMU in current
implementation. Handle them respectively as following:

1, smu_cmn_send_smc_msg_with_param() for normal timeout cases

  Halt on any error.

2, smu_cmn_send_msg_without_waiting()/smu_cmn_wait_for_response()
for longer timeout cases

  Halt on errors apart from ETIME. Otherwise this way won't work.
  Let the user handle ETIME error in such a case.

3, smu_cmn_send_msg_without_waiting() for no waiting cases

  Halt on errors apart from ETIME. Otherwise second way won't work.

== Command Guide ==

1, enable HALT_ON_ERROR mode

 # echo 0x1 > /sys/kernel/debug/dri/0/amdgpu_smu_debug

2, disable HALT_ON_ERROR mode

 # echo 0x0 > /sys/kernel/debug/dri/0/amdgpu_smu_debug

v5:
 - Use bit mask to allow more debug features.(Evan)
 - Use WRAN() instead of BUG().(Evan)

v4:
 - Set to halt state instead of a simple hang.(Christian)

v3:
 - Use debugfs_create_bool().(Christian)
 - Put variable into smu_context struct.
 - Don't resend command when timeout.

v2:
 - Resend command when timeout.(Lijo)
 - Use debugfs file instead of module parameter.
Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

6ff7fddb

drm/amdgpu: introduce a kind of halt state for amdgpu device · 34f3a4a9

Lang Yu authored Dec 09, 2021

It is useful to maintain error context when debugging
SW/FW issues. Introduce amdgpu_device_halt() for this
purpose. It will bring hardware to a kind of halt state,
so that no one can touch it any more.

Compare to a simple hang, the system will keep stable
at least for SSH access. Then it should be trivial to
inspect the hardware state and see what's going on.

v2:
 - Set adev->no_hw_access earlier to avoid potential crashes.(Christian)
Suggested-by: Christian Koenig <christian.koenig@amd.com>
Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Christian Koenig <christian.koenig@amd.co>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

34f3a4a9

drm/amdgpu: check df_funcs and its callback pointers · cace4bff

Hawking Zhang authored Nov 25, 2021

in case they are not avaiable in early phase
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

cace4bff

drm/amdgpu: don't override default ECO_BITs setting · 4ac955ba

Hawking Zhang authored Dec 04, 2021

Leave this bit as hardware default setting
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

4ac955ba

drm/amdgpu: correct register access for RLC_JUMP_TABLE_RESTORE · 2c113b99

Le Ma authored Dec 04, 2021

should count on GC IP base address
Signed-off-by: Le Ma <le.ma@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

2c113b99

drm/amdgpu: read and authenticate ip discovery binary · 2cb6577a

Hawking Zhang authored Nov 22, 2021

read and authenticate ip discovery binary getting from
vram first, if it is not valid, read and authenticate
the one getting from file
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

2cb6577a

drm/amdgpu: add helper to verify ip discovery binary signature · 32f0e1a3

Hawking Zhang authored Nov 22, 2021

To be used to check ip discovery binary signature
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

32f0e1a3

drm/amdgpu: rename discovery_read_binary helper · f6dcaf0c

Hawking Zhang authored Nov 22, 2021

add _from_vram in the funciton name to diffrentiate
the one used to read from file
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

f6dcaf0c

drm/amdgpu: add helper to load ip_discovery binary from file · 43a80bd5

Hawking Zhang authored Nov 23, 2021

To be used when ip_discovery binary is not carried by vbios
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

43a80bd5

drm/amdgpu: fix incorrect VCN revision in SRIOV · c40bdfb2

Leslie Shi authored Dec 08, 2021

Guest OS will setup VCN instance 1 which is disabled as an enabled instance and
execute initialization work on it, but this causes VCN ib ring test failure
on the disabled VCN instance during modprobe:

amdgpu 0000:00:08.0: amdgpu: ring vcn_enc_1.0 uses VM inv eng 5 on hub 1
amdgpu 0000:00:08.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on vcn_dec_0 (-110).
amdgpu 0000:00:08.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on vcn_enc_0.0 (-110).
[drm:amdgpu_device_delayed_init_work_handler [amdgpu]] *ERROR* ib ring test failed (-110).

v2: drop amdgpu_discovery_get_vcn_version and rename sriov_config to
vcn_config
v3: modify VCN's revision in SR-IOV and bare-metal

Fixes: baf3f8f3 ("drm/amdgpu: handle SRIOV VCN revision parsing")
Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

c40bdfb2

drm/amdgpu: add modifiers in amdgpu_vkms_plane_init() · 4046afce

Leslie Shi authored Dec 06, 2021

Fix following warning in SRIOV during modprobe:

amdgpu 0000:00:08.0: GFX9+ requires FB check based on format modifier
WARNING: CPU: 0 PID: 1023 at drivers/gpu/drm/amd/amdgpu/amdgpu_display.c:1150 amdgpu_display_framebuffer_init+0x8e7/0xb40 [amdgpu]
Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

4046afce

drm/amdgpu: disable default navi2x co-op kernel support · addaac0c

Jonathan Kim authored Dec 09, 2021

This patch reverts the following:
commit 48733b22 ("drm/amdkfd: add Navi2x to GWS init conditions")

Disable GWS usage in default settings for now due to FW bugs.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

addaac0c

drm/amdkfd: add Navi2x to GWS init conditions · 48733b22

Graham Sider authored Dec 09, 2021

Initalize GWS on Navi2x with mec2_fw_version >= 0x42.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-and-tested-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

48733b22

drm/amdgpu: only hw fini SMU fisrt for ASICs need that · 613aa3ea

Lang Yu authored Dec 03, 2021

We found some headaches on ASICs don't need that,
so remove that for them.
Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Kevin Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

613aa3ea

drm/amdgpu: remove power on/off SDMA in SMU hw_init/fini() · a60831ea

Lang Yu authored Dec 03, 2021

Currently, we don't find some neccesities to power on/off
SDMA in SMU hw_init/fini(). It makes more sense in SDMA
hw_init/fini().
Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Kevin Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

a60831ea

drm/amdkfd: Make KFD support on Hawaii experimental · 0f7ef0b9

Felix Kuehling authored Dec 07, 2021

Hawaii support is mostly untested these days. ROCm user mode also
depends on custom firmware for AQL packet processing, that was never
pushed upstream due to quality regressions in graphics driver testing.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

0f7ef0b9

drm/amdkfd: Don't split unchanged SVM ranges · 4853cbcd

Felix Kuehling authored Dec 07, 2021

If an existing SVM range overlaps an svm_range_set_attr call, we would
normally split it in order to update only the overlapping part.
However, if the attributes of the existing range would not be changed
splitting it is unnecessary.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

4853cbcd

drm/amdkfd: Fix svm_range_is_same_attrs · f864df76

Felix Kuehling authored Dec 07, 2021

The existing function doesn't compare the access bitmaps and flags.
This can result in failure to update those attributes in existing
ranges when all other attributes remained unchanged.

Because the access and flags attributes modify only some bits in the
respective bitmaps, we cannot compare them directly. Instead we need to
check whether applying the attributes to a particular range would
change the bitmaps.

A PREFETCH_LOC attribute must always trigger a migration, even if the
attribute value remains unchanged. E.g. if some pages were migrated due
to a CPU page fault, a prefetch must still be executed to migrate pages
back to VRAM.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

f864df76

drm/amdkfd: Fix error handling in svm_range_add · 726be406

Felix Kuehling authored Dec 07, 2021

Add null-pointer check after the last svm_range_new call. This was
originally reported by Zhou Qingyang <zhou1615@umn.edu> based on a
static analyzer.

To avoid duplicating the unwinding code from svm_range_handle_overlap,
I merged the two functions into one.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Zhou Qingyang <zhou1615@umn.edu>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

726be406

drm/amdgpu: Handle fault with same timestamp · 0771c805

Philip Yang authored Dec 08, 2021

Remove not unique timestamp WARNING as same timestamp interrupt happens
on some chips,

Drain fault need to wait for the processed_timestamp to be truly greater
than the checkpoint or the ring to be empty to be sure no stale faults
are handled.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1818Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

0771c805

drm/amdgpu: fix location of prototype for amdgpu_kms_compat_ioctl · e105b64a

Isabella Basso authored Dec 07, 2021

This fixes the warning below by changing the prototype to a location
that's actually included by the .c files that call
amdgpu_kms_compat_ioctl:

 warning: no previous prototype for ‘amdgpu_kms_compat_ioctl’
 [-Wmissing-prototypes]
 37 | long amdgpu_kms_compat_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
    |      ^~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Isabella Basso <isabbasso@riseup.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

e105b64a

drm/amd: append missing includes · 64cf26f0

Isabella Basso authored Dec 07, 2021

This fixes warnings caused by global functions lacking prototypes:, such as:

 warning: no previous prototype for 'dcn303_hw_sequencer_construct'
 [-Wmissing-prototypes]
 12 | void dcn303_hw_sequencer_construct(struct dc *dc)
    |      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 ...
 warning: no previous prototype for ‘amdgpu_has_atpx’
 [-Wmissing-prototypes]
 76 | bool amdgpu_has_atpx(void) {
    |      ^~~~~~~~~~~~~~~
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Isabella Basso <isabbasso@riseup.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

64cf26f0

drm/amdkfd: fix function scopes · ded331a0

Isabella Basso authored Dec 07, 2021

 This turns previously global functions into static, thus removing
 compile-time warnings such as:

 warning: no previous prototype for 'pm_set_resources_vi' [-Wmissing-prototypes]
 113 | int pm_set_resources_vi(struct packet_manager *pm, uint32_t *buffer,
     |     ^~~~~~~~~~~~~~~~~~~
Signed-off-by: Isabella Basso <isabbasso@riseup.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

ded331a0

drm/amdgpu: fix function scopes · 2351b7d4

Isabella Basso authored Dec 07, 2021

This turns previously global functions into static, thus removing
compile-time warnings such as:

 warning: no previous prototype for 'amdgpu_vkms_output_init' [-Wmissing-prototypes]
 399 | int amdgpu_vkms_output_init(struct drm_device *dev,
     |     ^~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Isabella Basso <isabbasso@riseup.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

2351b7d4

drm/amd: fix improper docstring syntax · bbe04dec

Isabella Basso authored Dec 07, 2021

This fixes various warnings relating to erroneous docstring syntax, of
which some are listed below:

 warning: Function parameter or member 'adev' not described in
 'amdgpu_atomfirmware_ras_rom_addr'
 ...
 warning: expecting prototype for amdgpu_atpx_validate_functions().
 Prototype was for amdgpu_atpx_validate() instead
 ...
 warning: Excess function parameter 'mem' description in 'amdgpu_preempt_mgr_new'
 ...
 warning: Cannot understand  * @kfd_get_cu_occupancy - Collect number of
 waves in-flight on this device
 ...
 warning: This comment starts with '/**', but isn't a kernel-doc
 comment. Refer Documentation/doc-guide/kernel-doc.rst
Signed-off-by: Isabella Basso <isabbasso@riseup.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

bbe04dec

drm/amd: Mark IP_BASE definition as __maybe_unused · 0e2a82a3

Isabella Basso authored Dec 07, 2021

Silences 166 compile-time warnings like:

 warning: 'UVD0_BASE' defined but not used [-Wunused-const-variable=]
 129 | static const struct IP_BASE UVD0_BASE ={ { { { 0x00007800, 0x00007E00, 0, 0, 0 } },
     |                             ^~~~~~~~~
 warning: 'UMC0_BASE' defined but not used [-Wunused-const-variable=]
 123 | static const struct IP_BASE UMC0_BASE ={ { { { 0x00014000, 0, 0, 0, 0 } },
     |                             ^~~~~~~~~
Signed-off-by: Isabella Basso <isabbasso@riseup.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

0e2a82a3

drm/amdgpu: extended waiting SRIOV VF reset completion timeout to 10s · 85a774d9

Zhigang Luo authored Dec 06, 2021

For the ASIC has big FB, it need more time to clear FB during reset.
This change extended SRIOV VF waiting reset completion timeout from 5s
to 10s.
Signed-off-by: Zhigang Luo <zhigang.luo@amd.com>
Acked-by: Shaoyun Liu <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

85a774d9

drm/amdgpu: recover XGMI topology for SRIOV VF after reset · a5f67c93

Zhigang Luo authored Dec 06, 2021

For SRIOV VF, the XGMI topology was not recovered after reset. This
change added code to SRIOV VF reset function to update XGMI topology
for SRIOV VF after reset.
Signed-off-by: Zhigang Luo <zhigang.luo@amd.com>
Reviewed-by: Shaoyun Liu <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

a5f67c93

drm/amdgpu: added PSP XGMI initialization for SRIOV VF during recover · dd26e018

Zhigang Luo authored Dec 06, 2021

For SRIOV VF, XGMI was not initialized in PSP during recover. This change
added PSP XGMI initialization for SRIOV VF during recover.
Signed-off-by: Zhigang Luo <zhigang.luo@amd.com>
Reviewed-by: Shaoyun Liu <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

dd26e018

drm/amdgpu: skip reset other device in the same hive if it's SRIOV VF · 175ac6ec

Zhigang Luo authored Nov 26, 2021

On SRIOV, host driver can support FLR(function level reset) on individual VF
within the hive which might bring the individual device back to normal without
the necessary to execute the hive reset. If the FLR failed , host driver will
trigger the hive reset, each guest VF will get reset notification before the
real hive reset been executed. The VF device can handle the reset request
individually in it's reset work handler.

This change updated gpu recover sequence to skip reset other device in
the same hive for SRIOV VF.
Signed-off-by: Zhigang Luo <zhigang.luo@amd.com>
Reviewed-by: Shaoyun Liu <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

175ac6ec

drm/amd/display: Add feature flags to disable LTTPR · 12320274

Aurabindo Pillai authored Dec 07, 2021

[Why]
Allow for disabling non transparent mode of LTTPR for running tests.

[How]
Add a feature flag and set them during init sequence. The flags are
already being used in DC.
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

12320274