Commits · 65ba96e91b689c23d6fa99c11cfd65965dcddc47 · Kirill Smelkov / linux

An error occurred fetching the project authors.

15 Mar, 2023 1 commit

drm/amdgpu: Move to common indirect reg access helper · 65ba96e9

Hawking Zhang authored 2 years ago

Replace soc15, nv, soc21 specific callbacks with common
one. so we don't need to duplicate code when introduce
new asics.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

65ba96e9

13 Mar, 2023 1 commit

drm/amdgpu: drop pm_sysfs_en flag from amdgpu_device structure · 53e9d836

Guchun Chen authored 1 year ago

pm_sysfs_en is overlapped with pm.sysfs_initialized, so drop it
for simplifying code(no functional change).
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

53e9d836

08 Mar, 2023 1 commit

drm/amdgpu: Drop redundant pci_enable_pcie_error_reporting() · 58265640

Bjorn Helgaas authored 2 years ago

pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages.  Since f26e58bf ("PCI/AER: Enable error reporting when AER is
native"), the PCI core does this for all devices during enumeration, so the
driver doesn't need to do it itself.

Remove the redundant pci_enable_pcie_error_reporting() call from the
driver.

Note that this only controls ERR_* Messages from the device.  An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

58265640

09 Feb, 2023 1 commit

drm/amdgpu: add S/G display parameter · bf0207e1

Alex Deucher authored 2 years ago

Some users have reported flickerng with S/G display.  We've
tried extensively to reproduce and debug the issue on a wide
variety of platform configurations (DRAM bandwidth, etc.) and
a variety of monitors, but so far have not been able to.  We
disabled S/G display on a number of platforms to address this
but that leads to failure to pin framebuffers errors and
blank displays when there is memory pressure or no displays
at all on systems with limited carveout (e.g., Chromebooks).
Add a option to disable this as a debugging option as a
way for users to disable this, depending on their use case,
and for us to help debug this further.

v2: fix typo
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

bf0207e1

05 Jan, 2023 2 commits

drm/amdgpu: Retry DDC probing on DVI on failure if we got an HPD interrupt · 90f56611

xurui authored 2 years ago

HPD signals on DVI ports can be fired off before the pins required for
DDC probing actually make contact, due to the pins for HPD making
contact first. This results in a HPD signal being asserted but DDC
probing failing, resulting in hotplugging occasionally failing.

Rescheduling the hotplug work for a second when we run into an HPD
signal with a failing DDC probe usually gives enough time for the rest
of the connector's pins to make contact, and fixes this issue.
Signed-off-by: xurui <xurui@kylinos.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

90f56611

Revert "drm/amd/display: Enable Freesync Video Mode by default" · 6fe6ece3

Michel Dänzer authored 2 years ago

This reverts commit de05abe6.

The bug referenced below was bisected to this commit. There has been no
activity toward fixing it in 3 months, so let's revert for now.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2162Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

6fe6ece3

03 Jan, 2023 4 commits

Revert "drm/amd/display: Enable Freesync Video Mode by default" · 4243c84a

Michel Dänzer authored 2 years ago

This reverts commit de05abe6.

The bug referenced below was bisected to this commit. There has been no
activity toward fixing it in 3 months, so let's revert for now.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2162Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

4243c84a

drm/amdgpu: allow zero as vram limit · 0b04ea39

Christian König authored 3 years ago

This allows testing the driver without any VRAM.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

0b04ea39

drm/amdgpu: rename vram_scratch into mem_scratch · 7ccfd79f

Christian König authored 3 years ago

Rename vram_scratch into mem_scratch and allow allocating it into GTT as
well.

The only problem with that is that we won't have a default page for the
system aperture any more.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

7ccfd79f

drm/amdgpu: use VRAM|GTT for a bunch of kernel allocations · 58ab2c08

Christian König authored 3 years ago

Technically all of those can use GTT as well, no need to force things
into VRAM.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

58ab2c08

06 Dec, 2022 1 commit

drm/ttm: merge ttm_bo_api.h and ttm_bo_driver.h v2 · a3185f91

Christian König authored 2 years ago

Merge and cleanup the two headers into a single description of the
object API. Also move all the documentation to the implementation and
drop unnecessary includes from the header.

No functional change.

v2: minimal checkpatch.pl cleanup
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221125102137.1801-4-christian.koenig@amd.com

a3185f91

17 Nov, 2022 1 commit

drm/amdgpu: rename the files for HMM handling · d9483ecd

Christian König authored 2 years ago

Clean that up a bit, no functional change.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

d9483ecd

15 Nov, 2022 4 commits

drm/amdgpu: there is no vbios fb on devices with no display hw (v2) · 220c8cc8

Alex Deucher authored 2 years ago

If we enable virtual display functionality on parts with
no display hardware we can end up trying to check for and
reserve the vbios FB area on devices where it doesn't exist.
Check if display hardware is actually present on the hardware
before trying to reserve the memory.

v2: move the check into common code
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

220c8cc8

drm/amdgpu: there is no vbios fb on devices with no display hw (v2) · f8794f31

Alex Deucher authored 2 years ago

If we enable virtual display functionality on parts with
no display hardware we can end up trying to check for and
reserve the vbios FB area on devices where it doesn't exist.
Check if display hardware is actually present on the hardware
before trying to reserve the memory.

v2: move the check into common code
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

f8794f31

drm/amdgpu: clarify DC checks · d09ef243

Alex Deucher authored 2 years ago

There are several places where we don't want to check
if a particular asic could support DC, but rather, if
DC is enabled.  Set a flag if DC is enabled and check
for that rather than if a device supports DC or not.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

d09ef243

drm/amdgpu: rework SR-IOV virtual display handling · 25263da3

Alex Deucher authored 2 years ago

virtual display is enabled unconditionally in SR-IOV, but
without specifying the virtual_display module, the number
of crtcs defaults to 0.  Set a single display by default
for SR-IOV if the virtual_display parameter is not set.
Only enable virtual display by default on SR-IOV on asics
which actually have display hardware.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

25263da3

04 Nov, 2022 1 commit

drm/amdgpu: extend halt_if_hws_hang to MES · 9a1662f5

Graham Sider authored 2 years ago

Hang on MES timeout if halt_if_hws_hang is set to 1.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

9a1662f5

19 Oct, 2022 1 commit

Revert "drm/amdgpu: add debugfs amdgpu_reset_level" · afbaa155

Victor Zhao authored 2 years ago

This reverts commit 5bd8d53f.

This commit breaks the reset logic for aldebaran, revert it for now.
Will move the mask inside the reset handler.

Fixes: 5bd8d53f ("drm/amdgpu: add debugfs amdgpu_reset_level")
Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

afbaa155

17 Oct, 2022 2 commits

Revert "drm/amdgpu: add debugfs amdgpu_reset_level" · 6aa58939

Victor Zhao authored 2 years ago

This reverts commit 5bd8d53f.

This commit breaks the reset logic for aldebaran, revert it for now.
Will move the mask inside the reset handler.

Fixes: 5bd8d53f ("drm/amdgpu: add debugfs amdgpu_reset_level")
Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

6aa58939

drm/amdgpu: extend HWIP_MAX_INSTANCE to 28 · 7a94c860

Hawking Zhang authored 2 years ago

more ip instances are available
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

7a94c860

20 Sep, 2022 1 commit

drm/amdgpu: add gang submit backend v2 · 68ce8b24

Christian König authored 3 years ago

Allows submitting jobs as gang which needs to run on multiple
engines at the same time.

Basic idea is that we have a global gang submit fence representing when the
gang leader is finally pushed to run on the hardware last.

Jobs submitted as gang are never re-submitted in case of a GPU reset since this
won't work and will just deadlock the hardware immediately again.

v2: fix logic inversion, improve documentation, fix rcu
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

68ce8b24

16 Aug, 2022 3 commits

drm/amdgpu: reduce reset time · 194eb174

Victor Zhao authored 2 years ago

In multi container use case, reset time is important, so skip ring
tests and cp halt wait during ip suspending for reset as they are
going to fail and cost more time on reset

v2: add a hang flag to indicate the reset comes from a job timeout,
skip ring test and cp halt wait in this case

v3: move hang flag to adev
Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

194eb174

drm/amdgpu: add debugfs amdgpu_reset_level · 5bd8d53f

Victor Zhao authored 2 years ago

Introduce amdgpu_reset_level debugfs in order to help debug and
test specific type of reset. Also helps blocking unwanted type of
resets.

By default, mode2 reset will not be enabled

v2: make this debugfs in adev and use debugfs_create_u32
Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

5bd8d53f

drm/amdgpu: Increase tlb flush timeout for sriov · 373008bf

Dusica Milinkovic authored 2 years ago

[Why]
During multi-vf executing benchmark (Luxmark) observed kiq error timeout.
It happenes because all of VFs do the tlb invalidation at the same time.
Although each VF has the invalidate register set, from hardware side
the invalidate requests are queue to execute.

[How]
In case of 12 VF increase timeout on 12*100ms
Signed-off-by: Dusica Milinkovic <Dusica.Milinkovic@amd.com>
Acked-by: Shaoyun Liu <shaoyun.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

373008bf

28 Jul, 2022 1 commit

drm/amdgpu: Fix the incomplete product number · 1f83db6b

Roy Sun authored 2 years ago

The comments say that the product number is a 16-digit HEX string so the
buffer needs to be at least 17 characters to hold the NUL terminator. Expand
the buffer size to 20 to avoid the alignment issues.

The comment:Product number should only be 16 characters. Any
more,and something could be wrong. Cap it at 16 to be safe
Signed-off-by: Roy Sun <Roy.Sun@amd.com>
Reviewed-by: André Almeida <andrealmeid@igalia.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

1f83db6b

25 Jul, 2022 1 commit

drm/amd/display: Add visualconfirm module parameter · 792a0cdd

Leo Li authored 2 years ago

[Why]

Being able to configure visual confirm at boot or in cmdline is helpful
when debugging.

[How]

Add a module parameter to configure DC visual confirm, which works the
same way as the equivalent debugfs entry.
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

792a0cdd

18 Jul, 2022 1 commit

drm/amdgpu: drop runpm from amdgpu_device structure · 9c913f38

Guchun Chen authored 2 years ago

It's redundant, as now switching to rpm_mode to indicate
runtime power management mode.
Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

9c913f38

13 Jul, 2022 1 commit

drm/amdgpu: support reset flag set for gpu reset · f1549c09

Likun Gao authored 2 years ago

Move reset_context out of gpu recover function to make it configurable
for different reset purpose.
For the reset way of call gpu_recovery sysfs, force to use full reset
method. Otherwise, try soft reset by default if the related ASIC
supportted, if soft reset failed, will use full reset.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

f1549c09

10 Jun, 2022 2 commits

drm/amdgpu: Rename amdgpu_device_gpu_recover_imp back to amdgpu_device_gpu_recover · cf727044

Andrey Grodzovsky authored 2 years ago

We removed the wrapper that was queueing the recover function
into reset domain queue who was using this name.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

cf727044

drm/amdgpu: Add work_struct for GPU reset from debugfs · 2f83658f

Andrey Grodzovsky authored 2 years ago

We need to have a work_struct to cancel this reset if another
already in progress.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

2f83658f

08 Jun, 2022 2 commits

drm/amdgpu: enable ASPM support for PCIE 7.4.0/7.6.0 · 62f8f5c3

Evan Quan authored 2 years ago

Enable ASPM support for PCIE 7.4.0 and 7.6.0.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

62f8f5c3

drm/amdgpu: Add peer-to-peer support among PCIe connected AMD GPUs · 08a2fd23

Ramesh Errabolu authored 2 years ago

Add support for peer-to-peer communication among AMD GPUs over PCIe
bus. Support REQUIRES enablement of config HSA_AMD_P2P.
Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

08a2fd23

06 Jun, 2022 2 commits

drm/amdgpu: adding device coredump support · 3d8785f6

Somalapuram Amaranath authored 2 years ago

Added device coredump information:
- Kernel version
- Module
- Time
- VRAM status
- Guilty process name and PID
- GPU register dumps
v1 -> v2: Variable name change
v1 -> v2: NULL check
v1 -> v2: Code alignment
v1 -> v2: Adding dummy amdgpu_devcoredump_free
v1 -> v2: memset reset_task_info to zero
v2 -> v3: add CONFIG_DEV_COREDUMP for variables
v2 -> v3: remove NULL check on amdgpu_devcoredump_read
Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
Reviewed-by: Shashank Sharma <Shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

3d8785f6

drm/amdgpu: save the reset dump register value for devcoredump · 651d7ee6

Somalapuram Amaranath authored 2 years ago

Allocate memory for register value and use the same values for devcoredump.
v1 -> v2: Change krealloc_array() to kmalloc_array()
v2 -> v3: Fix alignment
Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
Reviewed-by: Shashank Sharma <Shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

651d7ee6

03 Jun, 2022 1 commit

drm/amd: Fix spelling typo in comments · faf26f2b

pengfuyuan authored 2 years ago

Fix spelling typo in comments.
Reported-by: k2ci <kernel-bot@kylinos.cn>
Signed-off-by: pengfuyuan <pengfuyuan@kylinos.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

faf26f2b

18 May, 2022 2 commits

drm/amd: Don't reset dGPUs if the system is going to s2idle · 7123d39d

Mario Limonciello authored 2 years ago

An A+A configuration on ASUS ROG Strix G513QY proves that the ASIC
reset for handling aborted suspend can't work with s2idle.

This functionality was introduced in commit daf8de08 ("drm/amdgpu:
always reset the asic in suspend (v2)"). A few other commits have
gone on top of the ASIC reset, but this still doesn't work on the A+A
configuration in s2idle.

Avoid doing the reset on dGPUs specifically when using s2idle.

Fixes: daf8de08 ("drm/amdgpu: always reset the asic in suspend (v2)")
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2008Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

7123d39d

drm/amd: Don't reset dGPUs if the system is going to s2idle · 0223e516

Mario Limonciello authored 2 years ago

An A+A configuration on ASUS ROG Strix G513QY proves that the ASIC
reset for handling aborted suspend can't work with s2idle.

Avoid doing the reset on dGPUs specifically when using s2idle.

0223e516

10 May, 2022 2 commits

drm/amdgpu: add lsdma block · 1b491330

Likun Gao authored 2 years ago

Add Light SDMA (LSDMA) block and related function. LSDMA
is a small instance of SDMA mainly for kernel driver use.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

1b491330

drm/amdgpu/psp: Add vbflash sysfs interface support · 8424f2cc

Likun Gao authored 3 years ago

Add sysfs interface to copy VBIOS.

v2: squash in fix for proper vmalloc API (Alex)
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

8424f2cc

05 May, 2022 1 commit

Revert "drm/amdgpu: disable runpm if we are the primary adapter" · 5a90c24a

Alex Deucher authored 2 years ago

This reverts commit b95dc06a.

This workaround is no longer necessary.  We have a better workaround
in commit f95af4a9 ("drm/amdgpu: don't runtime suspend if there are displays attached (v3)").
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

5a90c24a