Commits · 52a3a40ee4f89c89026837838f7df386d64c2892 · Kirill Smelkov / linux

11 Apr, 2023 31 commits

drm/amd/pm: correct SMU13.0.7 pstate profiling clock settings · 52a3a40e

Horatio Zhang authored Apr 06, 2023

Correct the pstate standard/peak profiling mode clock
settings for SMU13.0.7.
Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

52a3a40e

drm/amdgpu: refine get gpu clock counter method · 5591a051

Tong Liu01 authored Apr 06, 2023

[why]
regGOLDEN_TSC_COUNT_LOWER/regGOLDEN_TSC_COUNT_UPPER are protected and
unaccessible under sriov.
The clock counter high bit may update during reading process.

[How]
Replace regGOLDEN_TSC_COUNT_LOWER/regGOLDEN_TSC_COUNT_UPPER with
regCP_MES_MTIME_LO/regCP_MES_MTIME_HI to get gpu clock under sriov.
Refine get gpu clock counter method to make the result more precise.
Signed-off-by: Tong Liu01 <Tong.Liu01@amd.com>
Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

5591a051

drm/amdgpu: optimize redundant code in umc_v6_7 · fc926fae

YiPeng Chai authored Mar 27, 2023

Optimize redundant code in umc_v6_7.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

fc926fae

drm/amdgpu: Fix sdma v4 sw fini error · 5e08e9c7

lyndonli authored Apr 06, 2023

Fix sdma v4 sw fini error for sdma 4.2.2 to
solve the following general protection fault

[  +0.108196] general protection fault, probably for non-canonical
address 0xd5e5a4ae79d24a32: 0000 [#1] PREEMPT SMP PTI
[  +0.000018] RIP: 0010:free_fw_priv+0xd/0x70
[  +0.000022] Call Trace:
[  +0.000012]  <TASK>
[  +0.000011]  release_firmware+0x55/0x80
[  +0.000021]  amdgpu_ucode_release+0x11/0x20 [amdgpu]
[  +0.000415]  amdgpu_sdma_destroy_inst_ctx+0x4f/0x90 [amdgpu]
[  +0.000360]  sdma_v4_0_sw_fini+0xce/0x110 [amdgpu]
Signed-off-by: lyndonli <Lyndon.Li@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

5e08e9c7

drm/amdgpu: DROP redundant drm_prime_sg_to_dma_addr_array · edd48e6d

Shane Xiao authored Apr 05, 2023

For DMA-MAP userptr on other GPUs, the dma address array
will be populated in amdgpu_ttm_backend_bind.

Remove the redundant call from the driver.

v2:
  update the comment
Signed-off-by: Shane Xiao <shane.xiao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

edd48e6d

drm/amdgpu: optimize redundant code in umc_v8_10 · e86bd8b2

YiPeng Chai authored Mar 27, 2023

Optimize redundant code in umc_v8_10
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

e86bd8b2

amd/amdgpu: Inherit coherence flags base on original BO flags · af152c21

Shane Xiao authored Apr 06, 2023

For SG BO to DMA-map userptrs on other GPUs, the SG BO
need inherit MTYPEs in PTEs from original BO.

If we set the flags, the device can be coherent with
the CPUs and other GPUs.

v2:
  1. Drop unnecessary flags check
  2. Remove local variable align
Signed-off-by: Shane Xiao <shane.xiao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

af152c21

drm/amd/pm: Fix incorrect comment about Vangogh power cap support · 89317d42

Guilherme G. Piccoli authored Mar 30, 2023

The comment mentions that power1 cap attributes are not supported on
Vangogh, but the opposite is indeed valid: for APUs, only Vangogh is
supported. While at it, also fixed the Renoir comment below (thanks
Melissa for noticing that!).

Cc: Lijo Lazar <lijo.lazar@amd.com>
Cc: Melissa Wen <mwen@igalia.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

89317d42

drm/amdgpu: Add userptr bo support for mGPUs when iommu is on · 207bbfb6

Shane Xiao authored Apr 05, 2023

For userptr bo with iommu on, multiple GPUs use same system
memory dma mapping address when both adev and bo_adev are in
identity mode or in the same iommu group.

If RAM direct map to one GPU, other GPUs can share the original
BO in order to reduce dma address array usage when RAM can
direct map to these GPUs. However, we should explicit check
whether RAM can direct map to all these GPUs.

This patch fixes a potential issue that where RAM is
direct mapped on some but not all GPUs.

v2:
  1. Update comment
  2. Add helper function reuse_dmamap
Signed-off-by: Shane Xiao <shane.xiao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

207bbfb6

drm/amd/amdgpu: Drop the hang limit parameter · 11f25c84

Srinivasan Shanmugam authored Apr 05, 2023

The driver doesn't resubmit jobs on hangs any more, hence drop
the hang limit parameter - amdgpu_job_hang_limit, wherever it is used.
Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Kent Russell <kent.russell@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

11f25c84

drm/amd/display: Fix potential null dereference · 52f1783f

Igor Artemiev authored Apr 03, 2023

The adev->dm.dc pointer can be NULL and dereferenced in amdgpu_dm_fini()
without checking.

Add a NULL pointer check before calling dc_dmub_srv_destroy().

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 9a71c7d3 ("drm/amd/display: Register DMUB service with DC")
Signed-off-by: Igor Artemiev <Igor.A.Artemiev@mcst.ru>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

52f1783f

drm/amd/display: 3.2.230 · e38dddca

Aric Cyr authored Mar 27, 2023

This DC version brings along:
- FW Release 0.0.161.0
- Improvements on FPO/FAMS
- Correction to DML calculation
- Fix to multiple clock related issues
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

e38dddca

drm/amd: Fix an out of bounds error in BIOS parser · d116db18

Mario Limonciello authored Mar 23, 2023

The array is hardcoded to 8 in atomfirmware.h, but firmware provides
a bigger one sometimes. Deferencing the larger array causes an out
of bounds error.

commit 4fc1ba4a ("drm/amd/display: fix array index out of bound error
in bios parser") fixed some of this, but there are two other cases
not covered by it.  Fix those as well.

Reported-by: erhard_f@mailbox.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=214853
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2473Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

d116db18

drm/amd/display: [FW Promotion] Release 0.0.161.0 · 9dce8c2a

Anthony Koo authored Mar 25, 2023

 - Add command to idle opt.
 - Rename d3 entry event and add idle trigger param on
   notify event.
 - Add bit to fw boot status to notify status when hardware
   is powered up.
Reviewed-by: Aric Cyr <Aric.Cyr@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Anthony Koo <Anthony.Koo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

9dce8c2a

drm/amd/display: Improve robustness of FIXED_VS link training at DP1 rates · 7727e7b6

Michael Strauss authored Mar 24, 2023

[WHY]
New sequence for transparent mode DP1.x link training was provided by LTTPR
vendor

[HOW]
Implement new FIXED_VS sequence, increase LT retry count to minimize
any potential intermittent lightup failures
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Michael Strauss <michael.strauss@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

7727e7b6

drm/amdgpu: Add MES KIQ clear to tell RLC that KIQ is dequeued · 554836cc

Yifan Zha authored Mar 29, 2023

[Why]
As MES KIQ is dequeued, tell RLC that KIQ is inactive

[How]
Clear the RLC_CP_SCHEDULERS Active bit which RLC checks KIQ status
In addition, driver can halt MES under SRIOV when unloading driver

v2:
Use scheduler0 mask to clear KIQ portion of RLC_CP_SCHEDULERS
Signed-off-by: Yifan Zha <Yifan.Zha@amd.com>
Reviewed-by: Horace Chen <horace.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

554836cc

drm/amd/display: add dscclk instance offset check · a2a0bdf1

Charlene Liu authored Mar 24, 2023

[why]
based on dscclk instance offset check conditiona program dscclk
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

a2a0bdf1

drm/amd/display: On clock init, maintain DISPCLK freq · d170e938

Alvin Lee authored Mar 24, 2023

[Description]
- On init if a display is connected, we need to maintain the DISPCLK
  frequency
- Even though DPG_EN=1, the display still requires the correct
  timing or it could cause audio corruption (if DISPCLK freq
  is reduced)
- Read the current DISPCLK freq and request the same value to ensure
  the timing is valid and unchanged
- However, add option to do a full pipe power down (including link)
  which will also avoid audio related issues
	- Disabled for the time being on dcn32
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alvin Lee <Alvin.Lee2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

d170e938

drm/amd/display: Add FPO + VActive support · 0289e0ed

Alvin Lee authored Mar 23, 2023

[Description]
- When determining FPO support, include FPO + VActive support
- Support FPO + VActive if one display meets regular requirements
  for FPO and the second display is able to switch in VACTIVE with
  a given amount of margin
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alvin Lee <Alvin.Lee2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

0289e0ed

drm/amd/display: Correct DML calculation to follow HW SPEC · 385c3e4c

Paul Hsieh authored Mar 22, 2023

[Why]
In 2560x1600@240p eDP panel, driver use lowest voltage level
to play 1080p video cause underflow. According to HW SPEC,
the senario should use high voltage level.

[How]
ChromaPre value is zero when bandwidth validation.
Correct ChromaPre calculation.
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Paul Hsieh <Paul.Hsieh@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

385c3e4c

drm/amd/display: prep work for root clock optimization enablement for DCN314 · 6f6869dc

Hamza Mahfooz authored Mar 21, 2023

To enable root clock optimizations, we need a number of
register writes and need to account for the difference
in DPSTREAMCLK between DCN31 and DCN314. To prevent
issues, add a number of register writes to
DCCG_MASK_SH_LIST_DCN314_COMMON(), and define dccg314_init()
which is mostly in alignment with dccg31_init() but
accounts for the new DPSTREAMCLK sequence.
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

6f6869dc

drm/amd/display: add scaler control for dcn32 · 0efa7035

Zhikai Zhai authored Mar 15, 2023

[WHY]
It will introduce the extra warnning log on some asic
that doesn't register

[HOW]
Add the register on dcn32
Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Zhikai Zhai <zhikai.zhai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

0efa7035

drm/amd/display: Clear FAMS flag if FAMS doesn't reduce vlevel · abaeafb1

Alvin Lee authored Mar 21, 2023

[Description]
- If we find that applying FAMS doesn't reduce the voltage level,
  we will not use it
- Ensure to clear the stream flags indicating FAMS if we hit this
  case
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alvin Lee <Alvin.Lee2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

abaeafb1

drm/amdgpu: Add MES KIQ dequeue in MES hw fini · 9eb28ac1

Yifan Zha authored Mar 29, 2023

[Why]
Need dequeue MES KIQ under SRIOV when unloading driver

[How]
Modify mes_v11_0_kiq_dequeue_sched which was used to dequeue MES SCHED
to support veriable pipe.
Add MES KIQ dequeue in hw fini
Signed-off-by: Yifan Zha <Yifan.Zha@amd.com>
Reviewed-by: Horace Chen <horace.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

9eb28ac1

drm/amd/display: remove unused average_render_time_in_us and i variables · 583da1b8

Tom Rix authored Mar 30, 2023

clang with W=1 reports
drivers/gpu/drm/amd/amdgpu/../display/modules/freesync/freesync.c:1132:15: error: variable
  'average_render_time_in_us' set but not used [-Werror,-Wunused-but-set-variable]
        unsigned int average_render_time_in_us = 0;
                     ^
This variable is not used so remove it, which caused i to be unused so remove that as well.
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

583da1b8

drm/amdgpu: add new parameters in v11_struct · 63a4d258

Arvind Yadav authored Mar 31, 2023

Added some new parameters defined for the gfx usermode queues
use cases in the v11_mqd_struct.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <Arvind.Yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

63a4d258

drm/amd/amdgpu: introduce gc_*_mes_2.bin v2 · 97998b89

Jack Xiao authored Mar 24, 2023

To avoid new mes fw running with old driver, rename
mes schq fw to gc_*_mes_2.bin.

v2: add MODULE_FIRMWARE declaration
v3: squash in fixup patch
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

97998b89

drm/amdgpu: allow more APUs to do mode2 reset when go to S4 · 980d5bae

Tim Huang authored Mar 30, 2023

Skip mode2 reset only for IMU enabled APUs when do S4.

This patch is to fix the regression issue
https://gitlab.freedesktop.org/drm/amd/-/issues/2483
It is generated by commit b5896266 ("drm/amdgpu: skip ASIC reset
for APUs when go to S4").

Fixes: b5896266 ("drm/amdgpu: skip ASIC reset for APUs when go to S4")
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2483Tested-by: Yuan Perry <Perry.Yuan@amd.com>
Signed-off-by: Tim Huang <tim.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

980d5bae

Merge tag 'mediatek-drm-next-6.4' of... · 55bf1496

Daniel Vetter authored Apr 11, 2023

Merge tag 'mediatek-drm-next-6.4' of https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-next

Mediatek DRM Next for Linux 6.4

1. Add support for 10-bit overlays
2. Add MediaTek SoC DRM (vdosys1) support for mt8195
3. Change mmsys compatible for mt8195 mediatek-drm
4. Only trigger DRM HPD events if bridge is attached
5. Change the aux retries times when receiving AUX_DEFER
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20230410233005.2572-1-chunkuang.hu@kernel.org

55bf1496

Merge tag 'drm-msm-next-2023-04-10' of https://gitlab.freedesktop.org/drm/msm into drm-next · b8d85bb5

Daniel Vetter authored Apr 11, 2023

main pull request for v6.4

Core Display:
============
* Bugfixes for error handling during probe
* rework UBWC decoder programming
* prepare_commit cleanup
* bindings for SM8550 (MDSS, DPU), SM8450 (DP)
* timeout calculation fixup
* atomic: use drm_crtc_next_vblank_start() instead of our own
  custom thing to calculate the start of next vblank

DP:
==
* interrupts cleanup

DPU:
===
* DSPP sub-block flush on sc7280
* support AR30 in addition to XR30 format
* Allow using REC_0 and REC_1 to handle wide (4k) RGB planes
* Split the HW catalog into individual per-SoC files

DSI:
===
* rework DSI instance ID detection on obscure platforms

GPU:
===
* uapi C++ compatibility fix
* a6xx: More robust gdsc reset
* a3xx and a4xx devfreq support
* update generated headers
* various cleanups and fixes
* GPU and GEM updates to avoid allocations which could trigger
  reclaim (shrinker) in fence signaling path
* dma-fence deadline hint support and wait-boost
* a640 speedbin support
* a650 speedbin support

Conflicts in drivers/gpu/drm/msm/adreno/adreno_gpu.c:

Conflict between the 7fa5047a ("drm: Use of_property_present() for
testing DT property presence") and 9f251f93 ("drm/msm/adreno: Use
OPP for every GPU generation"). The latter removed the of_ function
call outright, so I went with what's in the PR unchanged.

From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGvwuj5tabyW910+N-B=5kFNAC7QNYoQ=0xi3roBjQvFFQ@mail.gmail.comSigned-off-by: Daniel Vetter <daniel.vetter@intel.com>

b8d85bb5

Merge tag 'drm-habanalabs-next-2023-04-10' of... · 838ac90d

Daniel Vetter authored Apr 11, 2023

Merge tag 'drm-habanalabs-next-2023-04-10' of https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/linux into drm-next

This tag contains additional habanalabs driver changes for v6.4:

- uAPI changes:
  - Add a definition of a new Gaudi2 server type. This is used by userspace
    to know what is the connectivity between the accelerators inside the
    server

- New features and improvements:
  - speedup h/w queues test in Gaudi2 to reduce device initialization times.

- Firmware related fixes:
  - Fixes to the handshake protocol during f/w initialization.
  - Sync f/w events interrupt in hard reset to avoid warning message.
  - Improvements to extraction of the firmware version.

- Misc bug fixes and code cleanups. Notable fixes are:
  - Multiple fixes for interrupt handling in Gaudi2.
  - Unmap mapped memory in case TLB invalidation fails.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Oded Gabbay <ogabbay@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20230410124637.GA2441888@ogabbay-vm-u20.habana-labs.com

838ac90d

08 Apr, 2023 9 commits

accel/habanalabs: add missing error flow in hl_sysfs_init() · 56499c46

Tomer Tayar authored Apr 02, 2023

hl_sysfs_fini() is called only if hl_sysfs_init() completes
successfully. Therefore if hl_sysfs_init() fails, need to remove any
sysfs group that was added until that point.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>

56499c46

accel/habanalabs: speedup h/w queues test in Gaudi2 · 31420f93

Moti Haimovski authored Mar 20, 2023

HW queues testing at driver load and after reset takes a substantial
amount of time.
This commit reduces the queues test time in Gaudi2 devices by running
all the tests in parallel instead of one after the other.
Time measurements on tests duration shows that the new method is almost
x100 faster than the serial approach.
Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>

31420f93

accel/habanalabs: fix handling of arc farm sei event · 91204e47

Dani Liberman authored Mar 28, 2023

There is only single eq entry for arc farm sei event which aggregates
events from the four arc farms.
Fix the code to handle this event according to this behavior.
Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>

91204e47

accel/habanalabs: remove Gaudi1 multi MSI code · b207e166

Ofir Bitton authored Mar 27, 2023

Multi MSI interrupts aren't working in Gaudi1 and because of that,
we are only using a single MSI interrupt. Therefore, let's remove this
dead code in order to avoid confusion.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>

b207e166

accel/habanalabs/uapi: new Gaudi2 server type · a25c2f7a

Oded Gabbay authored Mar 30, 2023

Add definition of a new Gaudi2 server type. This represents
the connectivity between the cards in that server type.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>

a25c2f7a

accel/habanalabs: fixes for unexpected error interrupt · 38f3c732

Ofir Bitton authored Mar 28, 2023

Removing redundant asic prop variable as we don't need to expose this
to common code. In addition, fix some typos.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>

38f3c732

accel/habanalabs: don't wait for STS_OK after sending COMMS WFE · c19350ef

Koby Elbaz authored Mar 26, 2023

Sending COMMS_GOTO_WFE instructs the FW's CPU to halt (WFE state).
Once sent, FW's CPU isn't expected to continue communicating with LKD.
Therefore, the stage of waiting for COMMS_STS_OK should be skipped or
else waiting for COMMS_STS_OK will simply timeout, which will trigger
unexpected behavior.
Signed-off-by: Koby Elbaz <kelbaz@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>

c19350ef

accel/habanalabs: sync f/w events interrupt in hard reset · 802f25b6

Tal Cohen authored Mar 21, 2023

Receiving events from FW, while the device is in hard reset, causes
a warning message in Driver log. The message may point to a
problem in the Driver or FW. But It also can appear as a result
of events that have been sent from FW just before the hard reset.
In order to avoid receiving events from FW while the device is in reset
and is already in 'disabled' mode, sync the f/w events interrupt right
before setting the device to 'disabled'.
Signed-off-by: Tal Cohen <talcohen@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>

802f25b6

accel/habanalabs: fix wrong reset and event flags · 82a1b48a

Ofir Bitton authored Mar 26, 2023

During event handling, driver sets relevant reset and user event
notifier flags. Fix few wrong flags settings.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>

82a1b48a