- 13 Oct, 2020 1 commit
-
-
Dave Airlie authored
Merge tag 'drm-misc-next-fixes-2020-10-13' of git://anongit.freedesktop.org/drm/drm-misc into drm-next One fix for a bad revert in ingenic-drm, and one fix for panfrost to increase a timeout at power up. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20201013065709.lwjw3fthoxwsbqsl@gilmour.lan
-
- 12 Oct, 2020 2 commits
-
-
Paul Cercueil authored
Fix a badly reverted commit. The revert commit was cherry-picked from drm-misc-next to drm-misc-next-fixes, and in the process some unrelated code was added. Fixes: a3fb64c0 ("Revert "gpu/drm: ingenic: Add option to mmap GEM buffers cached"") Signed-off-by: Paul Cercueil <paul@crapouillou.net> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20201012102509.10690-1-paul@crapouillou.net
-
Dave Airlie authored
Merge tag 'amd-drm-fixes-5.10-2020-10-09' of git://people.freedesktop.org/~agd5f/linux into drm-next amd-drm-fixes-5.10-2020-10-09: amdgpu: - Clean up indirect register access - Navy Flounder fixes - SMU11 AC/DC interrupt fixes - GPUVM alignment fix - Display fixes - Misc other fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201009222810.4030-1-alexander.deucher@amd.com
-
- 11 Oct, 2020 1 commit
-
-
Dave Airlie authored
Merge tag 'drm-intel-next-fixes-2020-10-02' of git://anongit.freedesktop.org/drm/drm-intel into drm-next Propagated from drm-intel-next-queued: - Fix CRTC state checker (Ville) Propated from drm-intel-gt-next: - Avoid implicit vmpa for highmem on 32b (Chris) - Prevent PAT attriutes for writecombine if CPU doesn't support PAT (Chris) - Clear the buffer pool age before use. (Chris) - Fix error code (Dan) - Break up error capture compression loops (Chris) - Fix uninitialized variable in context_create_request (Maarten) - Check for errors on i915_vm_alloc_pt_stash to avoid NULL dereference (Matt) - Serialize debugfs i915_gem_objects with ctx->mutex (Chris) - Fix a rebase mistake caused during drm-intel-gt-next creation (Chris) - Hold request reference for canceling an active context (Chris) - Heartbeats fixes (Chris) - Use usigned during batch copies (Chris) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201002182610.GA2204465@intel.com
-
- 09 Oct, 2020 11 commits
-
-
Dave Airlie authored
Merge tag 'drm-misc-next-fixes-2020-10-02' of git://anongit.freedesktop.org/drm/drm-misc into drm-next Three fixes for vc4 that addresses dual-display breakages Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20201002065243.ry7gp4or3ywhluer@gilmour.lan
-
Ye Bin authored
Fix follow warning: Checking drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c... [drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:770]: (error) Invalid number of character '{' when these macros are defined: ''. Checking drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c: CONFIG_ACPI... [drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:770]: (error) Invalid number of character '{' when these macros are defined: 'CONFIG_ACPI'. ...... Checking drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c: CONFIG_X86... [drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:770]: (error) Invalid number of character '{' when these macros are defined: 'CONFIG_X86'. Checking drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c: _X86_... [drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:770]: (error) Invalid number of character '{' when these macros are defined: '_X86_'. Checking drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c: __linux__... [drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:770]: (error) Invalid number of character '{' when these macros are defined: '__linux__'. Fixes: 97d798b2 ("drm/amdgpu: simplify ATIF backlight handling") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Ye Bin <yebin10@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Emily.Deng authored
Remove the virtual_display warning in drm_crtc_vblank_off when dev->num_crtcs is null. Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Emily.Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
kernel test robot authored
Fixes: c7651b73 ("drm/amdgpu: Fix handling of KFD initialization failures") Signed-off-by: kernel test robot <lkp@intel.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Evan Quan authored
As the dpm clock table is needed during DC HW initialization. And that (DC HW initialization) comes before smu_late_init() where current APU dpm clock table setup is performed. So, NULL pointer dereference will be triggered. By moving APU dpm clock table setup to smu_hw_init(), this can be avoided. Fixes: 02cf91c1 ("drm/amd/powerplay: postpone operations not required for hw setup to late_init") Signed-off-by: Evan Quan <evan.quan@amd.com> Reported-by: Dirk Gouders <dirk@gouders.net> Acked-by: Nirmoy Das <nirmoy.das@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Alex Deucher authored
The default auto setting for kcq should not generate a warning. Fixes: a300de40 ("drm/amdgpu: introduce a new parameter to configure how many KCQ we want(v5)") Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Alex Deucher authored
We want to use the dev_* functions here rather than the pr_* variants. Switch to using dev_warn() which mirrors what we do on other asics. Fixes the following build errors on ARC: ../drivers/gpu/drm/amd/amdgpu/../powerplay/navi10_ppt.c: In function 'navi10_fill_i2c_req': ../arch/arc/include/asm/bug.h:24:2: error: implicit declaration of function 'pr_warn'; did you mean 'drm_warn'? [-Werror=implicit-function-declaration] ../drivers/gpu/drm/amd/amdgpu/../powerplay/sienna_cichlid_ppt.c: In function 'sienna_cichlid_fill_i2c_req': ../arch/arc/include/asm/bug.h:24:2: error: implicit declaration of function 'pr_warn'; did you mean 'drm_warn'? [-Werror=implicit-function-declaration] Reported-by: kernel test robot <lkp@intel.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Evan Quan <evan.quan@amd.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: linux-snps-arc@lists.infradead.org Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Dmytro Laktyushkin authored
This should be programmed with timing rather than with odm. Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com> Acked-by: Eryk Brol <eryk.brol@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Alvin Lee authored
[Why] We will hang if we report switch in VACTIVE but not in VBLANK and DPG_EN = 1 [How] Block switch in ACTIVE if not supported in BLANK Signed-off-by: Alvin Lee <alvin.lee2@amd.com> Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Eryk Brol <eryk.brol@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Christian Hewitt authored
Amlogic SoC devices report the following errors frequently causing excessive dmesg log spam and early log rotataion, although the errors appear to be harmless as everything works fine: [ 7.202702] panfrost ffe40000.gpu: error powering up gpu L2 [ 7.203760] panfrost ffe40000.gpu: error powering up gpu shader ARM staff have advised increasing the timeout values to eliminate the errors in most normal scenarios, and testing with several different G31/G52 devices shows 20000 to be a reliable value. Fixes: f3ba9122 ("drm/panfrost: Add initial panfrost driver") Suggested-by: Steven Price <steven.price@arm.com> Signed-off-by: Christian Hewitt <christianshewitt@gmail.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201008141738.13560-1-christianshewitt@gmail.com
-
Ondrej Jirman authored
The driver was renamed, change the path in the MAINTAINERS file. Signed-off-by: Ondrej Jirman <megous@megous.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://lore.kernel.org/lkml/20200701184640.1674969-1-megous@megous.com/#t
-
- 07 Oct, 2020 1 commit
-
-
Paul Cercueil authored
This reverts commit 37054fc8 ("gpu/drm: ingenic: Add option to mmap GEM buffers cached") At the very moment this commit was created, the DMA API it relied on was modified in the DMA tree, which caused the driver to break in linux-next. Revert it for now, and it will be resubmitted later to work with the new DMA API. Signed-off-by: Paul Cercueil <paul@crapouillou.net> Acked-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20201004141758.1013317-1-paul@crapouillou.net
-
- 05 Oct, 2020 5 commits
-
-
Fangzhi Zuo authored
[Why] Currently mode validation is bypassed if remote sink exists. That leads to mode set issue when a BW bottle neck exists in the link path, e.g., a DP-to-HDMI converter that only supports HDMI 1.4. Any invalid mode passed to Linux user space will cause the modeset failure due to limitation of Linux user space implementation. [How] Mode validation is skipped only if in edid override. For real remote sink, clock limit check should be done for HDMI remote sink. Have HDMI related remote sink going through mode validation to elimiate modes which pixel clock exceeds BW limitation. Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com> Reviewed-by: Hersen Wu <hersenxs.wu@amd.com> Acked-by: Eryk Brol <eryk.brol@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Chris Park authored
[Why] Formula uses kHz in their formula while our driver operates with Hz. [How] Divide audio rate by 1000 on the initial variable that is entered into formula. Signed-off-by: Chris Park <Chris.Park@amd.com> Reviewed-by: Charlene Liu <Charlene.Liu@amd.com> Acked-by: Eryk Brol <eryk.brol@amd.com> Acked-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Rodrigo Siqueira authored
[Why] Sometimes CRTCs can be disabled due to display unplugging or temporarily transition in the userspace; in these circumstances, DCE tries to set the minimum clock threshold. When we have this situation, the function bw_calcs is invoked with number_of_displays set to zero, making DCE set dispclk_khz and sclk_khz to zero. For these reasons, we have seen some ATOM bios errors that look like: [drm:atom_op_jump [amdgpu]] *ERROR* atombios stuck in loop for more than 5secs aborting [drm:amdgpu_atom_execute_table_locked [amdgpu]] *ERROR* atombios stuck executing EA8A (len 761, WS 0, PS 0) @ 0xEABA [How] This error happens due to an attempt to optimize the bandwidth using the sclk, and the dispclk clock set to zero. Technically we handle this in the function dce112_set_clock, but we are not considering the case that this value is set to zero. This commit fixes this issue by ensuring that we never set a minimum value below the minimum clock threshold. Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Acked-by: Eryk Brol <eryk.brol@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Alex Sierra authored
align frag_end to the next pd when there are no page table entries on the current pde. This fixes invalidation of larger address space areas where some page tables are allocated and other aren't. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Dirk Gouders authored
Commit c1cf79ca ("drm/amdgpu: use IP discovery table for renoir") introduced a NULL pointer dereference when booting with amdgpu.discovery=0, because it removed the call of vega10_reg_base_init() for that case. Fix this by calling that funcion if amdgpu_discovery == 0 in addition to the case that amdgpu_discovery_reg_base_init() failed. Fixes: c1cf79ca ("drm/amdgpu: use IP discovery table for renoir") Signed-off-by: Dirk Gouders <dirk@gouders.net> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
- 01 Oct, 2020 8 commits
-
-
Dave Airlie authored
When I refactored this code with the new init paths, I failed to set the funcs back up properly, this caused a failure to bringup gdm properly. Fixes: 252f8d7b ("drm/vmwgfx/ttm: convert vram mm init to new code paths") Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201001042012.13114-1-airlied@gmail.com
-
Alex Deucher authored
We need to schedule the smu AC/DC interrupt ack to avoid potentially sleeping if the smu message mutex is contended. Fixes: e1188aac ("drm/amdgpu/smu11: add support for SMU AC/DC interrupts") Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Alex Deucher authored
So we can schedule work from interrupts. This might include long tasks or things that could sleep. Fixes: e1188aac ("drm/amdgpu/smu11: add support for SMU AC/DC interrupts") Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Hawking Zhang authored
add mp0 11_0_11 for navy_flounder to the mem training supported list, otherwise the modeprobe would fail on navy_flounder with latest vbios. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Hawking Zhang authored
support both direct and indirect accessor in unified helper functions. v2: Retire indirect mmio access via mm_index/data Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Hawking Zhang authored
Switch WREG32/RREG32_PCIE to use indirect reg access helper for soc15 and onwards Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Hawking Zhang authored
Add helper function in order to remove RREG32/WREG32 in current pcie_rreg/wreg function for soc15 and onwards adapters. PCIE_INDEX/DATA pairs are used to access regsiters outside of mmio bar in the helper functions. The new helper functions help remove the recursion of amdgpu_mm_rreg/wreg from pcie_rreg/wreg and provide the oppotunity to centralize direct and indirect access in a single function. v2: Fixed typo and refine the comments v3: Remove unnecessary volatile local variable Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Tomi Valkeinen authored
On x64 we get: drivers/gpu/drm/bridge/cadence/cdns-mhdp8546-core.c:751:10: warning: conversion from 'long unsigned int' to 'unsigned int' changes value from '18446744073709551613' to '4294967293' [-Woverflow] The registers are 32 bit, so fix by casting to u32. Fixes: fb43aa0a ("drm: bridge: Add support for Cadence MHDP8546 DPI/DP bridge") Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Reviewed-by: Swapnil Jakhade <sjakhade@cadence.com> Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20200929091918.24813-1-tomi.valkeinen@ti.com
-
- 30 Sep, 2020 11 commits
-
-
Ramesh Errabolu authored
compute units that are in use. [Why] Allow user to know how many compute units (CU) are in use at any given moment. [How] Surface files in Sysfs that allow user to determine the number of compute units that are in use for a given process. One Sysfs file is used per device. Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com> Reviewed-By: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Ramesh Errabolu authored
waves that are in flight. [Why] Allow user to know how many compute units (CU) are in use at any given moment. [How] Read registers of SQ that give number of waves that are in flight of various queues. Use this information to determine number of CU's in use. Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com> Reviewed-By: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Chris Wilson authored
Be consistent and use unsigned long throughout the chunk copies to avoid the inherent clumsiness of mixing integer types of different widths and signs. Failing to take acount of a wider unsigned type when using min_t can lead to treating it as a negative, only for it flip back to a large unsigned value after passing a boundary check. Fixes: ed13033f ("drm/i915/cmdparser: Only cache the dst vmap") Testcase: igt/gen9_exec_parse/bb-large Reported-by: "Candelaria, Jared" <jared.candelaria@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: "Candelaria, Jared" <jared.candelaria@intel.com> Cc: "Bloomfield, Jon" <jon.bloomfield@intel.com> Cc: <stable@vger.kernel.org> # v4.9+ Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200928215942.31917-1-chris@chris-wilson.co.uk (cherry picked from commit b7eeb2b4) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Chris Wilson authored
Verify that if a context is active at the time it is closed, that it is either persistent and preemptible (with hangcheck running) or it shall be removed from execution. Fixes: 9a40bddd ("drm/i915/gt: Expose heartbeat interval via sysfs") Testcase: igt/gem_ctx_persistence/heartbeat-close Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.7+ Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200928221510.26044-3-chris@chris-wilson.co.uk (cherry picked from commit d3bb2f9b) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Chris Wilson authored
Currently, we check we can send a pulse prior to disabling the heartbeat to verify that we can change the heartbeat, but since we may re-evaluate execution upon changing the heartbeat interval we need another pulse afterwards to refresh execution. v2: Tvrtko asked if we could reduce the double pulse to a single, which opened up a discussion of how we should handle the pulse-error after attempting to change the property, and the desire to serialise adjustment of the property with its validating pulse, and unwind upon failure. Fixes: 9a40bddd ("drm/i915/gt: Expose heartbeat interval via sysfs") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: <stable@vger.kernel.org> # v5.7+ Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200928221510.26044-2-chris@chris-wilson.co.uk (cherry picked from commit 3dd66a94) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Chris Wilson authored
We only allow persistent requests to remain on the GPU past the closure of their containing context (and process) so long as they are continuously checked for hangs or allow other requests to preempt them, as we need to ensure forward progress of the system. If we allow persistent contexts to remain on the system after the the hangcheck mechanism is disabled, the system may grind to a halt. On disabling the mechanism, we sent a pulse along the engine to remove all executing contexts from the engine which would check for hung contexts -- but we did not prevent those contexts from being resubmitted if they survived the final hangcheck. Fixes: 9a40bddd ("drm/i915/gt: Expose heartbeat interval via sysfs") Testcase: igt/gem_ctx_persistence/heartbeat-stop Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.7+ Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200928221510.26044-1-chris@chris-wilson.co.uk (cherry picked from commit 7a991cd3) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Chris Wilson authored
We have to be very careful while walking the timeline->requests list under the RCU guard, as the requests (and so rq->link) use SLAB_TYPESAFE_BY_RCU and so the requests may be reallocated within an rcu grace period. As the requests are reallocated, they are removed from one list and placed on another, and if we are iterating over that request at that moment, the list iteration jumps from one list to the next and promptly gets confused. Verify we hold the request reference to ensure that the request is not added to a new list behind our backs. <4> [582.745252] general protection fault, probably for non-canonical address 0xcccccccccccccd5c: 0000 [#1] PREEMPT SMP PTI <4> [582.745297] CPU: 0 PID: 1475 Comm: gem_ctx_persist Not tainted 5.9.0-rc1-CI-CI_DRM_8908+ #1 <4> [582.745304] Hardware name: Intel Corporation NUC7CJYH/NUC7JYB, BIOS JYGLKCPX.86A.0027.2018.0125.1347 01/25/2018 <4> [582.745317] RIP: 0010:__lock_acquire+0x2c3/0x1f40 <4> [582.745323] Code: 00 65 8b 05 c7 8a ef 7e 85 c0 0f 85 b4 07 00 00 44 8b 9d c4 08 00 00 45 85 db 0f 84 0f 01 00 00 ba 05 00 00 00 e9 c8 06 00 00 <48> 81 3f c0 89 c7 82 b8 00 00 00 00 41 0f 45 c0 83 fe 01 41 89 c3 <4> [582.745334] RSP: 0018:ffffc9000461bc40 EFLAGS: 00010002 <4> [582.745340] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 <4> [582.745345] RDX: 0000000000000000 RSI: 0000000000000000 RDI: cccccccccccccd5c <4> [582.745350] RBP: ffff8881ec4a2880 R08: 0000000000000001 R09: 0000000000000001 <4> [582.745356] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000 <4> [582.745361] R13: 0000000000000000 R14: 0000000000000000 R15: cccccccccccccd5c <4> [582.745367] FS: 00007fb44da78e40(0000) GS:ffff888278000000(0000) knlGS:0000000000000000 <4> [582.745373] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4> [582.745378] CR2: 00007fb44daad040 CR3: 0000000268428000 CR4: 0000000000350ef0 <4> [582.745383] Call Trace: <4> [582.745390] ? __lock_acquire+0x913/0x1f40 <4> [582.745397] lock_acquire+0xb5/0x3c0 <4> [582.745526] ? kill_engines+0x19a/0x4b0 [i915] <4> [582.745533] ? find_held_lock+0x2d/0x90 <4> [582.745541] _raw_spin_lock_irq+0x30/0x40 <4> [582.745635] ? kill_engines+0x19a/0x4b0 [i915] <4> [582.745727] kill_engines+0x19a/0x4b0 [i915] <4> [582.745820] context_close+0x195/0x410 [i915] <4> [582.745912] i915_gem_context_close+0x5b/0x160 [i915] <4> [582.745994] i915_driver_postclose+0x14/0x40 [i915] <4> [582.746003] drm_file_free.part.13+0x240/0x290 <4> [582.746009] drm_release_noglobal+0x16/0x50 <4> [582.746016] __fput+0xa5/0x250 <4> [582.746021] task_work_run+0x6e/0xb0 <4> [582.746028] exit_to_user_mode_prepare+0x178/0x180 <4> [582.746034] syscall_exit_to_user_mode+0x36/0x220 <4> [582.746040] entry_SYSCALL_64_after_hwframe+0x44/0xa9 <4> [582.746045] RIP: 0033:0x7fb44d1dc421 <4> [582.746050] Code: f7 d8 64 89 02 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 8b 05 ea cf 20 00 85 c0 75 16 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 3f f3 c3 0f 1f 44 00 00 53 89 fb 48 83 ec 10 <4> [582.746062] RSP: 002b:00007ffed2e83818 EFLAGS: 00000246 ORIG_RAX: 0000000000000003 <4> [582.746069] RAX: 0000000000000000 RBX: 0000556410bfe840 RCX: 00007fb44d1dc421 <4> [582.746075] RDX: 000000000000000a RSI: 00000000c0406469 RDI: 0000000000000008 <4> [582.746080] RBP: 0000000000000008 R08: 00007fb44d1c51cc R09: 00007fb44d1c5240 <4> [582.746086] R10: 0000000000000001 R11: 0000000000000246 R12: 00000000fffffffb <4> [582.746091] R13: 0000000000000006 R14: 0000000000000000 R15: 000000000000000a <4> [582.746099] Modules linked in: vgem mei_hdcp snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio btusb btrtl btbcm btintel x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul bluetooth ghash_clmulni_intel ecdh_generic ecc i915 r8169 realtek mei_me mei snd_hda_intel i2c_hid snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core snd_pcm pinctrl_geminilake pinctrl_intel prime_numbers [last unloaded: test_drm_mm] Fixes: 736e785f ("drm/i915/gem: Reduce context termination list iteration guard to RCU") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200925101107.27869-2-chris@chris-wilson.co.uk (cherry picked from commit badef44d) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Chris Wilson authored
The reordering and rebasing of commit 2e4c6c1a ("drm/i915: Remove i915_request.lock requirement for execution callbacks") caused it to revert an earlier correction. Let us restore commit 99f0a640d464 ("drm/i915: Remove requirement for holding i915_request.lock for breadcrumbs") Fixes: 2e4c6c1a ("drm/i915: Remove i915_request.lock requirement for execution callbacks") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200925101107.27869-1-chris@chris-wilson.co.uk (cherry picked from commit 35faeb7d) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Chris Wilson authored
Since the debugfs may peek into the GEM contexts as the corresponding client/fd is being closed, we may try and follow a dangling pointer. However, the context closure itself is serialised with the ctx->mutex, so if we hold that mutex as we inspect the state coupled in the context, we know the pointers within the context are stable and will remain valid as we inspect their tables. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: CQ Tang <cq.tang@intel.com> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: stable@vger.kernel.org Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200723172119.17649-3-chris@chris-wilson.co.uk (cherry picked from commit 102f5aa4) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Matthew Auld authored
If we are really unlucky and encounter an error during i915_vm_alloc_pt_stash, we end up passing an empty pt/pd stash all the way down into the low-level ppgtt alloc code, leading to explosions, since it expects at least the required number of pt/pd for the va range. [ 211.981418] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 211.981421] #PF: supervisor read access in kernel mode [ 211.981422] #PF: error_code(0x0000) - not-present page [ 211.981424] PGD 80000008439cb067 P4D 80000008439cb067 PUD 84a37f067 PMD 0 [ 211.981427] Oops: 0000 [#1] SMP PTI [ 211.981428] CPU: 1 PID: 1301 Comm: i915_selftest Tainted: G U I 5.9.0-rc5+ #3 [ 211.981430] Hardware name: /NUC6i7KYB, BIOS KYSKLi70.86A.0050.2017.0831.1924 08/31/2017 [ 211.981521] RIP: 0010:__gen8_ppgtt_alloc+0x1ed/0x3c0 [i915] [ 211.981523] Code: c1 48 c7 c7 5d 5d fe c0 65 ff 0d ee 1d 03 3f e8 d9 91 1f e2 8b 55 c4 31 c0 48 8b 75 b8 85 d2 0f 95 c0 48 8b 1c c6 48 89 45 98 <48> 8b 03 48 8b 90 58 02 00 00 48 85 d2 0f 84 07 ea 15 00 48 81 fa [ 211.981526] RSP: 0018:ffffba2cc0eb3970 EFLAGS: 00010202 [ 211.981527] RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000004 [ 211.981529] RDX: 0000000000000002 RSI: ffff9be998bdb8c0 RDI: ffff9be99c844300 [ 211.981530] RBP: ffffba2cc0eb39d8 R08: 0000000000000640 R09: ffff9be97cdfd000 [ 211.981531] R10: ffff9be97cdfd614 R11: 0000000000000000 R12: 0000000000000000 [ 211.981532] R13: ffff9be98607ba20 R14: ffff9be995a0b400 R15: ffffba2cc0eb39e8 [ 211.981534] FS: 00007f0f10b31000(0000) GS:ffff9be99fc40000(0000) knlGS:0000000000000000 [ 211.981536] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 211.981538] CR2: 0000000000000000 CR3: 000000084d74e006 CR4: 00000000003706e0 [ 211.981539] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 211.981541] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 211.981542] Call Trace: [ 211.981609] gen8_ppgtt_alloc+0x79/0x90 [i915] [ 211.981678] ppgtt_bind_vma+0x36/0x80 [i915] [ 211.981756] __vma_bind+0x39/0x40 [i915] [ 211.981818] fence_work+0x21/0x98 [i915] [ 211.981879] fence_notify+0x8d/0x128 [i915] [ 211.981939] __i915_sw_fence_complete+0x62/0x240 [i915] [ 211.982018] i915_vma_pin_ww+0x1ee/0x9c0 [i915] Fixes: cd0452aa ("drm/i915: Preallocate stashes for vma page-directories") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200921160844.73186-1-matthew.auld@intel.com (cherry picked from commit 1604cb2a) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Maarten Lankhorst authored
In case backoff fails with an error, we return an undefined rq, assign err to rq correctly. Fixes: 8a929c9e ("drm/i915: Use ww pinning for intel_context_create_request()") Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200918111208.1392128-1-maarten.lankhorst@linux.intel.comReviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 4316b19d) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
-