- 19 Dec, 2023 4 commits
-
-
Wayne Lin authored
[Why & how] Refactor dc_is_dmub_outbox_supported() a bit and add case for dcn35 to register dmub outbox notification irq to handle usb4 relevant hpd event. Reviewed-by: Roman Li <roman.li@amd.com> Reviewed-by: Jun Lei <jun.lei@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Hamza Mahfooz authored
There have recently been changes that break backwards compatibility, that were introduced into DMUB firmware (for DCN32x) concerning FPO and SubVP. So, since those are just power optimization features, we can just disable them unless the user is using a new enough version of DMUB firmware. Cc: stable@vger.kernel.org Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2870 Fixes: ed6e2782 ("drm/amd/display: For cursor P-State allow for SubVP") Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Closes: https://lore.kernel.org/r/CABXGCsNRb0QbF2pKLJMDhVOKxyGD6-E+8p-4QO6FOWa6zp22_A@mail.gmail.com/Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Philip Yang authored
On gfx943 APU there is no VRAM and page migration, queue CWSR area, svm range with always mapped flag, is not mapped to GPU correctly. This works fine if retry fault on CWSR area can be recovered, but could cause deadlock if there is another retry fault recover waiting for CWSR to finish. Fix this by mapping svm range with always mapped flag to GPU with ACCESS attribute if XNACK ON. There is side effect, because all GPUs have ACCESS attribute by default on new svm range with XNACK on, the CWSR area will be mapped to all GPUs after this change. This side effect will be fixed with Thunk change to set CWSR svm range with ACCESS_IN_PLACE attribute on the GPU that user queue is created. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
Alvin Lee authored
[Description] Revert commit fec05adc ("drm/amd/display: Use channel_width = 2 for vram table 3.0") Because the issue is being fixed from VBIOS side. Reviewed-by: Samson Tam <samson.tam@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Alvin Lee <alvin.lee2@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-
- 17 Dec, 2023 10 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull perf fix from Borislav Petkov: - Avoid iterating over newly created group leader event's siblings because there are none, and thus prevent a lockdep splat * tag 'perf_urgent_for_v6.7_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Fix perf_event_validate_size() lockdep splat
-
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linuxLinus Torvalds authored
Pull btrfs fix from David Sterba: "One more fix that verifies that the snapshot source is a root, same check is also done in user space but should be done by the ioctl as well" * tag 'for-6.7-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: do not allow non subvolume root targets for snapshot
-
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwireLinus Torvalds authored
Pull soundwire fixes from Vinod Koul: - Null pointer dereference for mult link in core - AC timing fix in intel driver * tag 'soundwire-6.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire: soundwire: intel_ace2x: fix AC timing setting for ACE2.x soundwire: stream: fix NULL pointer dereference for multi_link
-
git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phyLinus Torvalds authored
Pull phy fixes from Vinod Koul: - register offset fix for TI driver - mediatek driver minimal supported frequency fix - negative error code in probe fix for sunplus driver * tag 'phy-fixes-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: phy: sunplus: return negative error code in sp_usb_phy_probe phy: mediatek: mipi: mt8183: fix minimal supported frequency phy: ti: gmii-sel: Fix register offset when parent is not a syscon node
-
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengineLinus Torvalds authored
Pull dmaengine fixes from Vinod Koul: - SPI PDMA data fix for TI k3-psil drivers - suspend fix, pointer check, logic for arbitration fix and channel leak fix in fsl-edma driver - couple of fixes in idxd driver for GRPCFG descriptions and int_handle field handling - single fix for stm32 driver for bitfield overflow * tag 'dmaengine-fix-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: dmaengine: fsl-edma: fix DMA channel leak in eDMAv4 dmaengine: fsl-edma: fix wrong pointer check in fsl_edma3_attach_pd() dmaengine: idxd: Fix incorrect descriptions for GRPCFG register dmaengine: idxd: Protect int_handle field in hw descriptor dmaengine: stm32-dma: avoid bitfield overflow assertion dmaengine: fsl-edma: Add judgment on enabling round robin arbitration dmaengine: fsl-edma: Do not suspend and resume the masked dma channel when the system is sleeping dmaengine: ti: k3-psil-am62a: Fix SPI PDMA data dmaengine: ti: k3-psil-am62: Fix SPI PDMA data
-
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxlLinus Torvalds authored
Pull CXL (Compute Express Link) fixes from Dan Williams: "A collection of CXL fixes. The touch outside of drivers/cxl/ is for a helper that allocates physical address space. Device hotplug tests showed that the driver failed to utilize (skipped over) valid capacity when allocating a new memory region. Outside of that, new tests uncovered a small crop of lockdep reports. There is also some miscellaneous error path and leak fixups that are not urgent, but useful to cleanup now. - Fix alloc_free_mem_region()'s scan for address space, prevent false negative out-of-space events - Fix sleeping lock acquisition from CXL trace event (atomic context) - Fix put_device() like for the new CXL PMU driver - Fix wrong pointer freed on error path - Fixup several lockdep reports (missing lock hold) from new assertion in cxl_num_decoders_committed() and new tests" * tag 'cxl-fixes-6.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/pmu: Ensure put_device on pmu devices cxl/cdat: Free correct buffer on checksum error cxl/hdm: Fix dpa translation locking kernel/resource: Increment by align value in get_free_mem_region() cxl: Add cxl_num_decoders_committed() usage to cxl_test cxl/memdev: Hold region_rwsem during inject and clear poison ops cxl/core: Always hold region_rwsem while reading poison lists cxl/hdm: Fix a benign lockdep splat
-
git://git.kernel.org/pub/scm/linux/kernel/git/ras/rasLinus Torvalds authored
Pull EDAC fix from Borislav Petkov: - A single fix for the EDAC Versal driver to read out register fields properly * tag 'edac_urgent_for_v6.7_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/versal: Read num_csrows and num_chans using the correct bitfield macro
-
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds authored
Pull powerpc fixes from Michael Ellerman: - Fix a bug where heavy VAS (accelerator) usage could race with partition migration and prevent the migration from completing. - Update MAINTAINERS to add Aneesh & Naveen. Thanks to Haren Myneni. * tag 'powerpc-6.7-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: MAINTAINERS: powerpc: Add Aneesh & Naveen powerpc/pseries/vas: Migration suspend waits for no in-progress open windows
-
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linuxLinus Torvalds authored
Pull clk fixes from Stephen Boyd: "A handful of clk fixes, mostly in the rockchip clk driver: - Fix a clk name, clk parent, and a register for a clk gate in the Rockchip rk3128 clk driver - Add a PLL frequency on Rockchip rk3568 to fix some display artifacts - Fix a kbuild dependency for Qualcomm's SM_CAMCC_8550 symbol so that it isn't possible to select the associated GCC driver" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: rockchip: rk3128: Fix SCLK_SDMMC's clock name clk: rockchip: rk3128: Fix aclk_peri_src's parent clk: qcom: Fix SM_CAMCC_8550 dependencies clk: rockchip: rk3128: Fix HCLK_OTG gate register clk: rockchip: rk3568: Add PLL rate for 292.5MHz
-
- 16 Dec, 2023 3 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-traceLinus Torvalds authored
Pull tracing fixes from Steven Rostedt: - Fix eventfs to check creating new files for events with names greater than NAME_MAX. The eventfs lookup needs to check the return result of simple_lookup(). - Fix the ring buffer to check the proper max data size. Events must be able to fit on the ring buffer sub-buffer, if it cannot, then it fails to be written and the logic to add the event is avoided. The code to check if an event can fit failed to add the possible absolute timestamp which may make the event not be able to fit. This causes the ring buffer to go into an infinite loop trying to find a sub-buffer that would fit the event. Luckily, there's a check that will bail out if it looped over a 1000 times and it also warns. The real fix is not to add the absolute timestamp to an event that is starting at the beginning of a sub-buffer because it uses the sub-buffer timestamp. By avoiding the timestamp at the start of the sub-buffer allows events that pass the first check to always find a sub-buffer that it can fit on. - Have large events that do not fit on a trace_seq to print "LINE TOO BIG" like it does for the trace_pipe instead of what it does now which is to silently drop the output. - Fix a memory leak of forgetting to free the spare page that is saved by a trace instance. - Update the size of the snapshot buffer when the main buffer is updated if the snapshot buffer is allocated. - Fix ring buffer timestamp logic by removing all the places that tried to put the before_stamp back to the write stamp so that the next event doesn't add an absolute timestamp. But each of these updates added a race where by making the two timestamp equal, it was validating the write_stamp so that it can be incorrectly used for calculating the delta of an event. - There's a temp buffer used for printing the event that was using the event data size for allocation when it needed to use the size of the entire event (meta-data and payload data) - For hardening, use "%.*s" for printing the trace_marker output, to limit the amount that is printed by the size of the event. This was discovered by development that added a bug that truncated the '\0' and caused a crash. - Fix a use-after-free bug in the use of the histogram files when an instance is being removed. - Remove a useless update in the rb_try_to_discard of the write_stamp. The before_stamp was already changed to force the next event to add an absolute timestamp that the write_stamp is not used. But the write_stamp is modified again using an unneeded 64-bit cmpxchg. - Fix several races in the 32-bit implementation of the rb_time_cmpxchg() that does a 64-bit cmpxchg. - While looking at fixing the 64-bit cmpxchg, I noticed that because the ring buffer uses normal cmpxchg, and this can be done in NMI context, there's some architectures that do not have a working cmpxchg in NMI context. For these architectures, fail recording events that happen in NMI context. * tag 'trace-v6.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: ring-buffer: Do not record in NMI if the arch does not support cmpxchg in NMI ring-buffer: Have rb_time_cmpxchg() set the msb counter too ring-buffer: Fix 32-bit rb_time_read() race with rb_time_cmpxchg() ring-buffer: Fix a race in rb_time_cmpxchg() for 32 bit archs ring-buffer: Remove useless update to write_stamp in rb_try_to_discard() ring-buffer: Do not try to put back write_stamp tracing: Fix uaf issue when open the hist or hist_debug file tracing: Add size check when printing trace_marker output ring-buffer: Have saved event hold the entire event ring-buffer: Do not update before stamp when switching sub-buffers tracing: Update snapshot buffer on resize if it is allocated ring-buffer: Fix memory leak of free page eventfs: Fix events beyond NAME_MAX blocking tasks tracing: Have large events show up as '[LINE TOO BIG]' instead of nothing ring-buffer: Fix writing to the buffer with max_data_size
-
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linuxLinus Torvalds authored
Pull arm64 fixes from Catalin Marinas: - Arm CMN perf: fix the DTC allocation failure path which can end up erroneously clearing live counters - arm64/mm: fix hugetlb handling of the dirty page state leading to a continuous fault loop in user on hardware without dirty bit management (DBM). That's caused by the dirty+writeable information not being properly preserved across a series of mprotect(PROT_NONE), mprotect(PROT_READ|PROT_WRITE) * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: mm: Always make sw-dirty PTEs hw-dirty in pte_modify perf/arm-cmn: Fail DTC counter allocation correctly
-
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pciLinus Torvalds authored
Pull pci fixes from Bjorn Helgaas: - Limit Max_Read_Request_Size (MRRS) on some MIPS Loongson systems because they don't all support MRRS > 256, and firmware doesn't always initialize it correctly, which meant some PCIe devices didn't work (Jiaxun Yang) - Add and use pci_enable_link_state_locked() to prevent potential deadlocks in vmd and qcom drivers (Johan Hovold) - Revert recent (v6.5) acpiphp resource assignment changes that fixed issues with hot-adding devices on a root bus or with large BARs, but introduced new issues with GPU initialization and hot-adding SCSI disks in QEMU VMs and (Bjorn Helgaas) * tag 'pci-v6.7-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: Revert "PCI: acpiphp: Reassign resources on bridge if necessary" PCI/ASPM: Add pci_disable_link_state_locked() lockdep assert PCI/ASPM: Clean up __pci_disable_link_state() 'sem' parameter PCI: qcom: Clean up ASPM comment PCI: qcom: Fix potential deadlock when enabling ASPM PCI: vmd: Fix potential deadlock when enabling ASPM PCI/ASPM: Add pci_enable_link_state_locked() PCI: loongson: Limit MRRS to 256
-
- 15 Dec, 2023 23 commits
-
-
Josef Bacik authored
Our btrfs subvolume snapshot <source> <destination> utility enforces that <source> is the root of the subvolume, however this isn't enforced in the kernel. Update the kernel to also enforce this limitation to avoid problems with other users of this ioctl that don't have the appropriate checks in place. Reported-by: Martin Michaelis <code@mgjm.de> CC: stable@vger.kernel.org # 4.14+ Reviewed-by: Neal Gompa <neal@gompa.dev> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-
Jens Axboe authored
This code is rarely (never?) enabled by distros, and it hasn't caught anything in decades. Let's kill off this legacy debug code. Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Jens Axboe authored
There are multiple ways to grab references to credentials, and the only protection we have against overflowing it is the memory required to do so. With memory sizes only moving in one direction, let's bump the reference count to 64-bit and move it outside the realm of feasibly overflowing. Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Bjorn Helgaas authored
This reverts commit 40613da5 and the subsequent fix to it: cc22522f ("PCI: acpiphp: Use pci_assign_unassigned_bridge_resources() only for non-root bus") 40613da5 fixed a problem where hot-adding a device with large BARs failed if the bridge windows programmed by firmware were not large enough. cc22522f ("PCI: acpiphp: Use pci_assign_unassigned_bridge_resources() only for non-root bus") fixed a problem with 40613da5: an ACPI hot-add of a device on a PCI root bus (common in the virt world) or firmware sending ACPI Bus Check to non-existent Root Ports (e.g., on Dell Inspiron 7352/0W6WV0) caused a NULL pointer dereference and suspend/resume hangs. Unfortunately the combination of 40613da5 and cc22522f caused other problems: - Fiona reported that hot-add of SCSI disks in QEMU virtual machine fails sometimes. - Dongli reported a similar problem with hot-add of SCSI disks. - Jonathan reported a console freeze during boot on bare metal due to an error in radeon GPU initialization. Revert both patches to avoid adding these problems. This means we will again see the problems with hot-adding devices with large BARs and the NULL pointer dereferences and suspend/resume issues that 40613da5 and cc22522f were intended to fix. Fixes: 40613da5 ("PCI: acpiphp: Reassign resources on bridge if necessary") Fixes: cc22522f ("PCI: acpiphp: Use pci_assign_unassigned_bridge_resources() only for non-root bus") Reported-by: Fiona Ebner <f.ebner@proxmox.com> Closes: https://lore.kernel.org/r/9eb669c0-d8f2-431d-a700-6da13053ae54@proxmox.comReported-by: Dongli Zhang <dongli.zhang@oracle.com> Closes: https://lore.kernel.org/r/3c4a446a-b167-11b8-f36f-d3c1b49b42e9@oracle.comReported-by: Jonathan Woithe <jwoithe@just42.net> Closes: https://lore.kernel.org/r/ZXpaNCLiDM+Kv38H@marvin.atrad.com.auSigned-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Igor Mammedov <imammedo@redhat.com> Cc: <stable@vger.kernel.org>
-
git://git.kernel.dk/linuxLinus Torvalds authored
Pull io_uring fixes from Jens Axboe: "Just two minor fixes: - Fix for the io_uring socket option commands using the wrong value on some archs (Al) - Tweak to the poll lazy wake enable (me)" * tag 'io_uring-6.7-2023-12-15' of git://git.kernel.dk/linux: io_uring/cmd: fix breakage in SOCKET_URING_OP_SIOC* implementation io_uring/poll: don't enable lazy wake for POLLEXCLUSIVE
-
Linus Torvalds authored
Merge tag 'mm-hotfixes-stable-2023-12-15-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "17 hotfixes. 8 are cc:stable and the other 9 pertain to post-6.6 issues" * tag 'mm-hotfixes-stable-2023-12-15-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm/mglru: reclaim offlined memcgs harder mm/mglru: respect min_ttl_ms with memcgs mm/mglru: try to stop at high watermarks mm/mglru: fix underprotected page cache mm/shmem: fix race in shmem_undo_range w/THP Revert "selftests: error out if kernel header files are not yet built" crash_core: fix the check for whether crashkernel is from high memory x86, kexec: fix the wrong ifdeffery CONFIG_KEXEC sh, kexec: fix the incorrect ifdeffery and dependency of CONFIG_KEXEC mips, kexec: fix the incorrect ifdeffery and dependency of CONFIG_KEXEC m68k, kexec: fix the incorrect ifdeffery and build dependency of CONFIG_KEXEC loongarch, kexec: change dependency of object files mm/damon/core: make damon_start() waits until kdamond_fn() starts selftests/mm: cow: print ksft header before printing anything else mm: fix VMA heap bounds checking riscv: fix VMALLOC_START definition kexec: drop dependency on ARCH_SUPPORTS_KEXEC from CRASH_DUMP
-
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/soundLinus Torvalds authored
Pull sound fixes from Takashi Iwai: "A collection of HD-audio quirks for TAS2781 codec and device-specific workarounds" * tag 'sound-6.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda/tas2781: reset the amp before component_add ALSA: hda/tas2781: call cleanup functions only once ALSA: hda/tas2781: handle missing EFI calibration data ALSA: hda/tas2781: leave hda_component in usable state ALSA: hda/realtek: Apply mute LED quirk for HP15-db ALSA: hda/hdmi: add force-connect quirks for ASUSTeK Z170 variants ALSA: hda/hdmi: add force-connect quirk for NUC5CPYB
-
git://anongit.freedesktop.org/drm/drmLinus Torvalds authored
Pull drm fixes from Dave Airlie: "More regular fixes, amdgpu, i915, mediatek and nouveau are most of them this week. Nothing too major, then a few misc bits and pieces in core, panel and ivpu. drm: - fix uninit problems in crtc - fix fd ownership check - edid: add modes in fallback paths panel: - move LG panel into DSI yaml - ltk050h3146w: set burst mode mediatek: - mtk_disp_gamma: Fix breakage due to merge issue - fix kernel oops if no crtc is found - Add spinlock for setting vblank event in atomic_begin - Fix access violation in mtk_drm_crtc_dma_dev_get i915: - Fix selftest engine reset count storage for multi-tile - Fix out-of-bounds reads for engine reset counts - Fix ADL+ remapped stride with CCS - Fix intel_atomic_setup_scalers() plane_state handling - Fix ADL+ tiled plane stride when the POT stride is smaller than the original - Fix eDP 1.4 rate select method link configuration amdgpu: - Fix suspend fix that got accidently mangled last week - Fix OD regression - PSR fixes - OLED Backlight regression fix - JPEG 4.0.5 fix - Misc display fixes - SDMA 5.2 fix - SDMA 2.4 regression fix - GPUVM race fix nouveau: - fix gk20a instobj hierarchy - fix headless iors inheritance regression ivpu: - fix WA initialisation" * tag 'drm-fixes-2023-12-15' of git://anongit.freedesktop.org/drm/drm: (31 commits) drm/nouveau/kms/nv50-: Don't allow inheritance of headless iors drm/nouveau: Fixup gk20a instobj hierarchy drm/amdgpu: warn when there are still mappings when a BO is destroyed v2 drm/amdgpu: fix tear down order in amdgpu_vm_pt_free drm/amd: Fix a probing order problem on SDMA 2.4 drm/amdgpu/sdma5.2: add begin/end_use ring callbacks drm/panel: ltk050h3146w: Set burst mode for ltk050h3148w dt-bindings: panel-simple-dsi: move LG 5" HD TFT LCD panel into DSI yaml drm/amd/display: Disable PSR-SU on Parade 0803 TCON again drm/amd/display: Populate dtbclk from bounding box drm/amd/display: Revert "Fix conversions between bytes and KB" drm/amdgpu/jpeg: configure doorbell for each playback drm/amd/display: Restore guard against default backlight value < 1 nit drm/amd/display: fix hw rotated modes when PSR-SU is enabled drm/amd/pm: fix pp_*clk_od typo drm/amdgpu: fix buffer funcs setting order on suspend harder drm/mediatek: Fix access violation in mtk_drm_crtc_dma_dev_get drm/edid: also call add modes in EDID connector update fallback drm/i915/edp: don't write to DP_LINK_BW_SET when using rate select drm/i915: Fix ADL+ tiled plane stride when the POT stride is smaller than the original ...
-
Steven Rostedt (Google) authored
As the ring buffer recording requires cmpxchg() to work, if the architecture does not support cmpxchg in NMI, then do not do any recording within an NMI. Link: https://lore.kernel.org/linux-trace-kernel/20231213175403.6fc18540@gandalf.local.home Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
-
Steven Rostedt (Google) authored
The rb_time_cmpxchg() on 32-bit architectures requires setting three 32-bit words to represent the 64-bit timestamp, with some salt for synchronization. Those are: msb, top, and bottom The issue is, the rb_time_cmpxchg() did not properly salt the msb portion, and the msb that was written was stale. Link: https://lore.kernel.org/linux-trace-kernel/20231215084114.20899342@rorschach.local.home Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Fixes: f03f2abc ("ring-buffer: Have 32 bit time stamps use all 64 bits") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
-
Mathieu Desnoyers authored
The following race can cause rb_time_read() to observe a corrupted time stamp: rb_time_cmpxchg() [...] if (!rb_time_read_cmpxchg(&t->msb, msb, msb2)) return false; if (!rb_time_read_cmpxchg(&t->top, top, top2)) return false; <interrupted before updating bottom> __rb_time_read() [...] do { c = local_read(&t->cnt); top = local_read(&t->top); bottom = local_read(&t->bottom); msb = local_read(&t->msb); } while (c != local_read(&t->cnt)); *cnt = rb_time_cnt(top); /* If top and msb counts don't match, this interrupted a write */ if (*cnt != rb_time_cnt(msb)) return false; ^ this check fails to catch that "bottom" is still not updated. So the old "bottom" value is returned, which is wrong. Fix this by checking that all three of msb, top, and bottom 2-bit cnt values match. The reason to favor checking all three fields over requiring a specific update order for both rb_time_set() and rb_time_cmpxchg() is because checking all three fields is more robust to handle partial failures of rb_time_cmpxchg() when interrupted by nested rb_time_set(). Link: https://lore.kernel.org/lkml/20231211201324.652870-1-mathieu.desnoyers@efficios.com/ Link: https://lore.kernel.org/linux-trace-kernel/20231212193049.680122-1-mathieu.desnoyers@efficios.com Fixes: f458a145 ("ring-buffer: Test last update in 32bit version of __rb_time_read()") Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
-
Steven Rostedt (Google) authored
Mathieu Desnoyers pointed out an issue in the rb_time_cmpxchg() for 32 bit architectures. That is: static bool rb_time_cmpxchg(rb_time_t *t, u64 expect, u64 set) { unsigned long cnt, top, bottom, msb; unsigned long cnt2, top2, bottom2, msb2; u64 val; /* The cmpxchg always fails if it interrupted an update */ if (!__rb_time_read(t, &val, &cnt2)) return false; if (val != expect) return false; <<<< interrupted here! cnt = local_read(&t->cnt); The problem is that the synchronization counter in the rb_time_t is read *after* the value of the timestamp is read. That means if an interrupt were to come in between the value being read and the counter being read, it can change the value and the counter and the interrupted process would be clueless about it! The counter needs to be read first and then the value. That way it is easy to tell if the value is stale or not. If the counter hasn't been updated, then the value is still good. Link: https://lore.kernel.org/linux-trace-kernel/20231211201324.652870-1-mathieu.desnoyers@efficios.com/ Link: https://lore.kernel.org/linux-trace-kernel/20231212115301.7a9c9a64@gandalf.local.home Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Fixes: 10464b4a ("ring-buffer: Add rb_time_t 64 bit operations for speeding up 32 bit") Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
-
Steven Rostedt (Google) authored
When filtering is enabled, a temporary buffer is created to place the content of the trace event output so that the filter logic can decide from the trace event output if the trace event should be filtered out or not. If it is to be filtered out, the content in the temporary buffer is simply discarded, otherwise it is written into the trace buffer. But if an interrupt were to come in while a previous event was using that temporary buffer, the event written by the interrupt would actually go into the ring buffer itself to prevent corrupting the data on the temporary buffer. If the event is to be filtered out, the event in the ring buffer is discarded, or if it fails to discard because another event were to have already come in, it is turned into padding. The update to the write_stamp in the rb_try_to_discard() happens after a fix was made to force the next event after the discard to use an absolute timestamp by setting the before_stamp to zero so it does not match the write_stamp (which causes an event to use the absolute timestamp). But there's an effort in rb_try_to_discard() to put back the write_stamp to what it was before the event was added. But this is useless and wasteful because nothing is going to be using that write_stamp for calculations as it still will not match the before_stamp. Remove this useless update, and in doing so, we remove another cmpxchg64()! Also update the comments to reflect this change as well as remove some extra white space in another comment. Link: https://lore.kernel.org/linux-trace-kernel/20231215081810.1f4f38fe@rorschach.local.home Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Vincent Donnefort <vdonnefort@google.com> Fixes: b2dd7975 ("ring-buffer: Force absolute timestamp on discard of event") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
-
Steven Rostedt (Google) authored
If an update to an event is interrupted by another event between the time the initial event allocated its buffer and where it wrote to the write_stamp, the code try to reset the write stamp back to the what it had just overwritten. It knows that it was overwritten via checking the before_stamp, and if it didn't match what it wrote to the before_stamp before it allocated its space, it knows it was overwritten. To put back the write_stamp, it uses the before_stamp it read. The problem here is that by writing the before_stamp to the write_stamp it makes the two equal again, which means that the write_stamp can be considered valid as the last timestamp written to the ring buffer. But this is not necessarily true. The event that interrupted the event could have been interrupted in a way that it was interrupted as well, and can end up leaving with an invalid write_stamp. But if this happens and returns to this context that uses the before_stamp to update the write_stamp again, it can possibly incorrectly make it valid, causing later events to have in correct time stamps. As it is OK to leave this function with an invalid write_stamp (one that doesn't match the before_stamp), there's no reason to try to make it valid again in this case. If this race happens, then just leave with the invalid write_stamp and the next event to come along will just add a absolute timestamp and validate everything again. Bonus points: This gets rid of another cmpxchg64! Link: https://lore.kernel.org/linux-trace-kernel/20231214222921.193037a7@gandalf.local.home Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Vincent Donnefort <vdonnefort@google.com> Fixes: a389d86f ("ring-buffer: Have nested events still record running time stamp") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
-
Shubhrajyoti Datta authored
Fix the extraction of num_csrows and num_chans. The extraction of the num_rows is wrong. Instead of extracting using the FIELD_GET it is calling FIELD_PREP. The issue was masked as the default design has the rows as 0. Fixes: 6f15b178 ("EDAC/versal: Add a Xilinx Versal memory controller driver") Closes: https://lore.kernel.org/all/60ca157e-6eff-d12c-9dc0-8aeab125edda@linux-m68k.org/Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231215053352.8740-1-shubhrajyoti.datta@amd.com
-
Mark Rutland authored
When lockdep is enabled, the for_each_sibling_event(sibling, event) macro checks that event->ctx->mutex is held. When creating a new group leader event, we call perf_event_validate_size() on a partially initialized event where event->ctx is NULL, and so when for_each_sibling_event() attempts to check event->ctx->mutex, we get a splat, as reported by Lucas De Marchi: WARNING: CPU: 8 PID: 1471 at kernel/events/core.c:1950 __do_sys_perf_event_open+0xf37/0x1080 This only happens for a new event which is its own group_leader, and in this case there cannot be any sibling events. Thus it's safe to skip the check for siblings, which avoids having to make invasive and ugly changes to for_each_sibling_event(). Avoid the splat by bailing out early when the new event is its own group_leader. Fixes: 382c27f4 ("perf: Fix perf_event_validate_size()") Closes: https://lore.kernel.org/lkml/20231214000620.3081018-1-lucas.demarchi@intel.com/ Closes: https://lore.kernel.org/lkml/ZXpm6gQ%2Fd59jGsuW@xpf.sh.intel.com/Reported-by: Lucas De Marchi <lucas.demarchi@intel.com> Reported-by: Pengfei Xu <pengfei.xu@intel.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20231215112450.3972309-1-mark.rutland@arm.com
-
Ira Weiny authored
The following kmemleaks were detected when removing the cxl module stack: unreferenced object 0xffff88822616b800 (size 1024): ... backtrace: [<00000000bedc6f83>] kmalloc_trace+0x26/0x90 [<00000000448d1afc>] devm_cxl_pmu_add+0x3a/0x110 [cxl_core] [<00000000ca3bfe16>] 0xffffffffa105213b [<00000000ba7f78dc>] local_pci_probe+0x41/0x90 [<000000005bb027ac>] pci_device_probe+0xb0/0x1c0 ... unreferenced object 0xffff8882260abcc0 (size 16): ... hex dump (first 16 bytes): 70 6d 75 5f 6d 65 6d 30 2e 30 00 26 82 88 ff ff pmu_mem0.0.&.... backtrace: ... [<00000000152b5e98>] dev_set_name+0x43/0x50 [<00000000c228798b>] devm_cxl_pmu_add+0x102/0x110 [cxl_core] [<00000000ca3bfe16>] 0xffffffffa105213b [<00000000ba7f78dc>] local_pci_probe+0x41/0x90 [<000000005bb027ac>] pci_device_probe+0xb0/0x1c0 ... unreferenced object 0xffff8882272af200 (size 256): ... backtrace: [<00000000bedc6f83>] kmalloc_trace+0x26/0x90 [<00000000a14d1813>] device_add+0x4ea/0x890 [<00000000a3f07b47>] devm_cxl_pmu_add+0xbe/0x110 [cxl_core] [<00000000ca3bfe16>] 0xffffffffa105213b [<00000000ba7f78dc>] local_pci_probe+0x41/0x90 [<000000005bb027ac>] pci_device_probe+0xb0/0x1c0 ... devm_cxl_pmu_add() correctly registers a device remove function but it only calls device_del() which is only part of device unregistration. Properly call device_unregister() to free up the memory associated with the device. Fixes: 1ad3f701 ("cxl/pci: Find and register CXL PMU devices") Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20231016-pmu-unregister-fix-v1-1-1e2eb2fa3c69@intel.comSigned-off-by: Dan Williams <dan.j.williams@intel.com>
-
Lyude Paul authored
Turns out we made a silly mistake when coming up with OR inheritance on nouveau. On pre-DCB 4.1, iors are statically routed to output paths via the DCB. On later generations iors are only routed to an output path if they're actually being used. Unfortunately, it appears with NVIF_OUTP_INHERIT_V0 we make the mistake of assuming the later is true on all generations, which is currently leading us to return bogus ior -> head assignments through nvif, which causes WARN_ON(). So - fix this by verifying that we actually know that there's a head assigned to an ior before allowing it to be inherited through nvif. This -should- hopefully fix the WARN_ON on GT218 reported by Borislav. Signed-off-by: Lyude Paul <lyude@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Reported-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231214004359.1028109-1-lyude@redhat.com
-
Thierry Reding authored
Commit 12c9b05d ("drm/nouveau/imem: support allocations not preserved across suspend") uses container_of() to cast from struct nvkm_memory to struct nvkm_instobj, assuming that all instance objects are derived from struct nvkm_instobj. For the gk20a family that's not the case and they are derived from struct nvkm_memory instead. This causes some subtle data corruption (nvkm_instobj.preserve ends up mapping to gk20a_instobj.vaddr) that causes a NULL pointer dereference in gk20a_instobj_acquire_iommu() (and possibly elsewhere) and also prevents suspend/resume from working. Fix this by making struct gk20a_instobj derive from struct nvkm_instobj instead. Fixes: 12c9b05d ("drm/nouveau/imem: support allocations not preserved across suspend") Reported-by: Jonathan Hunter <jonathanh@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231208104653.1917055-1-thierry.reding@gmail.com
-
git://git.samba.org/sfrench/cifs-2.6Linus Torvalds authored
Pull smb client fixes from Steve French: "Address OOBs and NULL dereference found by Dr. Morris's recent analysis and fuzzing. All marked for stable as well" * tag '6.7-rc5-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: smb: client: fix OOB in smb2_query_reparse_point() smb: client: fix NULL deref in asn1_ber_decoder() smb: client: fix potential OOBs in smb2_parse_contexts() smb: client: fix OOB in receive_encrypted_standard()
-
git://anongit.freedesktop.org/drm/drm-miscDave Airlie authored
drm-misc-fixes for v6.7-rc6: - Fix regression for checking if FD is master capable. - Fix uninitialized variables in drm/crtc. - Fix ivpu w/a. - Refresh modes correctly when updating EDID. - Small panel fixes. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/2d46b68f-c5a4-45e5-beb4-411569f4aac8@linux.intel.com
-
Dave Airlie authored
Merge tag 'amd-drm-fixes-6.7-2023-12-13' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.7-2023-12-13: amdgpu: - Fix suspend fix that got accidently mangled last week - Fix OD regression - PSR fixes - OLED Backlight regression fix - JPEG 4.0.5 fix - Misc display fixes - SDMA 5.2 fix - SDMA 2.4 regression fix - GPUVM race fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231213221122.4937-1-alexander.deucher@amd.com
-
Linus Torvalds authored
Merge tag 'platform-drivers-x86-v6.7-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Ilpo Järvinen: - tablet-mode-switch events fix - kernel-doc warning fixes * tag 'platform-drivers-x86-v6.7-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: intel_ips: fix kernel-doc formatting platform/x86: thinkpad_acpi: fix kernel-doc warnings platform/x86: intel-vbtn: Fix missing tablet-mode-switch events
-