- 13 Feb, 2023 15 commits
-
-
Johan Hovold authored
Use the irq_domain_create_hierarchy() helper to create the hierarchical domain, which both serves as documentation and avoids poking at irqdomain internals. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Hsin-Yi Wang <hsinyi@chromium.org> Tested-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230213104302.17307-16-johan+linaro@kernel.org
-
Johan Hovold authored
Use the irq_domain_add_hierarchy() helper to create the hierarchical domain, which both serves as documentation and avoids poking at irqdomain internals. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Hsin-Yi Wang <hsinyi@chromium.org> Tested-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230213104302.17307-15-johan+linaro@kernel.org
-
Johan Hovold authored
Use the irq_domain_create_hierarchy() helper to create the hierarchical domain, which both serves as documentation and avoids poking at irqdomain internals. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Hsin-Yi Wang <hsinyi@chromium.org> Tested-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230213104302.17307-14-johan+linaro@kernel.org
-
Johan Hovold authored
Use the irq_domain_create_hierarchy() helper to create the hierarchical domain, which both serves as documentation and avoids poking at irqdomain internals. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Hsin-Yi Wang <hsinyi@chromium.org> Tested-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230213104302.17307-13-johan+linaro@kernel.org
-
Johan Hovold authored
The irq_domain_push_irq() interface is used to add a new (outmost) level to a hierarchical domain after IRQs have been allocated. Possibly due to differing mental images of hierarchical domains, the names used for the irq_data variables make these functions much harder to understand than what they need to be. Rename the struct irq_data pointer to the data embedded in the descriptor as simply 'irq_data' and refer to the data allocated by this interface as 'parent_irq_data' so that the names reflect how hierarchical domains are implemented. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Hsin-Yi Wang <hsinyi@chromium.org> Tested-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230213104302.17307-12-johan+linaro@kernel.org
-
Johan Hovold authored
Drop some unnecessary brackets that were left in place when the corresponding code was updated. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Hsin-Yi Wang <hsinyi@chromium.org> Tested-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230213104302.17307-11-johan+linaro@kernel.org
-
Johan Hovold authored
Since commit d59f6617 ("genirq: Allow fwnode to carry name information only") an IRQ domain is always given a name during allocation (e.g. used for the debugfs entry). Drop the leftover name assignment when allocating the first IRQ. Tested-by: Hsin-Yi Wang <hsinyi@chromium.org> Tested-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230213104302.17307-10-johan+linaro@kernel.org
-
Johan Hovold authored
The revmap mutex is essentially only used to maintain the integrity of the radix tree during updates (lookups use RCU). As the global irq_domain_mutex is now held in all paths that update the revmap structures there is strictly no longer any need for the dedicated mutex, which can be removed. Drop the revmap mutex and add lockdep assertions to the revmap helpers to make sure that the global lock is always held when updating the revmap. Tested-by: Hsin-Yi Wang <hsinyi@chromium.org> Tested-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230213104302.17307-9-johan+linaro@kernel.org
-
Marc Zyngier authored
Hierarchical domains created using irq_domain_create_hierarchy() are currently added to the domain list before having been fully initialised. This specifically means that a racing allocation request might fail to allocate irq data for the inner domains of a hierarchy in case the parent domain pointer has not yet been set up. Note that this is not really any issue for irqchip drivers that are registered early (e.g. via IRQCHIP_DECLARE() or IRQCHIP_ACPI_DECLARE()) but could potentially cause trouble with drivers that are registered later (e.g. modular drivers using IRQCHIP_PLATFORM_DRIVER_BEGIN(), gpiochip drivers, etc.). Fixes: afb7da83 ("irqdomain: Introduce helper function irq_domain_add_hierarchy()") Cc: stable@vger.kernel.org # 3.19 Signed-off-by: Marc Zyngier <maz@kernel.org> [ johan: add commit message ] Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230213104302.17307-8-johan+linaro@kernel.org
-
Johan Hovold authored
Parallel probing of devices that share interrupts (e.g. when a driver uses asynchronous probing) can currently result in two mappings for the same hardware interrupt to be created due to missing serialisation. Make sure to hold the irq_domain_mutex when creating mappings so that looking for an existing mapping before creating a new one is done atomically. Fixes: 765230b5 ("driver-core: add asynchronous probing support for drivers") Fixes: b62b2cf5 ("irqdomain: Fix handling of type settings for existing mappings") Link: https://lore.kernel.org/r/YuJXMHoT4ijUxnRb@hovoldconsulting.com Cc: stable@vger.kernel.org # 4.8 Cc: Dmitry Torokhov <dtor@chromium.org> Cc: Jon Hunter <jonathanh@nvidia.com> Tested-by: Hsin-Yi Wang <hsinyi@chromium.org> Tested-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230213104302.17307-7-johan+linaro@kernel.org
-
Johan Hovold authored
Refactor __irq_domain_alloc_irqs() so that it can be called internally while holding the irq_domain_mutex. This will be used to fix a shared-interrupt mapping race, hence the Fixes tag. Fixes: b62b2cf5 ("irqdomain: Fix handling of type settings for existing mappings") Cc: stable@vger.kernel.org # 4.8 Tested-by: Hsin-Yi Wang <hsinyi@chromium.org> Tested-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230213104302.17307-6-johan+linaro@kernel.org
-
Johan Hovold authored
Avoid looking for an existing mapping twice when creating a new mapping using irq_create_fwspec_mapping() by factoring out the actual allocation which is shared with irq_create_mapping_affinity(). The new helper function will also be used to fix a shared-interrupt mapping race, hence the Fixes tag. Fixes: b62b2cf5 ("irqdomain: Fix handling of type settings for existing mappings") Cc: stable@vger.kernel.org # 4.8 Tested-by: Hsin-Yi Wang <hsinyi@chromium.org> Tested-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230213104302.17307-5-johan+linaro@kernel.org
-
Johan Hovold authored
In case a newly allocated IRQ ever ends up not having any associated struct irq_data it would not even be possible to dispose the mapping. Replace the bogus disposal with a WARN_ON(). This will also be used to fix a shared-interrupt mapping race, hence the CC-stable tag. Fixes: 1e2a7d78 ("irqdomain: Don't set type when mapping an IRQ") Cc: stable@vger.kernel.org # 4.8 Tested-by: Hsin-Yi Wang <hsinyi@chromium.org> Tested-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230213104302.17307-4-johan+linaro@kernel.org
-
Johan Hovold authored
The global irq_domain_mutex is held when mapping interrupts from non-hierarchical domains but currently not when disposing them. This specifically means that updates of the domain mapcount is racy (currently only used for statistics in debugfs). Make sure to hold the global irq_domain_mutex also when disposing mappings from non-hierarchical domains. Fixes: 9dc6be3d ("genirq/irqdomain: Add map counter") Cc: stable@vger.kernel.org # 4.13 Tested-by: Hsin-Yi Wang <hsinyi@chromium.org> Tested-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230213104302.17307-3-johan+linaro@kernel.org
-
Johan Hovold authored
The sanity check for an already mapped virq is done outside of the irq_domain_mutex-protected section which means that an (unlikely) racing association may not be detected. Fix this by factoring out the association implementation, which will also be used in a follow-on change to fix a shared-interrupt mapping race. Fixes: ddaf144c ("irqdomain: Refactor irq_domain_associate_many()") Cc: stable@vger.kernel.org # 3.11 Tested-by: Hsin-Yi Wang <hsinyi@chromium.org> Tested-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230213104302.17307-2-johan+linaro@kernel.org
-
- 15 Jan, 2023 4 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fixes from Borislav Petkov: - Make sure the poking PGD is pinned for Xen PV as it requires it this way - Fixes for two resctrl races when moving a task or creating a new monitoring group - Fix SEV-SNP guests running under HyperV where MTRRs are disabled to not return a UC- type mapping type on memremap() and thus cause a serious slowdown - Fix insn mnemonics in bioscall.S now that binutils is starting to fix confusing insn suffixes * tag 'x86_urgent_for_v6.2_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: fix poking_init() for Xen PV guests x86/resctrl: Fix event counts regression in reused RMIDs x86/resctrl: Fix task CLOSID/RMID update race x86/pat: Fix pat_x_mtrr_type() for MTRR disabled case x86/boot: Avoid using Intel mnemonics in AT&T syntax asm
-
git://git.kernel.org/pub/scm/linux/kernel/git/ras/rasLinus Torvalds authored
Pull EDAC fixes from Borislav Petkov: - Fix the EDAC device's confusion in the polling setting units - Fix a memory leak in highbank's probing function * tag 'edac_urgent_for_v6.2_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/highbank: Fix memory leak in highbank_mc_probe() EDAC/device: Fix period calculation in edac_device_reset_delay_period()
-
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds authored
Pull powerpc fixes from Michael Ellerman: - Fix a build failure with some versions of ld that have an odd version string - Fix incorrect use of mutex in the IMC PMU driver Thanks to Kajol Jain, Michael Petlan, Ojaswin Mujoo, Peter Zijlstra, and Yang Yingliang. * tag 'powerpc-6.2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64s/hash: Make stress_hpt_timer_fn() static powerpc/imc-pmu: Fix use of mutex in IRQs disabled section powerpc/boot: Fix incorrect version calculation issue in ld_version
-
- 14 Jan, 2023 7 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommuLinus Torvalds authored
Pull iommu fixes from Joerg Roedel: - Core: Fix an iommu-group refcount leak - Fix overflow issue in IOVA alloc path - ARM-SMMU fixes from Will: - Fix VFIO regression on NXP SoCs by reporting IOMMU_CAP_CACHE_COHERENCY - Fix SMMU shutdown paths to avoid device unregistration race - Error handling fix for Mediatek IOMMU driver * tag 'iommu-fixes-v6.2-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/mediatek-v1: Fix an error handling path in mtk_iommu_v1_probe() iommu/iova: Fix alloc iova overflows issue iommu: Fix refcount leak in iommu_device_claim_dma_owner iommu/arm-smmu-v3: Don't unregister on shutdown iommu/arm-smmu: Don't unregister on shutdown iommu/arm-smmu: Report IOMMU_CAP_CACHE_COHERENCY even betterer
-
git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblockLinus Torvalds authored
Pull memblock fix from Mike Rapoport: "memblock: always release pages to the buddy allocator in memblock_free_late() If CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, memblock_free_pages() only releases pages to the buddy allocator if they are not in the deferred range. This is correct for free pages (as defined by for_each_free_mem_pfn_range_in_zone()) because free pages in the deferred range will be initialized and released as part of the deferred init process. memblock_free_pages() is called by memblock_free_late(), which is used to free reserved ranges after memblock_free_all() has run. All pages in reserved ranges have been initialized at that point, and accordingly, those pages are not touched by the deferred init process. This means that currently, if the pages that memblock_free_late() intends to release are in the deferred range, they will never be released to the buddy allocator. They will forever be reserved. In addition, memblock_free_pages() calls kmsan_memblock_free_pages(), which is also correct for free pages but is not correct for reserved pages. KMSAN metadata for reserved pages is initialized by kmsan_init_shadow(), which runs shortly before memblock_free_all(). For both of these reasons, memblock_free_pages() should only be called for free pages, and memblock_free_late() should call __free_pages_core() directly instead. One case where this issue can occur in the wild is EFI boot on x86_64. The x86 EFI code reserves all EFI boot services memory ranges via memblock_reserve() and frees them later via memblock_free_late() (efi_reserve_boot_services() and efi_free_boot_services(), respectively). If any of those ranges happens to fall within the deferred init range, the pages will not be released and that memory will be unavailable. For example, on an Amazon EC2 t3.micro VM (1 GB) booting via EFI: v6.2-rc2: Node 0, zone DMA spanned 4095 present 3999 managed 3840 Node 0, zone DMA32 spanned 246652 present 245868 managed 178867 v6.2-rc2 + patch: Node 0, zone DMA spanned 4095 present 3999 managed 3840 Node 0, zone DMA32 spanned 246652 present 245868 managed 222816 # +43,949 pages" * tag 'fixes-2023-01-14' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock: mm: Always release pages to the buddy allocator in memblock_free_late().
-
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linuxLinus Torvalds authored
Pull kernel hardening fixes from Kees Cook: - Fix CFI hash randomization with KASAN (Sami Tolvanen) - Check size of coreboot table entry and use flex-array * tag 'hardening-v6.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: kbuild: Fix CFI hash randomization with KASAN firmware: coreboot: Check size of table entry and use flex-array
-
git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linuxLinus Torvalds authored
Pull module fix from Luis Chamberlain: "Just one fix for modules by Nick" * tag 'modules-6.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: kallsyms: Fix scheduling with interrupts disabled in self-test
-
git://git.samba.org/sfrench/cifs-2.6Linus Torvalds authored
Pull cifs fixes from Steve French: - memory leak and double free fix - two symlink fixes - minor cleanup fix - two smb1 fixes * tag '6.2-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: Fix uninitialized memory read for smb311 posix symlink create cifs: fix potential memory leaks in session setup cifs: do not query ifaces on smb1 mounts cifs: fix double free on failed kerberos auth cifs: remove redundant assignment to the variable match cifs: fix file info setting in cifs_open_file() cifs: fix file info setting in cifs_query_path_info()
-
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds authored
Pull SCSI fixes from James Bottomley: "Two minor fixes in the hisi_sas driver which only impact enterprise style multi-expander and shared disk situations and no core changes" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: hisi_sas: Set a port invalid only if there are no devices attached when refreshing port id scsi: hisi_sas: Use abort task set to reset SAS disks when discovered
-
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libataLinus Torvalds authored
Pull ATA fix from Damien Le Moal: "A single fix to prevent building the pata_cs5535 driver with user mode linux as it uses msr operations that are not defined with UML" * tag 'ata-6.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: ata: pata_cs5535: Don't build on UML
-
- 13 Jan, 2023 14 commits
-
-
git://git.kernel.dk/linuxLinus Torvalds authored
Pull block fixes from Jens Axboe: "Nothing major in here, just a collection of NVMe fixes and dropping a wrong might_sleep() that static checkers tripped over but which isn't valid" * tag 'block-6.2-2023-01-13' of git://git.kernel.dk/linux: MAINTAINERS: stop nvme matching for nvmem files nvme: don't allow unprivileged passthrough on partitions nvme: replace the "bool vec" arguments with flags in the ioctl path nvme: remove __nvme_ioctl nvme-pci: fix error handling in nvme_pci_enable() nvme-pci: add NVME_QUIRK_IDENTIFY_CNS quirk to Apple T2 controllers nvme-apple: add NVME_QUIRK_IDENTIFY_CNS quirk to fix regression block: Drop spurious might_sleep() from blk_put_queue()
-
git://git.kernel.dk/linuxLinus Torvalds authored
Pull io_uring fixes from Jens Axboe: "A fix for a regression that happened last week, rest is fixes that will be headed to stable as well. In detail: - Fix for a regression added with the leak fix from last week (me) - In writing a test case for that leak, inadvertently discovered a case where we a poll request can race. So fix that up and mark it for stable, and also ensure that fdinfo covers both the poll tables that we have. The latter was an oversight when the split poll table were added (me) - Fix for a lockdep reported issue with IOPOLL (Pavel)" * tag 'io_uring-6.2-2023-01-13' of git://git.kernel.dk/linux: io_uring: lock overflowing for IOPOLL io_uring/poll: attempt request issue after racy poll wakeup io_uring/fdinfo: include locked hash table in fdinfo output io_uring/poll: add hash if ready poll request can't complete inline io_uring/io-wq: only free worker if it was allocated for creation
-
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pciLinus Torvalds authored
Pull pci fixes from Bjorn Helgaas: - Work around apparent firmware issue that made Linux reject MMCONFIG space, which broke PCI extended config space (Bjorn Helgaas) - Fix CONFIG_PCIE_BT1 dependency due to mid-air collision between a PCI_MSI_IRQ_DOMAIN -> PCI_MSI change and addition of PCIE_BT1 (Lukas Bulwahn) * tag 'pci-v6.2-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: x86/pci: Treat EfiMemoryMappedIO as reservation of ECAM space x86/pci: Simplify is_mmconf_reserved() messages PCI: dwc: Adjust to recent removal of PCI_MSI_IRQ_DOMAIN
-
Sami Tolvanen authored
Clang emits a asan.module_ctor constructor to each object file when KASAN is enabled, and these functions are indirectly called in do_ctors. With CONFIG_CFI_CLANG, the compiler also emits a CFI type hash before each address-taken global function so they can pass indirect call checks. However, in commit 0c3e806e ("x86/cfi: Add boot time hash randomization"), x86 implemented boot time hash randomization, which relies on the .cfi_sites section generated by objtool. As objtool is run against vmlinux.o instead of individual object files with X86_KERNEL_IBT (enabled by default), CFI types in object files that are not part of vmlinux.o end up not being included in .cfi_sites, and thus won't get randomized and trip CFI when called. Only .vmlinux.export.o and init/version-timestamp.o are linked into vmlinux separately from vmlinux.o. As these files don't contain any functions, disable KASAN for both of them to avoid breaking hash randomization. Link: https://github.com/ClangBuiltLinux/linux/issues/1742 Fixes: 0c3e806e ("x86/cfi: Add boot time hash randomization") Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230112224948.1479453-2-samitolvanen@google.com
-
Kees Cook authored
The memcpy() of the data following a coreboot_table_entry couldn't be evaluated by the compiler under CONFIG_FORTIFY_SOURCE. To make it easier to reason about, add an explicit flexible array member to struct coreboot_device so the entire entry can be copied at once. Additionally, validate the sizes before copying. Avoids this run-time false positive warning: memcpy: detected field-spanning write (size 168) of single field "&device->entry" at drivers/firmware/google/coreboot_table.c:103 (size 8) Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Link: https://lore.kernel.org/all/03ae2704-8c30-f9f0-215b-7cdf4ad35a9a@molgen.mpg.de/ Cc: Jack Rosenthal <jrosenth@chromium.org> Cc: Guenter Roeck <groeck@chromium.org> Cc: Julius Werner <jwerner@chromium.org> Cc: Brian Norris <briannorris@chromium.org> Cc: Stephen Boyd <swboyd@chromium.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Julius Werner <jwerner@chromium.org> Reviewed-by: Guenter Roeck <groeck@chromium.org> Link: https://lore.kernel.org/r/20230107031406.gonna.761-kees@kernel.orgReviewed-by: Stephen Boyd <swboyd@chromium.org> Reviewed-by: Jack Rosenthal <jrosenth@chromium.org> Link: https://lore.kernel.org/r/20230112230312.give.446-kees@kernel.org
-
Nicholas Piggin authored
kallsyms_on_each* may schedule so must not be called with interrupts disabled. The iteration function could disable interrupts, but this also changes lookup_symbol() to match the change to the other timing code. Reported-by: Erhard F. <erhard_f@mailbox.org> Link: https://lore.kernel.org/all/bug-216902-206035@https.bugzilla.kernel.org%2F/Reported-by: kernel test robot <oliver.sang@intel.com> Link: https://lore.kernel.org/oe-lkp/202212251728.8d0872ff-oliver.sang@intel.com Fixes: 30f3bb09 ("kallsyms: Add self-test facility") Tested-by: "Erhard F." <erhard_f@mailbox.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
-
Peter Foley authored
This driver uses MSR functions that aren't implemented under UML. Avoid building it to prevent tripping up allyesconfig. e.g. /usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x3a3): undefined reference to `__tracepoint_read_msr' /usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x3d2): undefined reference to `__tracepoint_write_msr' /usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x457): undefined reference to `__tracepoint_write_msr' /usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x481): undefined reference to `do_trace_write_msr' /usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x4d5): undefined reference to `do_trace_write_msr' /usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x4f5): undefined reference to `do_trace_read_msr' /usr/lib/gcc/x86_64-pc-linux-gnu/12/../../../../x86_64-pc-linux-gnu/bin/ld: pata_cs5535.c:(.text+0x51c): undefined reference to `do_trace_write_msr' Signed-off-by: Peter Foley <pefoley2@pefoley.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
-
git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds authored
Pull kvm fixes from Paolo Bonzini: "ARM: - Fix the PMCR_EL0 reset value after the PMU rework - Correctly handle S2 fault triggered by a S1 page table walk by not always classifying it as a write, as this breaks on R/O memslots - Document why we cannot exit with KVM_EXIT_MMIO when taking a write fault from a S1 PTW on a R/O memslot - Put the Apple M2 on the naughty list for not being able to correctly implement the vgic SEIS feature, just like the M1 before it - Reviewer updates: Alex is stepping down, replaced by Zenghui x86: - Fix various rare locking issues in Xen emulation and teach lockdep to detect them - Documentation improvements - Do not return host topology information from KVM_GET_SUPPORTED_CPUID" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86/xen: Avoid deadlock by adding kvm->arch.xen.xen_lock leaf node lock KVM: Ensure lockdep knows about kvm->lock vs. vcpu->mutex ordering rule KVM: x86/xen: Fix potential deadlock in kvm_xen_update_runstate_guest() KVM: x86/xen: Fix lockdep warning on "recursive" gpc locking Documentation: kvm: fix SRCU locking order docs KVM: x86: Do not return host topology information from KVM_GET_SUPPORTED_CPUID KVM: nSVM: clarify recalc_intercepts() wrt CR8 MAINTAINERS: Remove myself as a KVM/arm64 reviewer MAINTAINERS: Add Zenghui Yu as a KVM/arm64 reviewer KVM: arm64: vgic: Add Apple M2 cpus to the list of broken SEIS implementations KVM: arm64: Convert FSC_* over to ESR_ELx_FSC_* KVM: arm64: Document the behaviour of S1PTW faults on RO memslots KVM: arm64: Fix S1PTW handling on RO memslots KVM: arm64: PMU: Fix PMCR_EL0 reset value
-
Mateusz Guzik authored
On the x86-64 architecture even a failing cmpxchg grants exclusive access to the cacheline, making it preferable to retry the failed op immediately instead of stalling with the pause instruction. To illustrate the impact, below are benchmark results obtained by running various will-it-scale tests on top of the 6.2-rc3 kernel and Cascade Lake (2 sockets * 24 cores * 2 threads) CPU. All results in ops/s. Note there is some variance in re-runs, but the code is consistently faster when contention is present. open3 ("Same file open/close"): proc stock no-pause 1 805603 814942 (+%1) 2 1054980 1054781 (-0%) 8 1544802 1822858 (+18%) 24 1191064 2199665 (+84%) 48 851582 1469860 (+72%) 96 609481 1427170 (+134%) fstat2 ("Same file fstat"): proc stock no-pause 1 3013872 3047636 (+1%) 2 4284687 4400421 (+2%) 8 3257721 5530156 (+69%) 24 2239819 5466127 (+144%) 48 1701072 5256609 (+209%) 96 1269157 6649326 (+423%) Additionally, a kernel with a private patch to help access() scalability: access2 ("Same file access"): proc stock patched patched +nopause 24 2378041 2005501 5370335 (-15% / +125%) That is, fixing the problems in access itself *reduces* scalability after the cacheline ping-pong only happens in lockref with the pause instruction. Note that fstat and access benchmarks are not currently integrated into will-it-scale, but interested parties can find them in pull requests to said project. Code at hand has a rather tortured history. First modification showed up in commit d472d9d9 ("lockref: Relax in cmpxchg loop"), written with Itanium in mind. Later it got patched up to use an arch-dependent macro to stop doing it on s390 where it caused a significant regression. Said macro had undergone revisions and was ultimately eliminated later, going back to cpu_relax. While I intended to only remove cpu_relax for x86-64, I got the following comment from Linus: I would actually prefer just removing it entirely and see if somebody else hollers. You have the numbers to prove it hurts on real hardware, and I don't think we have any numbers to the contrary. So I think it's better to trust the numbers and remove it as a failure, than say "let's just remove it on x86-64 and leave everybody else with the potentially broken code" Additionally, Will Deacon (maintainer of the arm64 port, one of the architectures previously benchmarked): So, from the arm64 side of the fence, I'm perfectly happy just removing the cpu_relax() calls from lockref. As such, come back full circle in history and whack it altogether. Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://lore.kernel.org/all/CAGudoHHx0Nqg6DE70zAVA75eV-HXfWyhVMWZ-aSeOofkA_=WdA@mail.gmail.com/ Acked-by: Tony Luck <tony.luck@intel.com> # ia64 Acked-by: Nicholas Piggin <npiggin@gmail.com> # powerpc Acked-by: Will Deacon <will@kernel.org> # arm64 Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Bjorn Helgaas authored
Normally we reject ECAM space unless it is reported as reserved in the E820 table or via a PNP0C02 _CRS method (PCI Firmware, r3.3, sec 4.1.2). 07eab090 ("efi/x86: Remove EfiMemoryMappedIO from E820 map"), removes E820 entries that correspond to EfiMemoryMappedIO regions because some other firmware uses EfiMemoryMappedIO for PCI host bridge windows, and the E820 entries prevent Linux from allocating BAR space for hot-added devices. Some firmware doesn't report ECAM space via PNP0C02 _CRS methods, but does mention it as an EfiMemoryMappedIO region via EFI GetMemoryMap(), which is normally converted to an E820 entry by a bootloader or EFI stub. After 07eab090, that E820 entry is removed, so we reject this ECAM space, which makes PCI extended config space (offsets 0x100-0xfff) inaccessible. The lack of extended config space breaks anything that relies on it, including perf, VSEC telemetry, EDAC, QAT, SR-IOV, etc. Allow use of ECAM for extended config space when the region is covered by an EfiMemoryMappedIO region, even if it's not included in E820 or PNP0C02 _CRS. Link: https://lore.kernel.org/r/ac2693d8-8ba3-72e0-5b66-b3ae008d539d@linux.intel.com Link: https://bugzilla.kernel.org/show_bug.cgi?id=216891 Fixes: 07eab090 ("efi/x86: Remove EfiMemoryMappedIO from E820 map") Link: https://lore.kernel.org/r/20230110180243.1590045-3-helgaas@kernel.orgReported-by: Kan Liang <kan.liang@linux.intel.com> Reported-by: Tony Luck <tony.luck@intel.com> Reported-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reported-by: Yunying Sun <yunying.sun@intel.com> Reported-by: Baowen Zheng <baowen.zheng@corigine.com> Reported-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reported-by: Yang Lixiao <lixiao.yang@intel.com> Tested-by: Tony Luck <tony.luck@intel.com> Tested-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Tested-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Yunying Sun <yunying.sun@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efiLinus Torvalds authored
Pull EFI fixes from Ard Biesheuvel: - avoid a potential crash on the efi_subsys_init() error path - use more appropriate error code for runtime services calls issued after a crash in the firmware occurred - avoid READ_ONCE() for accessing firmware tables that may appear misaligned in memory * tag 'efi-fixes-for-v6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efi: tpm: Avoid READ_ONCE() for accessing the event log efi: rt-wrapper: Add missing include efi: fix userspace infinite retry read efivars after EFI runtime services page fault efi: fix NULL-deref in init error path
-
git://git.lwn.net/linuxLinus Torvalds authored
Pull documentation fixes from Jonathan Corbet: "Three documentation fixes (or rather two and one warning): - Sphinx 6.0 broke our configuration mechanism, so fix it - I broke our configuration for non-Alabaster themes; Akira fixed it - Deprecate Sphinx < 2.4 with an eye toward future removal" * tag 'docs-6.2-fixes' of git://git.lwn.net/linux: docs/conf.py: Use about.html only in sidebar of alabaster theme docs: Deprecate use of Sphinx < 2.4.x docs: Fix the docs build with Sphinx 6.0
-
Ard Biesheuvel authored
Nathan reports that recent kernels built with LTO will crash when doing EFI boot using Fedora's GRUB and SHIM. The culprit turns out to be a misaligned load from the TPM event log, which is annotated with READ_ONCE(), and under LTO, this gets translated into a LDAR instruction which does not tolerate misaligned accesses. Interestingly, this does not happen when booting the same kernel straight from the UEFI shell, and so the fact that the event log may appear misaligned in memory may be caused by a bug in GRUB or SHIM. However, using READ_ONCE() to access firmware tables is slightly unusual in any case, and here, we only need to ensure that 'event' is not dereferenced again after it gets unmapped, but this is already taken care of by the implicit barrier() semantics of the early_memunmap() call. Cc: <stable@vger.kernel.org> Cc: Peter Jones <pjones@redhat.com> Cc: Jarkko Sakkinen <jarkko@kernel.org> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Reported-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Link: https://github.com/ClangBuiltLinux/linux/issues/1782Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
-
Pavel Begunkov authored
syzbot reports an issue with overflow filling for IOPOLL: WARNING: CPU: 0 PID: 28 at io_uring/io_uring.c:734 io_cqring_event_overflow+0x1c0/0x230 io_uring/io_uring.c:734 CPU: 0 PID: 28 Comm: kworker/u4:1 Not tainted 6.2.0-rc3-syzkaller-16369-g358a161a6a9e #0 Workqueue: events_unbound io_ring_exit_work Call trace: io_cqring_event_overflow+0x1c0/0x230 io_uring/io_uring.c:734 io_req_cqe_overflow+0x5c/0x70 io_uring/io_uring.c:773 io_fill_cqe_req io_uring/io_uring.h:168 [inline] io_do_iopoll+0x474/0x62c io_uring/rw.c:1065 io_iopoll_try_reap_events+0x6c/0x108 io_uring/io_uring.c:1513 io_uring_try_cancel_requests+0x13c/0x258 io_uring/io_uring.c:3056 io_ring_exit_work+0xec/0x390 io_uring/io_uring.c:2869 process_one_work+0x2d8/0x504 kernel/workqueue.c:2289 worker_thread+0x340/0x610 kernel/workqueue.c:2436 kthread+0x12c/0x158 kernel/kthread.c:376 ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:863 There is no real problem for normal IOPOLL as flush is also called with uring_lock taken, but it's getting more complicated for IOPOLL|SQPOLL, for which __io_cqring_overflow_flush() happens from the CQ waiting path. Reported-and-tested-by: syzbot+6805087452d72929404e@syzkaller.appspotmail.com Cc: stable@vger.kernel.org # 5.10+ Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
-