1. 27 Jul, 2024 23 commits
    • Linus Torvalds's avatar
      Merge tag 'io_uring-6.11-20240726' of git://git.kernel.dk/linux · 8c930747
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
      
       - Fix a syzbot issue for the msg ring cache added in this release. No
         ill effects from this one, but it did make KMSAN unhappy (me)
      
       - Sanitize the NAPI timeout handling, by unifying the value handling
         into all ktime_t rather than converting back and forth (Pavel)
      
       - Fail NAPI registration for IOPOLL rings, it's not supported (Pavel)
      
       - Fix a theoretical issue with ring polling and cancelations (Pavel)
      
       - Various little cleanups and fixes (Pavel)
      
      * tag 'io_uring-6.11-20240726' of git://git.kernel.dk/linux:
        io_uring/napi: pass ktime to io_napi_adjust_timeout
        io_uring/napi: use ktime in busy polling
        io_uring/msg_ring: fix uninitialized use of target_req->flags
        io_uring: align iowq and task request error handling
        io_uring: kill REQ_F_CANCEL_SEQ
        io_uring: simplify io_uring_cmd return
        io_uring: fix io_match_task must_hold
        io_uring: don't allow netpolling with SETUP_IOPOLL
        io_uring: tighten task exit cancellations
      8c930747
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.11-rc1.fixes.3' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · bc4eee85
      Linus Torvalds authored
      Pull vfs fixes from Christian Brauner:
       "This contains two fixes for this merge window:
      
        VFS:
      
         - I noticed that it is possible for a privileged user to mount most
           filesystems with a non-initial user namespace in sb->s_user_ns.
      
           When fsopen() is called in a non-init namespace the caller's
           namespace is recorded in fs_context->user_ns. If the returned file
           descriptor is then passed to a process privileged in init_user_ns,
           that process can call fsconfig(fd_fs, FSCONFIG_CMD_CREATE*),
           creating a new superblock with sb->s_user_ns set to the namespace
           of the process which called fsopen().
      
           This is problematic as only filesystems that raise FS_USERNS_MOUNT
           are known to be able to support a non-initial s_user_ns. Others may
           suffer security issues, on-disk corruption or outright crash the
           kernel. Prevent that by restricting such delegation to filesystems
           that allow FS_USERNS_MOUNT.
      
           Note, that this delegation requires a privileged process to
           actually create the superblock so either the privileged process is
           cooperaing or someone must have tricked a privileged process into
           operating on a fscontext file descriptor whose origin it doesn't
           know (a stupid idea).
      
           The bug dates back to about 5 years afaict.
      
        Misc:
      
         - Fix hostfs parsing when the mount request comes in via the legacy
           mount api.
      
           In the legacy mount api hostfs allows to specify the host directory
           mount without any key.
      
           Restore that behavior"
      
      * tag 'vfs-6.11-rc1.fixes.3' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        hostfs: fix the host directory parse when mounting.
        fs: don't allow non-init s_user_ns for filesystems without FS_USERNS_MOUNT
      bc4eee85
    • Linus Torvalds's avatar
      Merge tag 'rust-6.11' of https://github.com/Rust-for-Linux/linux · 910bfc26
      Linus Torvalds authored
      Pull Rust updates from Miguel Ojeda:
       "The highlight is the establishment of a minimum version for the Rust
        toolchain, including 'rustc' (and bundled tools) and 'bindgen'.
      
        The initial minimum will be the pinned version we currently have, i.e.
        we are just widening the allowed versions. That covers three stable
        Rust releases: 1.78.0, 1.79.0, 1.80.0 (getting released tomorrow),
        plus beta, plus nightly.
      
        This should already be enough for kernel developers in distributions
        that provide recent Rust compiler versions routinely, such as Arch
        Linux, Debian Unstable (outside the freeze period), Fedora Linux,
        Gentoo Linux (especially the testing channel), Nix (unstable) and
        openSUSE Slowroll and Tumbleweed.
      
        In addition, the kernel is now being built-tested by Rust's pre-merge
        CI. That is, every change that is attempting to land into the Rust
        compiler is tested against the kernel, and it is merged only if it
        passes. Similarly, the bindgen tool has agreed to build the kernel in
        their CI too.
      
        Thus, with the pre-merge CI in place, both projects hope to avoid
        unintentional changes to Rust that break the kernel. This means that,
        in general, apart from intentional changes on their side (that we will
        need to workaround conditionally on our side), the upcoming Rust
        compiler versions should generally work.
      
        In addition, the Rust project has proposed getting the kernel into
        stable Rust (at least solving the main blockers) as one of its three
        flagship goals for 2024H2 [1].
      
        I would like to thank Niko, Sid, Emilio et al. for their help
        promoting the collaboration between Rust and the kernel.
      
        Toolchain and infrastructure:
      
         - Support several Rust toolchain versions.
      
         - Support several bindgen versions.
      
         - Remove 'cargo' requirement and simplify 'rusttest', thanks to
           'alloc' having been dropped last cycle.
      
         - Provide proper error reporting for the 'rust-analyzer' target.
      
        'kernel' crate:
      
         - Add 'uaccess' module with a safe userspace pointers abstraction.
      
         - Add 'page' module with a 'struct page' abstraction.
      
         - Support more complex generics in workqueue's 'impl_has_work!'
           macro.
      
        'macros' crate:
      
         - Add 'firmware' field support to the 'module!' macro.
      
         - Improve 'module!' macro documentation.
      
        Documentation:
      
         - Provide instructions on what packages should be installed to build
           the kernel in some popular Linux distributions.
      
         - Introduce the new kernel.org LLVM+Rust toolchains.
      
         - Explain '#[no_std]'.
      
        And a few other small bits"
      
      Link: https://rust-lang.github.io/rust-project-goals/2024h2/index.html#flagship-goals [1]
      
      * tag 'rust-6.11' of https://github.com/Rust-for-Linux/linux: (26 commits)
        docs: rust: quick-start: add section on Linux distributions
        rust: warn about `bindgen` versions 0.66.0 and 0.66.1
        rust: start supporting several `bindgen` versions
        rust: work around `bindgen` 0.69.0 issue
        rust: avoid assuming a particular `bindgen` build
        rust: start supporting several compiler versions
        rust: simplify Clippy warning flags set
        rust: relax most deny-level lints to warnings
        rust: allow `dead_code` for never constructed bindings
        rust: init: simplify from `map_err` to `inspect_err`
        rust: macros: indent list item in `paste!`'s docs
        rust: add abstraction for `struct page`
        rust: uaccess: add typed accessors for userspace pointers
        uaccess: always export _copy_[from|to]_user with CONFIG_RUST
        rust: uaccess: add userspace pointers
        kbuild: rust-analyzer: improve comment documentation
        kbuild: rust-analyzer: better error handling
        docs: rust: no_std is used
        rust: alloc: add __GFP_HIGHMEM flag
        rust: alloc: fix typo in docs for GFP_NOWAIT
        ...
      910bfc26
    • Linus Torvalds's avatar
      Merge tag 'apparmor-pr-2024-07-25' of... · ff305644
      Linus Torvalds authored
      Merge tag 'apparmor-pr-2024-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor
      
      Pull apparmor updates from John Johansen:
       "Cleanups
         - optimization: try to avoid refing the label in apparmor_file_open
         - remove useless static inline function is_deleted
         - use kvfree_sensitive to free data->data
         - fix typo in kernel doc
      
        Bug fixes:
         - unpack transition table if dfa is not present
         - test: add MODULE_DESCRIPTION()
         - take nosymfollow flag into account
         - fix possible NULL pointer dereference
         - fix null pointer deref when receiving skb during sock creation"
      
      * tag 'apparmor-pr-2024-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor:
        apparmor: unpack transition table if dfa is not present
        apparmor: try to avoid refing the label in apparmor_file_open
        apparmor: test: add MODULE_DESCRIPTION()
        apparmor: take nosymfollow flag into account
        apparmor: fix possible NULL pointer dereference
        apparmor: fix typo in kernel doc
        apparmor: remove useless static inline function is_deleted
        apparmor: use kvfree_sensitive to free data->data
        apparmor: Fix null pointer deref when receiving skb during sock creation
      ff305644
    • Linus Torvalds's avatar
      Merge tag 'landlock-6.11-rc1-houdini-fix' of... · 86b405ad
      Linus Torvalds authored
      Merge tag 'landlock-6.11-rc1-houdini-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux
      
      Pull landlock fix from Mickaël Salaün:
       "Jann Horn reported a sandbox bypass for Landlock. This includes the
        fix and new tests. This should be backported"
      
      * tag 'landlock-6.11-rc1-houdini-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
        selftests/landlock: Add cred_transfer test
        landlock: Don't lose track of restrictions on cred_transfer
      86b405ad
    • Linus Torvalds's avatar
      Merge tag 'gpio-fixes-for-v6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux · 8e333791
      Linus Torvalds authored
      Pull gpio fix from Bartosz Golaszewski:
      
       - don't use sprintf() with non-constant format string
      
      * tag 'gpio-fixes-for-v6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
        gpio: virtuser: avoid non-constant format string
      8e333791
    • Linus Torvalds's avatar
      Merge tag 'devicetree-fixes-for-6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux · bf80f139
      Linus Torvalds authored
      Pull more devicetree updates from Rob Herring:
       "Most of this is a treewide change to of_property_for_each_u32() which
        was small enough to do in one go before rc1 and avoids the need to
        create of_property_for_each_u32_some_new_name().
      
         - Treewide conversion of of_property_for_each_u32() to drop internal
           arguments making struct property opaque
      
         - Add binding for Amlogic A4 SoC watchdog
      
         - Fix constraints for AD7192 'single-channel' property"
      
      * tag 'devicetree-fixes-for-6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
        dt-bindings: iio: adc: ad7192: Fix 'single-channel' constraints
        of: remove internal arguments from of_property_for_each_u32()
        dt-bindings: watchdog: add support for Amlogic A4 SoCs
      bf80f139
    • Linus Torvalds's avatar
      Merge tag 'iommu-fixes-v6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux · b465ed28
      Linus Torvalds authored
      Pull iommu fixes from Will Deacon:
       "We're still resolving a regression with the handling of unexpected
        page faults on SMMUv3, but we're not quite there with a fix yet.
      
         - Fix NULL dereference when freeing domain in Unisoc SPRD driver
      
         - Separate assignment statements with semicolons in AMD page-table
           code
      
         - Fix Tegra erratum workaround when the CPU is using 16KiB pages"
      
      * tag 'iommu-fixes-v6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux:
        iommu: arm-smmu: Fix Tegra workaround for PAGE_SIZE mappings
        iommu/amd: Convert comma to semicolon
        iommu: sprd: Avoid NULL deref in sprd_iommu_hw_en
      b465ed28
    • Linus Torvalds's avatar
      Merge tag 'firewire-fixes-6.11-rc1' of... · 04216211
      Linus Torvalds authored
      Merge tag 'firewire-fixes-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
      
      Pull firewire fixes from Takashi Sakamoto:
       "The recent integration of compiler collections introduced the
        technology to check flexible array length at runtime by providing
        proper annotations. In v6.10 kernel, a patch was merged into firewire
        subsystem to utilize it, however the annotation was inadequate.
      
        There is also the related change for the flexible array in sound
        subsystem, but it causes a regression where the data in the payload of
        isochronous packet is incorrect for some devices. These bugs are now
        fixed"
      
      * tag 'firewire-fixes-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
        ALSA: firewire-lib: fix wrong value as length of header for CIP_NO_HEADER case
        Revert "firewire: Annotate struct fw_iso_packet with __counted_by()"
      04216211
    • Linus Torvalds's avatar
      Merge tag 'spi-fix-v6.11-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · ab11658f
      Linus Torvalds authored
      Pull spi fixes from Mark Brown:
       "The bulk of this is a series of fixes for the microchip-core driver
        mostly originating from one of their customers, I also applied an
        additional patch adding support for controlling the word size which
        came along with it since it's still the merge window and clearly had a
        bunch of fairly thorough testing.
      
        We also have a fix for the compatible used to bind spidev to the
        BH2228FV"
      
      * tag 'spi-fix-v6.11-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
        spi: spidev: add correct compatible for Rohm BH2228FV
        dt-bindings: trivial-devices: fix Rohm BH2228FV compatible string
        spi: microchip-core: add support for word sizes of 1 to 32 bits
        spi: microchip-core: ensure TX and RX FIFOs are empty at start of a transfer
        spi: microchip-core: fix init function not setting the master and motorola modes
        spi: microchip-core: only disable SPI controller when register value change requires it
        spi: microchip-core: defer asserting chip select until just before write to TX FIFO
        spi: microchip-core: fix the issues in the isr
      ab11658f
    • Linus Torvalds's avatar
      Merge tag 'regulator-fix-v6.11-merge-window' of... · 560e8050
      Linus Torvalds authored
      Merge tag 'regulator-fix-v6.11-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
      
      Pull regulator fixes from Mark Brown:
       "These two commits clean up the excessively loose dependencies for the
        RZG2L USB VBCTRL regulator driver, ensuring it shouldn't prompt for
        people who can't use it"
      
      * tag 'regulator-fix-v6.11-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
        regulator: Further restrict RZG2L USB VBCTRL regulator dependencies
        regulator: renesas-usb-vbus-regulator: Update the default
      560e8050
    • Linus Torvalds's avatar
      Merge tag 'regmap-fix-v6.11-merge-window' of... · 8f3f7598
      Linus Torvalds authored
      Merge tag 'regmap-fix-v6.11-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
      
      Pull regmap fix from Mark Brown:
       "Arnd sent a workaround for a false positive warning which was showing
        up with GCC 14.1"
      
      * tag 'regmap-fix-v6.11-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
        regmap: maple: work around gcc-14.1 false-positive warning
      8f3f7598
    • Linus Torvalds's avatar
      Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · de5f4fbe
      Linus Torvalds authored
      Pull clk fixes from Stephen Boyd:
       "A few clk driver fixes for the merge window to fix the build and boot
        on some SoCs.
      
         - Initialize struct clk_init_data in the TI da8xx-cfgchip driver so
           that stack contents aren't used for things like clk flags leading
           to unexpected behavior
      
         - Don't leak stack contents in a debug print in the new Sophgo clk
           driver
      
         - Disable the new T-Head clk driver on 32-bit targets to fix the
           build due to a division
      
         - Fix Samsung Exynos4 fin_pll wreckage from the clkdev rework done
           last cycle by using a struct clk_hw directly instead of a struct
           clk consumer"
      
      * tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: samsung: fix getting Exynos4 fin_pll rate from external clocks
        clk: T-Head: Disable on 32-bit Targets
        clk: sophgo: clk-sg2042-pll: Fix uninitialized variable in debug output
        clk: davinci: da8xx-cfgchip: Initialize clk_init_data before use
      de5f4fbe
    • Linus Torvalds's avatar
      Merge tag 'i3c/for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux · c85e1497
      Linus Torvalds authored
      Pull i3c updates from Alexandre Belloni:
       "This cycle, there are new features for the Designware controller and
        fixes for the other IPs:
      
         - dw: optional apb clock and power management support, IBI handling
           fixes
      
         - mipi-i3c-hci: IBI handling fixes
      
         - svc: a few fixes"
      
      * tag 'i3c/for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux:
        dt-bindings: i3c: add header for generic I3C flags
        i3c: master: svc: Fix error code in svc_i3c_master_do_daa_locked()
        i3c: master: Enhance i3c_bus_type visibility for device searching & event monitoring
        i3c: dw: Add power management support
        i3c: dw: Add some functions for reusability
        i3c: dw: Save timing registers and other values
        i3c: master: svc: Improve DAA STOP handle code logic
        i3c: dw: Add optional apb clock
        i3c: dw: Use new *_enabled clk API
        dt-bindings: i3c: dw: Add apb clock binding
        i3c: master: svc: Convert comma to semicolon
        i3c: mipi-i3c-hci: Round IBI data chunk size to HW supported value
        i3c: mipi-i3c-hci: Error out instead on BUG_ON() in IBI DMA setup
        i3c: mipi-i3c-hci: Set IBI Status and Data Ring base addresses
        i3c: mipi-i3c-hci: Switch to lower_32_bits()/upper_32_bits() helpers
        i3c: dw: Remove ibi_capable property
        i3c: dw: Fix IBI intr programming
        i3c: dw: Fix clearing queue thld
        i3c: mipi-i3c-hci: Fix number of DAT/DCT entries for HCI versions < 1.1
        i3c: master: svc: resend target address when get NACK
      c85e1497
    • Linus Torvalds's avatar
      Merge tag 'thermal-6.11-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 1fcaa5db
      Linus Torvalds authored
      Pull thermal control fix from Rafael Wysocki:
       "Prevent the thermal core from flooding the kernel log with useless
        messages if thermal zone temperature can never be determined (or its
        sensor has failed permanently) and make it finally give up and disable
        defective thermal zones (Rafael Wysocki)"
      
      * tag 'thermal-6.11-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        thermal: core: Back off when polling thermal zones on errors
        thermal: trip: Split thermal_zone_device_set_mode()
      1fcaa5db
    • Linus Torvalds's avatar
      Merge tag 'mm-hotfixes-stable-2024-07-26-14-33' of... · 7b0acd91
      Linus Torvalds authored
      Merge tag 'mm-hotfixes-stable-2024-07-26-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
      
      Pull misc hotfixes from Andrew Morton:
       "11 hotfixes, 7 of which are cc:stable.  7 are MM, 4 are other"
      
      * tag 'mm-hotfixes-stable-2024-07-26-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
        nilfs2: handle inconsistent state in nilfs_btnode_create_block()
        selftests/mm: skip test for non-LPA2 and non-LVA systems
        mm/page_alloc: fix pcp->count race between drain_pages_zone() vs __rmqueue_pcplist()
        mm: memcg: add cacheline padding after lruvec in mem_cgroup_per_node
        alloc_tag: outline and export free_reserved_page()
        decompress_bunzip2: fix rare decompression failure
        mm/huge_memory: avoid PMD-size page cache if needed
        mm: huge_memory: use !CONFIG_64BIT to relax huge page alignment on 32 bit machines
        mm: fix old/young bit handling in the faulting path
        dt-bindings: arm: update James Clark's email address
        MAINTAINERS: mailmap: update James Clark's email address
      7b0acd91
    • Linus Torvalds's avatar
      Merge tag 'timers-urgent-2024-07-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 5256184b
      Linus Torvalds authored
      Pull timer migration updates from Thomas Gleixner:
       "Fixes and minor updates for the timer migration code:
      
         - Stop testing the group->parent pointer as it is not guaranteed to
           be stable over a chain of operations by design.
      
           This includes a warning which would be nice to have but it produces
           false positives due to the racy nature of the check.
      
         - Plug a race between CPUs going in and out of idle and a CPU hotplug
           operation. The latter can create and connect a new hierarchy level
           which is missed in the concurrent updates of CPUs which go into
           idle. As a result the events of such a CPU might not be processed
           and timers go stale.
      
           Cure it by splitting the hotplug operation into a prepare and
           online callback. The prepare callback is guaranteed to run on an
           online and therefore active CPU. This CPU updates the hierarchy and
           being online ensures that there is always at least one migrator
           active which handles the modified hierarchy correctly when going
           idle. The online callback which runs on the incoming CPU then just
           marks the CPU active and brings it into operation.
      
         - Improve tracing and polish the code further so it is more obvious
           what's going on"
      
      * tag 'timers-urgent-2024-07-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        timers/migration: Fix grammar in comment
        timers/migration: Spare write when nothing changed
        timers/migration: Rename childmask by groupmask to make naming more obvious
        timers/migration: Read childmask and parent pointer in a single place
        timers/migration: Use a single struct for hierarchy walk data
        timers/migration: Improve tracing
        timers/migration: Move hierarchy setup into cpuhotplug prepare callback
        timers/migration: Do not rely always on group->parent
      5256184b
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-6.11-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · c9f33436
      Linus Torvalds authored
      Pull more RISC-V updates from Palmer Dabbelt:
      
       - Support for NUMA (via SRAT and SLIT), console output (via SPCR), and
         cache info (via PPTT) on ACPI-based systems.
      
       - The trap entry/exit code no longer breaks the return address stack
         predictor on many systems, which results in an improvement to trap
         latency.
      
       - Support for HAVE_ARCH_STACKLEAK.
      
       - The sv39 linear map has been extended to support 128GiB mappings.
      
       - The frequency of the mtime CSR is now visible via hwprobe.
      
      * tag 'riscv-for-linus-6.11-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (21 commits)
        RISC-V: Provide the frequency of time CSR via hwprobe
        riscv: Extend sv39 linear mapping max size to 128G
        riscv: enable HAVE_ARCH_STACKLEAK
        riscv: signal: Remove unlikely() from WARN_ON() condition
        riscv: Improve exception and system call latency
        RISC-V: Select ACPI PPTT drivers
        riscv: cacheinfo: initialize cacheinfo's level and type from ACPI PPTT
        riscv: cacheinfo: remove the useless input parameter (node) of ci_leaf_init()
        RISC-V: ACPI: Enable SPCR table for console output on RISC-V
        riscv: boot: remove duplicated targets line
        trace: riscv: Remove deprecated kprobe on ftrace support
        riscv: cpufeature: Extract common elements from extension checking
        riscv: Introduce vendor variants of extension helpers
        riscv: Add vendor extensions to /proc/cpuinfo
        riscv: Extend cpufeature.c to detect vendor extensions
        RISC-V: run savedefconfig for defconfig
        RISC-V: hwprobe: sort EXT_KEY()s in hwprobe_isa_ext0() alphabetically
        ACPI: NUMA: replace pr_info with pr_debug in arch_acpi_numa_init
        ACPI: NUMA: change the ACPI_NUMA to a hidden option
        ACPI: NUMA: Add handler for SRAT RINTC affinity structure
        ...
      c9f33436
    • Linus Torvalds's avatar
      Merge tag 'for-linus-6.11-rc1a-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · c17f1224
      Linus Torvalds authored
      Pull xen fixes from Juergen Gross:
       "Two fixes for issues introduced in this merge window:
      
         - fix enhanced debugging in the Xen multicall handling
      
         - two patches fixing a boot failure when running as dom0 in PVH mode"
      
      * tag 'for-linus-6.11-rc1a-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        x86/xen: fix memblock_reserve() usage on PVH
        x86/xen: move xen_reserve_extra_memory()
        xen: fix multicall debug data referencing
      c17f1224
    • Hongbo Li's avatar
      hostfs: fix the host directory parse when mounting. · ef9ca17c
      Hongbo Li authored
      hostfs not keep the host directory when mounting. When the host
      directory is none (default), fc->source is used as the host root
      directory, and this is wrong. Here we use `parse_monolithic` to
      handle the old mount path for parsing the root directory. For new
      mount path, The `parse_param` is used for the host directory parse.
      Reported-and-tested-by: default avatarMaciej Żenczykowski <maze@google.com>
      Fixes: cd140ce9 ("hostfs: convert hostfs to use the new mount API")
      Link: https://lore.kernel.org/all/CANP3RGceNzwdb7w=vPf5=7BCid5HVQDmz1K5kC9JG42+HVAh_g@mail.gmail.com/
      Cc: Christian Brauner <brauner@kernel.org>
      Signed-off-by: default avatarHongbo Li <lihongbo22@huawei.com>
      Link: https://lore.kernel.org/r/20240725065130.1821964-1-lihongbo22@huawei.com
      [brauner: minor fixes]
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      ef9ca17c
    • Seth Forshee (DigitalOcean)'s avatar
      fs: don't allow non-init s_user_ns for filesystems without FS_USERNS_MOUNT · e1c5ae59
      Seth Forshee (DigitalOcean) authored
      Christian noticed that it is possible for a privileged user to mount
      most filesystems with a non-initial user namespace in sb->s_user_ns.
      When fsopen() is called in a non-init namespace the caller's namespace
      is recorded in fs_context->user_ns. If the returned file descriptor is
      then passed to a process priviliged in init_user_ns, that process can
      call fsconfig(fd_fs, FSCONFIG_CMD_CREATE), creating a new superblock
      with sb->s_user_ns set to the namespace of the process which called
      fsopen().
      
      This is problematic. We cannot assume that any filesystem which does not
      set FS_USERNS_MOUNT has been written with a non-initial s_user_ns in
      mind, increasing the risk for bugs and security issues.
      
      Prevent this by returning EPERM from sget_fc() when FS_USERNS_MOUNT is
      not set for the filesystem and a non-initial user namespace will be
      used. sget() does not need to be updated as it always uses the user
      namespace of the current context, or the initial user namespace if
      SB_SUBMOUNT is set.
      
      Fixes: cb50b348 ("convenience helpers: vfs_get_super() and sget_fc()")
      Reported-by: default avatarChristian Brauner <brauner@kernel.org>
      Signed-off-by: default avatarSeth Forshee (DigitalOcean) <sforshee@kernel.org>
      Link: https://lore.kernel.org/r/20240724-s_user_ns-fix-v1-1-895d07c94701@kernel.orgReviewed-by: default avatarAlexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      e1c5ae59
    • Takashi Sakamoto's avatar
      ALSA: firewire-lib: fix wrong value as length of header for CIP_NO_HEADER case · c1839501
      Takashi Sakamoto authored
      In a commit 1d717123 ("ALSA: firewire-lib: Avoid
      -Wflex-array-member-not-at-end warning"), DEFINE_FLEX() macro was used to
      handle variable length of array for header field in struct fw_iso_packet
      structure. The usage of macro has a side effect that the designated
      initializer assigns the count of array to the given field. Therefore
      CIP_HEADER_QUADLETS (=2) is assigned to struct fw_iso_packet.header,
      while the original designated initializer assigns zero to all fields.
      
      With CIP_NO_HEADER flag, the change causes invalid length of header in
      isochronous packet for 1394 OHCI IT context. This bug affects all of
      devices supported by ALSA fireface driver; RME Fireface 400, 800, UCX, UFX,
      and 802.
      
      This commit fixes the bug by replacing it with the alternative version of
      macro which corresponds no initializer.
      
      Cc: stable@vger.kernel.org
      Fixes: 1d717123 ("ALSA: firewire-lib: Avoid -Wflex-array-member-not-at-end warning")
      Reported-by: default avatarEdmund Raile <edmund.raile@proton.me>
      Closes: https://lore.kernel.org/r/rrufondjeynlkx2lniot26ablsltnynfaq2gnqvbiso7ds32il@qk4r6xps7jh2/Reviewed-by: default avatarTakashi Iwai <tiwai@suse.de>
      Link: https://lore.kernel.org/r/20240725155640.128442-1-o-takashi@sakamocchi.jpSigned-off-by: default avatarTakashi Sakamoto <o-takashi@sakamocchi.jp>
      c1839501
    • Takashi Sakamoto's avatar
      Revert "firewire: Annotate struct fw_iso_packet with __counted_by()" · 00e3913b
      Takashi Sakamoto authored
      This reverts commit d3155742.
      
      The header_length field is byte unit, thus it can not express the number of
      elements in header field. It seems that the argument for counted_by
      attribute can have no arithmetic expression, therefore this commit just
      reverts the issued commit.
      Suggested-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      Link: https://lore.kernel.org/r/20240725161648.130404-1-o-takashi@sakamocchi.jpSigned-off-by: default avatarTakashi Sakamoto <o-takashi@sakamocchi.jp>
      00e3913b
  2. 26 Jul, 2024 17 commits
    • Linus Torvalds's avatar
      minmax: avoid overly complicated constant expressions in VM code · 3a7e02c0
      Linus Torvalds authored
      The minmax infrastructure is overkill for simple constants, and can
      cause huge expansions because those simple constants are then used by
      other things.
      
      For example, 'pageblock_order' is a core VM constant, but because it was
      implemented using 'min_t()' and all the type-checking that involves, it
      actually expanded to something like 2.5kB of preprocessor noise.
      
      And when that simple constant was then used inside other expansions:
      
        #define pageblock_nr_pages      (1UL << pageblock_order)
        #define pageblock_start_pfn(pfn)  ALIGN_DOWN((pfn), pageblock_nr_pages)
      
      and we then use that inside a 'max()' macro:
      
      	case ISOLATE_SUCCESS:
      		update_cached = false;
      		last_migrated_pfn = max(cc->zone->zone_start_pfn,
      			pageblock_start_pfn(cc->migrate_pfn - 1));
      
      the end result was that one statement expanding to 253kB in size.
      
      There are probably other cases of this, but this one case certainly
      stood out.
      
      I've added 'MIN_T()' and 'MAX_T()' macros for this kind of "core simple
      constant with specific type" use.  These macros skip the type checking,
      and as such need to be very sparingly used only for obvious cases that
      have active issues like this.
      Reported-by: default avatarLorenzo Stoakes <lorenzo.stoakes@oracle.com>
      Link: https://lore.kernel.org/all/36aa2cad-1db1-4abf-8dd2-fb20484aabc3@lucifer.local/
      Cc: David Laight <David.Laight@aculab.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3a7e02c0
    • Linus Torvalds's avatar
      minmax: avoid overly complex min()/max() macro arguments in xen · e8432ac8
      Linus Torvalds authored
      We have some very fancy min/max macros that have tons of sanity checking
      to warn about mixed signedness etc.
      
      This is all things that a sane compiler should warn about, but there are
      no sane compiler interfaces for this, and '-Wsign-compare' is broken [1]
      and not useful.
      
      So then we compensate (some would say over-compensate) by doing the
      checks manually with some truly horrid macro games.
      
      And no, we can't just use __builtin_types_compatible_p(), because the
      whole question of "does it make sense to compare these two values" is a
      lot more complicated than that.
      
      For example, it makes a ton of sense to compare unsigned values with
      simple constants like "5", even if that is indeed a signed type.  So we
      have these very strange macros to try to make sensible type checking
      decisions on the arguments to 'min()' and 'max()'.
      
      But that can cause enormous code expansion if the min()/max() macros are
      used with complicated expressions, and particularly if you nest these
      things so that you get the first big expansion then expanded again.
      
      The xen setup.c file ended up ballooning to over 50MB of preprocessed
      noise that takes 15s to compile (obviously depending on the build host),
      largely due to one single line.
      
      So let's split that one single line to just be simpler.  I think it ends
      up being more legible to humans too at the same time.  Now that single
      file compiles in under a second.
      Reported-and-reviewed-by: default avatarLorenzo Stoakes <lorenzo.stoakes@oracle.com>
      Link: https://lore.kernel.org/all/c83c17bb-be75-4c67-979d-54eee38774c6@lucifer.local/
      Link: https://staticthinking.wordpress.com/2023/07/25/wsign-compare-is-garbage/ [1]
      Cc: David Laight <David.Laight@aculab.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e8432ac8
    • Ryusuke Konishi's avatar
      nilfs2: handle inconsistent state in nilfs_btnode_create_block() · 4811f7af
      Ryusuke Konishi authored
      Syzbot reported that a buffer state inconsistency was detected in
      nilfs_btnode_create_block(), triggering a kernel bug.
      
      It is not appropriate to treat this inconsistency as a bug; it can occur
      if the argument block address (the buffer index of the newly created
      block) is a virtual block number and has been reallocated due to
      corruption of the bitmap used to manage its allocation state.
      
      So, modify nilfs_btnode_create_block() and its callers to treat it as a
      possible filesystem error, rather than triggering a kernel bug.
      
      Link: https://lkml.kernel.org/r/20240725052007.4562-1-konishi.ryusuke@gmail.com
      Fixes: a60be987 ("nilfs2: B-tree node cache")
      Signed-off-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Reported-by: syzbot+89cc4f2324ed37988b60@syzkaller.appspotmail.com
      Closes: https://syzkaller.appspot.com/bug?extid=89cc4f2324ed37988b60
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      4811f7af
    • Dev Jain's avatar
      selftests/mm: skip test for non-LPA2 and non-LVA systems · f556acc2
      Dev Jain authored
      Post my improvement of the test in e4a4ba41 ("selftests/mm:
      va_high_addr_switch: dynamically initialize testcases to enable LPA2
      testing"):
      
      The test begins to fail on 4k and 16k pages, on non-LPA2 systems.  To
      reduce noise in the CI systems, let us skip the test when higher address
      space is not implemented.
      
      Link: https://lkml.kernel.org/r/20240718052504.356517-1-dev.jain@arm.com
      Fixes: e4a4ba41 ("selftests/mm: va_high_addr_switch: dynamically initialize testcases to enable LPA2 testing")
      Signed-off-by: default avatarDev Jain <dev.jain@arm.com>
      Reviewed-by: default avatarRyan Roberts <ryan.roberts@arm.com>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Mark Brown <broonie@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      f556acc2
    • Li Zhijian's avatar
      mm/page_alloc: fix pcp->count race between drain_pages_zone() vs __rmqueue_pcplist() · 66eca102
      Li Zhijian authored
      It's expected that no page should be left in pcp_list after calling
      zone_pcp_disable() in offline_pages().  Previously, it's observed that
      offline_pages() gets stuck [1] due to some pages remaining in pcp_list.
      
      Cause:
      There is a race condition between drain_pages_zone() and __rmqueue_pcplist()
      involving the pcp->count variable. See below scenario:
      
               CPU0                              CPU1
          ----------------                    ---------------
                                            spin_lock(&pcp->lock);
                                            __rmqueue_pcplist() {
      zone_pcp_disable() {
                                              /* list is empty */
                                              if (list_empty(list)) {
                                                /* add pages to pcp_list */
                                                alloced = rmqueue_bulk()
        mutex_lock(&pcp_batch_high_lock)
        ...
        __drain_all_pages() {
          drain_pages_zone() {
            /* read pcp->count, it's 0 here */
            count = READ_ONCE(pcp->count)
            /* 0 means nothing to drain */
                                                /* update pcp->count */
                                                pcp->count += alloced << order;
            ...
                                            ...
                                            spin_unlock(&pcp->lock);
      
      In this case, after calling zone_pcp_disable() though, there are still some
      pages in pcp_list. And these pages in pcp_list are neither movable nor
      isolated, offline_pages() gets stuck as a result.
      
      Solution:
      Expand the scope of the pcp->lock to also protect pcp->count in
      drain_pages_zone(), to ensure no pages are left in the pcp list after
      zone_pcp_disable()
      
      [1] https://lore.kernel.org/linux-mm/6a07125f-e720-404c-b2f9-e55f3f166e85@fujitsu.com/
      
      Link: https://lkml.kernel.org/r/20240723064428.1179519-1-lizhijian@fujitsu.com
      Fixes: 4b23a68f ("mm/page_alloc: protect PCP lists with a spinlock")
      Signed-off-by: default avatarLi Zhijian <lizhijian@fujitsu.com>
      Reported-by: default avatarYao Xingtao <yaoxt.fnst@fujitsu.com>
      Reviewed-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      66eca102
    • Roman Gushchin's avatar
      mm: memcg: add cacheline padding after lruvec in mem_cgroup_per_node · f59adcf5
      Roman Gushchin authored
      Oliver Sand reported a performance regression caused by commit
      98c9daf5 ("mm: memcg: guard memcg1-specific members of struct
      mem_cgroup_per_node"), which puts some fields of the mem_cgroup_per_node
      structure under the CONFIG_MEMCG_V1 config option.  Apparently it causes a
      false cache sharing between lruvec and lru_zone_size members of the
      structure.  Fix it by adding an explicit padding after the lruvec member.
      
      Even though the padding is not required with CONFIG_MEMCG_V1 set, it seems
      like the introduced memory overhead is not significant enough to warrant
      another divergence in the mem_cgroup_per_node layout, so the padding is
      added unconditionally.
      
      Link: https://lkml.kernel.org/r/20240723171244.747521-1-roman.gushchin@linux.dev
      Fixes: 98c9daf5 ("mm: memcg: guard memcg1-specific members of struct mem_cgroup_per_node")
      Signed-off-by: default avatarRoman Gushchin <roman.gushchin@linux.dev>
      Reported-by: default avatarkernel test robot <oliver.sang@intel.com>
      Closes: https://lore.kernel.org/oe-lkp/202407121335.31a10cb6-oliver.sang@intel.comTested-by: default avatarOliver Sang <oliver.sang@intel.com>
      Acked-by: default avatarShakeel Butt <shakeel.butt@linux.dev>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Muchun Song <muchun.song@linux.dev>
      Cc: Roman Gushchin <roman.gushchin@linux.dev>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      f59adcf5
    • Suren Baghdasaryan's avatar
      alloc_tag: outline and export free_reserved_page() · b3bebe44
      Suren Baghdasaryan authored
      Outline and export free_reserved_page() because modules use it and it in
      turn uses page_ext_{get|put} which should not be exported.  The same
      result could be obtained by outlining {get|put}_page_tag_ref() but that
      would have higher performance impact as these functions are used in more
      performance critical paths.
      
      Link: https://lkml.kernel.org/r/20240717212844.2749975-1-surenb@google.com
      Fixes: dcfe378c ("lib: introduce support for page allocation tagging")
      Signed-off-by: default avatarSuren Baghdasaryan <surenb@google.com>
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Closes: https://lore.kernel.org/oe-kbuild-all/202407080044.DWMC9N9I-lkp@intel.com/Suggested-by: default avatarChristoph Hellwig <hch@infradead.org>
      Suggested-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Kent Overstreet <kent.overstreet@linux.dev>
      Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
      Cc: Sourav Panda <souravpanda@google.com>
      Cc: <stable@vger.kernel.org>	[6.10]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      b3bebe44
    • Ross Lagerwall's avatar
      decompress_bunzip2: fix rare decompression failure · bf6acd5d
      Ross Lagerwall authored
      The decompression code parses a huffman tree and counts the number of
      symbols for a given bit length.  In rare cases, there may be >= 256
      symbols with a given bit length, causing the unsigned char to overflow. 
      This causes a decompression failure later when the code tries and fails to
      find the bit length for a given symbol.
      
      Since the maximum number of symbols is 258, use unsigned short instead.
      
      Link: https://lkml.kernel.org/r/20240717162016.1514077-1-ross.lagerwall@citrix.com
      Fixes: bc22c17e ("bzip2/lzma: library support for gzip, bzip2 and lzma decompression")
      Signed-off-by: default avatarRoss Lagerwall <ross.lagerwall@citrix.com>
      Cc: Alain Knaff <alain@knaff.lu>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      bf6acd5d
    • Gavin Shan's avatar
      mm/huge_memory: avoid PMD-size page cache if needed · d659b715
      Gavin Shan authored
      xarray can't support arbitrary page cache size.  the largest and supported
      page cache size is defined as MAX_PAGECACHE_ORDER by commit 099d9064
      ("mm/filemap: make MAX_PAGECACHE_ORDER acceptable to xarray").  However,
      it's possible to have 512MB page cache in the huge memory's collapsing
      path on ARM64 system whose base page size is 64KB.  512MB page cache is
      breaking the limitation and a warning is raised when the xarray entry is
      split as shown in the following example.
      
      [root@dhcp-10-26-1-207 ~]# cat /proc/1/smaps | grep KernelPageSize
      KernelPageSize:       64 kB
      [root@dhcp-10-26-1-207 ~]# cat /tmp/test.c
         :
      int main(int argc, char **argv)
      {
      	const char *filename = TEST_XFS_FILENAME;
      	int fd = 0;
      	void *buf = (void *)-1, *p;
      	int pgsize = getpagesize();
      	int ret = 0;
      
      	if (pgsize != 0x10000) {
      		fprintf(stdout, "System with 64KB base page size is required!\n");
      		return -EPERM;
      	}
      
      	system("echo 0 > /sys/devices/virtual/bdi/253:0/read_ahead_kb");
      	system("echo 1 > /proc/sys/vm/drop_caches");
      
      	/* Open the xfs file */
      	fd = open(filename, O_RDONLY);
      	assert(fd > 0);
      
      	/* Create VMA */
      	buf = mmap(NULL, TEST_MEM_SIZE, PROT_READ, MAP_SHARED, fd, 0);
      	assert(buf != (void *)-1);
      	fprintf(stdout, "mapped buffer at 0x%p\n", buf);
      
      	/* Populate VMA */
      	ret = madvise(buf, TEST_MEM_SIZE, MADV_NOHUGEPAGE);
      	assert(ret == 0);
      	ret = madvise(buf, TEST_MEM_SIZE, MADV_POPULATE_READ);
      	assert(ret == 0);
      
      	/* Collapse VMA */
      	ret = madvise(buf, TEST_MEM_SIZE, MADV_HUGEPAGE);
      	assert(ret == 0);
      	ret = madvise(buf, TEST_MEM_SIZE, MADV_COLLAPSE);
      	if (ret) {
      		fprintf(stdout, "Error %d to madvise(MADV_COLLAPSE)\n", errno);
      		goto out;
      	}
      
      	/* Split xarray entry. Write permission is needed */
      	munmap(buf, TEST_MEM_SIZE);
      	buf = (void *)-1;
      	close(fd);
      	fd = open(filename, O_RDWR);
      	assert(fd > 0);
      	fallocate(fd, FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE,
       		  TEST_MEM_SIZE - pgsize, pgsize);
      out:
      	if (buf != (void *)-1)
      		munmap(buf, TEST_MEM_SIZE);
      	if (fd > 0)
      		close(fd);
      
      	return ret;
      }
      
      [root@dhcp-10-26-1-207 ~]# gcc /tmp/test.c -o /tmp/test
      [root@dhcp-10-26-1-207 ~]# /tmp/test
       ------------[ cut here ]------------
       WARNING: CPU: 25 PID: 7560 at lib/xarray.c:1025 xas_split_alloc+0xf8/0x128
       Modules linked in: nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib    \
       nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct      \
       nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4      \
       ip_set rfkill nf_tables nfnetlink vfat fat virtio_balloon drm fuse   \
       xfs libcrc32c crct10dif_ce ghash_ce sha2_ce sha256_arm64 virtio_net  \
       sha1_ce net_failover virtio_blk virtio_console failover dimlib virtio_mmio
       CPU: 25 PID: 7560 Comm: test Kdump: loaded Not tainted 6.10.0-rc7-gavin+ #9
       Hardware name: QEMU KVM Virtual Machine, BIOS edk2-20240524-1.el9 05/24/2024
       pstate: 83400005 (Nzcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
       pc : xas_split_alloc+0xf8/0x128
       lr : split_huge_page_to_list_to_order+0x1c4/0x780
       sp : ffff8000ac32f660
       x29: ffff8000ac32f660 x28: ffff0000e0969eb0 x27: ffff8000ac32f6c0
       x26: 0000000000000c40 x25: ffff0000e0969eb0 x24: 000000000000000d
       x23: ffff8000ac32f6c0 x22: ffffffdfc0700000 x21: 0000000000000000
       x20: 0000000000000000 x19: ffffffdfc0700000 x18: 0000000000000000
       x17: 0000000000000000 x16: ffffd5f3708ffc70 x15: 0000000000000000
       x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
       x11: ffffffffffffffc0 x10: 0000000000000040 x9 : ffffd5f3708e692c
       x8 : 0000000000000003 x7 : 0000000000000000 x6 : ffff0000e0969eb8
       x5 : ffffd5f37289e378 x4 : 0000000000000000 x3 : 0000000000000c40
       x2 : 000000000000000d x1 : 000000000000000c x0 : 0000000000000000
       Call trace:
        xas_split_alloc+0xf8/0x128
        split_huge_page_to_list_to_order+0x1c4/0x780
        truncate_inode_partial_folio+0xdc/0x160
        truncate_inode_pages_range+0x1b4/0x4a8
        truncate_pagecache_range+0x84/0xa0
        xfs_flush_unmap_range+0x70/0x90 [xfs]
        xfs_file_fallocate+0xfc/0x4d8 [xfs]
        vfs_fallocate+0x124/0x2f0
        ksys_fallocate+0x4c/0xa0
        __arm64_sys_fallocate+0x24/0x38
        invoke_syscall.constprop.0+0x7c/0xd8
        do_el0_svc+0xb4/0xd0
        el0_svc+0x44/0x1d8
        el0t_64_sync_handler+0x134/0x150
        el0t_64_sync+0x17c/0x180
      
      Fix it by correcting the supported page cache orders, different sets for
      DAX and other files.  With it corrected, 512MB page cache becomes
      disallowed on all non-DAX files on ARM64 system where the base page size
      is 64KB.  After this patch is applied, the test program fails with error
      -EINVAL returned from __thp_vma_allowable_orders() and the madvise()
      system call to collapse the page caches.
      
      Link: https://lkml.kernel.org/r/20240715000423.316491-1-gshan@redhat.com
      Fixes: 6b24ca4a ("mm: Use multi-index entries in the page cache")
      Signed-off-by: default avatarGavin Shan <gshan@redhat.com>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Reviewed-by: default avatarRyan Roberts <ryan.roberts@arm.com>
      Acked-by: default avatarZi Yan <ziy@nvidia.com>
      Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
      Cc: Barry Song <baohua@kernel.org>
      Cc: Don Dutile <ddutile@redhat.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Ryan Roberts <ryan.roberts@arm.com>
      Cc: William Kucharski <william.kucharski@oracle.com>
      Cc: <stable@vger.kernel.org>	[5.17+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      d659b715
    • Yang Shi's avatar
      mm: huge_memory: use !CONFIG_64BIT to relax huge page alignment on 32 bit machines · d9592025
      Yang Shi authored
      Yves-Alexis Perez reported commit 4ef9ad19 ("mm: huge_memory: don't
      force huge page alignment on 32 bit") didn't work for x86_32 [1].  It is
      because x86_32 uses CONFIG_X86_32 instead of CONFIG_32BIT.
      
      !CONFIG_64BIT should cover all 32 bit machines.
      
      [1] https://lore.kernel.org/linux-mm/CAHbLzkr1LwH3pcTgM+aGQ31ip2bKqiqEQ8=FQB+t2c3dhNKNHA@mail.gmail.com/
      
      Link: https://lkml.kernel.org/r/20240712155855.1130330-1-yang@os.amperecomputing.com
      Fixes: 4ef9ad19 ("mm: huge_memory: don't force huge page alignment on 32 bit")
      Signed-off-by: default avatarYang Shi <yang@os.amperecomputing.com>
      Reported-by: default avatarYves-Alexis Perez <corsac@debian.org>
      Tested-by: default avatarYves-Alexis Perez <corsac@debian.org>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Jiri Slaby <jirislaby@kernel.org>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Salvatore Bonaccorso <carnil@debian.org>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: <stable@vger.kernel.org>	[6.8+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      d9592025
    • Ram Tummala's avatar
      mm: fix old/young bit handling in the faulting path · 4cd7ba16
      Ram Tummala authored
      Commit 3bd786f7 ("mm: convert do_set_pte() to set_pte_range()")
      replaced do_set_pte() with set_pte_range() and that introduced a
      regression in the following faulting path of non-anonymous vmas which
      caused the PTE for the faulting address to be marked as old instead of
      young.
      
      handle_pte_fault()
        do_pte_missing()
          do_fault()
            do_read_fault() || do_cow_fault() || do_shared_fault()
              finish_fault()
                set_pte_range()
      
      The polarity of prefault calculation is incorrect.  This leads to prefault
      being incorrectly set for the faulting address.  The following check will
      incorrectly mark the PTE old rather than young.  On some architectures
      this will cause a double fault to mark it young when the access is
      retried.
      
          if (prefault && arch_wants_old_prefaulted_pte())
              entry = pte_mkold(entry);
      
      On a subsequent fault on the same address, the faulting path will see a
      non NULL vmf->pte and instead of reaching the do_pte_missing() path, PTE
      will then be correctly marked young in handle_pte_fault() itself.
      
      Due to this bug, performance degradation in the fault handling path will
      be observed due to unnecessary double faulting.
      
      Link: https://lkml.kernel.org/r/20240710014539.746200-1-rtummala@nvidia.com
      Fixes: 3bd786f7 ("mm: convert do_set_pte() to set_pte_range()")
      Signed-off-by: default avatarRam Tummala <rtummala@nvidia.com>
      Reviewed-by: default avatarYin Fengwei <fengwei.yin@intel.com>
      Cc: Alistair Popple <apopple@nvidia.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Yin Fengwei <fengwei.yin@intel.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      4cd7ba16
    • James Clark's avatar
      dt-bindings: arm: update James Clark's email address · 34e526f6
      James Clark authored
      My new address is james.clark@linaro.org
      
      Link: https://lkml.kernel.org/r/20240709102512.31212-3-james.clark@linaro.orgSigned-off-by: default avatarJames Clark <james.clark@linaro.org>
      Cc: Bjorn Andersson <quic_bjorande@quicinc.com>
      Cc: Conor Dooley <conor+dt@kernel.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Geliang Tang <geliang@kernel.org>
      Cc: Hao Zhang <quic_hazha@quicinc.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Kees Cook <kees@kernel.org>
      Cc: Krzysztof Kozlowski <krzk+dt@kernel.org>
      Cc: Mao Jinlong <quic_jinlmao@quicinc.com>
      Cc: Matthieu Baerts <matttbe@kernel.org>
      Cc: Matt Ranostay <matt@ranostay.sg>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Oleksij Rempel <o.rempel@pengutronix.de>
      Cc: Rob Herring (Arm) <robh@kernel.org>
      Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      34e526f6
    • James Clark's avatar
      MAINTAINERS: mailmap: update James Clark's email address · 5bf6f3c5
      James Clark authored
      My new address is james.clark@linaro.org
      
      Link: https://lkml.kernel.org/r/20240709102512.31212-2-james.clark@linaro.orgSigned-off-by: default avatarJames Clark <james.clark@linaro.org>
      Cc: Bjorn Andersson <quic_bjorande@quicinc.com>
      Cc: Conor Dooley <conor+dt@kernel.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Geliang Tang <geliang@kernel.org>
      Cc: Hao Zhang <quic_hazha@quicinc.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Kees Cook <kees@kernel.org>
      Cc: Krzysztof Kozlowski <krzk+dt@kernel.org>
      Cc: Mao Jinlong <quic_jinlmao@quicinc.com>
      Cc: Matthieu Baerts <matttbe@kernel.org>
      Cc: Matt Ranostay <matt@ranostay.sg>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Oleksij Rempel <o.rempel@pengutronix.de>
      Cc: Rob Herring (Arm) <robh@kernel.org>
      Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      5bf6f3c5
    • Rob Herring (Arm)'s avatar
      dt-bindings: iio: adc: ad7192: Fix 'single-channel' constraints · 6dc55268
      Rob Herring (Arm) authored
      The 'single-channel' property is an uint32, not an array, so 'items' is
      an incorrect constraint. This didn't matter until dtschema recently
      changed how properties are decoded. This results in this warning:
      
      Documentation/devicetree/bindings/iio/adc/adi,ad7192.example.dtb: adc@0: \
        channel@1:single-channel: 1 is not of type 'array'
      
      Fixes: caf7b763 ("dt-bindings: iio: adc: ad7192: Add AD7194 support")
      Reviewed-by: default avatarConor Dooley <conor.dooley@microchip.com>
      Link: https://lore.kernel.org/r/20240723230904.1299744-1-robh@kernel.orgSigned-off-by: default avatarRob Herring (Arm) <robh@kernel.org>
      6dc55268
    • Linus Torvalds's avatar
      Merge tag 'auxdisplay-for-v6.11-tag1' of... · 2f8c4f50
      Linus Torvalds authored
      Merge tag 'auxdisplay-for-v6.11-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
      
      Pull auxdisplay updates from Geert Uytterhoeven:
      
        - add support for configuring the boot message on line displays
      
        - miscellaneous fixes and improvements
      
      * tag 'auxdisplay-for-v6.11-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
        auxdisplay: ht16k33: Drop reference after LED registration
        auxdisplay: Use sizeof(*pointer) instead of sizeof(type)
        auxdisplay: hd44780: add missing MODULE_DESCRIPTION() macro
        auxdisplay: linedisp: add missing MODULE_DESCRIPTION() macro
        auxdisplay: linedisp: Support configuring the boot message
        auxdisplay: charlcd: Provide a forward declaration
      2f8c4f50
    • Linus Torvalds's avatar
      Merge tag 'sound-fix-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · eb966e0c
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "A collection of fixes gathered since the previous pull.
      
        We see a bit large LOCs at a HD-audio quirk, but that's only bulk COEF
        data, hence it's safe to take. In addition to that, there were two
        minor fixes for MIDI 2.0 handling for ALSA core, and the rest are all
        rather random small and device-specific fixes"
      
      * tag 'sound-fix-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ASoC: fsl-asoc-card: Dynamically allocate memory for snd_soc_dai_link_components
        ASoC: amd: yc: Support mic on Lenovo Thinkpad E16 Gen 2
        ALSA: hda/realtek: Implement sound init sequence for Samsung Galaxy Book3 Pro 360
        ALSA: hda/realtek: cs35l41: Fixup remaining asus strix models
        ASoC: SOF: ipc4-topology: Preserve the DMA Link ID for ChainDMA on unprepare
        ASoC: SOF: ipc4-topology: Only handle dai_config with HW_PARAMS for ChainDMA
        ALSA: ump: Force 1 Group for MIDI1 FBs
        ALSA: ump: Don't update FB name for static blocks
        ALSA: usb-audio: Add a quirk for Sonix HD USB Camera
        ASoC: TAS2781: Fix tasdev_load_calibrated_data()
        ASoC: tegra: select CONFIG_SND_SIMPLE_CARD_UTILS
        ASoC: Intel: use soc_intel_is_byt_cr() only when IOSF_MBI is reachable
        ALSA: usb-audio: Move HD Webcam quirk to the right place
        ALSA: hda: tas2781: mark const variables as __maybe_unused
        ALSA: usb-audio: Fix microphone sound on HD webcam.
        ASoC: sof: amd: fix for firmware reload failure in Vangogh platform
        ASoC: Intel: Fix RT5650 SSP lookup
        ASOC: SOF: Intel: hda-loader: only wait for HDaudio IOC for IPC4 devices
        ASoC: SOF: imx8m: Fix DSP control regmap retrieval
      eb966e0c
    • Linus Torvalds's avatar
      Merge tag 'drm-next-2024-07-26' of https://gitlab.freedesktop.org/drm/kernel · 0ba9b155
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Fixes for rc1, mostly amdgpu, i915 and xe, with some other misc ones,
        doesn't seem to be anything too serious.
      
        amdgpu:
         - Bump driver version for GFX12 DCC
         - DC documention warning fixes
         - VCN unified queue power fix
         - SMU fix
         - RAS fix
         - Display corruption fix
         - SDMA 5.2 workaround
         - GFX12 fixes
         - Uninitialized variable fix
         - VCN/JPEG 4.0.3 fixes
         - Misc display fixes
         - RAS fixes
         - VCN4/5 harvest fix
         - GPU reset fix
      
        i915:
         - Reset intel_dp->link_trained before retraining the link
         - Don't switch the LTTPR mode on an active link
         - Do not consider preemption during execlists_dequeue for gen8
         - Allow NULL memory region
      
        xe:
         - xe_exec ioctl minor fix on sync entry cleanup upon error
         - SRIOV: limit VF LMEM provisioning
         - Wedge mode fixes
      
        v3d:
         - fix indirect dispatch on newer v3d revs
      
        panel:
         - fix panel backlight bindings"
      
      * tag 'drm-next-2024-07-26' of https://gitlab.freedesktop.org/drm/kernel: (39 commits)
        drm/amdgpu: reset vm state machine after gpu reset(vram lost)
        drm/amdgpu: add missed harvest check for VCN IP v4/v5
        drm/amdgpu: Fix eeprom max record count
        drm/amdgpu: fix ras UE error injection failure issue
        drm/amd/display: Remove ASSERT if significance is zero in math_ceil2
        drm/amd/display: Check for NULL pointer
        drm/amdgpu/vcn: Use offsets local to VCN/JPEG in VF
        drm/amdgpu: Add empty HDP flush function to VCN v4.0.3
        drm/amdgpu: Add empty HDP flush function to JPEG v4.0.3
        drm/amd/amdgpu: Fix uninitialized variable warnings
        drm/amdgpu: Fix atomics on GFX12
        drm/amdgpu/sdma5.2: Update wptr registers as well as doorbell
        drm/i915: Allow NULL memory region
        drm/i915/gt: Do not consider preemption during execlists_dequeue for gen8
        dt-bindings: display: panel: samsung,atna33xc20: Document ATNA45AF01
        drm/xe: Don't suspend device upon wedge
        drm/xe: Wedge the entire device
        drm/xe/pf: Limit fair VF LMEM provisioning
        drm/xe/exec: Fix minor bug related to xe_sync_entry_cleanup
        drm/amd/display: fix corruption with high refresh rates on DCN 3.0
        ...
      0ba9b155