1. 29 Aug, 2023 3 commits
  2. 28 Aug, 2023 36 commits
    • Linus Torvalds's avatar
      Merge tag 'sched-core-2023-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3ca9a836
      Linus Torvalds authored
      Pull scheduler updates from Ingo Molnar:
      
       - The biggest change is introduction of a new iteration of the
         SCHED_FAIR interactivity code: the EEVDF ("Earliest Eligible Virtual
         Deadline First") scheduler
      
         EEVDF too is a virtual-time scheduler, with two parameters (weight
         and relative deadline), compared to CFS that had weight only. It
         completely reworks the base scheduler: placement, preemption, picking
         -- everything
      
         LWN.net, as usual, has a terrific writeup about EEVDF:
      
            https://lwn.net/Articles/925371/
      
         Preemption (both tick and wakeup) is driven by testing against a
         fresh pick. Because the tree is now effectively an interval tree, and
         the selection is no longer the 'leftmost' task, over-scheduling is
         less of a problem. A lot of the CFS heuristics are removed or
         replaced by more natural latency-space parameters & constructs
      
         In terms of expected performance regressions: we will and can fix
         everything where a 'good' workload misbehaves with the new scheduler,
         but EEVDF inevitably changes workload scheduling in a binary fashion,
         hopefully for the better in the overwhelming majority of cases, but
         in some cases it won't, especially in adversarial loads that got
         lucky with the previous code, such as some variants of hackbench. We
         are trying hard to err on the side of fixing all performance
         regressions, but we expect some inevitable post-release iterations of
         that process
      
       - Improve load-balancing on hybrid x86 systems: enable cluster
         scheduling (again)
      
       - Improve & fix bandwidth-scheduling on nohz systems
      
       - Improve bandwidth-throttling
      
       - Use lock guards to simplify and de-goto-ify control flow
      
       - Misc improvements, cleanups and fixes
      
      * tag 'sched-core-2023-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (43 commits)
        sched/eevdf/doc: Modify the documented knob to base_slice_ns as well
        sched/eevdf: Curb wakeup-preemption
        sched: Simplify sched_core_cpu_{starting,deactivate}()
        sched: Simplify try_steal_cookie()
        sched: Simplify sched_tick_remote()
        sched: Simplify sched_exec()
        sched: Simplify ttwu()
        sched: Simplify wake_up_if_idle()
        sched: Simplify: migrate_swap_stop()
        sched: Simplify sysctl_sched_uclamp_handler()
        sched: Simplify get_nohz_timer_target()
        sched/rt: sysctl_sched_rr_timeslice show default timeslice after reset
        sched/rt: Fix sysctl_sched_rr_timeslice intial value
        sched/fair: Block nohz tick_stop when cfs bandwidth in use
        sched, cgroup: Restore meaning to hierarchical_quota
        MAINTAINERS: Add Peter explicitly to the psi section
        sched/psi: Select KERNFS as needed
        sched/topology: Align group flags when removing degenerate domain
        sched/fair: remove util_est boosting
        sched/fair: Propagate enqueue flags into place_entity()
        ...
      3ca9a836
    • Linus Torvalds's avatar
      Merge tag 'perf-core-2023-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 1a7c6115
      Linus Torvalds authored
      Pull perf event updates from Ingo Molnar:
      
       - AMD IBS improvements
      
       - Intel PMU driver updates
      
       - Extend core perf facilities & the ARM PMU driver to better handle ARM big.LITTLE events
      
       - Micro-optimize software events and the ring-buffer code
      
       - Misc cleanups & fixes
      
      * tag 'perf-core-2023-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/x86/uncore: Remove unnecessary ?: operator around pcibios_err_to_errno() call
        perf/x86/intel: Add Crestmont PMU
        x86/cpu: Update Hybrids
        x86/cpu: Fix Crestmont uarch
        x86/cpu: Fix Gracemont uarch
        perf: Remove unused extern declaration arch_perf_get_page_size()
        perf: Remove unused PERF_PMU_CAP_HETEROGENEOUS_CPUS capability
        arm_pmu: Remove unused PERF_PMU_CAP_HETEROGENEOUS_CPUS capability
        perf/x86: Remove unused PERF_PMU_CAP_HETEROGENEOUS_CPUS capability
        arm_pmu: Add PERF_PMU_CAP_EXTENDED_HW_TYPE capability
        perf/x86/ibs: Set mem_lvl_num, mem_remote and mem_hops for data_src
        perf/mem: Add PERF_MEM_LVLNUM_NA to PERF_MEM_NA
        perf/mem: Introduce PERF_MEM_LVLNUM_UNC
        perf/ring_buffer: Use local_try_cmpxchg in __perf_output_begin
        locking/arch: Avoid variable shadowing in local_try_cmpxchg()
        perf/core: Use local64_try_cmpxchg in perf_swevent_set_period
        perf/x86: Use local64_try_cmpxchg
        perf/amd: Prevent grouping of IBS events
      1a7c6115
    • Linus Torvalds's avatar
      Merge tag 'locking-core-2023-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d637fce0
      Linus Torvalds authored
      Pull locking update from Ingo Molnar:
       "Simplify the locking self-tests via using the new <linux/cleanup.h>
        facilities for lock guards"
      
      * tag 'locking-core-2023-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        lockdep/selftests: Use SBRM APIs for wait context tests
      d637fce0
    • Linus Torvalds's avatar
      Merge tag 'efi-next-for-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi · d7dd9b44
      Linus Torvalds authored
      Pull EFI updates from Ard Biesheuvel:
       "This primarily covers some cleanup work on the EFI runtime wrappers,
        which are shared between all EFI architectures except Itanium, and
        which provide some level of isolation to prevent faults occurring in
        the firmware code (which runs at the same privilege level as the
        kernel) from bringing down the system.
      
        Beyond that, there is a fix that did not make it into v6.5, and some
        doc fixes and dead code cleanup.
      
         - one bugfix for x86 mixed mode that did not make it into v6.5
      
         - first pass of cleanup for the EFI runtime wrappers
      
         - some cosmetic touchups"
      
      * tag 'efi-next-for-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
        x86/efistub: Fix PCI ROM preservation in mixed mode
        efi/runtime-wrappers: Clean up white space and add __init annotation
        acpi/prmt: Use EFI runtime sandbox to invoke PRM handlers
        efi/runtime-wrappers: Don't duplicate setup/teardown code
        efi/runtime-wrappers: Remove duplicated macro for service returning void
        efi/runtime-wrapper: Move workqueue manipulation out of line
        efi/runtime-wrappers: Use type safe encapsulation of call arguments
        efi/riscv: Move EFI runtime call setup/teardown helpers out of line
        efi/arm64: Move EFI runtime call setup/teardown helpers out of line
        efi/riscv: libstub: Fix comment about absolute relocation
        efi: memmap: Remove kernel-doc warnings
        efi: Remove unused extern declaration efi_lookup_mapped_addr()
      d7dd9b44
    • Linus Torvalds's avatar
      Merge tag 'x86_microcode_for_v6.6_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 42a7f6e3
      Linus Torvalds authored
      Pull x86 microcode loading updates from Borislav Petkov:
       "The first, cleanup part of the microcode loader reorg tglx has been
        working on. The other part wasn't fully ready in time so it will
        follow on later.
      
        This part makes the loader core code as it is practically enabled on
        pretty much every baremetal machine so there's no need to have the
        Kconfig items.
      
        In addition, there are cleanups which prepare for future feature
        enablement"
      
      * tag 'x86_microcode_for_v6.6_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/microcode: Remove remaining references to CONFIG_MICROCODE_AMD
        x86/microcode/intel: Remove pointless mutex
        x86/microcode/intel: Remove debug code
        x86/microcode: Move core specific defines to local header
        x86/microcode/intel: Rename get_datasize() since its used externally
        x86/microcode: Make reload_early_microcode() static
        x86/microcode: Include vendor headers into microcode.h
        x86/microcode/intel: Move microcode functions out of cpu/intel.c
        x86/microcode: Hide the config knob
        x86/mm: Remove unused microcode.h include
        x86/microcode: Remove microcode_mutex
        x86/microcode/AMD: Rip out static buffers
      42a7f6e3
    • Linus Torvalds's avatar
      Merge tag 'x86_sev_for_v6.6_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · f31f663f
      Linus Torvalds authored
      Pull x86 SEV updates from Borislav Petkov:
      
       - Handle the case where the beginning virtual address of the address
         range whose SEV encryption status needs to change, is not page
         aligned so that callers which round up the number of pages to be
         decrypted, would mark a trailing page as decrypted and thus cause
         corruption during live migration.
      
       - Return an error from the #VC handler on AMD SEV-* guests when the
         debug registers swapping is enabled as a DR7 access should not happen
         then - that register is guest/host switched.
      
      * tag 'x86_sev_for_v6.6_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/sev: Make enc_dec_hypercall() accept a size instead of npages
        x86/sev: Do not handle #VC for DR7 read/write
      f31f663f
    • Linus Torvalds's avatar
      Merge tag 'ras_core_for_v6.6_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 28c59d94
      Linus Torvalds authored
      Pull x86 RAS updates from Borislav Petkov:
      
       - Add a quirk for AMD Zen machines where Instruction Fetch unit poison
         consumption MCEs are not delivered synchronously but still within the
         same context, which can lead to erroneously increased error severity
         and unneeded kernel panics
      
       - Do not log errors caught by polling shared MCA banks as they
         materialize as duplicated error records otherwise
      
      * tag 'ras_core_for_v6.6_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/MCE: Always save CS register on AMD Zen IF Poison errors
        x86/mce: Prevent duplicate error records
      28c59d94
    • Linus Torvalds's avatar
      Merge tag 'x86_misc_for_v6.6_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 7e5e832c
      Linus Torvalds authored
      Pull misc x86 updates from Borislav Petkov:
      
       - Add PCI device IDs for a new AMD family 0x1a CPUs and use them in the
         respective drivers
      
       - Update HPE Superdome Flex maintainers list
      
      * tag 'x86_misc_for_v6.6_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/uv: Update HPE Superdome Flex Maintainers
        EDAC/amd64: Add support for AMD family 1Ah models 00h-1Fh and 40h-4Fh
        hwmon: (k10temp) Add thermal support for AMD Family 1Ah-based models
        x86/amd_nb: Add PCI IDs for AMD Family 1Ah-based models
      7e5e832c
    • Linus Torvalds's avatar
      Merge tag 'x86_boot_for_v6.6_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · bd9e99f7
      Linus Torvalds authored
      Pull x86 boot updates from Borislav Petkov:
       "Avoid the baremetal decompressor code when booting on an EFI machine.
      
        This is mandated by the current tightening of EFI executables
        requirements when used in a secure boot scenario. More specifically,
        an EFI executable cannot have a single section with RWX permissions,
        which conflicts with the in-place kernel decompression that is done
        today.
      
        Instead, the things required by the booting kernel image are done in
        the EFI stub now.
      
        Work by Ard Biesheuvel"
      
      * tag 'x86_boot_for_v6.6_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
        x86/efistub: Avoid legacy decompressor when doing EFI boot
        x86/efistub: Perform SNP feature test while running in the firmware
        efi/libstub: Add limit argument to efi_random_alloc()
        x86/decompressor: Factor out kernel decompression and relocation
        x86/decompressor: Move global symbol references to C code
        decompress: Use 8 byte alignment
        x86/efistub: Prefer EFI memory attributes protocol over DXE services
        x86/efistub: Perform 4/5 level paging switch from the stub
        x86/decompressor: Merge trampoline cleanup with switching code
        x86/decompressor: Pass pgtable address to trampoline directly
        x86/decompressor: Only call the trampoline when changing paging levels
        x86/decompressor: Call trampoline directly from C code
        x86/decompressor: Avoid the need for a stack in the 32-bit trampoline
        x86/decompressor: Use standard calling convention for trampoline
        x86/decompressor: Call trampoline as a normal function
        x86/decompressor: Assign paging related global variables earlier
        x86/decompressor: Store boot_params pointer in callee save register
        x86/efistub: Clear BSS in EFI handover protocol entrypoint
        x86/decompressor: Avoid magic offsets for EFI handover entrypoint
        x86/efistub: Simplify and clean up handover entry code
        ...
      bd9e99f7
    • Linus Torvalds's avatar
      Merge tag 'smp-core-2023-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 6f49693a
      Linus Torvalds authored
      Pull CPU hotplug updates from Thomas Gleixner:
       "Updates for the CPU hotplug core:
      
         - Support partial SMT enablement.
      
           So far the sysfs SMT control only allows to toggle between SMT on
           and off. That's sufficient for x86 which usually has at max two
           threads except for the Xeon PHI platform which has four threads per
           core
      
           Though PowerPC has up to 16 threads per core and so far it's only
           possible to control the number of enabled threads per core via a
           command line option. There is some way to control this at runtime,
           but that lacks enforcement and the usability is awkward
      
           This update expands the sysfs interface and the core infrastructure
           to accept numerical values so PowerPC can build SMT runtime control
           for partial SMT enablement on top
      
           The core support has also been provided to the PowerPC maintainers
           who added the PowerPC related changes on top
      
         - Minor cleanups and documentation updates"
      
      * tag 'smp-core-2023-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        Documentation: core-api/cpuhotplug: Fix state names
        cpu/hotplug: Remove unused function declaration cpu_set_state_online()
        cpu/SMT: Fix cpu_smt_possible() comment
        cpu/SMT: Allow enabling partial SMT states via sysfs
        cpu/SMT: Create topology_smt_thread_allowed()
        cpu/SMT: Remove topology_smt_supported()
        cpu/SMT: Store the current/max number of threads
        cpu/SMT: Move smt/control simple exit cases earlier
        cpu/SMT: Move SMT prototypes into cpu_smt.h
        cpu/hotplug: Remove dependancy against cpu_primary_thread_mask
      6f49693a
    • Linus Torvalds's avatar
      Merge tag 'irq-core-2023-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · dd3f0fe5
      Linus Torvalds authored
      Pull irq updates from Thomas Gleixner:
       "Boring updates for the interrupt subsystem:
      
        Core:
      
         - Prevent a deadlock of nested interrupt threads vs.
           synchronize_hard()
      
         - Removal of a stale extern declaration
      
        Drivers:
      
         - The first new driver since v6.2 for Amlogic-C3 SoCs
      
         - The usual small fixes, cleanups and improvements all over the
           place"
      
      * tag 'irq-core-2023-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqchip: Add support for Amlogic-C3 SoCs
        dt-bindings: interrupt-controller: Add support for Amlogic-C3 SoCs
        irqchip/irq-mvebu-sei: Use devm_platform_get_and_ioremap_resource()
        irqchip/ls-scfg-msi: Use devm_platform_get_and_ioremap_resource()
        irqchip: Explicitly include correct DT includes
        irqchip/orion: Use of_address_count() helper
        irqchip/irq-pruss-intc: Do not check for 0 return after calling platform_get_irq()
        irqchip/imx-mu-msi: Do not check for 0 return after calling platform_get_irq()
        irqchipr/i8259: Mark i8259_of_init() static
        irqchip/mips-gic: Mark gic_irq_domain_free() static
        irqchip/xtensa-pic: Include header for xtensa_pic_init_legacy()
        irqchip/loongson-eiointc: Fix return value checking of eiointc_index
        genirq: Remove unused extern declaration
        genirq: Prevent nested thread vs synchronize_hardirq() deadlock
      dd3f0fe5
    • Linus Torvalds's avatar
      Merge tag 'core-entry-2023-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 6bfce775
      Linus Torvalds authored
      Pull core entry code update from Thomas Gleixner:
       "A single update to the core entry code, which removes the empty user
        address limit check which is a leftover of the removed TIF_FSCHECK"
      
      * tag 'core-entry-2023-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        entry: Remove empty addr_limit_user_check()
      6bfce775
    • Linus Torvalds's avatar
      Merge tag 'clocksource.2023.08.15a' of... · b98af53c
      Linus Torvalds authored
      Merge tag 'clocksource.2023.08.15a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
      
      Pull clocksource watchdog updates from Paul McKenney:
      
       - Handle negative skews in "skew is too large" messages
      
       - Extend watchdog check exemption to 4-Socket platforms
      
      * tag 'clocksource.2023.08.15a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
        x86/tsc: Extend watchdog check exemption to 4-Sockets platform
        clocksource: Handle negative skews in "skew is too large" messages
      b98af53c
    • Linus Torvalds's avatar
      Merge tag 'csd-lock.2023.07.15a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu · b324696d
      Linus Torvalds authored
      Pull CSD lock updates from Paul McKenney:
       "This series reduces the number of stack traces dumped during CSD-lock
        debugging. This helps to avoid console overrun on systems with large
        numbers of CPUs"
      
      * tag 'csd-lock.2023.07.15a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
        smp: Reduce NMI traffic from CSD waiters to CSD destination
        smp: Reduce logging due to dump_stack of CSD waiters
      b324696d
    • Linus Torvalds's avatar
      Merge tag 'scftorture.2023.08.15a' of... · 6ae0c157
      Linus Torvalds authored
      Merge tag 'scftorture.2023.08.15a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
      
      Pull smp_call_function torture-test updates from Paul McKenney:
       "This prevents some memory-exhaustion false-postitive failures in
        scftorture testing"
      
      * tag 'scftorture.2023.08.15a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
        scftorture: Add CONFIG_PREEMPT_DYNAMIC=n to NOPREEMPT scenario
        scftorture: Pause testing after memory-allocation failure
        scftorture: Forgive memory-allocation failure if KASAN
        torture: Scale scftorture memory based on number of CPUs
      6ae0c157
    • Linus Torvalds's avatar
      Merge tag 'rcu.2023.08.21a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu · 68cadad1
      Linus Torvalds authored
      Pull RCU updates from Paul McKenney:
      
       - Documentation updates
      
       - Miscellaneous fixes, perhaps most notably simplifying
         SRCU_NOTIFIER_INIT() as suggested
      
       - RCU Tasks updates, most notably treating Tasks RCU callbacks as lazy
         while still treating synchronous grace periods as urgent. Also fixes
         one bug that restores the ability to apply debug-objects to RCU Tasks
         and another that fixes a race condition that could result in
         false-positive failures of the boot-time self-test code
      
       - RCU-scalability performance-test updates, most notably adding the
         ability to measure the RCU-Tasks's grace-period kthread's CPU
         consumption. This proved quite useful for the RCU Tasks work
      
       - Reference-acquisition/release performance-test updates, including a
         fix for an uninitialized wait_queue_head_t
      
       - Miscellaneous torture-test updates
      
       - Torture-test scripting updates, including removal of the
         non-longer-functional formal-verification scripts, test builds of
         individual RCU Tasks flavors, better diagnostics for loss of
         connectivity for distributed rcutorture tests, disabling of reboot
         loops in qemu/KVM-based rcutorture testing, and passing of init
         parameters to rcutorture's init program
      
      * tag 'rcu.2023.08.21a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (64 commits)
        rcu: Use WRITE_ONCE() for assignments to ->next for rculist_nulls
        rcu: Make the rcu_nocb_poll boot parameter usable via boot config
        rcu: Mark __rcu_irq_enter_check_tick() ->rcu_urgent_qs load
        srcu,notifier: Remove #ifdefs in favor of SRCU Tiny srcu_usage
        rcutorture: Stop right-shifting torture_random() return values
        torture: Stop right-shifting torture_random() return values
        torture: Move stutter_wait() timeouts to hrtimers
        torture: Move torture_shuffle() timeouts to hrtimers
        torture: Move torture_onoff() timeouts to hrtimers
        torture: Make torture_hrtimeout_*() use TASK_IDLE
        torture: Add lock_torture writer_fifo module parameter
        torture: Add a kthread-creation callback to _torture_create_kthread()
        rcu-tasks: Fix boot-time RCU tasks debug-only deadlock
        rcu-tasks: Permit use of debug-objects with RCU Tasks flavors
        checkpatch: Complain about unexpected uses of RCU Tasks Trace
        torture: Cause mkinitrd.sh to indicate failure on compile errors
        torture: Make init program dump command-line arguments
        torture: Switch qemu from -nographic to -display none
        torture: Add init-program support for loongarch
        torture: Avoid torture-test reboot loops
        ...
      68cadad1
    • Linus Torvalds's avatar
      Merge tag 'hardening-v6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 727dbda1
      Linus Torvalds authored
      Pull hardening updates from Kees Cook:
       "As has become normal, changes are scattered around the tree (either
        explicitly maintainer Acked or for trivial stuff that went ignored):
      
         - Carve out the new CONFIG_LIST_HARDENED as a more focused subset of
           CONFIG_DEBUG_LIST (Marco Elver)
      
         - Fix kallsyms lookup failure under Clang LTO (Yonghong Song)
      
         - Clarify documentation for CONFIG_UBSAN_TRAP (Jann Horn)
      
         - Flexible array member conversion not carried in other tree (Gustavo
           A. R. Silva)
      
         - Various strlcpy() and strncpy() removals not carried in other trees
           (Azeem Shaikh, Justin Stitt)
      
         - Convert nsproxy.count to refcount_t (Elena Reshetova)
      
         - Add handful of __counted_by annotations not carried in other trees,
           as well as an LKDTM test
      
         - Fix build failure with gcc-plugins on GCC 14+
      
         - Fix selftests to respect SKIP for signal-delivery tests
      
         - Fix CFI warning for paravirt callback prototype
      
         - Clarify documentation for seq_show_option_n() usage"
      
      * tag 'hardening-v6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (23 commits)
        LoadPin: Annotate struct dm_verity_loadpin_trusted_root_digest with __counted_by
        kallsyms: Change func signature for cleanup_symbol_name()
        kallsyms: Fix kallsyms_selftest failure
        nsproxy: Convert nsproxy.count to refcount_t
        integrity: Annotate struct ima_rule_opt_list with __counted_by
        lkdtm: Add FAM_BOUNDS test for __counted_by
        Compiler Attributes: counted_by: Adjust name and identifier expansion
        um: refactor deprecated strncpy to memcpy
        um: vector: refactor deprecated strncpy
        alpha: Replace one-element array with flexible-array member
        hardening: Move BUG_ON_DATA_CORRUPTION to hardening options
        list: Introduce CONFIG_LIST_HARDENED
        list_debug: Introduce inline wrappers for debug checks
        compiler_types: Introduce the Clang __preserve_most function attribute
        gcc-plugins: Rename last_stmt() for GCC 14+
        selftests/harness: Actually report SKIP for signal tests
        x86/paravirt: Fix tlb_remove_table function callback prototype warning
        EISA: Replace all non-returning strlcpy with strscpy
        perf: Replace strlcpy with strscpy
        um: Remove strlcpy declaration
        ...
      727dbda1
    • Linus Torvalds's avatar
      Merge tag 'seccomp-v6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · b03a4342
      Linus Torvalds authored
      Pull seccomp updates from Kees Cook:
      
       - Provide USER_NOTIFY flag for synchronous mode (Andrei Vagin, Peter
         Oskolkov). This touches the scheduler and perf but has been Acked by
         Peter Zijlstra.
      
       - Fix regression in syscall skipping and restart tracing on arm32. This
         touches arch/arm/ but has been Acked by Arnd Bergmann.
      
      * tag 'seccomp-v6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        seccomp: Add missing kerndoc notations
        ARM: ptrace: Restore syscall skipping for tracers
        ARM: ptrace: Restore syscall restart tracing
        selftests/seccomp: Handle arm32 corner cases better
        perf/benchmark: add a new benchmark for seccom_unotify
        selftest/seccomp: add a new test for the sync mode of seccomp_user_notify
        seccomp: add the synchronous mode for seccomp_unotify
        sched: add a few helpers to wake up tasks on the current cpu
        sched: add WF_CURRENT_CPU and externise ttwu
        seccomp: don't use semaphore and wait_queue together
      b03a4342
    • Linus Torvalds's avatar
      Merge tag 'pstore-v6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 5b07aaca
      Linus Torvalds authored
      Pull pstore updates from Kees Cook:
      
       - Greatly simplify compression support (Ard Biesheuvel)
      
       - Avoid crashes for corrupted offsets when prz size is 0 (Enlin Mu)
      
       - Expand range of usable record sizes (Yuxiao Zhang)
      
       - Fix kernel-doc warning (Matthew Wilcox)
      
      * tag 'pstore-v6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        pstore: Fix kernel-doc warning
        pstore: Support record sizes larger than kmalloc() limit
        pstore/ram: Check start of empty przs during init
        pstore: Replace crypto API compression with zlib_deflate library calls
        pstore: Remove worst-case compression size logic
      5b07aaca
    • Linus Torvalds's avatar
      Merge tag 'for-6.6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 547635c6
      Linus Torvalds authored
      Pull btrfs updates from David Sterba:
       "No new features, the bulk of the changes are fixes, refactoring and
        cleanups. The notable fix is the scrub performance restoration after
        rewrite in 6.4, though still only partial.
      
        Fixes:
      
         - scrub performance drop due to rewrite in 6.4 partially restored:
            - do IO grouping by blg_plug/blk_unplug again
            - avoid unnecessary tree searches when processing stripes, in
              extent and checksum trees
            - the drop is noticeable on fast PCIe devices, -66% and restored
              to -33% of the original
            - backports to 6.4 planned
      
         - handle more corner cases of transaction commit during orphan
           cleanup or delayed ref processing
      
         - use correct fsid/metadata_uuid when validating super block
      
         - copy directory permissions and time when creating a stub subvolume
      
        Core:
      
         - debugging feature integrity checker deprecated, to be removed in
           6.7
      
         - in zoned mode, zones are activated just before the write, making
           error handling easier, now the overcommit mechanism can be enabled
           again which improves performance by avoiding more frequent flushing
      
         - v0 extent handling completely removed, deprecated long time ago
      
         - error handling improvements
      
         - tests:
            - extent buffer bitmap tests
            - pinned extent splitting tests
      
         - cleanups and refactoring:
            - compression writeback
            - extent buffer bitmap
            - space flushing, ENOSPC handling"
      
      * tag 'for-6.6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (110 commits)
        btrfs: zoned: skip splitting and logical rewriting on pre-alloc write
        btrfs: tests: test invalid splitting when skipping pinned drop extent_map
        btrfs: tests: add a test for btrfs_add_extent_mapping
        btrfs: tests: add extent_map tests for dropping with odd layouts
        btrfs: scrub: move write back of repaired sectors to scrub_stripe_read_repair_worker()
        btrfs: scrub: don't go ordered workqueue for dev-replace
        btrfs: scrub: fix grouping of read IO
        btrfs: scrub: avoid unnecessary csum tree search preparing stripes
        btrfs: scrub: avoid unnecessary extent tree search preparing stripes
        btrfs: copy dir permission and time when creating a stub subvolume
        btrfs: remove pointless empty list check when reading delayed dir indexes
        btrfs: drop redundant check to use fs_devices::metadata_uuid
        btrfs: compare the correct fsid/metadata_uuid in btrfs_validate_super
        btrfs: use the correct superblock to compare fsid in btrfs_validate_super
        btrfs: simplify memcpy either of metadata_uuid or fsid
        btrfs: add a helper to read the superblock metadata_uuid
        btrfs: remove v0 extent handling
        btrfs: output extra debug info if we failed to find an inline backref
        btrfs: move the !zoned assert into run_delalloc_cow
        btrfs: consolidate the error handling in run_delalloc_nocow
        ...
      547635c6
    • Linus Torvalds's avatar
      Merge tag 'affs-for-6.6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · f678c890
      Linus Torvalds authored
      Pull affs updates from David Sterba:
       "Two minor updates for AFFS:
      
         - reimplement writepage() address space callback on top of
           migrate_folio()
      
         - fix a build warning, local parameters 'toupper' collide with the
           standard ctype.h name"
      
      * tag 'affs-for-6.6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        affs: rename local toupper() to fn() to avoid confusion
        affs: remove writepage implementation
      f678c890
    • Linus Torvalds's avatar
      Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux · 3bb156a5
      Linus Torvalds authored
      Pull fsverity updates from Eric Biggers:
       "Several cleanups for fs/verity/, including two commits that make the
        builtin signature support more cleanly separated from the base
        feature"
      
      * tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux:
        fsverity: skip PKCS#7 parser when keyring is empty
        fsverity: move sysctl registration out of signature.c
        fsverity: simplify handling of errors during initcall
        fsverity: explicitly check that there is no algorithm 0
      3bb156a5
    • Linus Torvalds's avatar
      Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux · cc0a38d0
      Linus Torvalds authored
      Pull fscrypt update from Eric Biggers:
       "Just a small documentation improvement"
      
      * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux:
        fscrypt: improve the "Encryption modes and usage" section
      cc0a38d0
    • Linus Torvalds's avatar
      Merge tag 'iomap-6.6-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · 6016fc91
      Linus Torvalds authored
      Pull iomap updates from Darrick Wong:
       "We've got some big changes for this release -- I'm very happy to be
        landing willy's work to enable large folios for the page cache for
        general read and write IOs when the fs can make contiguous space
        allocations, and Ritesh's work to track sub-folio dirty state to
        eliminate the write amplification problems inherent in using large
        folios.
      
        As a bonus, io_uring can now process write completions in the caller's
        context instead of bouncing through a workqueue, which should reduce
        io latency dramatically. IOWs, XFS should see a nice performance bump
        for both IO paths.
      
        Summary:
      
         - Make large writes to the page cache fill sparse parts of the cache
           with large folios, then use large memcpy calls for the large folio.
      
         - Track the per-block dirty state of each large folio so that a
           buffered write to a single byte on a large folio does not result in
           a (potentially) multi-megabyte writeback IO.
      
         - Allow some directio completions to be performed in the initiating
           task's context instead of punting through a workqueue. This will
           reduce latency for some io_uring requests"
      
      * tag 'iomap-6.6-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (26 commits)
        iomap: support IOCB_DIO_CALLER_COMP
        io_uring/rw: add write support for IOCB_DIO_CALLER_COMP
        fs: add IOCB flags related to passing back dio completions
        iomap: add IOMAP_DIO_INLINE_COMP
        iomap: only set iocb->private for polled bio
        iomap: treat a write through cache the same as FUA
        iomap: use an unsigned type for IOMAP_DIO_* defines
        iomap: cleanup up iomap_dio_bio_end_io()
        iomap: Add per-block dirty state tracking to improve performance
        iomap: Allocate ifs in ->write_begin() early
        iomap: Refactor iomap_write_delalloc_punch() function out
        iomap: Use iomap_punch_t typedef
        iomap: Fix possible overflow condition in iomap_write_delalloc_scan
        iomap: Add some uptodate state handling helpers for ifs state bitmap
        iomap: Drop ifs argument from iomap_set_range_uptodate()
        iomap: Rename iomap_page to iomap_folio_state and others
        iomap: Copy larger chunks from userspace
        iomap: Create large folios in the buffered write path
        filemap: Allow __filemap_get_folio to allocate large folios
        filemap: Add fgf_t typedef
        ...
      6016fc91
    • Linus Torvalds's avatar
      Merge tag 'erofs-for-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs · dd2c0198
      Linus Torvalds authored
      Pull erofs updates from Gao Xiang:
       "In this cycle, a xattr bloom filter feature is introduced to speed up
        negative xattr lookups, which was originally suggested by Alexander
        for Composefs use cases.
      
        Additionally, the DEFLATE algorithm is now supported, which can be
        used together with hardware accelerators for our cloud workloads. Each
        supported compression algorithm can be selected on a per-file basis
        for specific access patterns too.
      
        There are also some random fixes and cleanups as usual:
      
         - Support xattr bloom filter to optimize negative xattr lookups
      
         - Support DEFLATE compression algorithm as an alternative
      
         - Fix a regression that ztailpacking pclusters don't release properly
      
         - Avoid warning dedupe and fragments features anymore
      
         - Some folio conversions and cleanups"
      
      * tag 'erofs-for-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
        erofs: release ztailpacking pclusters properly
        erofs: don't warn dedupe and fragments features anymore
        erofs: adapt folios for z_erofs_read_folio()
        erofs: adapt folios for z_erofs_readahead()
        erofs: get rid of fe->backmost for cache decompression
        erofs: drop z_erofs_page_mark_eio()
        erofs: tidy up z_erofs_do_read_page()
        erofs: move preparation logic into z_erofs_pcluster_begin()
        erofs: avoid obsolete {collector,collection} terms
        erofs: simplify z_erofs_read_fragment()
        erofs: remove redundant erofs_fs_type declaration in super.c
        erofs: add necessary kmem_cache_create flags for erofs inode cache
        erofs: clean up redundant comment and adjust code alignment
        erofs: refine warning messages for zdata I/Os
        erofs: boost negative xattr lookup with bloom filter
        erofs: update on-disk format for xattr name filter
        erofs: DEFLATE compression support
      dd2c0198
    • Linus Torvalds's avatar
      Merge tag 'filelock-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux · f20ae9cf
      Linus Torvalds authored
      Pull file locking updates from Jeff Layton:
      
       - new functionality for F_OFD_GETLK: requesting a type of F_UNLCK will
         find info about whatever lock happens to be first in the given range,
         regardless of type.
      
       - an OFD lock selftest
      
       - bugfix involving a UAF in a tracepoint
      
       - comment typo fix
      
      * tag 'filelock-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
        locks: fix KASAN: use-after-free in trace_event_raw_event_filelock_lock
        fs/locks: Fix typo
        selftests: add OFD lock tests
        fs/locks: F_UNLCK extension for F_OFD_GETLK
      f20ae9cf
    • Linus Torvalds's avatar
      Merge tag 'v6.6-fs.proc.uapi' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · b4a04f92
      Linus Torvalds authored
      Pull procfs fixes from Christian Brauner:
       "Mode changes to files under /proc/<pid>/ aren't supported ever since
        commit 6d76fa58 ("Don't allow chmod() on the /proc/<pid>/ files").
      
        Due to an oversight in commit 1b3044e3 ("procfs: fix pthread
        cross-thread naming if !PR_DUMPABLE") in switching from REG to NOD,
        mode changes on /proc/thread-self/comm were accidently allowed.
      
        Similar, mode changes for all files beneath /proc/<pid>/net/ are
        blocked but mode changes on /proc/<pid>/net itself were accidently
        allowed.
      
        Both issues come down to not using the generic proc_setattr() helper
        which blocks all mode changes. This is rectified with this pull
        request.
      
        This also removes a strange nolibc test that abused /proc/<pid>/net
        for testing mode changes. Using procfs for this test never made a lot
        of sense given procfs has special semantics for almost everything
        anway.
      
        Both changes are minor user-visible changes. It is however very
        unlikely that mode changes on proc/<pid>/net and
        /proc/thread-self/comm are something that userspace relies on"
      
      * tag 'v6.6-fs.proc.uapi' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        procfs: block chmod on /proc/thread-self/comm
        proc: use generic setattr() for /proc/$PID/net
        selftests/nolibc: drop test chmod_net
      b4a04f92
    • Linus Torvalds's avatar
      Merge tag 'v6.6-vfs.autofs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · 2e0afa7e
      Linus Torvalds authored
      Pull autofs fixes from Christian Brauner:
       "This fixes a memory leak in autofs reported by syzkaller and a missing
        conversion from uninterruptible to interruptible wake up when autofs
        is in catatonic mode"
      
      * tag 'v6.6-vfs.autofs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        autofs: use wake_up() instead of wake_up_interruptible(()
        autofs: fix memory leak of waitqueues in autofs_catatonic_mode
      2e0afa7e
    • Linus Torvalds's avatar
      Merge tag 'v6.6-vfs.fchmodat2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · 475d4df8
      Linus Torvalds authored
      Pull fchmodat2 system call from Christian Brauner:
       "This adds the fchmodat2() system call. It is a revised version of the
        fchmodat() system call, adding a missing flag argument. Support for
        both AT_SYMLINK_NOFOLLOW and AT_EMPTY_PATH are included.
      
        Adding this system call revision has been a longstanding request but
        so far has always fallen through the cracks. While the kernel
        implementation of fchmodat() does not have a flag argument the libc
        provided POSIX-compliant fchmodat(3) version does. Both glibc and musl
        have to implement a workaround in order to support AT_SYMLINK_NOFOLLOW
        (see [1] and [2]).
      
        The workaround is brittle because it relies not just on O_PATH and
        O_NOFOLLOW semantics and procfs magic links but also on our rather
        inconsistent symlink semantics.
      
        This gives userspace a proper fchmodat2() system call that libcs can
        use to properly implement fchmodat(3) and allows them to get rid of
        their hacks. In this case it will immediately benefit them as the
        current workaround is already defunct because of aformentioned
        inconsistencies.
      
        In addition to AT_SYMLINK_NOFOLLOW, give userspace the ability to use
        AT_EMPTY_PATH with fchmodat2(). This is already possible with
        fchownat() so there's no reason to not also support it for
        fchmodat2().
      
        The implementation is simple and comes with selftests. Implementation
        of the system call and wiring up the system call are done as separate
        patches even though they could arguably be one patch. But in case
        there are merge conflicts from other system call additions it can be
        beneficial to have separate patches"
      
      Link: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=17eca54051ee28ba1ec3f9aed170a62630959143;hb=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c#l35 [1]
      Link: https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c?id=718f363bc2067b6487900eddc9180c84e7739f80#n28 [2]
      
      * tag 'v6.6-vfs.fchmodat2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        selftests: fchmodat2: remove duplicate unneeded defines
        fchmodat2: add support for AT_EMPTY_PATH
        selftests: Add fchmodat2 selftest
        arch: Register fchmodat2, usually as syscall 452
        fs: Add fchmodat2()
        Non-functional cleanup of a "__user * filename"
      475d4df8
    • Linus Torvalds's avatar
      Merge tag 'v6.6-vfs.super' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · 511fb5ba
      Linus Torvalds authored
      Pull superblock updates from Christian Brauner:
       "This contains the super rework that was ready for this cycle. The
        first part changes the order of how we open block devices and allocate
        superblocks, contains various cleanups, simplifications, and a new
        mechanism to wait on superblock state changes.
      
        This unblocks work to ultimately limit the number of writers to a
        block device. Jan has already scheduled follow-up work that will be
        ready for v6.7 and allows us to restrict the number of writers to a
        given block device. That series builds on this work right here.
      
        The second part contains filesystem freezing updates.
      
        Overview:
      
        The generic superblock changes are rougly organized as follows
        (ignoring additional minor cleanups):
      
         (1) Removal of the bd_super member from struct block_device.
      
             This was a very odd back pointer to struct super_block with
             unclear rules. For all relevant places we have other means to get
             the same information so just get rid of this.
      
         (2) Simplify rules for superblock cleanup.
      
             Roughly, everything that is allocated during fs_context
             initialization and that's stored in fs_context->s_fs_info needs
             to be cleaned up by the fs_context->free() implementation before
             the superblock allocation function has been called successfully.
      
             After sget_fc() returned fs_context->s_fs_info has been
             transferred to sb->s_fs_info at which point sb->kill_sb() if
             fully responsible for cleanup. Adhering to these rules means that
             cleanup of sb->s_fs_info in fill_super() is to be avoided as it's
             brittle and inconsistent.
      
             Cleanup shouldn't be duplicated between sb->put_super() as
             sb->put_super() is only called if sb->s_root has been set aka
             when the filesystem has been successfully born (SB_BORN). That
             complexity should be avoided.
      
             This also means that block devices are to be closed in
             sb->kill_sb() instead of sb->put_super(). More details in the
             lower section.
      
         (3) Make it possible to lookup or create a superblock before opening
             block devices
      
             There's a subtle dependency on (2) as some filesystems did rely
             on fill_super() to be called in order to correctly clean up
             sb->s_fs_info. All these filesystems have been fixed.
      
         (4) Switch most filesystem to follow the same logic as the generic
             mount code now does as outlined in (3).
      
         (5) Use the superblock as the holder of the block device. We can now
             easily go back from block device to owning superblock.
      
         (6) Export and extend the generic fs_holder_ops and use them as
             holder ops everywhere and remove the filesystem specific holder
             ops.
      
         (7) Call from the block layer up into the filesystem layer when the
             block device is removed, allowing to shut down the filesystem
             without risk of deadlocks.
      
         (8) Get rid of get_super().
      
             We can now easily go back from the block device to owning
             superblock and can call up from the block layer into the
             filesystem layer when the device is removed. So no need to wade
             through all registered superblock to find the owning superblock
             anymore"
      
      Link: https://lore.kernel.org/lkml/20230824-prall-intakt-95dbffdee4a0@brauner/
      
      * tag 'v6.6-vfs.super' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (47 commits)
        super: use higher-level helper for {freeze,thaw}
        super: wait until we passed kill super
        super: wait for nascent superblocks
        super: make locking naming consistent
        super: use locking helpers
        fs: simplify invalidate_inodes
        fs: remove get_super
        block: call into the file system for ioctl BLKFLSBUF
        block: call into the file system for bdev_mark_dead
        block: consolidate __invalidate_device and fsync_bdev
        block: drop the "busy inodes on changed media" log message
        dasd: also call __invalidate_device when setting the device offline
        amiflop: don't call fsync_bdev in FDFMTBEG
        floppy: call disk_force_media_change when changing the format
        block: simplify the disk_force_media_change interface
        nbd: call blk_mark_disk_dead in nbd_clear_sock_ioctl
        xfs use fs_holder_ops for the log and RT devices
        xfs: drop s_umount over opening the log and RT devices
        ext4: use fs_holder_ops for the log device
        ext4: drop s_umount over opening the log device
        ...
      511fb5ba
    • Linus Torvalds's avatar
      Merge tag 'v6.6-vfs.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · de16588a
      Linus Torvalds authored
      Pull misc vfs updates from Christian Brauner:
       "This contains the usual miscellaneous features, cleanups, and fixes
        for vfs and individual filesystems.
      
        Features:
      
         - Block mode changes on symlinks and rectify our broken semantics
      
         - Report file modifications via fsnotify() for splice
      
         - Allow specifying an explicit timeout for the "rootwait" kernel
           command line option. This allows to timeout and reboot instead of
           always waiting indefinitely for the root device to show up
      
         - Use synchronous fput for the close system call
      
        Cleanups:
      
         - Get rid of open-coded lockdep workarounds for async io submitters
           and replace it all with a single consolidated helper
      
         - Simplify epoll allocation helper
      
         - Convert simple_write_begin and simple_write_end to use a folio
      
         - Convert page_cache_pipe_buf_confirm() to use a folio
      
         - Simplify __range_close to avoid pointless locking
      
         - Disable per-cpu buffer head cache for isolated cpus
      
         - Port ecryptfs to kmap_local_page() api
      
         - Remove redundant initialization of pointer buf in pipe code
      
         - Unexport the d_genocide() function which is only used within core
           vfs
      
         - Replace printk(KERN_ERR) and WARN_ON() with WARN()
      
        Fixes:
      
         - Fix various kernel-doc issues
      
         - Fix refcount underflow for eventfds when used as EFD_SEMAPHORE
      
         - Fix a mainly theoretical issue in devpts
      
         - Check the return value of __getblk() in reiserfs
      
         - Fix a racy assert in i_readcount_dec
      
         - Fix integer conversion issues in various functions
      
         - Fix LSM security context handling during automounts that prevented
           NFS superblock sharing"
      
      * tag 'v6.6-vfs.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (39 commits)
        cachefiles: use kiocb_{start,end}_write() helpers
        ovl: use kiocb_{start,end}_write() helpers
        aio: use kiocb_{start,end}_write() helpers
        io_uring: use kiocb_{start,end}_write() helpers
        fs: create kiocb_{start,end}_write() helpers
        fs: add kerneldoc to file_{start,end}_write() helpers
        io_uring: rename kiocb_end_write() local helper
        splice: Convert page_cache_pipe_buf_confirm() to use a folio
        libfs: Convert simple_write_begin and simple_write_end to use a folio
        fs/dcache: Replace printk and WARN_ON by WARN
        fs/pipe: remove redundant initialization of pointer buf
        fs: Fix kernel-doc warnings
        devpts: Fix kernel-doc warnings
        doc: idmappings: fix an error and rephrase a paragraph
        init: Add support for rootwait timeout parameter
        vfs: fix up the assert in i_readcount_dec
        fs: Fix one kernel-doc comment
        docs: filesystems: idmappings: clarify from where idmappings are taken
        fs/buffer.c: disable per-CPU buffer_head cache for isolated CPUs
        vfs, security: Fix automount superblock LSM init problem, preventing NFS sb sharing
        ...
      de16588a
    • Linus Torvalds's avatar
      Merge tag 'v6.6-vfs.tmpfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · ecd7db20
      Linus Torvalds authored
      Pull libfs and tmpfs updates from Christian Brauner:
       "This cycle saw a lot of work for tmpfs that required changes to the
        vfs layer. Andrew, Hugh, and I decided to take tmpfs through vfs this
        cycle. Things will go back to mm next cycle.
      
        Features
        ========
      
         - By far the biggest work is the quota support for tmpfs. New tmpfs
           quota infrastructure is added to support it and a new QFMT_SHMEM
           uapi option is exposed.
      
           This offers user and group quotas to tmpfs (project quotas will be
           added later). Similar to other filesystems tmpfs quota are not
           supported within user namespaces yet.
      
         - Add support for user xattrs. While tmpfs already supports security
           xattrs (security.*) and POSIX ACLs for a long time it lacked
           support for user xattrs (user.*). With this pull request tmpfs will
           be able to support a limited number of user xattrs.
      
           This is accompanied by a fix (see below) to limit persistent simple
           xattr allocations.
      
         - Add support for stable directory offsets. Currently tmpfs relies on
           the libfs provided cursor-based mechanism for readdir. This causes
           issues when a tmpfs filesystem is exported via NFS.
      
           NFS clients do not open directories. Instead, each server-side
           readdir operation opens the directory, reads it, and then closes
           it. Since the cursor state for that directory is associated with
           the opened file it is discarded after each readdir operation. Such
           directory offsets are not just cached by NFS clients but also
           various userspace libraries based on these clients.
      
           As it stands there is no way to invalidate the caches when
           directory offsets have changed and the whole application depends on
           unchanging directory offsets.
      
           At LSFMM we discussed how to solve this problem and decided to
           support stable directory offsets. libfs now allows filesystems like
           tmpfs to use an xarrary to map a directory offset to a dentry. This
           mechanism is currently only used by tmpfs but can be supported by
           others as well.
      
        Fixes
        =====
      
         - Change persistent simple xattrs allocations in libfs from
           GFP_KERNEL to GPF_KERNEL_ACCOUNT so they're subject to memory
           cgroup limits. Since this is a change to libfs it affects both
           tmpfs and kernfs.
      
         - Correctly verify {g,u}id mount options.
      
           A new filesystem context is created via fsopen() which records the
           namespace that becomes the owning namespace of the superblock when
           fsconfig(FSCONFIG_CMD_CREATE) is called for filesystems that are
           mountable in namespaces. However, fsconfig() calls can occur in a
           namespace different from the namespace where fsopen() has been
           called.
      
           Currently, when fsconfig() is called to set {g,u}id mount options
           the requested {g,u}id is mapped into a k{g,u}id according to the
           namespace where fsconfig() was called from. The resulting k{g,u}id
           is not guaranteed to be resolvable in the namespace of the
           filesystem (the one that fsopen() was called in).
      
           This means it's possible for an unprivileged user to create files
           owned by any group in a tmpfs mount since it's possible to set the
           setid bits on the tmpfs directory.
      
           The contract for {g,u}id mount options and {g,u}id values in
           general set from userspace has always been that they are translated
           according to the caller's idmapping. In so far, tmpfs has been
           doing the correct thing. But since tmpfs is mountable in
           unprivileged contexts it is also necessary to verify that the
           resulting {k,g}uid is representable in the namespace of the
           superblock to avoid such bugs.
      
           The new mount api's cross-namespace delegation abilities are
           already widely used. Having talked to a bunch of userspace this is
           the most faithful solution with minimal regression risks"
      
      * tag 'v6.6-vfs.tmpfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        tmpfs,xattr: GFP_KERNEL_ACCOUNT for simple xattrs
        mm: invalidation check mapping before folio_contains
        tmpfs: trivial support for direct IO
        tmpfs,xattr: enable limited user extended attributes
        tmpfs: track free_ispace instead of free_inodes
        xattr: simple_xattr_set() return old_xattr to be freed
        tmpfs: verify {g,u}id mount options correctly
        shmem: move spinlock into shmem_recalc_inode() to fix quota support
        libfs: Remove parent dentry locking in offset_iterate_dir()
        libfs: Add a lock class for the offset map's xa_lock
        shmem: stable directory offsets
        shmem: Refactor shmem_symlink()
        libfs: Add directory operations for stable offsets
        shmem: fix quota lock nesting in huge hole handling
        shmem: Add default quota limit mount options
        shmem: quota support
        shmem: prepare shmem quota infrastructure
        quota: Check presence of quota operation structures instead of ->quota_read and ->quota_write callbacks
        shmem: make shmem_get_inode() return ERR_PTR instead of NULL
        shmem: make shmem_inode_acct_block() return error
      ecd7db20
    • Linus Torvalds's avatar
      Merge tag 'v6.6-vfs.ctime' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · 615e9583
      Linus Torvalds authored
      Pull vfs timestamp updates from Christian Brauner:
       "This adds VFS support for multi-grain timestamps and converts tmpfs,
        xfs, ext4, and btrfs to use them. This carries acks from all relevant
        filesystems.
      
        The VFS always uses coarse-grained timestamps when updating the ctime
        and mtime after a change. This has the benefit of allowing filesystems
        to optimize away a lot of metadata updates, down to around 1 per
        jiffy, even when a file is under heavy writes.
      
        Unfortunately, this has always been an issue when we're exporting via
        NFSv3, which relies on timestamps to validate caches. A lot of changes
        can happen in a jiffy, so timestamps aren't sufficient to help the
        client decide to invalidate the cache.
      
        Even with NFSv4, a lot of exported filesystems don't properly support
        a change attribute and are subject to the same problems with timestamp
        granularity. Other applications have similar issues with timestamps
        (e.g., backup applications).
      
        If we were to always use fine-grained timestamps, that would improve
        the situation, but that becomes rather expensive, as the underlying
        filesystem would have to log a lot more metadata updates.
      
        This introduces fine-grained timestamps that are used when they are
        actively queried.
      
        This uses the 31st bit of the ctime tv_nsec field to indicate that
        something has queried the inode for the mtime or ctime. When this flag
        is set, on the next mtime or ctime update, the kernel will fetch a
        fine-grained timestamp instead of the usual coarse-grained one.
      
        As POSIX generally mandates that when the mtime changes, the ctime
        must also change the kernel always stores normalized ctime values, so
        only the first 30 bits of the tv_nsec field are ever used.
      
        Filesytems can opt into this behavior by setting the FS_MGTIME flag in
        the fstype. Filesystems that don't set this flag will continue to use
        coarse-grained timestamps.
      
        Various preparatory changes, fixes and cleanups are included:
      
         - Fixup all relevant places where POSIX requires updating ctime
           together with mtime. This is a wide-range of places and all
           maintainers provided necessary Acks.
      
         - Add new accessors for inode->i_ctime directly and change all
           callers to rely on them. Plain accesses to inode->i_ctime are now
           gone and it is accordingly rename to inode->__i_ctime and commented
           as requiring accessors.
      
         - Extend generic_fillattr() to pass in a request mask mirroring in a
           sense the statx() uapi. This allows callers to pass in a request
           mask to only get a subset of attributes filled in.
      
         - Rework timestamp updates so it's possible to drop the @now
           parameter the update_time() inode operation and associated helpers.
      
         - Add inode_update_timestamps() and convert all filesystems to it
           removing a bunch of open-coding"
      
      * tag 'v6.6-vfs.ctime' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (107 commits)
        btrfs: convert to multigrain timestamps
        ext4: switch to multigrain timestamps
        xfs: switch to multigrain timestamps
        tmpfs: add support for multigrain timestamps
        fs: add infrastructure for multigrain timestamps
        fs: drop the timespec64 argument from update_time
        xfs: have xfs_vn_update_time gets its own timestamp
        fat: make fat_update_time get its own timestamp
        fat: remove i_version handling from fat_update_time
        ubifs: have ubifs_update_time use inode_update_timestamps
        btrfs: have it use inode_update_timestamps
        fs: drop the timespec64 arg from generic_update_time
        fs: pass the request_mask to generic_fillattr
        fs: remove silly warning from current_time
        gfs2: fix timestamp handling on quota inodes
        fs: rename i_ctime field to __i_ctime
        selinux: convert to ctime accessor functions
        security: convert to ctime accessor functions
        apparmor: convert to ctime accessor functions
        sunrpc: convert to ctime accessor functions
        ...
      615e9583
    • Linus Torvalds's avatar
      Merge tag 'v6.6-vfs.fs_context' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · 84ab1277
      Linus Torvalds authored
      Pull mount API updates from Christian Brauner:
       "This introduces FSCONFIG_CMD_CREATE_EXCL which allows userspace to
        implement something like
      
            $ mount -t ext4 --exclusive /dev/sda /B
      
        which fails if a superblock for the requested filesystem does already
        exist instead of silently reusing an existing superblock.
      
        Without it, in the sequence
      
            $ move-mount -f xfs -o       source=/dev/sda4 /A
            $ move-mount -f xfs -o noacl,source=/dev/sda4 /B
      
        the initial mounter will create a superblock. The second mounter will
        reuse the existing superblock, creating a bind-mount (see [1] for the
        source of the move-mount binary).
      
        The problem is that reusing an existing superblock means all mount
        options other than read-only and read-write will be silently ignored
        even if they are incompatible requests. For example, the second mount
        has requested no POSIX ACL support but since the existing superblock
        is reused POSIX ACL support will remain enabled.
      
        Such silent superblock reuse can easily become a security issue.
      
        After adding support for FSCONFIG_CMD_CREATE_EXCL to mount(8) in
        util-linux this can be fixed:
      
            $ move-mount -f xfs --exclusive -o       source=/dev/sda4 /A
            $ move-mount -f xfs --exclusive -o noacl,source=/dev/sda4 /B
            Device or resource busy | move-mount.c: 300: do_fsconfig: i xfs: reusing existing filesystem not allowed
      
        This requires the new mount api. With the old mount api it would be
        necessary to plumb this through every legacy filesystem's
        file_system_type->mount() method. If they want this feature they are
        most welcome to switch to the new mount api"
      
      Link: https://github.com/brauner/move-mount-beneath [1]
      Link: https://lore.kernel.org/linux-block/20230704-fasching-wertarbeit-7c6ffb01c83d@brauner
      Link: https://lore.kernel.org/linux-block/20230705-pumpwerk-vielversprechend-a4b1fd947b65@brauner
      Link: https://lore.kernel.org/linux-fsdevel/20230725-einnahmen-warnschilder-17779aec0a97@brauner
      Link: https://lore.kernel.org/lkml/20230824-anzog-allheilmittel-e8c63e429a79@brauner/
      
      * tag 'v6.6-vfs.fs_context' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        fs: add FSCONFIG_CMD_CREATE_EXCL
        fs: add vfs_cmd_reconfigure()
        fs: add vfs_cmd_create()
        super: remove get_tree_single_reconf()
      84ab1277
    • Ard Biesheuvel's avatar
    • Thomas Gleixner's avatar
      Merge tag 'irqchip-6.6' of... · 02362c9a
      Thomas Gleixner authored
      Merge tag 'irqchip-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core
      
      Pull irqchip updates from Marc Zyngier:
      
        - Fix for Loongsoon eiointc init error handling
      
        - Fix a bunch of warning showing up when -Wmissing-prototypes is set
      
        - A set of fixes for drivers checking for 0 as a potential return
          value from platform_get_irq()
      
        - Another set of patches converting existing code to the use of helpers
          such as of_address_count() and devm_platform_get_and_ioremap_resource()
      
        - A tree-wide cleanup of drivers including of_*.h without discrimination
      
        - Added support for the Amlogic C3 SoCs
      
      Link: https://lore.kernel.org/lkml/20230828091543.4001857-1-maz@kernel.org
      02362c9a
  3. 27 Aug, 2023 1 commit