1. 03 Aug, 2022 1 commit
    • Waiman Long's avatar
      sched, cpuset: Fix dl_cpu_busy() panic due to empty cs->cpus_allowed · b6e8d40d
      Waiman Long authored
      With cgroup v2, the cpuset's cpus_allowed mask can be empty indicating
      that the cpuset will just use the effective CPUs of its parent. So
      cpuset_can_attach() can call task_can_attach() with an empty mask.
      This can lead to cpumask_any_and() returns nr_cpu_ids causing the call
      to dl_bw_of() to crash due to percpu value access of an out of bound
      CPU value. For example:
      
      	[80468.182258] BUG: unable to handle page fault for address: ffffffff8b6648b0
      	  :
      	[80468.191019] RIP: 0010:dl_cpu_busy+0x30/0x2b0
      	  :
      	[80468.207946] Call Trace:
      	[80468.208947]  cpuset_can_attach+0xa0/0x140
      	[80468.209953]  cgroup_migrate_execute+0x8c/0x490
      	[80468.210931]  cgroup_update_dfl_csses+0x254/0x270
      	[80468.211898]  cgroup_subtree_control_write+0x322/0x400
      	[80468.212854]  kernfs_fop_write_iter+0x11c/0x1b0
      	[80468.213777]  new_sync_write+0x11f/0x1b0
      	[80468.214689]  vfs_write+0x1eb/0x280
      	[80468.215592]  ksys_write+0x5f/0xe0
      	[80468.216463]  do_syscall_64+0x5c/0x80
      	[80468.224287]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Fix that by using effective_cpus instead. For cgroup v1, effective_cpus
      is the same as cpus_allowed. For v2, effective_cpus is the real cpumask
      to be used by tasks within the cpuset anyway.
      
      Also update task_can_attach()'s 2nd argument name to cs_effective_cpus to
      reflect the change. In addition, a check is added to task_can_attach()
      to guard against the possibility that cpumask_any_and() may return a
      value >= nr_cpu_ids.
      
      Fixes: 7f51412a ("sched/deadline: Fix bandwidth check/update when migrating tasks between exclusive cpusets")
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Acked-by: default avatarJuri Lelli <juri.lelli@redhat.com>
      Link: https://lore.kernel.org/r/20220803015451.2219567-1-longman@redhat.com
      b6e8d40d
  2. 01 Aug, 2022 30 commits
    • Linus Torvalds's avatar
      Merge tag 'irq-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 9de1f9c8
      Linus Torvalds authored
      Pull irq updates from Thomas Gleixner:
       "Updates for interrupt core and drivers:
      
        Core:
      
         - Fix a few inconsistencies between UP and SMP vs interrupt
           affinities
      
         - Small updates and cleanups all over the place
      
        New drivers:
      
         - LoongArch interrupt controller
      
         - Renesas RZ/G2L interrupt controller
      
        Updates:
      
         - Hotpath optimization for SiFive PLIC
      
         - Workaround for broken PLIC edge triggered interrupts
      
         - Simall cleanups and improvements as usual"
      
      * tag 'irq-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits)
        irqchip/mmp: Declare init functions in common header file
        irqchip/mips-gic: Check the return value of ioremap() in gic_of_init()
        genirq: Use for_each_action_of_desc in actions_show()
        irqchip / ACPI: Introduce ACPI_IRQ_MODEL_LPIC for LoongArch
        irqchip: Add LoongArch CPU interrupt controller support
        irqchip: Add Loongson Extended I/O interrupt controller support
        irqchip/loongson-liointc: Add ACPI init support
        irqchip/loongson-pch-msi: Add ACPI init support
        irqchip/loongson-pch-pic: Add ACPI init support
        irqchip: Add Loongson PCH LPC controller support
        LoongArch: Prepare to support multiple pch-pic and pch-msi irqdomain
        LoongArch: Use ACPI_GENERIC_GSI for gsi handling
        genirq/generic_chip: Export irq_unmap_generic_chip
        ACPI: irq: Allow acpi_gsi_to_irq() to have an arch-specific fallback
        APCI: irq: Add support for multiple GSI domains
        LoongArch: Provisionally add ACPICA data structures
        irqdomain: Use hwirq_max instead of revmap_size for NOMAP domains
        irqdomain: Report irq number for NOMAP domains
        irqchip/gic-v3: Fix comment typo
        dt-bindings: interrupt-controller: renesas,rzg2l-irqc: Document RZ/V2L SoC
        ...
      9de1f9c8
    • Linus Torvalds's avatar
      Merge tag 'timers-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · dfea8482
      Linus Torvalds authored
      Pull timer updates from Thomas Gleixner:
       "Timers, timekeeping and related drivers update:
      
        Core:
      
         - Make wait_event_hrtimeout() aware of RT/DL tasks
      
        New drivers:
      
         - R-Car Gen4 timer
      
         - Tegra186 timer
      
         - Mediatek MT6795 CPUXGPT timer
      
        Updates:
      
         - Rework suspend/resume handling in timer drivers so it
           takes inactive clocks into account.
      
         - The usual device tree compatible add ons
      
         - Small fixed and cleanups all over the place"
      
      * tag 'timers-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
        wait: Fix __wait_event_hrtimeout for RT/DL tasks
        clocksource/drivers/sun5i: Remove unnecessary (void*) conversions
        dt-bindings: timer: allwinner,sun4i-a10-timer: Add D1 compatible
        dt-bindings: timer: ingenic,tcu: use absolute path to other schema
        clocksource/drivers/sun4i: Remove unnecessary (void*) conversions
        dt-bindings: timer: renesas,cmt: Fix R-Car Gen4 fall-out
        clocksource/drivers/tegra186: Put Kconfig option 'tristate' to 'bool'
        clocksource/drivers/timer-ti-dm: Make driver selection bool for TI K3
        clocksource/drivers/timer-ti-dm: Add compatible for am6 SoCs
        clocksource/drivers/timer-ti-dm: Make timer selectable for ARCH_K3
        clocksource/drivers/timer-ti-dm: Move inline functions to driver for am6
        clocksource/drivers/sh_cmt: Add R-Car Gen4 support
        dt-bindings: timer: renesas,cmt: R-Car V3U is R-Car Gen4
        dt-bindings: timer: renesas,cmt: Add r8a779f0 and generic Gen4 CMT support
        clocksource/drivers/timer-microchip-pit64b: Fix compilation warnings
        clocksource/drivers/timer-microchip-pit64b: Use mchp_pit64b_{suspend, resume}
        clocksource/drivers/timer-microchip-pit64b: Remove suspend/resume ops for ce
        thermal/drivers/rcar_gen3_thermal: Add r8a779f0 support
        clocksource/drivers/timer-mediatek: Implement CPUXGPT timers
        dt-bindings: timer: mediatek: Add CPUX System Timer and MT6795 compatible
        ...
      dfea8482
    • Linus Torvalds's avatar
      Merge tag 'perf-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 63e6053a
      Linus Torvalds authored
      Pull perf events updates from Ingo Molnar:
      
       - Fix Intel Alder Lake PEBS memory access latency & data source
         profiling info bugs.
      
       - Use Intel large-PEBS hardware feature in more circumstances, to
         reduce PMI overhead & reduce sampling data.
      
       - Extend the lost-sample profiling output with the PERF_FORMAT_LOST ABI
         variant, which tells tooling the exact number of samples lost.
      
       - Add new IBS register bits definitions.
      
       - AMD uncore events: Add PerfMonV2 DF (Data Fabric) enhancements.
      
      * tag 'perf-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/x86/ibs: Add new IBS register bits into header
        perf/x86/intel: Fix PEBS data source encoding for ADL
        perf/x86/intel: Fix PEBS memory access info encoding for ADL
        perf/core: Add a new read format to get a number of lost samples
        perf/x86/amd/uncore: Add PerfMonV2 RDPMC assignments
        perf/x86/amd/uncore: Add PerfMonV2 DF event format
        perf/x86/amd/uncore: Detect available DF counters
        perf/x86/amd/uncore: Use attr_update for format attributes
        perf/x86/amd/uncore: Use dynamic events array
        x86/events/intel/ds: Enable large PEBS for PERF_SAMPLE_WEIGHT_TYPE
      63e6053a
    • Linus Torvalds's avatar
      Merge tag 'locking-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 22a39c3d
      Linus Torvalds authored
      Pull locking updates from Ingo Molnar:
       "This was a fairly quiet cycle for the locking subsystem:
      
         - lockdep: Fix a handful of the more complex lockdep_init_map_*()
           primitives that can lose the lock_type & cause false reports. No
           such mishap was observed in the wild.
      
         - jump_label improvements: simplify the cross-arch support of initial
           NOP patching by making it arch-specific code (used on MIPS only),
           and remove the s390 initial NOP patching that was superfluous"
      
      * tag 'locking-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        locking/lockdep: Fix lockdep_init_map_*() confusion
        jump_label: make initial NOP patching the special case
        jump_label: mips: move module NOP patching into arch code
        jump_label: s390: avoid pointless initial NOP patching
      22a39c3d
    • Linus Torvalds's avatar
      Merge tag 'sched-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b167fdff
      Linus Torvalds authored
      Pull scheduler updates from Ingo Molnar:
      "Load-balancing improvements:
      
         - Improve NUMA balancing on AMD Zen systems for affine workloads.
      
         - Improve the handling of reduced-capacity CPUs in load-balancing.
      
         - Energy Model improvements: fix & refine all the energy fairness
           metrics (PELT), and remove the conservative threshold requiring 6%
           energy savings to migrate a task. Doing this improves power
           efficiency for most workloads, and also increases the reliability
           of energy-efficiency scheduling.
      
         - Optimize/tweak select_idle_cpu() to spend (much) less time
           searching for an idle CPU on overloaded systems. There's reports of
           several milliseconds spent there on large systems with large
           workloads ...
      
           [ Since the search logic changed, there might be behavioral side
             effects. ]
      
         - Improve NUMA imbalance behavior. On certain systems with spare
           capacity, initial placement of tasks is non-deterministic, and such
           an artificial placement imbalance can persist for a long time,
           hurting (and sometimes helping) performance.
      
           The fix is to make fork-time task placement consistent with runtime
           NUMA balancing placement.
      
           Note that some performance regressions were reported against this,
           caused by workloads that are not memory bandwith limited, which
           benefit from the artificial locality of the placement bug(s). Mel
           Gorman's conclusion, with which we concur, was that consistency is
           better than random workload benefits from non-deterministic bugs:
      
              "Given there is no crystal ball and it's a tradeoff, I think
               it's better to be consistent and use similar logic at both fork
               time and runtime even if it doesn't have universal benefit."
      
         - Improve core scheduling by fixing a bug in
           sched_core_update_cookie() that caused unnecessary forced idling.
      
         - Improve wakeup-balancing by allowing same-LLC wakeup of idle CPUs
           for newly woken tasks.
      
         - Fix a newidle balancing bug that introduced unnecessary wakeup
           latencies.
      
        ABI improvements/fixes:
      
         - Do not check capabilities and do not issue capability check denial
           messages when a scheduler syscall doesn't require privileges. (Such
           as increasing niceness.)
      
         - Add forced-idle accounting to cgroups too.
      
         - Fix/improve the RSEQ ABI to not just silently accept unknown flags.
           (No existing tooling is known to have learned to rely on the
           previous behavior.)
      
         - Depreciate the (unused) RSEQ_CS_FLAG_NO_RESTART_ON_* flags.
      
        Optimizations:
      
         - Optimize & simplify leaf_cfs_rq_list()
      
         - Micro-optimize set_nr_{and_not,if}_polling() via try_cmpxchg().
      
        Misc fixes & cleanups:
      
         - Fix the RSEQ self-tests on RISC-V and Glibc 2.35 systems.
      
         - Fix a full-NOHZ bug that can in some cases result in the tick not
           being re-enabled when the last SCHED_RT task is gone from a
           runqueue but there's still SCHED_OTHER tasks around.
      
         - Various PREEMPT_RT related fixes.
      
         - Misc cleanups & smaller fixes"
      
      * tag 'sched-core-2022-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits)
        rseq: Kill process when unknown flags are encountered in ABI structures
        rseq: Deprecate RSEQ_CS_FLAG_NO_RESTART_ON_* flags
        sched/core: Fix the bug that task won't enqueue into core tree when update cookie
        nohz/full, sched/rt: Fix missed tick-reenabling bug in dequeue_task_rt()
        sched/core: Always flush pending blk_plug
        sched/fair: fix case with reduced capacity CPU
        sched/core: Use try_cmpxchg in set_nr_{and_not,if}_polling
        sched/core: add forced idle accounting for cgroups
        sched/fair: Remove the energy margin in feec()
        sched/fair: Remove task_util from effective utilization in feec()
        sched/fair: Use the same cpumask per-PD throughout find_energy_efficient_cpu()
        sched/fair: Rename select_idle_mask to select_rq_mask
        sched, drivers: Remove max param from effective_cpu_util()/sched_cpu_util()
        sched/fair: Decay task PELT values during wakeup migration
        sched/fair: Provide u64 read for 32-bits arch helper
        sched/fair: Introduce SIS_UTIL to search idle CPU based on sum of util_avg
        sched: only perform capability check on privileged operation
        sched: Remove unused function group_first_cpu()
        sched/fair: Remove redundant word " *"
        selftests/rseq: check if libc rseq support is registered
        ...
      b167fdff
    • Linus Torvalds's avatar
      Merge tag 'slab-for-5.20_or_6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab · 0dd1cabe
      Linus Torvalds authored
      Pull slab updates from Vlastimil Babka:
      
       - An addition of 'accounted' flag to slab allocation tracepoints to
         indicate memcg_kmem accounting, by Vasily
      
       - An optimization of memcg handling in freeing paths, by Muchun
      
       - Various smaller fixes and cleanups
      
      * tag 'slab-for-5.20_or_6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
        mm/slab_common: move generic bulk alloc/free functions to SLOB
        mm/sl[au]b: use own bulk free function when bulk alloc failed
        mm: slab: optimize memcg_slab_free_hook()
        mm/tracing: add 'accounted' entry into output of allocation tracepoints
        tools/vm/slabinfo: Handle files in debugfs
        mm/slub: Simplify __kmem_cache_alias()
        mm, slab: fix bad alignments
      0dd1cabe
    • Linus Torvalds's avatar
      Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 0cec3f24
      Linus Torvalds authored
      Pull arm64 updates from Will Deacon:
       "Highlights include a major rework of our kPTI page-table rewriting
        code (which makes it both more maintainable and considerably faster in
        the cases where it is required) as well as significant changes to our
        early boot code to reduce the need for data cache maintenance and
        greatly simplify the KASLR relocation dance.
      
        Summary:
      
         - Remove unused generic cpuidle support (replaced by PSCI version)
      
         - Fix documentation describing the kernel virtual address space
      
         - Handling of some new CPU errata in Arm implementations
      
         - Rework of our exception table code in preparation for handling
           machine checks (i.e. RAS errors) more gracefully
      
         - Switch over to the generic implementation of ioremap()
      
         - Fix lockdep tracking in NMI context
      
         - Instrument our memory barrier macros for KCSAN
      
         - Rework of the kPTI G->nG page-table repainting so that the MMU
           remains enabled and the boot time is no longer slowed to a crawl
           for systems which require the late remapping
      
         - Enable support for direct swapping of 2MiB transparent huge-pages
           on systems without MTE
      
         - Fix handling of MTE tags with allocating new pages with HW KASAN
      
         - Expose the SMIDR register to userspace via sysfs
      
         - Continued rework of the stack unwinder, particularly improving the
           behaviour under KASAN
      
         - More repainting of our system register definitions to match the
           architectural terminology
      
         - Improvements to the layout of the vDSO objects
      
         - Support for allocating additional bits of HWCAP2 and exposing
           FEAT_EBF16 to userspace on CPUs that support it
      
         - Considerable rework and optimisation of our early boot code to
           reduce the need for cache maintenance and avoid jumping in and out
           of the kernel when handling relocation under KASLR
      
         - Support for disabling SVE and SME support on the kernel
           command-line
      
         - Support for the Hisilicon HNS3 PMU
      
         - Miscellanous cleanups, trivial updates and minor fixes"
      
      * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (136 commits)
        arm64: Delay initialisation of cpuinfo_arm64::reg_{zcr,smcr}
        arm64: fix KASAN_INLINE
        arm64/hwcap: Support FEAT_EBF16
        arm64/cpufeature: Store elf_hwcaps as a bitmap rather than unsigned long
        arm64/hwcap: Document allocation of upper bits of AT_HWCAP
        arm64: enable THP_SWAP for arm64
        arm64/mm: use GENMASK_ULL for TTBR_BADDR_MASK_52
        arm64: errata: Remove AES hwcap for COMPAT tasks
        arm64: numa: Don't check node against MAX_NUMNODES
        drivers/perf: arm_spe: Fix consistency of SYS_PMSCR_EL1.CX
        perf: RISC-V: Add of_node_put() when breaking out of for_each_of_cpu_node()
        docs: perf: Include hns3-pmu.rst in toctree to fix 'htmldocs' WARNING
        arm64: kasan: Revert "arm64: mte: reset the page tag in page->flags"
        mm: kasan: Skip page unpoisoning only if __GFP_SKIP_KASAN_UNPOISON
        mm: kasan: Skip unpoisoning of user pages
        mm: kasan: Ensure the tags are visible before the tag in page->flags
        drivers/perf: hisi: add driver for HNS3 PMU
        drivers/perf: hisi: Add description for HNS3 PMU driver
        drivers/perf: riscv_pmu_sbi: perf format
        perf/arm-cci: Use the bitmap API to allocate bitmaps
        ...
      0cec3f24
    • Linus Torvalds's avatar
      Merge tag 'm68k-for-v5.20-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k · a82c58cf
      Linus Torvalds authored
      Pull m68k updates from Geert Uytterhoeven:
      
       - Use RNG seed from bootinfo block on virt platform
      
       - defconfig updates
      
       - Minor fixes and improvements
      
      * tag 'm68k-for-v5.20-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
        m68k: defconfig: Update defconfigs for v5.19-rc1
        m68k: Add common forward declaration for show_registers()
        m68k: mac: Remove forward declaration for mac_nmi_handler()
        m68k: virt: Fix missing platform_device_unregister() on error in virt_platform_init()
        m68k: virt: Use RNG seed from bootinfo block
        m68k: bitops: Change __fls to return and accept unsigned long
        m68k: Kconfig.machine: Add endif comment
        m68k: Kconfig.debug: Replace single quotes
        m68k: Kconfig.cpu: Fix indentation and add endif comments
        m68k: q40: Align '*' in comments
        m68k: sun3: Use __func__ to get function's name in an output message
        m68k: mac: Fix typos in comments
        m68k: virt: Kconfig minor fixes
      a82c58cf
    • Linus Torvalds's avatar
      Merge tag 'x86_kdump_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 60ee49fa
      Linus Torvalds authored
      Pull x86 kdump updates from Borislav Petkov:
      
       - Add the ability to pass early an RNG seed to the kernel from the boot
         loader
      
       - Add the ability to pass the IMA measurement of kernel and bootloader
         to the kexec-ed kernel
      
      * tag 'x86_kdump_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/setup: Use rng seeds from setup_data
        x86/kexec: Carry forward IMA measurement log on kexec
      60ee49fa
    • Linus Torvalds's avatar
      Merge tag 'x86_build_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 8b705452
      Linus Torvalds authored
      Pull x86 build updates from Borislav Petkov:
      
       - Fix stack protector builds when cross compiling with Clang
      
       - Other Kbuild improvements and fixes
      
      * tag 'x86_build_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/purgatory: Omit use of bin2c
        x86/purgatory: Hard-code obj-y in Makefile
        x86/build: Remove unused OBJECT_FILES_NON_STANDARD_test_nx.o
        x86/Kconfig: Fix CONFIG_CC_HAS_SANE_STACKPROTECTOR when cross compiling with clang
      8b705452
    • Linus Torvalds's avatar
      Merge tag 'x86_core_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ecf9b7bf
      Linus Torvalds authored
      Pull x86 core updates from Borislav Petkov:
      
       - Have invalid MSR accesses warnings appear only once after a
         pr_warn_once() change broke that
      
       - Simplify {JMP,CALL}_NOSPEC and let the objtool retpoline patching
         infra take care of them instead of having unreadable alternative
         macros there
      
      * tag 'x86_core_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/extable: Fix ex_handler_msr() print condition
        x86,nospec: Simplify {JMP,CALL}_NOSPEC
      ecf9b7bf
    • Linus Torvalds's avatar
      Merge tag 'x86_misc_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 98b1783d
      Linus Torvalds authored
      Pull misc x86 updates from Borislav Petkov:
      
       - Add a bunch of PCI IDs for new AMD CPUs and use them in k10temp
      
       - Free the pmem platform device on the registration error path
      
      * tag 'x86_misc_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        hwmon: (k10temp): Add support for new family 17h and 19h models
        x86/amd_nb: Add AMD PCI IDs for SMN communication
        x86/pmem: Fix platform-device leak in error path
      98b1783d
    • Linus Torvalds's avatar
      Merge tag 'x86_cpu_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 42efa5e3
      Linus Torvalds authored
      Pull x86 cpu updates from Borislav Petkov:
      
       - Remove the vendor check when selecting MWAIT as the default idle
         state
      
       - Respect idle=nomwait when supplied on the kernel cmdline
      
       - Two small cleanups
      
      * tag 'x86_cpu_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/cpu: Use MSR_IA32_MISC_ENABLE constants
        x86: Fix comment for X86_FEATURE_ZEN
        x86: Remove vendor checks from prefer_mwait_c1_over_halt
        x86: Handle idle=nomwait cmdline properly for x86_idle
      42efa5e3
    • Linus Torvalds's avatar
      Merge tag 'x86_fpu_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 650ea1f6
      Linus Torvalds authored
      Pull x86 fpu update from Borislav Petkov:
      
       - Add machinery to initialize AMX register state in order for
         AMX-capable CPUs to be able to enter deeper low-power state
      
      * tag 'x86_fpu_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        intel_idle: Add a new flag to initialize the AMX state
        x86/fpu: Add a helper to prepare AMX state for low-power CPU idle
      650ea1f6
    • Linus Torvalds's avatar
      Merge tag 'x86_mm_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 92598ae2
      Linus Torvalds authored
      Pull x86 mm updates from Borislav Petkov:
      
       - Rename a PKRU macro to make more sense when reading the code
      
       - Update pkeys documentation
      
       - Avoid reading contended mm's TLB generation var if not absolutely
         necessary along with fixing a case where arch_tlbbatch_flush()
         doesn't adhere to the generation scheme and thus violates the
         conditions for the above avoidance.
      
      * tag 'x86_mm_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/mm/tlb: Ignore f->new_tlb_gen when zero
        x86/pkeys: Clarify PKRU_AD_KEY macro
        Documentation/protection-keys: Clean up documentation for User Space pkeys
        x86/mm/tlb: Avoid reading mm_tlb_gen when possible
      92598ae2
    • Linus Torvalds's avatar
      Merge tag 'x86_cleanups_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 94e37e84
      Linus Torvalds authored
      Pull x86 cleanup from Borislav Petkov:
      
       - A single CONFIG_ symbol correction in a comment
      
      * tag 'x86_cleanups_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/mm: Refer to the intended config STRICT_DEVMEM in a comment
      94e37e84
    • Linus Torvalds's avatar
      Merge tag 'x86_vmware_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · dbc1f5a9
      Linus Torvalds authored
      Pull x86 vmware cleanup from Borislav Petkov:
      
       - A single statement simplification by using the BIT() macro
      
      * tag 'x86_vmware_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/vmware: Use BIT() macro for shifting
      dbc1f5a9
    • Linus Torvalds's avatar
      Merge tag 'ras_core_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 296d3b3e
      Linus Torvalds authored
      Pull RAS update from Borislav Petkov:
       "A single RAS change:
      
         - Probe whether hardware error injection (direct MSR writes) is
           possible when injecting errors on AMD platforms. In some cases, the
           platform could prohibit those"
      
      * tag 'ras_core_for_v6.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/mce: Check whether writes to MCA_STATUS are getting ignored
      296d3b3e
    • Linus Torvalds's avatar
      Merge tag 'fs.idmapped.overlay.acl.v5.20' of... · 0fac198d
      Linus Torvalds authored
      Merge tag 'fs.idmapped.overlay.acl.v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
      
      Pull acl updates from Christian Brauner:
       "Last cycle we introduced support for mounting overlayfs on top of
        idmapped mounts. While looking into additional testing we realized
        that posix acls don't really work correctly with stacking filesystems
        on top of idmapped layers.
      
        We already knew what the fix were but it would require work that is
        more suitable for the merge window so we turned off posix acls for
        v5.19 for overlayfs on top of idmapped layers with Miklos routing my
        patch upstream in 72a8e05d ("Merge tag 'ovl-fixes-5.19-rc7' [..]").
      
        This contains the work to support posix acls for overlayfs on top of
        idmapped layers. Since the posix acl fixes should use the new
        vfs{g,u}id_t work the associated branch has been merged in. (We sent a
        pull request for this earlier.)
      
        We've also pulled in Miklos pull request containing my patch to turn
        of posix acls on top of idmapped layers. This allowed us to avoid
        rebasing the branch which we didn't like because we were already at
        rc7 by then. Merging it in allows this branch to first fix posix acls
        and then to cleanly revert the temporary fix it brought in by commit
        4a47c638 ("ovl: turn of SB_POSIXACL with idmapped layers
        temporarily").
      
        The last patch in this series adds Seth Forshee as a co-maintainer for
        idmapped mounts. Seth has been integral to all of this work and is
        also the main architect behind the filesystem idmapping work which
        ultimately made filesystems such as FUSE and overlayfs available in
        containers. He continues to be active in both development and review.
        I'm very happy he decided to help and he has my full trust. This
        increases the bus factor which is always great for work like this. I'm
        honestly very excited about this because I think in general we don't
        do great in the bringing on new maintainers department"
      
      For more explanations of the ACL issues, see
      
        https://lore.kernel.org/all/20220801145520.1532837-1-brauner@kernel.org/
      
      * tag 'fs.idmapped.overlay.acl.v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
        Add Seth Forshee as co-maintainer for idmapped mounts
        Revert "ovl: turn of SB_POSIXACL with idmapped layers temporarily"
        ovl: handle idmappings in ovl_get_acl()
        acl: make posix_acl_clone() available to overlayfs
        acl: port to vfs{g,u}id_t
        acl: move idmapped mount fixup into vfs_{g,s}etxattr()
        mnt_idmapping: add vfs[g,u]id_into_k[g,u]id()
      0fac198d
    • Linus Torvalds's avatar
      Merge tag 'fs.idmapped.vfsuid.v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux · bdfae5ce
      Linus Torvalds authored
      Pull fs idmapping updates from Christian Brauner:
       "This introduces the new vfs{g,u}id_t types we agreed on. Similar to
        k{g,u}id_t the new types are just simple wrapper structs around
        regular {g,u}id_t types.
      
        They allow to establish a type safety boundary in the VFS for idmapped
        mounts preventing confusion betwen {g,u}ids mapped into an idmapped
        mount and {g,u}ids mapped into the caller's or the filesystem's
        idmapping.
      
        An initial set of helpers is introduced that allows to operate on
        vfs{g,u}id_t types. We will remove all references to non-type safe
        idmapped mounts helpers in the very near future. The patches do
        already exist.
      
        This converts the core attribute changing codepaths which become
        significantly easier to reason about because of this change.
      
        Just a few highlights here as the patches give detailed overviews of
        what is happening in the commit messages:
      
         - The kernel internal struct iattr contains type safe vfs{g,u}id_t
           values clearly communicating that these values have to take a given
           mount's idmapping into account.
      
         - The ownership values placed in struct iattr to change ownership are
           identical for idmapped and non-idmapped mounts going forward. This
           also allows to simplify stacking filesystems such as overlayfs that
           change attributes In other words, they always represent the values.
      
         - Instead of open coding checks for whether ownership changes have
           been requested and an actual update of the inode is required we now
           have small static inline wrappers that abstract this logic away
           removing a lot of code duplication from individual filesystems that
           all open-coded the same checks"
      
      * tag 'fs.idmapped.vfsuid.v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
        mnt_idmapping: align kernel doc and parameter order
        mnt_idmapping: use new helpers in mapped_fs{g,u}id()
        fs: port HAS_UNMAPPED_ID() to vfs{g,u}id_t
        mnt_idmapping: return false when comparing two invalid ids
        attr: fix kernel doc
        attr: port attribute changes to new types
        security: pass down mount idmapping to setattr hook
        quota: port quota helpers mount ids
        fs: port to iattr ownership update helpers
        fs: introduce tiny iattr ownership update helpers
        fs: use mount types in iattr
        fs: add two type safe mapping helpers
        mnt_idmapping: add vfs{g,u}id_t
      bdfae5ce
    • Linus Torvalds's avatar
      Merge tag 'filelock-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux · e6a7cf70
      Linus Torvalds authored
      Pull file locking updates from Jeff Layton:
       "Just a couple of flock() patches from Kuniyuki Iwashima.
      
        The main change is that this moves a file_lock allocation from the
        slab to the stack"
      
      * tag 'filelock-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
        fs/lock: Rearrange ops in flock syscall.
        fs/lock: Don't allocate file_lock in flock_make_lock().
      e6a7cf70
    • Linus Torvalds's avatar
      Merge tag 'erofs-for-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs · e88745dc
      Linus Torvalds authored
      Pull erofs updates from Gao Xiang:
       "First of all, we'd like to add Yue Hu and Jeffle Xu as two new
        reviewers. Thank them for spending time working on EROFS!
      
        There is no major feature outstanding in this cycle, mainly a patchset
        I worked on to prepare for rolling hash deduplication and folios for
        compressed data as the next big features. It kills the unneeded
        PG_error flag dependency as well.
      
        Apart from that, there are bugfixes and cleanups as always. Details
        are listed below:
      
         - Add Yue Hu and Jeffle Xu as reviewers
      
         - Add the missing wake_up when updating lzma streams
      
         - Avoid consecutive detection for Highmem memory
      
         - Prepare for multi-reference pclusters and get rid of PG_error
      
         - Fix ctx->pos update for NFS export
      
         - minor cleanups"
      
      * tag 'erofs-for-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: (23 commits)
        erofs: update ctx->pos for every emitted dirent
        erofs: get rid of the leftover PAGE_SIZE in dir.c
        erofs: get rid of erofs_prepare_dio() helper
        erofs: introduce multi-reference pclusters (fully-referenced)
        erofs: record the longest decompressed size in this round
        erofs: introduce z_erofs_do_decompressed_bvec()
        erofs: try to leave (de)compressed_pages on stack if possible
        erofs: introduce struct z_erofs_decompress_backend
        erofs: get rid of `z_pagemap_global'
        erofs: clean up `enum z_erofs_collectmode'
        erofs: get rid of `enum z_erofs_page_type'
        erofs: rework online page handling
        erofs: switch compressed_pages[] to bufvec
        erofs: introduce `z_erofs_parse_in_bvecs'
        erofs: drop the old pagevec approach
        erofs: introduce bufvec to store decompressed buffers
        erofs: introduce `z_erofs_parse_out_bvecs()'
        erofs: clean up z_erofs_collector_begin()
        erofs: get rid of unneeded `inode', `map' and `sb'
        erofs: avoid consecutive detection for Highmem memory
        ...
      e88745dc
    • Linus Torvalds's avatar
      Merge tag 'fsnotify_for_v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs · bec14d79
      Linus Torvalds authored
      Pull fsnotify updates from Jan Kara:
      
       - support for FAN_MARK_IGNORE which untangles some of the not well
         defined corner cases with fanotify ignore masks
      
       - small cleanups
      
      * tag 'fsnotify_for_v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
        fsnotify: Fix comment typo
        fanotify: introduce FAN_MARK_IGNORE
        fanotify: cleanups for fanotify_mark() input validations
        fanotify: prepare for setting event flags in ignore mask
        fs: inotify: Fix typo in inotify comment
      bec14d79
    • Linus Torvalds's avatar
      Merge tag 'fs_for_v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs · af07685b
      Linus Torvalds authored
      Pull ext2 and reiserfs updates from Jan Kara:
       "A fix for ext2 handling of a corrupted fs image and cleanups in ext2
        and reiserfs"
      
      * tag 'fs_for_v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
        ext2: Add more validity checks for inode counts
        fs/reiserfs/inode: remove dead code in _get_block_create_0()
        fs/ext2: replace ternary operator with min_t()
      af07685b
    • Linus Torvalds's avatar
      Merge tag 'dlm-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm · eb43bbac
      Linus Torvalds authored
      Pull dlm updates from David Teigland:
      
       - Delay the cleanup of interrupted posix lock requests until the user
         space result arrives. Previously, the immediate cleanup would lead to
         extraneous warnings when the result arrived.
      
       - Tracepoint improvements, e.g. adding the lock resource name.
      
       - Delay the completion of lockspace creation until one full recovery
         cycle has completed. This allows more error cases to be returned to
         the caller.
      
       - Remove warnings from the locking layer about delayed network replies.
         The recently added midcomms warnings are much more useful.
      
       - Begin the process of deprecating two unused lock-timeout-related
         features. These features now require enabling via a Kconfig option,
         and enabling them triggers deprecation warnings. We expect to remove
         the code in v6.2.
      
      * tag 'dlm-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
        fs: dlm: move kref_put assert for lkb structs
        fs: dlm: don't use deprecated timeout features by default
        fs: dlm: add deprecation Kconfig and warnings for timeouts
        fs: dlm: remove timeout from dlm_user_adopt_orphan
        fs: dlm: remove waiter warnings
        fs: dlm: fix grammar in lowcomms output
        fs: dlm: add comment about lkb IFL flags
        fs: dlm: handle recovery result outside of ls_recover
        fs: dlm: make new_lockspace() wait until recovery completes
        fs: dlm: call dlm_lsop_recover_prep once
        fs: dlm: update comments about recovery and membership handling
        fs: dlm: add resource name to tracepoints
        fs: dlm: remove additional dereference of lksb
        fs: dlm: change ast and bast trace order
        fs: dlm: change posix lock sigint handling
        fs: dlm: use dlm_plock_info for do_unlock_close
        fs: dlm: change plock interrupted message to debug again
        fs: dlm: add pid to debug log
        fs: dlm: plock use list_first_entry
      eb43bbac
    • Alexander Aring's avatar
      fs: dlm: move kref_put assert for lkb structs · 95858989
      Alexander Aring authored
      The unhold_lkb() function decrements the lock's kref, and
      asserts that the ref count was not the final one.  Use the
      kref_put release function (which should not be called) to
      call the assert, rather than doing the assert based on the
      kref_put return value.  Using kill_lkb() as the release
      function doesn't make sense if we only want to assert.
      Signed-off-by: default avatarAlexander Aring <aahringo@redhat.com>
      Signed-off-by: default avatarDavid Teigland <teigland@redhat.com>
      95858989
    • Alexander Aring's avatar
      fs: dlm: don't use deprecated timeout features by default · 6b0afc0c
      Alexander Aring authored
      This patch will disable use of deprecated timeout features if
      CONFIG_DLM_DEPRECATED_API is not set.  The deprecated features
      will be removed in upcoming kernel release v6.2.
      Signed-off-by: default avatarAlexander Aring <aahringo@redhat.com>
      Signed-off-by: default avatarDavid Teigland <teigland@redhat.com>
      6b0afc0c
    • Alexander Aring's avatar
      fs: dlm: add deprecation Kconfig and warnings for timeouts · 81eeb82f
      Alexander Aring authored
      This patch adds a CONFIG_DLM_DEPRECATED_API Kconfig option
      that must be enabled to use two timeout-related features
      that we intend to remove in kernel v6.2.  Warnings are
      printed if either is enabled and used.  Neither has ever
      been used as far as we know.
      
      . The DLM_LSFL_TIMEWARN lockspace creation flag will be
        removed, along with the associated configfs entry for
        setting the timeout.  Setting the flag and configfs file
        would cause dlm to track how long locks were waiting
        for reply messages.  After a timeout, a kernel message
        would be logged, and a netlink message would be sent
        to userspace.  Recently, midcomms messages have been
        added that produce much better logging about actual
        problems with messages.  No use has ever been found
        for the netlink messages.
      
      . The userspace libdlm API has allowed the DLM_LKF_TIMEOUT
        flag with a timeout value to be set in lock requests.
        The lock request would be cancelled after the timeout.
      Signed-off-by: default avatarAlexander Aring <aahringo@redhat.com>
      Signed-off-by: default avatarDavid Teigland <teigland@redhat.com>
      81eeb82f
    • Mathieu Desnoyers's avatar
      rseq: Kill process when unknown flags are encountered in ABI structures · c17a6ff9
      Mathieu Desnoyers authored
      rseq_abi()->flags and rseq_abi()->rseq_cs->flags 29 upper bits are
      currently unused.
      
      The current behavior when those bits are set is to ignore them. This is
      not an ideal behavior, because when future features will start using
      those flags, if user-space fails to correctly validate that the kernel
      indeed supports those flags (e.g. with a new sys_rseq flags bit) before
      using them, it may incorrectly assume that the kernel will handle those
      flags way when in fact those will be silently ignored on older kernels.
      
      Validating that unused flags bits are cleared will allow a smoother
      transition when those flags will start to be used by allowing
      applications to fail early, and obviously, when they attempt to use the
      new flags on an older kernel that does not support them.
      Signed-off-by: default avatarMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Link: https://lkml.kernel.org/r/20220622194617.1155957-2-mathieu.desnoyers@efficios.com
      c17a6ff9
    • Mathieu Desnoyers's avatar
      rseq: Deprecate RSEQ_CS_FLAG_NO_RESTART_ON_* flags · 0190e419
      Mathieu Desnoyers authored
      The pretty much unused RSEQ_CS_FLAG_NO_RESTART_ON_* flags introduce
      complexity in rseq, and are subtly buggy [1]. Solving those issues
      requires introducing additional complexity in the rseq implementation
      for each supported architecture.
      
      Considering that it complexifies the rseq ABI, I am proposing that we
      deprecate those flags. [2]
      
      So far there appears to be consensus from maintainers of user-space
      projects impacted by this feature that its removal would be a welcome
      simplification. [3]
      
      The deprecation approach proposed here is to issue WARN_ON_ONCE() when
      encountering those flags and kill the offending process with sigsegv.
      This should allow us to quickly identify whether anyone yells at us for
      removing this.
      
      Link: https://lore.kernel.org/lkml/20220618182515.95831-1-minhquangbui99@gmail.com/ [1]
      Link: https://lore.kernel.org/lkml/258546133.12151.1655739550814.JavaMail.zimbra@efficios.com/ [2]
      Link: https://lore.kernel.org/lkml/87pmj1enjh.fsf@email.froward.int.ebiederm.org/ [3]
      Signed-off-by: default avatarMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Link: https://lore.kernel.org/lkml/20220622194617.1155957-1-mathieu.desnoyers@efficios.com
      0190e419
  3. 31 Jul, 2022 7 commits
  4. 30 Jul, 2022 2 commits