1. 25 Apr, 2023 17 commits
    • Linus Torvalds's avatar
      Merge tag 'timers-core-2023-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e7989789
      Linus Torvalds authored
      Pull timers and timekeeping updates from Thomas Gleixner:
      
       - Improve the VDSO build time checks to cover all dynamic relocations
      
         VDSO does not allow dynamic relocations, but the build time check is
         incomplete and fragile.
      
         It's based on architectures specifying the relocation types to search
         for and does not handle R_*_NONE relocation entries correctly.
         R_*_NONE relocations are injected by some GNU ld variants if they
         fail to determine the exact .rel[a]/dyn_size to cover trailing zeros.
         R_*_NONE relocations must be ignored by dynamic loaders, so they
         should be ignored in the build time check too.
      
         Remove the architecture specific relocation types to check for and
         validate strictly that no other relocations than R_*_NONE end up in
         the VSDO .so file.
      
       - Prefer signal delivery to the current thread for
         CLOCK_PROCESS_CPUTIME_ID based posix-timers
      
         Such timers prefer to deliver the signal to the main thread of a
         process even if the context in which the timer expires is the current
         task. This has the downside that it might wake up an idle thread.
      
         As there is no requirement or guarantee that the signal has to be
         delivered to the main thread, avoid this by preferring the current
         task if it is part of the thread group which shares sighand.
      
         This not only avoids waking idle threads, it also distributes the
         signal delivery in case of multiple timers firing in the context of
         different threads close to each other better.
      
       - Align the tick period properly (again)
      
         For a long time the tick was starting at CLOCK_MONOTONIC zero, which
         allowed users space applications to either align with the tick or to
         place a periodic computation so that it does not interfere with the
         tick. The alignement of the tick period was more by chance than by
         intention as the tick is set up before a high resolution clocksource
         is installed, i.e. timekeeping is still tick based and the tick
         period advances from there.
      
         The early enablement of sched_clock() broke this alignement as the
         time accumulated by sched_clock() is taken into account when
         timekeeping is initialized. So the base value now(CLOCK_MONOTONIC) is
         not longer a multiple of tick periods, which breaks applications
         which relied on that behaviour.
      
         Cure this by aligning the tick starting point to the next multiple of
         tick periods, i.e 1000ms/CONFIG_HZ.
      
       - A set of NOHZ fixes and enhancements:
      
           * Cure the concurrent writer race for idle and IO sleeptime
             statistics
      
             The statitic values which are exposed via /proc/stat are updated
             from the CPU local idle exit and remotely by cpufreq, but that
             happens without any form of serialization. As a consequence
             sleeptimes can be accounted twice or worse.
      
             Prevent this by restricting the accumulation writeback to the CPU
             local idle exit and let the remote access compute the accumulated
             value.
      
           * Protect idle/iowait sleep time with a sequence count
      
             Reading idle/iowait sleep time, e.g. from /proc/stat, can race
             with idle exit updates. As a consequence the readout may result
             in random and potentially going backwards values.
      
             Protect this by a sequence count, which fixes the idle time
             statistics issue, but cannot fix the iowait time problem because
             iowait time accounting races with remote wake ups decrementing
             the remote runqueues nr_iowait counter. The latter is impossible
             to fix, so the only way to deal with that is to document it
             properly and to remove the assertion in the selftest which
             triggers occasionally due to that.
      
           * Restructure struct tick_sched for better cache layout
      
           * Some small cleanups and a better cache layout for struct
             tick_sched
      
       - Implement the missing timer_wait_running() callback for POSIX CPU
         timers
      
         For unknown reason the introduction of the timer_wait_running()
         callback missed to fixup posix CPU timers, which went unnoticed for
         almost four years.
      
         While initially only targeted to prevent livelocks between a timer
         deletion and the timer expiry function on PREEMPT_RT enabled kernels,
         it turned out that fixing this for mainline is not as trivial as just
         implementing a stub similar to the hrtimer/timer callbacks.
      
         The reason is that for CONFIG_POSIX_CPU_TIMERS_TASK_WORK enabled
         systems there is a livelock issue independent of RT.
      
         CONFIG_POSIX_CPU_TIMERS_TASK_WORK=y moves the expiry of POSIX CPU
         timers out from hard interrupt context to task work, which is handled
         before returning to user space or to a VM. The expiry mechanism moves
         the expired timers to a stack local list head with sighand lock held.
         Once sighand is dropped the task can be preempted and a task which
         wants to delete a timer will spin-wait until the expiry task is
         scheduled back in. In the worst case this will end up in a livelock
         when the preempting task and the expiry task are pinned on the same
         CPU.
      
         The timer wheel has a timer_wait_running() mechanism for RT, which
         uses a per CPU timer-base expiry lock which is held by the expiry
         code and the task waiting for the timer function to complete blocks
         on that lock.
      
         This does not work in the same way for posix CPU timers as there is
         no timer base and expiry for process wide timers can run on any task
         belonging to that process, but the concept of waiting on an expiry
         lock can be used too in a slightly different way.
      
         Add a per task mutex to struct posix_cputimers_work, let the expiry
         task hold it accross the expiry function and let the deleting task
         which waits for the expiry to complete block on the mutex.
      
         In the non-contended case this results in an extra
         mutex_lock()/unlock() pair on both sides.
      
         This avoids spin-waiting on a task which is scheduled out, prevents
         the livelock and cures the problem for RT and !RT systems
      
      * tag 'timers-core-2023-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        posix-cpu-timers: Implement the missing timer_wait_running callback
        selftests/proc: Assert clock_gettime(CLOCK_BOOTTIME) VS /proc/uptime monotonicity
        selftests/proc: Remove idle time monotonicity assertions
        MAINTAINERS: Remove stale email address
        timers/nohz: Remove middle-function __tick_nohz_idle_stop_tick()
        timers/nohz: Add a comment about broken iowait counter update race
        timers/nohz: Protect idle/iowait sleep time under seqcount
        timers/nohz: Only ever update sleeptime from idle exit
        timers/nohz: Restructure and reshuffle struct tick_sched
        tick/common: Align tick period with the HZ tick.
        selftests/timers/posix_timers: Test delivery of signals across threads
        posix-timers: Prefer delivery of signals to the current thread
        vdso: Improve cmd_vdso_check to check all dynamic relocations
      e7989789
    • Linus Torvalds's avatar
      Merge tag 'irq-core-2023-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3f614ab5
      Linus Torvalds authored
      Pull interrupt updates from Thomas Gleixner:
       "Core:
      
         - Add tracepoints for tasklet callbacks which makes it possible to
           analyze individual tasklet functions instead of guess working from
           the overall duration of tasklet processing
      
         - Ensure that secondary interrupt threads have their affinity
           adjusted correctly
      
        Drivers:
      
         - A large rework of the RISC-V IPI management to prepare for a new
           RISC-V interrupt architecture
      
         - Small fixes and enhancements all over the place
      
         - Removal of support for various obsolete hardware platforms and the
           related code"
      
      * tag 'irq-core-2023-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
        irqchip/st: Remove stih415/stih416 and stid127 platforms support
        irqchip/gic-v3: Add Rockchip 3588001 erratum workaround
        genirq: Update affinity of secondary threads
        softirq: Add trace points for tasklet entry/exit
        irqchip/loongson-pch-pic: Fix pch_pic_acpi_init calling
        irqchip/loongson-pch-pic: Fix registration of syscore_ops
        irqchip/loongson-eiointc: Fix registration of syscore_ops
        irqchip/loongson-eiointc: Fix incorrect use of acpi_get_vec_parent
        irqchip/loongson-eiointc: Fix returned value on parsing MADT
        irqchip/riscv-intc: Add empty irq_eoi() for chained irq handlers
        RISC-V: Use IPIs for remote icache flush when possible
        RISC-V: Use IPIs for remote TLB flush when possible
        RISC-V: Allow marking IPIs as suitable for remote FENCEs
        RISC-V: Treat IPIs as normal Linux IRQs
        irqchip/riscv-intc: Allow drivers to directly discover INTC hwnode
        RISC-V: Clear SIP bit only when using SBI IPI operations
        irqchip/irq-sifive-plic: Add syscore callbacks for hibernation
        irqchip: Use of_property_read_bool() for boolean properties
        irqchip/bcm-6345-l1: Request memory region
        irqchip/gicv3: Workaround for NVIDIA erratum T241-FABRIC-4
        ...
      3f614ab5
    • Linus Torvalds's avatar
      Merge tag 'core-entry-2023-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 15bbeec0
      Linus Torvalds authored
      Pull core entry/ptrace update from Thomas Gleixner:
       "Provide a ptrace set/get interface for syscall user dispatch. The main
        purpose is to enable checkpoint/restore (CRIU) to handle processes
        which utilize syscall user dispatch correctly"
      
      * tag 'core-entry-2023-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        selftest, ptrace: Add selftest for syscall user dispatch config api
        ptrace: Provide set/get interface for syscall user dispatch
        syscall_user_dispatch: Untag selector address before access_ok()
        syscall_user_dispatch: Split up set_syscall_user_dispatch()
      15bbeec0
    • Linus Torvalds's avatar
      Merge tag 'core-debugobjects-2023-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 29e95a4b
      Linus Torvalds authored
      Pull core debugobjects update from Thomas Gleixner:
       "A single update to debugobjects:
      
        Prevent a race vs statically initialized objects. Such objects are
        usually not initialized via an init() function. They are special cased
        and detected on first use under the assumption that they are already
        correctly initialized via the static initializer.
      
        This works correctly unless there are two concurrent debug object
        operations on such an object.
      
        The first one detects that the object is not yet tracked and tries to
        establish a tracking object after dropping the debug objects hash
        bucket lock. The concurrent operation does the same. The one which
        wins the race ends up modifying the state of the object which makes
        the other one fail resulting in a bogus debug objects warning.
      
        Prevent this by making the detection of a static object and the
        allocation of a tracking object atomic under the hash bucket lock. So
        the first one to acquire the hash bucket lock will succeed and the
        second one will observe the correct tracking state.
      
        This race existed forever but was only exposed when the timer wheel
        code added a debug_object_assert_init() call outside of the timer base
        locked region. This replaced the previous warning about
        timer::function being NULL which had to be removed when the
        timer_shutdown() mechanics were added"
      
      * tag 'core-debugobjects-2023-04-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        debugobject: Prevent init race with static objects
      29e95a4b
    • Linus Torvalds's avatar
      Merge tag 'x86_sev_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · bc1bb2a4
      Linus Torvalds authored
      Pull x86 SEV updates from Borislav Petkov:
      
       - Add the necessary glue so that the kernel can run as a confidential
         SEV-SNP vTOM guest on Hyper-V. A vTOM guest basically splits the
         address space in two parts: encrypted and unencrypted. The use case
         being running unmodified guests on the Hyper-V confidential computing
         hypervisor
      
       - Double-buffer messages between the guest and the hardware PSP device
         so that no partial buffers are copied back'n'forth and thus potential
         message integrity and leak attacks are possible
      
       - Name the return value the sev-guest driver returns when the hw PSP
         device hasn't been called, explicitly
      
       - Cleanups
      
      * tag 'x86_sev_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/hyperv: Change vTOM handling to use standard coco mechanisms
        init: Call mem_encrypt_init() after Hyper-V hypercall init is done
        x86/mm: Handle decryption/re-encryption of bss_decrypted consistently
        Drivers: hv: Explicitly request decrypted in vmap_pfn() calls
        x86/hyperv: Reorder code to facilitate future work
        x86/ioremap: Add hypervisor callback for private MMIO mapping in coco VM
        x86/sev: Change snp_guest_issue_request()'s fw_err argument
        virt/coco/sev-guest: Double-buffer messages
        crypto: ccp: Get rid of __sev_platform_init_locked()'s local function pointer
        crypto: ccp - Name -1 return value as SEV_RET_NO_FW_CALL
      bc1bb2a4
    • Linus Torvalds's avatar
      Merge tag 'x86_paravirt_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · c42b59bf
      Linus Torvalds authored
      Pull x86 paravirt updates from Borislav Petkov:
      
       - Convert a couple of paravirt callbacks to asm to prevent
         '-fzero-call-used-regs' builds from zeroing live registers because
         paravirt hides the CALLs from the compiler so latter doesn't know
         there's a CALL in the first place
      
       - Merge two paravirt callbacks into one, as their functionality is
         identical
      
      * tag 'x86_paravirt_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/paravirt: Convert simple paravirt functions to asm
        x86/paravirt: Merge activate_mm() and dup_mmap() callbacks
      c42b59bf
    • Linus Torvalds's avatar
      Merge tag 'x86_misc_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 4a4a28fc
      Linus Torvalds authored
      Pull misc x86 updates from Borislav Petkov:
      
       - Add a x86 hw vulnerabilities section to MAINTAINERS so that the folks
         involved in it can get CCed on patches
      
       - Add some more CPUID leafs to the kcpuid tool and extend its
         functionality to be more useful when grepping for CPUID bits
      
      * tag 'x86_misc_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        MAINTAINERS: Add x86 hardware vulnerabilities section
        tools/x86/kcpuid: Dump the CPUID function in detailed view
        tools/x86/kcpuid: Update AMD leaf Fn80000001
        tools/x86/kcpuid: Fix avx512bw and avx512lvl fields in Fn00000007
      4a4a28fc
    • Linus Torvalds's avatar
      Merge tag 'x86_cpu_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e3420f98
      Linus Torvalds authored
      Pull x86 cpu model updates from Borislav Petkov:
      
       - Add Emerald Rapids to the list of Intel models supporting PPIN
      
       - Finally use a CPUID bit for split lock detection instead of
         enumerating every model
      
       - Make sure automatic IBRS is set on AMD, even though the AP bringup
         code does that now by replicating the MSR which contains the switch
      
      * tag 'x86_cpu_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/cpu: Add Xeon Emerald Rapids to list of CPUs that support PPIN
        x86/split_lock: Enumerate architectural split lock disable bit
        x86/CPU/AMD: Make sure EFER[AIBRSE] is set
      e3420f98
    • Linus Torvalds's avatar
      Merge tag 'x86_acpi_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 1699dbeb
      Linus Torvalds authored
      Pull x86 ACPI update from Borislav Petkov:
      
       - Improve code generation in ACPI's global lock's acquisition function
      
      * tag 'x86_acpi_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/ACPI/boot: Improve __acpi_acquire_global_lock
      1699dbeb
    • Linus Torvalds's avatar
      Merge tag 'ras_core_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d3464152
      Linus Torvalds authored
      Pull RAS updates from Borislav Petkov:
      
       - Just cleanups and fixes this time around: make threshold_ktype const,
         an objtool fix and use proper size for a bitmap
      
      * tag 'ras_core_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/MCE/AMD: Use an u64 for bank_map
        x86/mce: Always inline old MCA stubs
        x86/MCE/AMD: Make kobj_type structure constant
      d3464152
    • Linus Torvalds's avatar
      Merge tag 'edac_updates_for_v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras · e94ee641
      Linus Torvalds authored
      Pull EDAC updates from Borislav Petkov:
      
       - skx_edac: Fix overflow when decoding 32G DIMM ranks
      
       - i10nm_edac: Add Sierra Forest support
      
       - amd64_edac: Split driver code between legacy and SMCA systems. The
         final goal is adding support for more hw, like GPUs
      
       - The usual minor cleanups and fixes
      
      * tag 'edac_updates_for_v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: (25 commits)
        EDAC/i10nm: Add Intel Sierra Forest server support
        EDAC/amd64: Fix indentation in umc_determine_edac_cap()
        EDAC/altera: Remove MODULE_LICENSE in non-module
        EDAC: Sanitize MODULE_AUTHOR strings
        EDAC/amd81[13]1: Remove trailing newline from MODULE_AUTHOR
        EDAC/amd64: Add get_err_info() to pvt->ops
        EDAC/amd64: Split dump_misc_regs() into dct/umc functions
        EDAC/amd64: Split init_csrows() into dct/umc functions
        EDAC/amd64: Split determine_edac_cap() into dct/umc functions
        EDAC/amd64: Rename f17h_determine_edac_ctl_cap()
        EDAC/amd64: Split setup_mci_misc_attrs() into dct/umc functions
        EDAC/amd64: Split ecc_enabled() into dct/umc functions
        EDAC/amd64: Split read_mc_regs() into dct/umc functions
        EDAC/amd64: Split determine_memory_type() into dct/umc functions
        EDAC/amd64: Split read_base_mask() into dct/umc functions
        EDAC/amd64: Split prep_chip_selects() into dct/umc functions
        EDAC/amd64: Rework hw_info_{get,put}
        EDAC/amd64: Merge struct amd64_family_type into struct amd64_pvt
        EDAC/amd64: Do not discover ECC symbol size for Family 17h and later
        EDAC/amd64: Drop dbam_to_cs() for Family 17h and later
        ...
      e94ee641
    • Linus Torvalds's avatar
      Merge tag 'm68k-for-v6.4-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k · f7301270
      Linus Torvalds authored
      Pull m68k updates from Geert Uytterhoeven:
      
       - defconfig updates
      
       - miscellaneous fixes and improvements
      
      * tag 'm68k-for-v6.4-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
        m68k: kexec: Include <linux/reboot.h>
        m68k: defconfig: Update defconfigs for v6.3-rc1
        m68k: Remove obsolete config NO_KERNEL_MSG
        nubus: Drop noop match function
      f7301270
    • Linus Torvalds's avatar
      Merge tag 'pull-nios2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 173ea743
      Linus Torvalds authored
      Pull trivial nios2 cleanup from Al Viro.
      
      * tag 'pull-nios2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        nios2: _TIF_ALLWORK_MASK is unused
      173ea743
    • Linus Torvalds's avatar
      Merge tag 'pull-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 181b69dd
      Linus Torvalds authored
      Pull misc vfs pile from Al Viro.
      
      Random minor cleanups.
      
      * tag 'pull-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        fs: Fix description of vfs_tmpfile()
        sysv: switch to put_and_unmap_page()
        fs/sysv: Don't round down address for kunmap_flush_on_unmap()
      181b69dd
    • Linus Torvalds's avatar
      Merge tag 'pull-old-dio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 11b32219
      Linus Torvalds authored
      Pull legacy dio cleanup from Al Viro.
      
      * tag 'pull-old-dio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        __blockdev_direct_IO(): get rid of submit_io callback
      11b32219
    • Linus Torvalds's avatar
      Merge tag 'pull-write-one-page' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 0e497ad5
      Linus Torvalds authored
      Pull vfs write_one_page removal from Al Viro:
       "write_one_page series"
      
      * tag 'pull-write-one-page' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        mm,jfs: move write_one_page/folio_write_one to jfs
        ocfs2: don't use write_one_page in ocfs2_duplicate_clusters_by_page
        ufs: don't flush page immediately for DIRSYNC directories
      0e497ad5
    • Linus Torvalds's avatar
      Merge tag 'pull-fd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · ef36b9af
      Linus Torvalds authored
      Pull vfs fget updates from Al Viro:
       "fget() to fdget() conversions"
      
      * tag 'pull-fd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        fuse_dev_ioctl(): switch to fdget()
        cgroup_get_from_fd(): switch to fdget_raw()
        bpf: switch to fdget_raw()
        build_mount_idmapped(): switch to fdget()
        kill the last remaining user of proc_ns_fget()
        SVM-SEV: convert the rest of fget() uses to fdget() in there
        convert sgx_set_attribute() to fdget()/fdput()
        convert setns(2) to fdget()/fdput()
      ef36b9af
  2. 24 Apr, 2023 23 commits
    • Linus Torvalds's avatar
      Merge tag 'erofs-for-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs · 61d325dc
      Linus Torvalds authored
      Pull erofs updates from Gao Xiang:
       "In this cycle, sub-page block support for uncompressed files is
        available. It's mainly used to enable original signing ('golden')
        4k-block images on arm64 with 16/64k pages. In addition, end users
        could also use this feature to build a manifest to directly refer to
        golden tar data.
      
        Besides, long xattr name prefix support is also introduced in this
        cycle to avoid too many xattrs with the same prefix (e.g. overlayfs
        xattrs). It's useful for erofs + overlayfs combination (like Composefs
        model): the image size is reduced by ~14% and runtime performance is
        also slightly improved.
      
        Others are random fixes and cleanups as usual.
      
        Summary:
      
         - Add sub-page block size support for uncompressed files
      
         - Support flattened block device for multi-blob images to be attached
           into virtual machines (including cloud servers) and bare metals
      
         - Support long xattr name prefixes to optimize images with common
           xattr namespaces (e.g. files with overlayfs xattrs) use cases
      
         - Various minor cleanups & fixes"
      
      * tag 'erofs-for-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
        erofs: cleanup i_format-related stuffs
        erofs: sunset erofs_dbg()
        erofs: fix potential overflow calculating xattr_isize
        erofs: get rid of z_erofs_fill_inode()
        erofs: enable long extended attribute name prefixes
        erofs: handle long xattr name prefixes properly
        erofs: add helpers to load long xattr name prefixes
        erofs: introduce on-disk format for long xattr name prefixes
        erofs: move packed inode out of the compression part
        erofs: keep meta inode into erofs_buf
        erofs: initialize packed inode after root inode is assigned
        erofs: stop parsing non-compact HEAD index if clusterofs is invalid
        erofs: don't warn ztailpacking feature anymore
        erofs: simplify erofs_xattr_generic_get()
        erofs: rename init_inode_xattrs with erofs_ prefix
        erofs: move several xattr helpers into xattr.c
        erofs: tidy up EROFS on-disk naming
        erofs: support flattened block device for multi-blob images
        erofs: set block size to the on-disk block size
        erofs: avoid hardcoded blocksize for subpage block support
      61d325dc
    • Linus Torvalds's avatar
      Merge tag 'v6.4/vfs.open' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · 97adb49f
      Linus Torvalds authored
      Pull vfs open fixlet from Christian Brauner:
       "EINVAL ist keinmal: This contains the changes to make O_DIRECTORY when
        specified together with O_CREAT an invalid request.
      
        The wider background is that a regression report about the behavior of
        O_DIRECTORY | O_CREAT was sent to fsdevel about a behavior that was
        changed multiple years and LTS releases earlier during v5.7
        development.
      
        This has also been covered in
      
              https://lwn.net/Articles/926782/
      
        which provides an excellent summary of the discussion"
      
      * tag 'v6.4/vfs.open' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        open: return EINVAL for O_DIRECTORY | O_CREAT
      97adb49f
    • Linus Torvalds's avatar
      Merge tag 'v6.4/vfs.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · e2eff52c
      Linus Torvalds authored
      Pull misc vfs updates from Christian Brauner:
       "This contains a pile of various smaller fixes. Most of them aren't
        very interesting so this just highlights things worth mentioning:
      
         - Various filesystems contained the same little helper to convert
           from the mode of a dentry to the DT_* type of that dentry.
      
           They have now all been switched to rely on the generic
           fs_umode_to_dtype() helper. All custom helpers are removed (Jeff)
      
         - Fsnotify now reports ACCESS and MODIFY events for splice
           (Chung-Chiang Cheng)
      
         - After converting timerfd a long time ago to rely on
           wait_event_interruptible_*() apis, convert eventfd as well. This
           removes the complex open-coded wait code (Wen Yang)
      
         - Simplify sysctl registration for devpts, avoiding the declaration
           of two tables. Instead, just use a prefixed path with
           register_sysctl() (Luis)
      
         - The setattr_should_drop_sgid() helper is now exported so NFS can
           use it. By switching NFS to this helper an NFS setgid inheritance
           bug is fixed (me)"
      
      * tag 'v6.4/vfs.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        fs: hfsplus: remove WARN_ON() from hfsplus_cat_{read,write}_inode()
        pnode: pass mountpoint directly
        eventfd: use wait_event_interruptible_locked_irq() helper
        splice: report related fsnotify events
        fs: consolidate duplicate dt_type helpers
        nfs: use vfs setgid helper
        Update relatime comments to include equality
        fs/buffer: Remove redundant assignment to err
        fs_context: drop the unused lsm_flags member
        fs/namespace: fnic: Switch to use %ptTd
        Documentation: update idmappings.rst
        devpts: simplify two-level sysctl registration for pty_kern_table
        eventpoll: align comment with nested epoll limitation
      e2eff52c
    • Linus Torvalds's avatar
      Merge tag 'v6.4/vfs.acl' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · 7bcff5a3
      Linus Torvalds authored
      Pull acl updates from Christian Brauner:
       "After finishing the introduction of the new posix acl api last cycle
        the generic POSIX ACL xattr handlers are still around in the
        filesystems xattr handlers for two reasons:
      
         (1) Because a few filesystems rely on the ->list() method of the
             generic POSIX ACL xattr handlers in their ->listxattr() inode
             operation.
      
         (2) POSIX ACLs are only available if IOP_XATTR is raised. The
             IOP_XATTR flag is raised in inode_init_always() based on whether
             the sb->s_xattr pointer is non-NULL. IOW, the registered xattr
             handlers of the filesystem are used to raise IOP_XATTR. Removing
             the generic POSIX ACL xattr handlers from all filesystems would
             risk regressing filesystems that only implement POSIX ACL support
             and no other xattrs (nfs3 comes to mind).
      
        This contains the work to decouple POSIX ACLs from the IOP_XATTR flag
        as they don't depend on xattr handlers anymore. So it's now possible
        to remove the generic POSIX ACL xattr handlers from the sb->s_xattr
        list of all filesystems. This is a crucial step as the generic POSIX
        ACL xattr handlers aren't used for POSIX ACLs anymore and POSIX ACLs
        don't depend on the xattr infrastructure anymore.
      
        Adressing problem (1) will require more long-term work. It would be
        best to get rid of the ->list() method of xattr handlers completely at
        some point.
      
        For erofs, ext{2,4}, f2fs, jffs2, ocfs2, and reiserfs the nop POSIX
        ACL xattr handler is kept around so they can continue to use
        array-based xattr handler indexing.
      
        This update does simplify the ->listxattr() implementation of all
        these filesystems however"
      
      * tag 'v6.4/vfs.acl' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        acl: don't depend on IOP_XATTR
        ovl: check for ->listxattr() support
        reiserfs: rework priv inode handling
        fs: rename generic posix acl handlers
        reiserfs: rework ->listxattr() implementation
        fs: simplify ->listxattr() implementation
        fs: drop unused posix acl handlers
        xattr: remove unused argument
        xattr: add listxattr helper
        xattr: simplify listxattr helpers
      7bcff5a3
    • Linus Torvalds's avatar
      Merge tag 'v6.4/pidfd.file' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux · ec40758b
      Linus Torvalds authored
      Pull pidfd updates from Christian Brauner:
       "This adds a new pidfd_prepare() helper which allows the caller to
        reserve a pidfd number and allocates a new pidfd file that stashes the
        provided struct pid.
      
        It should be avoided installing a file descriptor into a task's file
        descriptor table just to close it again via close_fd() in case an
        error occurs. The fd has been visible to userspace and might already
        be in use. Instead, a file descriptor should be reserved but not
        installed into the caller's file descriptor table.
      
        If another failure path is hit then the reserved file descriptor and
        file can just be put without any userspace visible side-effects. And
        if all failure paths are cleared the file descriptor and file can be
        installed into the task's file descriptor table.
      
        This helper is now used in all places that open coded this
        functionality before. For example, this is currently done during
        copy_process() and fanotify used pidfd_create(), which returns a pidfd
        that has already been made visibile in the caller's file descriptor
        table, but then closed it using close_fd().
      
        In one of the next merge windows there is also new functionality
        coming to unix domain sockets that will have to rely on
        pidfd_prepare()"
      
      * tag 'v6.4/pidfd.file' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
        fanotify: use pidfd_prepare()
        fork: use pidfd_prepare()
        pid: add pidfd_prepare()
      ec40758b
    • Linus Torvalds's avatar
      Merge tag 'v6.4/kernel.user_worker' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux · 3323ddce
      Linus Torvalds authored
      Pull user work thread updates from Christian Brauner:
       "This contains the work generalizing the ability to create a kernel
        worker from a userspace process.
      
        Such user workers will run with the same credentials as the userspace
        process they were created from providing stronger security and
        accounting guarantees than the traditional override_creds() approach
        ever could've hoped for.
      
        The original work was heavily based and optimzed for the needs of
        io_uring which was the first user. However, as it quickly turned out
        the ability to create user workers inherting properties from a
        userspace process is generally useful.
      
        The vhost subsystem currently creates workers using the kthread api.
        The consequences of using the kthread api are that RLIMITs don't work
        correctly as they are inherited from khtreadd. This leads to bugs
        where more workers are created than would be allowed by the RLIMITs of
        the userspace process in lieu of which workers are created.
      
        Problems like this disappear with user workers created from the
        userspace processes for which they perform the work. In addition,
        providing this api allows vhost to remove additional complexity. For
        example, cgroup and mm sharing will just work out of the box with user
        workers based on the relevant userspace process instead of manually
        ensuring the correct cgroup and mm contexts are used.
      
        So the vhost subsystem should simply be made to use the same mechanism
        as io_uring. To this end the original mechanism used for
        create_io_thread() is generalized into user workers:
      
         - Introduce PF_USER_WORKER as a generic indicator that a given task
           is a user worker, i.e., a kernel task that was created from a
           userspace process. Now a PF_IO_WORKER thread is just a specialized
           version of PF_USER_WORKER. So io_uring io workers raise both flags.
      
         - Make copy_process() available to core kernel code
      
         - Extend struct kernel_clone_args with the following bitfields
           allowing to indicate to copy_process():
             - to create a user worker (raise PF_USER_WORKER)
             - to not inherit any files from the userspace process
             - to ignore signals
      
        After all generic changes are in place the vhost subsystem implements
        a new dedicated vhost api based on user workers. Finally, vhost is
        switched to rely on the new api moving it off of kthreads.
      
        Thanks to Mike for sticking it out and making it through this rather
        arduous journey"
      
      * tag 'v6.4/kernel.user_worker' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
        vhost: use vhost_tasks for worker threads
        vhost: move worker thread fields to new struct
        vhost_task: Allow vhost layer to use copy_process
        fork: allow kernel code to call copy_process
        fork: Add kernel_clone_args flag to ignore signals
        fork: add kernel_clone_args flag to not dup/clone files
        fork/vm: Move common PF_IO_WORKER behavior to new flag
        kernel: Make io_thread and kthread bit fields
        kthread: Pass in the thread's name during creation
        kernel: Allow a kernel thread's name to be set in copy_process
        csky: Remove kernel_thread declaration
      3323ddce
    • Linus Torvalds's avatar
      Merge tag 'v6.4/kernel.clone3.tests' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux · a632b76b
      Linus Torvalds authored
      Pull clone3 selftest fix from Christian Brauner:
       "This is a single fix to the clone3() selftstests.
      
        It fell through the sefltest tree cracks a few times so I'll provide
        it here. It has low urgency but we should still correctly report the
        number of tests"
      
      * tag 'v6.4/kernel.clone3.tests' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
        selftests/clone3: fix number of tests in ksft_set_plan
      a632b76b
    • Linus Torvalds's avatar
      Merge tag 'docs-6.4' of git://git.lwn.net/linux · c23f2897
      Linus Torvalds authored
      Pull documentation updates from Jonathan Corbet:
       "Commit volume in documentation is relatively low this time, but there
        is still a fair amount going on, including:
      
         - Reorganize the architecture-specific documentation under
           Documentation/arch
      
           This makes the structure match the source directory and helps to
           clean up the mess that is the top-level Documentation directory a
           bit. This work creates the new directory and moves x86 and most of
           the less-active architectures there.
      
           The current plan is to move the rest of the architectures in 6.5,
           with the patches going through the appropriate subsystem trees.
      
         - Some more Spanish translations and maintenance of the Italian
           translation
      
         - A new "Kernel contribution maturity model" document from Ted
      
         - A new tutorial on quickly building a trimmed kernel from Thorsten
      
        Plus the usual set of updates and fixes"
      
      * tag 'docs-6.4' of git://git.lwn.net/linux: (47 commits)
        media: Adjust column width for pdfdocs
        media: Fix building pdfdocs
        docs: clk: add documentation to log which clocks have been disabled
        docs: trace: Fix typo in ftrace.rst
        Documentation/process: always CC responsible lists
        docs: kmemleak: adjust to config renaming
        ELF: document some de-facto PT_* ABI quirks
        Documentation: arm: remove stih415/stih416 related entries
        docs: turn off "smart quotes" in the HTML build
        Documentation: firmware: Clarify firmware path usage
        docs/mm: Physical Memory: Fix grammar
        Documentation: Add document for false sharing
        dma-api-howto: typo fix
        docs: move m68k architecture documentation under Documentation/arch/
        docs: move parisc documentation under Documentation/arch/
        docs: move ia64 architecture docs under Documentation/arch/
        docs: Move arc architecture docs under Documentation/arch/
        docs: move nios2 documentation under Documentation/arch/
        docs: move openrisc documentation under Documentation/arch/
        docs: move superh documentation under Documentation/arch/
        ...
      c23f2897
    • Linus Torvalds's avatar
      Merge tag 'linux-kselftest-kunit-6.4-rc1' of... · 1be89faa
      Linus Torvalds authored
      Merge tag 'linux-kselftest-kunit-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
      
      Pull KUnit updates from Shuah Khan:
      
       - several fixes to kunit tool
      
       - new klist structure test
      
       - support for m68k under QEMU
      
       - support for overriding the QEMU serial port
      
       - support for SH under QEMU
      
      * tag 'linux-kselftest-kunit-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
        kunit: add tests for using current KUnit test field
        kunit: tool: Add support for SH under QEMU
        kunit: tool: Add support for overriding the QEMU serial port
        .gitignore: Unignore .kunitconfig
        list: test: Test the klist structure
        kunit: increase KUNIT_LOG_SIZE to 2048 bytes
        kunit: Use gfp in kunit_alloc_resource() kernel-doc
        kunit: tool: fix pre-existing `mypy --strict` errors and update run_checks.py
        kunit: tool: remove unused imports and variables
        kunit: tool: add subscripts for type annotations where appropriate
        kunit: fix bug of extra newline characters in debugfs logs
        kunit: fix bug in the order of lines in debugfs logs
        kunit: fix bug in debugfs logs of parameterized tests
        kunit: tool: Add support for m68k under QEMU
      1be89faa
    • Linus Torvalds's avatar
      Merge tag 'linux-kselftest-next-6.4-rc1' of... · 0f50767d
      Linus Torvalds authored
      Merge tag 'linux-kselftest-next-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
      
      Pull Kselftest updates from Shuah Khan:
      
       - several patches to enhance and fix resctrl test
      
       - nolibc support for kselftest with an addition to vprintf() to
         tools/nolibc/stdio and related test changes
      
       - Refactor 'peeksiginfo' ptrace test part
      
       - add 'malloc' failures checks in cgroup test_memcontrol
      
       - a new prctl test
      
       - enhancements sched test with additional ore schedule prctl calls
      
      * tag 'linux-kselftest-next-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (25 commits)
        selftests/resctrl: Fix incorrect error return on test complete
        selftests/resctrl: Remove duplicate codes that clear each test result file
        selftests/resctrl: Commonize the signal handler register/unregister for all tests
        selftests/resctrl: Cleanup properly when an error occurs in CAT test
        selftests/resctrl: Flush stdout file buffer before executing fork()
        selftests/resctrl: Return MBA check result and make it to output message
        selftests/resctrl: Fix set up schemata with 100% allocation on first run in MBM test
        selftests/resctrl: Use correct exit code when tests fail
        kselftest/arm64: Convert za-fork to use kselftest.h
        kselftest: Support nolibc
        tools/nolibc/stdio: Implement vprintf()
        selftests/resctrl: Correct get_llc_perf() param in function comment
        selftests/resctrl: Use remount_resctrlfs() consistently with boolean
        selftests/resctrl: Change name from CBM_MASK_PATH to INFO_PATH
        selftests/resctrl: Change initialize_llc_perf() return type to void
        selftests/resctrl: Replace obsolete memalign() with posix_memalign()
        selftests/resctrl: Check for return value after write_schemata()
        selftests/resctrl: Allow ->setup() to return errors
        selftests/resctrl: Move ->setup() call outside of test specific branches
        selftests/resctrl: Return NULL if malloc_and_init_memory() did not alloc mem
        ...
      0f50767d
    • Linus Torvalds's avatar
      Merge tag 'rcu.6.4.april5.2023.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux · 5dfb75e8
      Linus Torvalds authored
      Pull RCU updates from Joel Fernandes:
      
       - Updates and additions to MAINTAINERS files, with Boqun being added to
         the RCU entry and Zqiang being added as an RCU reviewer.
      
         I have also transitioned from reviewer to maintainer; however, Paul
         will be taking over sending RCU pull-requests for the next merge
         window.
      
       - Resolution of hotplug warning in nohz code, achieved by fixing
         cpu_is_hotpluggable() through interaction with the nohz subsystem.
      
         Tick dependency modifications by Zqiang, focusing on fixing usage of
         the TICK_DEP_BIT_RCU_EXP bitmask.
      
       - Avoid needless calls to the rcu-lazy shrinker for CONFIG_RCU_LAZY=n
         kernels, fixed by Zqiang.
      
       - Improvements to rcu-tasks stall reporting by Neeraj.
      
       - Initial renaming of k[v]free_rcu() to k[v]free_rcu_mightsleep() for
         increased robustness, affecting several components like mac802154,
         drbd, vmw_vmci, tracing, and more.
      
         A report by Eric Dumazet showed that the API could be unknowingly
         used in an atomic context, so we'd rather make sure they know what
         they're asking for by being explicit:
      
            https://lore.kernel.org/all/20221202052847.2623997-1-edumazet@google.com/
      
       - Documentation updates, including corrections to spelling,
         clarifications in comments, and improvements to the srcu_size_state
         comments.
      
       - Better srcu_struct cache locality for readers, by adjusting the size
         of srcu_struct in support of SRCU usage by Christoph Hellwig.
      
       - Teach lockdep to detect deadlocks between srcu_read_lock() vs
         synchronize_srcu() contributed by Boqun.
      
         Previously lockdep could not detect such deadlocks, now it can.
      
       - Integration of rcutorture and rcu-related tools, targeted for v6.4
         from Boqun's tree, featuring new SRCU deadlock scenarios, test_nmis
         module parameter, and more
      
       - Miscellaneous changes, various code cleanups and comment improvements
      
      * tag 'rcu.6.4.april5.2023.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux: (71 commits)
        checkpatch: Error out if deprecated RCU API used
        mac802154: Rename kfree_rcu() to kvfree_rcu_mightsleep()
        rcuscale: Rename kfree_rcu() to kfree_rcu_mightsleep()
        ext4/super: Rename kfree_rcu() to kfree_rcu_mightsleep()
        net/mlx5: Rename kfree_rcu() to kfree_rcu_mightsleep()
        net/sysctl: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
        lib/test_vmalloc.c: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
        tracing: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
        misc: vmw_vmci: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
        drbd: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
        rcu: Protect rcu_print_task_exp_stall() ->exp_tasks access
        rcu: Avoid stack overflow due to __rcu_irq_enter_check_tick() being kprobe-ed
        rcu-tasks: Report stalls during synchronize_srcu() in rcu_tasks_postscan()
        rcu: Permit start_poll_synchronize_rcu_expedited() to be invoked early
        rcu: Remove never-set needwake assignment from rcu_report_qs_rdp()
        rcu: Register rcu-lazy shrinker only for CONFIG_RCU_LAZY=y kernels
        rcu: Fix missing TICK_DEP_MASK_RCU_EXP dependency check
        rcu: Fix set/clear TICK_DEP_BIT_RCU_EXP bitmask race
        rcu/trace: use strscpy() to instead of strncpy()
        tick/nohz: Fix cpu_is_hotpluggable() by checking with nohz subsystem
        ...
      5dfb75e8
    • Linus Torvalds's avatar
      Merge tag 'nolibc.2023.04.04a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu · 5d77652f
      Linus Torvalds authored
      Pull nolibc updates from Paul McKenney:
      
       - Add support for loongarch
      
       - Fix stack-protector issues
      
       - Support additional integral types and signal-related macros
      
       - Add support for stdin, stdout, and stderr
      
       - Add getuid() and geteuid()
      
       - Allow S_I* macros to be overridden by program
      
       - Defer to linux/fcntl.h and linux/stat.h to avoid duplicate
         definitions
      
       - Many improvements to the selftests
      
      * tag 'nolibc.2023.04.04a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (22 commits)
        tools/nolibc: x86_64: add stackprotector support
        tools/nolibc: i386: add stackprotector support
        tools/nolibc: tests: add test for -fstack-protector
        tools/nolibc: tests: fold in no-stack-protector cflags
        tools/nolibc: add support for stack protector
        tools/nolibc: tests: constify test_names
        tools/nolibc: add helpers for wait() signal exits
        tools/nolibc: add definitions for standard fds
        selftests/nolibc: Adjust indentation for Makefile
        selftests/nolibc: Add support for LoongArch
        tools/nolibc: Add support for LoongArch
        tools/nolibc: Add statx() and make stat() rely on statx() if necessary
        tools/nolibc: Include linux/fcntl.h and remove duplicate code
        tools/nolibc: check for S_I* macros before defining them
        selftests/nolibc: skip the chroot_root and link_dir tests when not privileged
        tools/nolibc: add getuid() and geteuid()
        tools/nolibc: add tests for the integer limits in stdint.h
        tools/nolibc: enlarge column width of tests
        tools/nolibc: add integer types and integer limit macros
        tools/nolibc: add stdint.h
        ...
      5d77652f
    • Linus Torvalds's avatar
      Merge tag 'locktorture.2023.04.04a' of... · 4a4075ad
      Linus Torvalds authored
      Merge tag 'locktorture.2023.04.04a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
      
      Pull locktorture updates from Paul McKenney:
       "This adds tests for nested locking and also adds support for testing
        raw spinlocks in PREEMPT_RT kernels"
      
      * tag 'locktorture.2023.04.04a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
        locktorture: Add raw_spinlock* torture tests for PREEMPT_RT kernels
        locktorture: With nested locks, occasionally skip main lock
        locktorture: Add nested locking to rtmutex torture tests
        locktorture: Add nested locking to mutex torture tests
        locktorture: Add nested_[un]lock() hooks and nlocks parameter
      4a4075ad
    • Linus Torvalds's avatar
      Merge tag 'lkmm-scripting.2023.04.07a' of... · 60eb4507
      Linus Torvalds authored
      Merge tag 'lkmm-scripting.2023.04.07a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
      
      Pull Linux Kernel Memory Model scripting updates from Paul McKenney:
       "This improves litmus-test documentation and improves the ability to do
        before/after tests on the https://github.com/paulmckrcu/litmus repo"
      
      * tag 'lkmm-scripting.2023.04.07a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (32 commits)
        tools/memory-model: Remove out-of-date SRCU documentation
        tools/memory-model: Document LKMM test procedure
        tools/memory-model: Use "grep -E" instead of "egrep"
        tools/memory-model: Use "-unroll 0" to keep --hw runs finite
        tools/memory-model: Make judgelitmus.sh handle scripted Result: tag
        tools/memory-model: Add data-race capabilities to judgelitmus.sh
        tools/memory-model: Add checktheselitmus.sh to run specified litmus tests
        tools/memory-model: Repair parseargs.sh header comment
        tools/memory-model:  Add "--" to parseargs.sh for additional arguments
        tools/memory-model: Make history-check scripts use mselect7
        tools/memory-model: Make checkghlitmus.sh use mselect7
        tools/memory-model: Fix scripting --jobs argument
        tools/memory-model: Implement --hw support for checkghlitmus.sh
        tools/memory-model: Add -v flag to jingle7 runs
        tools/memory-model: Make runlitmus.sh check for jingle errors
        tools/memory-model: Allow herd to deduce CPU type
        tools/memory-model: Keep assembly-language litmus tests
        tools/memory-model: Move from .AArch64.litmus.out to .litmus.AArch.out
        tools/memory-model: Make runlitmus.sh generate .litmus.out for --hw
        tools/memory-model: Split runlitmus.sh out of checklitmus.sh
        ...
      60eb4507
    • Linus Torvalds's avatar
      Merge tag 'lkmm.2023.04.07a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu · 40603735
      Linus Torvalds authored
      Pull Linux Kernel Memory Model updates from Paul McKenney
       "This improves LKMM diagnostic messages, unifies handling of the
        ordering produced by unlock/lock pairs, adds support for the
        smp_mb__after_srcu_read_unlock() macro, removes redundant members from
        the to-r relation, brings SRCU read-side semantics into alignment with
        Linux-kernel SRCU, makes ppo a subrelation of po, and improves
        documentation"
      
      * tag 'lkmm.2023.04.07a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
        Documentation: litmus-tests: Correct spelling
        tools/memory-model: Add documentation about SRCU read-side critical sections
        tools/memory-model: Make ppo a subrelation of po
        tools/memory-model: Provide exact SRCU semantics
        tools/memory-model: Restrict to-r to read-read address dependency
        tools/memory-model: Add smp_mb__after_srcu_read_unlock()
        tools/memory-model: Unify UNLOCK+LOCK pairings to po-unlock-lock-po
        tools/memory-model: Update some warning labels
      40603735
    • Linus Torvalds's avatar
      Merge tag 'kcsan.2023.04.04a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu · 022e3209
      Linus Torvalds authored
      Pull KCSAN updates from Paul McKenney:
       "Kernel concurrency sanitizer (KCSAN) updates for v6.4
      
        This fixes kernel-doc warnings and also updates instrumentation from
        READ_ONCE() to volatile in order to avoid unaligned load-acquire
        instructions on arm64 in kernels built with LTO"
      
      * tag 'kcsan.2023.04.04a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
        kcsan: Avoid READ_ONCE() in read_instrumented_memory()
        instrumented.h: Fix all kernel-doc format warnings
      022e3209
    • Linus Torvalds's avatar
      Merge tag 'tpmdd-v6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd · 1a0beef9
      Linus Torvalds authored
      Pull tpm updates from Jarkko Sakkinen:
      
       - The .machine keyring, used for Machine Owner Keys (MOK), acquired the
         ability to store only CA enforced keys, and put rest to the .platform
         keyring, thus separating the code signing keys from the keys that are
         used to sign certificates.
      
         This essentially unlocks the use of the .machine keyring as a trust
         anchor for IMA. It is an opt-in feature, meaning that the additional
         contraints won't brick anyone who does not care about them.
      
       - Enable interrupt based transactions with discrete TPM chips (tpm_tis).
      
         There was code for this existing but it never really worked so I
         consider this a new feature rather than a bug fix. Before the driver
         just fell back to the polling mode.
      
      Link: https://lore.kernel.org/linux-integrity/a93b6222-edda-d43c-f010-a59701f2aeef@gmx.de/
      Link: https://lore.kernel.org/linux-integrity/20230302164652.83571-1-eric.snowberg@oracle.com/
      
      * tag 'tpmdd-v6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: (29 commits)
        tpm: Add !tpm_amd_is_rng_defective() to the hwrng_unregister() call site
        tpm_tis: fix stall after iowrite*()s
        tpm/tpm_tis_synquacer: Convert to platform remove callback returning void
        tpm/tpm_tis: Convert to platform remove callback returning void
        tpm/tpm_ftpm_tee: Convert to platform remove callback returning void
        tpm: tpm_tis_spi: Mark ACPI and OF related data as maybe unused
        tpm: st33zp24: Mark ACPI and OF related data as maybe unused
        tpm, tpm_tis: Enable interrupt test
        tpm, tpm_tis: startup chip before testing for interrupts
        tpm, tpm_tis: Claim locality when interrupts are reenabled on resume
        tpm, tpm_tis: Claim locality in interrupt handler
        tpm, tpm_tis: Request threaded interrupt handler
        tpm, tpm: Implement usage counter for locality
        tpm, tpm_tis: do not check for the active locality in interrupt handler
        tpm, tpm_tis: Move interrupt mask checks into own function
        tpm, tpm_tis: Only handle supported interrupts
        tpm, tpm_tis: Claim locality before writing interrupt registers
        tpm, tpm_tis: Do not skip reset of original interrupt vector
        tpm, tpm_tis: Disable interrupts if tpm_tis_probe_irq() failed
        tpm, tpm_tis: Claim locality before writing TPM_INT_ENABLE register
        ...
      1a0beef9
    • Linus Torvalds's avatar
      Merge tag 'Smack-for-6.4' of https://github.com/cschaufler/smack-next · dc7e22a3
      Linus Torvalds authored
      Pull smack updates from Casey Schaufler:
       "There are two changes, one small and one more substantial:
      
         - Remove of an unnecessary cast
      
         - The mount option processing introduced with the mount rework makes
           copies of mount option values. There is no good reason to make
           copies of Smack labels, as they are maintained on a list and never
           removed.
      
           The code now uses pointers to entries on the list, reducing
           processing time and memory use"
      
      * tag 'Smack-for-6.4' of https://github.com/cschaufler/smack-next:
        Smack: Improve mount process memory use
        smack_lsm: remove unnecessary type casting
      dc7e22a3
    • Linus Torvalds's avatar
      Merge tag 'landlock-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux · 62443646
      Linus Torvalds authored
      Pull landlock update from Mickaël Salaün:
       "Improve user space documentation"
      
      * tag 'landlock-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
        landlock: Clarify documentation for the LANDLOCK_ACCESS_FS_REFER right
      62443646
    • Linus Torvalds's avatar
      Merge tag 'tomoyo-pr-20230424' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1 · 5af4b523
      Linus Torvalds authored
      Pull tomoyo update from Tetsuo Handa:
       "One cleanup patch from Vlastimil Babka"
      
      * tag 'tomoyo-pr-20230424' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1:
        tomoyo: replace tomoyo_round2() with kmalloc_size_roundup()
      5af4b523
    • Linus Torvalds's avatar
      Merge tag 'lsm-pr-20230420' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm · 08e30833
      Linus Torvalds authored
      Pull lsm updates from Paul Moore:
      
       - Move the LSM hook comment blocks into security/security.c
      
         For many years the LSM hook comment blocks were located in a very odd
         place, include/linux/lsm_hooks.h, where they lived on their own,
         disconnected from both the function prototypes and definitions.
      
         In keeping with current kernel conventions, this moves all of these
         comment blocks to the top of the function definitions, transforming
         them into the kdoc format in the process. This should make it much
         easier to maintain these comments, which are the main source of LSM
         hook documentation.
      
         For the most part the comment contents were left as-is, although some
         glaring errors were corrected. Expect additional edits in the future
         as we slowly update and correct the comment blocks.
      
         This is the bulk of the diffstat.
      
       - Introduce LSM_ORDER_LAST
      
         Similar to how LSM_ORDER_FIRST is used to specify LSMs which should
         be ordered before "normal" LSMs, the LSM_ORDER_LAST is used to
         specify LSMs which should be ordered after "normal" LSMs.
      
         This is one of the prerequisites for transitioning IMA/EVM to a
         proper LSM.
      
       - Remove the security_old_inode_init_security() hook
      
         The security_old_inode_init_security() LSM hook only allows for a
         single xattr which is problematic both for LSM stacking and the
         IMA/EVM-as-a-LSM effort. This finishes the conversion over to the
         security_inode_init_security() hook and removes the single-xattr LSM
         hook.
      
       - Fix a reiserfs problem with security xattrs
      
         During the security_old_inode_init_security() removal work it became
         clear that reiserfs wasn't handling security xattrs properly so we
         fixed it.
      
      * tag 'lsm-pr-20230420' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: (32 commits)
        reiserfs: Add security prefix to xattr name in reiserfs_security_write()
        security: Remove security_old_inode_init_security()
        ocfs2: Switch to security_inode_init_security()
        reiserfs: Switch to security_inode_init_security()
        security: Remove integrity from the LSM list in Kconfig
        Revert "integrity: double check iint_cache was initialized"
        security: Introduce LSM_ORDER_LAST and set it for the integrity LSM
        device_cgroup: Fix typo in devcgroup_css_alloc description
        lsm: fix a badly named parameter in security_get_getsecurity()
        lsm: fix doc warnings in the LSM hook comments
        lsm: styling fixes to security/security.c
        lsm: move the remaining LSM hook comments to security/security.c
        lsm: move the io_uring hook comments to security/security.c
        lsm: move the perf hook comments to security/security.c
        lsm: move the bpf hook comments to security/security.c
        lsm: move the audit hook comments to security/security.c
        lsm: move the binder hook comments to security/security.c
        lsm: move the sysv hook comments to security/security.c
        lsm: move the key hook comments to security/security.c
        lsm: move the xfrm hook comments to security/security.c
        ...
      08e30833
    • Linus Torvalds's avatar
      Merge tag 'selinux-pr-20230420' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux · 72eaa096
      Linus Torvalds authored
      Pull selinux updates from Paul Moore:
      
       - Stop passing the 'selinux_state' pointers as function arguments
      
         As discussed during the end of the last development cycle, passing a
         selinux_state pointer through the SELinux code has a noticeable
         impact on performance, and with the current code it is not strictly
         necessary.
      
         This simplifies things by referring directly to the single
         selinux_state global variable which should help improve SELinux
         performance.
      
       - Uninline the unlikely portions of avc_has_perm_noaudit()
      
         This change was also based on a discussion from the last development
         cycle, and is heavily based on an initial proof of concept patch from
         you. The core issue was that avc_has_perm_noaudit() was not able to
         be inlined, as intended, due to its size. We solved this issue by
         extracting the less frequently hit portions of avc_has_perm_noaudit()
         into a separate function, reducing the size of avc_has_perm_noaudit()
         to the point where the compiler began inlining the function. We also
         took the opportunity to clean up some ugly RCU locking in the code
         that became uglier with the change.
      
       - Remove the runtime disable functionality
      
         After several years of work by the userspace and distro folks, we are
         finally in a place where we feel comfortable removing the runtime
         disable functionality which we initially deprecated at the start of
         2020.
      
         There is plenty of information in the kernel's deprecation (now
         removal) notice, but the main motivation was to be able to safely
         mark the LSM hook structures as '__ro_after_init'.
      
         LWN also wrote a good summary of the deprecation this morning which
         offers a more detailed history:
      
              https://lwn.net/SubscriberLink/927463/dcfa0d4ed2872f03
      
       - Remove the checkreqprot functionality
      
         The original checkreqprot deprecation notice stated that the removal
         would happen no sooner than June 2021, which means this falls hard
         into the "better late than never" bucket.
      
         The Kconfig and deprecation notice has more detail on this setting,
         but the basic idea is that we want to ensure that the SELinux policy
         allows for the memory protections actually applied by the kernel, and
         not those requested by the process.
      
         While we haven't found anyone running a supported distro that is
         affected by this deprecation/removal, anyone who is affected would
         only need to update their policy to reflect the reality of their
         applications' mapping protections.
      
       - Minor Makefile improvements
      
         Some minor Makefile improvements to correct some dependency issues
         likely only ever seen by SELinux developers. I expect we will have at
         least one more tweak to the Makefile during the next merge window,
         but it didn't quite make the cutoff this time around.
      
      * tag 'selinux-pr-20230420' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
        selinux: ensure av_permissions.h is built when needed
        selinux: fix Makefile dependencies of flask.h
        selinux: stop returning node from avc_insert()
        selinux: clean up dead code after removing runtime disable
        selinux: update the file list in MAINTAINERS
        selinux: remove the runtime disable functionality
        selinux: remove the 'checkreqprot' functionality
        selinux: stop passing selinux_state pointers and their offspring
        selinux: uninline unlikely parts of avc_has_perm_noaudit()
      72eaa096
    • Linus Torvalds's avatar
      Merge branch 'x86-rep-insns': x86 user copy clarifications · a5624566
      Linus Torvalds authored
      Merge my x86 user copy updates branch.
      
      This cleans up a lot of our x86 memory copy code, particularly for user
      accesses.  I've been pushing for microarchitectural support for good
      memory copying and clearing for a long while, and it's been visible in
      how the kernel has aggressively used 'rep movs' and 'rep stos' whenever
      possible.
      
      And that micro-architectural support has been improving over the years,
      to the point where on modern CPU's the best option for a memory copy
      that would become a function call (as opposed to being something that
      can just be turned into individual 'mov' instructions) is now to inline
      the string instruction sequence instead.
      
      However, that only makes sense when we have the modern markers for this:
      the x86 FSRM and FSRS capabilities ("Fast Short REP MOVS/STOS").
      
      So this cleans up a lot of our historical code, gets rid of the legacy
      marker use ("REP_GOOD" and "ERMS") from the memcpy/memset cases, and
      replaces it with that modern reality.  Note that REP_GOOD and ERMS end
      up still being used by the known large cases (ie page copyin gand
      clearing).
      
      The reason much of this ends up being about user memory accesses is that
      the normal in-kernel cases are done by the compiler (__builtin_memcpy()
      and __builtin_memset()) and getting to the point where we can use our
      instruction rewriting to inline those to be string instructions will
      need some compiler support.
      
      In contrast, the user accessor functions are all entirely controlled by
      the kernel code, so we can change those arbitrarily.
      
      Thanks to Borislav Petkov for feedback on the series, and Jens testing
      some of this on micro-architectures I didn't personally have access to.
      
      * x86-rep-insns:
        x86: rewrite '__copy_user_nocache' function
        x86: remove 'zerorest' argument from __copy_user_nocache()
        x86: set FSRS automatically on AMD CPUs that have FSRM
        x86: improve on the non-rep 'copy_user' function
        x86: improve on the non-rep 'clear_user' function
        x86: inline the 'rep movs' in user copies for the FSRM case
        x86: move stac/clac from user copy routines into callers
        x86: don't use REP_GOOD or ERMS for user memory clearing
        x86: don't use REP_GOOD or ERMS for user memory copies
        x86: don't use REP_GOOD or ERMS for small memory clearing
        x86: don't use REP_GOOD or ERMS for small memory copies
      a5624566