1. 02 May, 2021 4 commits
    • Linus Torvalds's avatar
      Merge tag 'for-linus-5.13-ofs-1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux · 9ccce092
      Linus Torvalds authored
      Pull orangefs updates from Mike Marshall:
       "orangefs: implement orangefs_readahead
      
        mm/readahead.c/read_pages was quite a bit different back when I put my
        open-coded readahead logic into orangefs_readpage. That logic seemed
        to work as designed back then, it is a trainwreck now.
      
        This implements orangefs_readahead using the new xarray and
        readahead_expand features and removes all my open-coded readahead
        logic.
      
        This results in an extreme read performance improvement, these sample
        numbers are from my test VM:
      
        Here's an example of what's upstream in
        5.11.8-200.fc33.x86_64:
      
           30+0 records in
           30+0 records out
           125829120 bytes (126 MB, 120 MiB) copied, 5.77943 s, 21.8 MB/s
      
        And here's this version of orangefs_readahead on top of 5.12.0-rc4:
      
           30+0 records in
           30+0 records out
           125829120 bytes (126 MB, 120 MiB) copied, 0.325919 s, 386 MB/s
      
        There are four xfstest regressions with this patch. David Howells and
        Matthew Wilcox have been helping me work with this code"
      
      * tag 'for-linus-5.13-ofs-1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
        orangefs: leave files in the page cache for a few micro seconds at least
        Orangef: implement orangefs_readahead.
      9ccce092
    • Linus Torvalds's avatar
      Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 27787ba3
      Linus Torvalds authored
      Pull misc vfs updates from Al Viro:
       "Assorted stuff all over the place"
      
      * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        useful constants: struct qstr for ".."
        hostfs_open(): don't open-code file_dentry()
        whack-a-mole: kill strlen_user() (again)
        autofs: should_expire() argument is guaranteed to be positive
        apparmor:match_mn() - constify devpath argument
        buffer: a small optimization in grow_buffers
        get rid of autofs_getpath()
        constify dentry argument of dentry_path()/dentry_path_raw()
      27787ba3
    • Linus Torvalds's avatar
      Merge branch 'work.ecryptfs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · b28866f4
      Linus Torvalds authored
      Pull exryptfs updates from Al Viro:
       "The interesting part here is (ecryptfs) lock_parent() fixes - its
        treatment of ->d_parent had been very wrong.
      
        The rest is trivial cleanups"
      
      * 'work.ecryptfs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        ecryptfs: ecryptfs_dentry_info->crypt_stat is never used
        ecryptfs: get rid of unused accessors
        ecryptfs: saner API for lock_parent()
        ecryptfs: get rid of pointless dget/dput in ->symlink() and ->link()
      b28866f4
    • Linus Torvalds's avatar
      Merge tag 'landlock_v34' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security · 17ae69ab
      Linus Torvalds authored
      Pull Landlock LSM from James Morris:
       "Add Landlock, a new LSM from Mickaël Salaün.
      
        Briefly, Landlock provides for unprivileged application sandboxing.
      
        From Mickaël's cover letter:
          "The goal of Landlock is to enable to restrict ambient rights (e.g.
           global filesystem access) for a set of processes. Because Landlock
           is a stackable LSM [1], it makes possible to create safe security
           sandboxes as new security layers in addition to the existing
           system-wide access-controls. This kind of sandbox is expected to
           help mitigate the security impact of bugs or unexpected/malicious
           behaviors in user-space applications. Landlock empowers any
           process, including unprivileged ones, to securely restrict
           themselves.
      
           Landlock is inspired by seccomp-bpf but instead of filtering
           syscalls and their raw arguments, a Landlock rule can restrict the
           use of kernel objects like file hierarchies, according to the
           kernel semantic. Landlock also takes inspiration from other OS
           sandbox mechanisms: XNU Sandbox, FreeBSD Capsicum or OpenBSD
           Pledge/Unveil.
      
           In this current form, Landlock misses some access-control features.
           This enables to minimize this patch series and ease review. This
           series still addresses multiple use cases, especially with the
           combined use of seccomp-bpf: applications with built-in sandboxing,
           init systems, security sandbox tools and security-oriented APIs [2]"
      
        The cover letter and v34 posting is here:
      
            https://lore.kernel.org/linux-security-module/20210422154123.13086-1-mic@digikod.net/
      
        See also:
      
            https://landlock.io/
      
        This code has had extensive design discussion and review over several
        years"
      
      Link: https://lore.kernel.org/lkml/50db058a-7dde-441b-a7f9-f6837fe8b69f@schaufler-ca.com/ [1]
      Link: https://lore.kernel.org/lkml/f646e1c7-33cf-333f-070c-0a40ad0468cd@digikod.net/ [2]
      
      * tag 'landlock_v34' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
        landlock: Enable user space to infer supported features
        landlock: Add user and kernel documentation
        samples/landlock: Add a sandbox manager example
        selftests/landlock: Add user space tests
        landlock: Add syscall implementations
        arch: Wire up Landlock syscalls
        fs,security: Add sb_delete hook
        landlock: Support filesystem access-control
        LSM: Infrastructure management of the superblock
        landlock: Add ptrace restrictions
        landlock: Set up the security framework and manage credentials
        landlock: Add ruleset and domain management
        landlock: Add object management
      17ae69ab
  2. 01 May, 2021 7 commits
    • Linus Torvalds's avatar
      Merge tag 'integrity-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity · e6f0bf09
      Linus Torvalds authored
      Pull IMA updates from Mimi Zohar:
       "In addition to loading the kernel module signing key onto the builtin
        keyring, load it onto the IMA keyring as well.
      
        Also six trivial changes and bug fixes"
      
      * tag 'integrity-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
        ima: ensure IMA_APPRAISE_MODSIG has necessary dependencies
        ima: Fix fall-through warnings for Clang
        integrity: Add declarations to init_once void arguments.
        ima: Fix function name error in comment.
        ima: enable loading of build time generated key on .ima keyring
        ima: enable signing of modules with build time generated key
        keys: cleanup build time module signing keys
        ima: Fix the error code for restoring the PCR value
        ima: without an IMA policy loaded, return quickly
      e6f0bf09
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-for-v5.13-2021-04-29' of... · 10a3efd0
      Linus Torvalds authored
      Merge tag 'perf-tools-for-v5.13-2021-04-29' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
      Pull perf tool updates from Arnaldo Carvalho de Melo:
       "perf stat:
      
         - Add support for hybrid PMUs to support systems such as Intel
           Alderlake and its BIG/little core/atom cpus.
      
         - Introduce 'bperf' to share hardware PMCs with BPF.
      
         - New --iostat option to collect and present IO stats on Intel
           hardware.
      
           This functionality is based on recently introduced sysfs attributes
           for Intel® Xeon® Scalable processor family (code name Skylake-SP)
           in commit bb42b3d3 ("perf/x86/intel/uncore: Expose an Uncore
           unit to IIO PMON mapping")
      
           It is intended to provide four I/O performance metrics in MB per
           each PCIe root port:
      
             - Inbound Read: I/O devices below root port read from the host memory
             - Inbound Write: I/O devices below root port write to the host memory
             - Outbound Read: CPU reads from I/O devices below root port
             - Outbound Write: CPU writes to I/O devices below root port
      
         - Align CSV output for summary.
      
         - Clarify --null use cases: Assess raw overhead of 'perf stat' or
           measure just wall clock time.
      
         - Improve readability of shadow stats.
      
        perf record:
      
         - Change the COMM when starting tha workload so that --exclude-perf
           doesn't seem to be not honoured.
      
         - Improve 'Workload failed' message printing events + what was
           exec'ed.
      
         - Fix cross-arch support for TIME_CONV.
      
        perf report:
      
         - Add option to disable raw event ordering.
      
         - Dump the contents of PERF_RECORD_TIME_CONV in 'perf report -D'.
      
         - Improvements to --stat output, that shows information about
           PERF_RECORD_ events.
      
         - Preserve identifier id in OCaml demangler.
      
        perf annotate:
      
         - Show full source location with 'l' hotkey in the 'perf annotate'
           TUI.
      
         - Add line number like in TUI and source location at EOL to the 'perf
           annotate' --stdio mode.
      
         - Add --demangle and --demangle-kernel to 'perf annotate'.
      
         - Allow configuring annotate.demangle{,_kernel} in 'perf config'.
      
         - Fix sample events lost in stdio mode.
      
        perf data:
      
         - Allow converting a perf.data file to JSON.
      
        libperf:
      
         - Add support for user space counter access.
      
         - Update topdown documentation to permit rdpmc calls.
      
        perf test:
      
         - Add 'perf test' for 'perf stat' CSV output.
      
         - Add 'perf test' entries to test the hybrid PMU support.
      
         - Cleanup 'perf test daemon' if its 'perf test' is interrupted.
      
         - Handle metric reuse in pmu-events parsing 'perf test' entry.
      
         - Add test for PE executable support.
      
         - Add timeout for wait for daemon start in its 'perf test' entries.
      
        Build:
      
         - Enable libtraceevent dynamic linking.
      
         - Improve feature detection output.
      
         - Fix caching of feature checks caching.
      
         - First round of updates for tools copies of kernel headers.
      
         - Enable warnings when compiling BPF programs.
      
        Vendor specific events:
      
         - Intel:
            - Add missing skylake & icelake model numbers.
      
         - arm64:
            - Add Hisi hip08 L1, L2 and L3 metrics.
            - Add Fujitsu A64FX PMU events.
      
         - PowerPC:
            - Initial JSON/events list for power10 platform.
            - Remove unsupported power9 metrics.
      
         - AMD:
            - Add Zen3 events.
            - Fix broken L2 Cache Hits from L2 HWPF metric.
            - Use lowercases for all the eventcodes and umasks.
      
        Hardware tracing:
      
         - arm64:
            - Update CoreSight ETM metadata format.
            - Fix bitmap for CS-ETM option.
            - Support PID tracing in config.
            - Detect pid in VMID for kernel running at EL2.
      
        Arch specific updates:
      
         - MIPS:
            - Support MIPS unwinding and dwarf-regs.
            - Generate mips syscalls_n64.c syscall table.
      
         - PowerPC:
            - Add support for PERF_SAMPLE_WEIGH_STRUCT on PowerPC.
            - Support pipeline stage cycles for powerpc.
      
        libbeauty:
      
         - Fix fsconfig generator"
      
      * tag 'perf-tools-for-v5.13-2021-04-29' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (132 commits)
        perf build: Defer printing detected features to the end of all feature checks
        tools build: Allow deferring printing the results of feature detection
        perf build: Regenerate the FEATURE_DUMP file after extra feature checks
        perf session: Dump PERF_RECORD_TIME_CONV event
        perf session: Add swap operation for event TIME_CONV
        perf jit: Let convert_timestamp() to be backwards-compatible
        perf tools: Change fields type in perf_record_time_conv
        perf tools: Enable libtraceevent dynamic linking
        perf Documentation: Document intel-hybrid support
        perf tests: Skip 'perf stat metrics (shadow stat) test' for hybrid
        perf tests: Support 'Convert perf time to TSC' test for hybrid
        perf tests: Support 'Session topology' test for hybrid
        perf tests: Support 'Parse and process metrics' test for hybrid
        perf tests: Support 'Track with sched_switch' test for hybrid
        perf tests: Skip 'Setup struct perf_event_attr' test for hybrid
        perf tests: Add hybrid cases for 'Roundtrip evsel->name' test
        perf tests: Add hybrid cases for 'Parse event definition strings' test
        perf record: Uniquify hybrid event name
        perf stat: Warn group events from different hybrid PMU
        perf stat: Filter out unmatched aggregation for hybrid event
        ...
      10a3efd0
    • David Howells's avatar
      afs: Fix speculative status fetches · 22650f14
      David Howells authored
      The generic/464 xfstest causes kAFS to emit occasional warnings of the
      form:
      
              kAFS: vnode modified {100055:8a} 30->31 YFS.StoreData64 (c=6015)
      
      This indicates that the data version received back from the server did not
      match the expected value (the DV should be incremented monotonically for
      each individual modification op committed to a vnode).
      
      What is happening is that a lookup call is doing a bulk status fetch
      speculatively on a bunch of vnodes in a directory besides getting the
      status of the vnode it's actually interested in.  This is racing with a
      StoreData operation (though it could also occur with, say, a MakeDir op).
      
      On the client, a modification operation locks the vnode, but the bulk
      status fetch only locks the parent directory, so no ordering is imposed
      there (thereby avoiding an avenue to deadlock).
      
      On the server, the StoreData op handler doesn't lock the vnode until it's
      received all the request data, and downgrades the lock after committing the
      data until it has finished sending change notifications to other clients -
      which allows the status fetch to occur before it has finished.
      
      This means that:
      
       - a status fetch can access the target vnode either side of the exclusive
         section of the modification
      
       - the status fetch could start before the modification, yet finish after,
         and vice-versa.
      
       - the status fetch and the modification RPCs can complete in either order.
      
       - the status fetch can return either the before or the after DV from the
         modification.
      
       - the status fetch might regress the locally cached DV.
      
      Some of these are handled by the previous fix[1], but that's not sufficient
      because it checks the DV it received against the DV it cached at the start
      of the op, but the DV might've been updated in the meantime by a locally
      generated modification op.
      
      Fix this by the following means:
      
       (1) Keep track of when we're performing a modification operation on a
           vnode.  This is done by marking vnode parameters with a 'modification'
           note that causes the AFS_VNODE_MODIFYING flag to be set on the vnode
           for the duration.
      
       (2) Alter the speculation race detection to ignore speculative status
           fetches if either the vnode is marked as being modified or the data
           version number is not what we expected.
      
      Note that whilst the "vnode modified" warning does get recovered from as it
      causes the client to refetch the status at the next opportunity, it will
      also invalidate the pagecache, so changes might get lost.
      
      Fixes: a9e5c87c ("afs: Fix speculative status fetch going out of order wrt to modifications")
      Reported-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Tested-and-reviewed-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      Link: https://lore.kernel.org/r/160605082531.252452.14708077925602709042.stgit@warthog.procyon.org.uk/ [1]
      Link: https://lore.kernel.org/linux-fsdevel/161961335926.39335.2552653972195467566.stgit@warthog.procyon.org.uk/ # v1
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      22650f14
    • Linus Torvalds's avatar
      Merge tag 'for-5.13/dm-changes' of... · 7af81cd0
      Linus Torvalds authored
      Merge tag 'for-5.13/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
      
      Pull device mapper updates from Mike Snitzer:
      
       - Improve scalability of DM's device hash by switching to rbtree
      
       - Extend DM ioctl's DM_LIST_DEVICES_CMD handling to include UUID and
         allow filtering based on name or UUID prefix.
      
       - Various small fixes for typos, warnings, unused function, or
         needlessly exported interfaces.
      
       - Remove needless request_queue NULL pointer checks in DM thin and
         cache targets.
      
       - Remove unnecessary loop in DM core's __split_and_process_bio().
      
       - Remove DM core's dm_vcalloc() and just use kvcalloc or kvmalloc_array
         instead (depending whether zeroing is useful).
      
       - Fix request-based DM's double free of blk_mq_tag_set in device remove
         after table load fails.
      
       - Improve DM persistent data performance on non-x86 by fixing packed
         structs to have a stated alignment. Also remove needless extra work
         from redundant calls to sm_disk_get_nr_free() and a paranoid BUG_ON()
         that caused duplicate checksum calculation.
      
       - Fix missing goto in DM integrity's bitmap_flush_interval error
         handling.
      
       - Add "reset_recalculate" feature flag to DM integrity.
      
       - Improve DM integrity by leveraging discard support to avoid needless
         re-writing of metadata and also use discard support to improve hash
         recalculation.
      
       - Fix race with DM raid target's reshape and MD raid4/5/6 resync that
         resulted in inconsistant reshape state during table reloads.
      
       - Update DM raid target to temove unnecessary discard limits for raid0
         and raid10 now that MD has optimized discard handling for both raid
         levels.
      
      * tag 'for-5.13/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (26 commits)
        dm raid: remove unnecessary discard limits for raid0 and raid10
        dm rq: fix double free of blk_mq_tag_set in dev remove after table load fails
        dm integrity: use discard support when recalculating
        dm integrity: increase RECALC_SECTORS to improve recalculate speed
        dm integrity: don't re-write metadata if discarding same blocks
        dm raid: fix inconclusive reshape layout on fast raid4/5/6 table reload sequences
        dm raid: fix fall-through warning in rs_check_takeover() for Clang
        dm clone metadata: remove unused function
        dm integrity: fix missing goto in bitmap_flush_interval error handling
        dm: replace dm_vcalloc()
        dm space map common: fix division bug in sm_ll_find_free_block()
        dm persistent data: packed struct should have an aligned() attribute too
        dm btree spine: remove paranoid node_check call in node_prep_for_write()
        dm space map disk: remove redundant calls to sm_disk_get_nr_free()
        dm integrity: add the "reset_recalculate" feature flag
        dm persistent data: remove unused return from exit_shadow_spine()
        dm cache: remove needless request_queue NULL pointer checks
        dm thin: remove needless request_queue NULL pointer check
        dm: unexport dm_{get,put}_table_device
        dm ebs: fix a few typos
        ...
      7af81cd0
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 152d32aa
      Linus Torvalds authored
      Pull kvm updates from Paolo Bonzini:
       "This is a large update by KVM standards, including AMD PSP (Platform
        Security Processor, aka "AMD Secure Technology") and ARM CoreSight
        (debug and trace) changes.
      
        ARM:
      
         - CoreSight: Add support for ETE and TRBE
      
         - Stage-2 isolation for the host kernel when running in protected
           mode
      
         - Guest SVE support when running in nVHE mode
      
         - Force W^X hypervisor mappings in nVHE mode
      
         - ITS save/restore for guests using direct injection with GICv4.1
      
         - nVHE panics now produce readable backtraces
      
         - Guest support for PTP using the ptp_kvm driver
      
         - Performance improvements in the S2 fault handler
      
        x86:
      
         - AMD PSP driver changes
      
         - Optimizations and cleanup of nested SVM code
      
         - AMD: Support for virtual SPEC_CTRL
      
         - Optimizations of the new MMU code: fast invalidation, zap under
           read lock, enable/disably dirty page logging under read lock
      
         - /dev/kvm API for AMD SEV live migration (guest API coming soon)
      
         - support SEV virtual machines sharing the same encryption context
      
         - support SGX in virtual machines
      
         - add a few more statistics
      
         - improved directed yield heuristics
      
         - Lots and lots of cleanups
      
        Generic:
      
         - Rework of MMU notifier interface, simplifying and optimizing the
           architecture-specific code
      
         - a handful of "Get rid of oprofile leftovers" patches
      
         - Some selftests improvements"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (379 commits)
        KVM: selftests: Speed up set_memory_region_test
        selftests: kvm: Fix the check of return value
        KVM: x86: Take advantage of kvm_arch_dy_has_pending_interrupt()
        KVM: SVM: Skip SEV cache flush if no ASIDs have been used
        KVM: SVM: Remove an unnecessary prototype declaration of sev_flush_asids()
        KVM: SVM: Drop redundant svm_sev_enabled() helper
        KVM: SVM: Move SEV VMCB tracking allocation to sev.c
        KVM: SVM: Explicitly check max SEV ASID during sev_hardware_setup()
        KVM: SVM: Unconditionally invoke sev_hardware_teardown()
        KVM: SVM: Enable SEV/SEV-ES functionality by default (when supported)
        KVM: SVM: Condition sev_enabled and sev_es_enabled on CONFIG_KVM_AMD_SEV=y
        KVM: SVM: Append "_enabled" to module-scoped SEV/SEV-ES control variables
        KVM: SEV: Mask CPUID[0x8000001F].eax according to supported features
        KVM: SVM: Move SEV module params/variables to sev.c
        KVM: SVM: Disable SEV/SEV-ES if NPT is disabled
        KVM: SVM: Free sev_asid_bitmap during init if SEV setup fails
        KVM: SVM: Zero out the VMCB array used to track SEV ASID association
        x86/sev: Drop redundant and potentially misleading 'sev_enabled'
        KVM: x86: Move reverse CPUID helpers to separate header file
        KVM: x86: Rename GPR accessors to make mode-aware variants the defaults
        ...
      152d32aa
    • Linus Torvalds's avatar
      Merge tag 'iommu-updates-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 4f970105
      Linus Torvalds authored
      Pull iommu updates from Joerg Roedel:
      
       - Big cleanup of almost unsused parts of the IOMMU API by Christoph
         Hellwig. This mostly affects the Freescale PAMU driver.
      
       - New IOMMU driver for Unisoc SOCs
      
       - ARM SMMU Updates from Will:
           - Drop vestigial PREFETCH_ADDR support (SMMUv3)
           - Elide TLB sync logic for empty gather (SMMUv3)
           - Fix "Service Failure Mode" handling (SMMUv3)
           - New Qualcomm compatible string (SMMUv2)
      
       - Removal of the AMD IOMMU performance counter writeable check on AMD.
         It caused long boot delays on some machines and is only needed to
         work around an errata on some older (possibly pre-production) chips.
         If someone is still hit by this hardware issue anyway the performance
         counters will just return 0.
      
       - Support for targeted invalidations in the AMD IOMMU driver. Before
         that the driver only invalidated a single 4k page or the whole IO/TLB
         for an address space. This has been extended now and is mostly useful
         for emulated AMD IOMMUs.
      
       - Several fixes for the Shared Virtual Memory support in the Intel VT-d
         driver
      
       - Mediatek drivers can now be built as modules
      
       - Re-introduction of the forcedac boot option which got lost when
         converting the Intel VT-d driver to the common dma-iommu
         implementation.
      
       - Extension of the IOMMU device registration interface and support
         iommu_ops to be const again when drivers are built as modules.
      
      * tag 'iommu-updates-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (84 commits)
        iommu: Streamline registration interface
        iommu: Statically set module owner
        iommu/mediatek-v1: Add error handle for mtk_iommu_probe
        iommu/mediatek-v1: Avoid build fail when build as module
        iommu/mediatek: Always enable the clk on resume
        iommu/fsl-pamu: Fix uninitialized variable warning
        iommu/vt-d: Force to flush iotlb before creating superpage
        iommu/amd: Put newline after closing bracket in warning
        iommu/vt-d: Fix an error handling path in 'intel_prepare_irq_remapping()'
        iommu/vt-d: Fix build error of pasid_enable_wpe() with !X86
        iommu/amd: Remove performance counter pre-initialization test
        Revert "iommu/amd: Fix performance counter initialization"
        iommu/amd: Remove duplicate check of devid
        iommu/exynos: Remove unneeded local variable initialization
        iommu/amd: Page-specific invalidations for more than one page
        iommu/arm-smmu-v3: Remove the unused fields for PREFETCH_CONFIG command
        iommu/vt-d: Avoid unnecessary cache flush in pasid entry teardown
        iommu/vt-d: Invalidate PASID cache when root/context entry changed
        iommu/vt-d: Remove WO permissions on second-level paging entries
        iommu/vt-d: Report the right page fault address
        ...
      4f970105
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · f34b2cf1
      Linus Torvalds authored
      Pull rdma updates from Jason Gunthorpe:
       "This is significantly bug fixes and general cleanups. The noteworthy
        new features are fairly small:
      
         - XRC support for HNS and improves RQ operations
      
         - Bug fixes and updates for hns, mlx5, bnxt_re, hfi1, i40iw, rxe, siw
           and qib
      
         - Quite a few general cleanups on spelling, error handling, static
           checker detections, etc
      
         - Increase the number of device ports supported beyond 255. High port
           count software switches now exist
      
         - Several bug fixes for rtrs
      
         - mlx5 Device Memory support for host controlled atomics
      
         - Report SRQ tables through to rdma-tool"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (145 commits)
        IB/qib: Remove redundant assignment to ret
        RDMA/nldev: Add copy-on-fork attribute to get sys command
        RDMA/bnxt_re: Fix a double free in bnxt_qplib_alloc_res
        RDMA/siw: Fix a use after free in siw_alloc_mr
        IB/hfi1: Remove redundant variable rcd
        RDMA/nldev: Add QP numbers to SRQ information
        RDMA/nldev: Return SRQ information
        RDMA/restrack: Add support to get resource tracking for SRQ
        RDMA/nldev: Return context information
        RDMA/core: Add CM to restrack after successful attachment to a device
        RDMA/cma: Skip device which doesn't support CM
        RDMA/rxe: Fix a bug in rxe_fill_ip_info()
        RDMA/mlx5: Expose private query port
        RDMA/mlx4: Remove an unused variable
        RDMA/mlx5: Fix type assignment for ICM DM
        IB/mlx5: Set right RoCE l3 type and roce version while deleting GID
        RDMA/i40iw: Fix error unwinding when i40iw_hmc_sd_one fails
        RDMA/cxgb4: add missing qpid increment
        IB/ipoib: Remove unnecessary struct declaration
        RDMA/bnxt_re: Get rid of custom module reference counting
        ...
      f34b2cf1
  3. 30 Apr, 2021 29 commits
    • Linus Torvalds's avatar
      Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · 9f67672a
      Linus Torvalds authored
      Pull ext4 updates from Ted Ts'o:
       "New features for ext4 this cycle include support for encrypted
        casefold, ensure that deleted file names are cleared in directory
        blocks by zeroing directory entries when they are unlinked or moved as
        part of a hash tree node split. We also improve the block allocator's
        performance on a freshly mounted file system by prefetching block
        bitmaps.
      
        There are also the usual cleanups and bug fixes, including fixing a
        page cache invalidation race when there is mixed buffered and direct
        I/O and the block size is less than page size, and allow the dax flag
        to be set and cleared on inline directories"
      
      * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (32 commits)
        ext4: wipe ext4_dir_entry2 upon file deletion
        ext4: Fix occasional generic/418 failure
        fs: fix reporting supported extra file attributes for statx()
        ext4: allow the dax flag to be set and cleared on inline directories
        ext4: fix debug format string warning
        ext4: fix trailing whitespace
        ext4: fix various seppling typos
        ext4: fix error return code in ext4_fc_perform_commit()
        ext4: annotate data race in jbd2_journal_dirty_metadata()
        ext4: annotate data race in start_this_handle()
        ext4: fix ext4_error_err save negative errno into superblock
        ext4: fix error code in ext4_commit_super
        ext4: always panic when errors=panic is specified
        ext4: delete redundant uptodate check for buffer
        ext4: do not set SB_ACTIVE in ext4_orphan_cleanup()
        ext4: make prefetch_block_bitmaps default
        ext4: add proc files to monitor new structures
        ext4: improve cr 0 / cr 1 group scanning
        ext4: add MB_NUM_ORDERS macro
        ext4: add mballoc stats proc file
        ...
      9f67672a
    • Linus Torvalds's avatar
      Merge tag 'dlm-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm · 6bab076a
      Linus Torvalds authored
      Pull dlm updates from David Teigland:
       "This includes more dlm networking cleanups and improvements for making
        dlm shutdowns more robust"
      
      * tag 'dlm-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
        fs: dlm: fix missing unlock on error in accept_from_sock()
        fs: dlm: add shutdown hook
        fs: dlm: flush swork on shutdown
        fs: dlm: remove unaligned memory access handling
        fs: dlm: check on minimum msglen size
        fs: dlm: simplify writequeue handling
        fs: dlm: use GFP_ZERO for page buffer
        fs: dlm: change allocation limits
        fs: dlm: add check if dlm is currently running
        fs: dlm: add errno handling to check callback
        fs: dlm: set subclass for othercon sock_mutex
        fs: dlm: set connected bit after accept
        fs: dlm: fix mark setting deadlock
        fs: dlm: fix debugfs dump
      6bab076a
    • Linus Torvalds's avatar
      Merge tag 'fuse-update-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse · 9ec1efbf
      Linus Torvalds authored
      Pull fuse updates from Miklos Szeredi:
      
       - Fix a page locking bug in write (introduced in 2.6.26)
      
       - Allow sgid bit to be killed in setacl()
      
       - Miscellaneous fixes and cleanups
      
      * tag 'fuse-update-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
        cuse: simplify refcount
        cuse: prevent clone
        virtiofs: fix userns
        virtiofs: remove useless function
        virtiofs: split requests that exceed virtqueue size
        virtiofs: fix memory leak in virtio_fs_probe()
        fuse: invalidate attrs when page writeback completes
        fuse: add a flag FUSE_SETXATTR_ACL_KILL_SGID to kill SGID
        fuse: extend FUSE_SETXATTR request
        fuse: fix matching of FUSE_DEV_IOC_CLONE command
        fuse: fix a typo
        fuse: don't zero pages twice
        fuse: fix typo for fuse_conn.max_pages comment
        fuse: fix write deadlock
      9ec1efbf
    • Linus Torvalds's avatar
      Merge tag 'ovl-update-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs · d652502e
      Linus Torvalds authored
      Pull overlayfs update from Miklos Szeredi:
      
       - Fix a regression introduced in 5.2 that resulted in valid overlayfs
         mounts being rejected with ELOOP (Too many levels of symbolic links)
      
       - Fix bugs found by various tools
      
       - Miscellaneous improvements and cleanups
      
      * tag 'ovl-update-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
        ovl: add debug print to ovl_do_getxattr()
        ovl: invalidate readdir cache on changes to dir with origin
        ovl: allow upperdir inside lowerdir
        ovl: show "userxattr" in the mount data
        ovl: trivial typo fixes in the file inode.c
        ovl: fix misspellings using codespell tool
        ovl: do not copy attr several times
        ovl: remove ovl_map_dev_ino() return value
        ovl: fix error for ovl_fill_super()
        ovl: fix missing revert_creds() on error path
        ovl: fix leaked dentry
        ovl: restrict lower null uuid for "xino=auto"
        ovl: check that upperdir path is not on a read-only mount
        ovl: plumb through flush method
      d652502e
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · d42f323a
      Linus Torvalds authored
      Merge misc updates from Andrew Morton:
       "A few misc subsystems and some of MM.
      
        175 patches.
      
        Subsystems affected by this patch series: ia64, kbuild, scripts, sh,
        ocfs2, kfifo, vfs, kernel/watchdog, and mm (slab-generic, slub,
        kmemleak, debug, pagecache, msync, gup, memremap, memcg, pagemap,
        mremap, dma, sparsemem, vmalloc, documentation, kasan, initialization,
        pagealloc, and memory-failure)"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (175 commits)
        mm/memory-failure: unnecessary amount of unmapping
        mm/mmzone.h: fix existing kernel-doc comments and link them to core-api
        mm: page_alloc: ignore init_on_free=1 for debug_pagealloc=1
        net: page_pool: use alloc_pages_bulk in refill code path
        net: page_pool: refactor dma_map into own function page_pool_dma_map
        SUNRPC: refresh rq_pages using a bulk page allocator
        SUNRPC: set rq_page_end differently
        mm/page_alloc: inline __rmqueue_pcplist
        mm/page_alloc: optimize code layout for __alloc_pages_bulk
        mm/page_alloc: add an array-based interface to the bulk page allocator
        mm/page_alloc: add a bulk page allocator
        mm/page_alloc: rename alloced to allocated
        mm/page_alloc: duplicate include linux/vmalloc.h
        mm, page_alloc: avoid page_to_pfn() in move_freepages()
        mm/Kconfig: remove default DISCONTIGMEM_MANUAL
        mm: page_alloc: dump migrate-failed pages
        mm/mempolicy: fix mpol_misplaced kernel-doc
        mm/mempolicy: rewrite alloc_pages_vma documentation
        mm/mempolicy: rewrite alloc_pages documentation
        mm/mempolicy: rename alloc_pages_current to alloc_pages
        ...
      d42f323a
    • Linus Torvalds's avatar
      Merge tag 'pinctrl-v5.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · 65ec0a7d
      Linus Torvalds authored
      Pull pin control updates from Linus Walleij:
       "There is a lot going on!
      
        Core changes:
      
         - A semantic change to handle pinmux and pinconf in explicit order
           while up until now we depended on the semantic order in the device
           tree. The device tree is a functional programming language and does
           not imply any order, so the right thing is for the pin control core
           to provide these semantics.
      
         - Add a new pinmux-select debugfs file which makes it possible to go
           in and select functions for a pin manually (iteratively, at the
           prompt) for debugging purposes.
      
         - Fixes to gpio regmap handling for a new pin control driver making
           use of regmap-gpio.
      
         - Use octal permissions on debugfs files.
      
        New drivers:
      
         - A massive rewrite of the former custom pin control driver for MIPS
           Broadcom devices to instead use the pin control subsystem. New pin
           control drivers for BCM6345, BCM6328, BCM6358, BCM6362, BCM6368,
           BCM63268 and BCM6318 SoC variants are implemented.
      
         - Support for PM8350, PM8350B, PM8350C, PMK8350, PMR735A and PMR735B
           in the Qualcomm PMIC GPIO driver. Also the two GPIOs on PM8008 are
           supported.
      
         - Support for the Rockchip RK3568/RK3566 pin controller.
      
         - Support for Ingenic JZ4730, JZ4750, JZ4755, JZ4775 and X2000.
      
         - Support for Mediatek MTK8195.
      
         - Add a new Xilinx ZynqMP pin control driver.
      
        Driver improvements and non-urgent fixes:
      
         - Modularization and improvements of the Rockchip drivers.
      
         - Some new pins added to the description of new Renesas SoCs.
      
         - Clarifications of the GPIO base calculation in the Intel driver.
      
         - Fix the function names for the MPP54 and MPP55 pins in the Armada
           CP110 pin controller.
      
         - GPIO wakeup interrupt map for Qualcomm SC7280 and SM8350.
      
         - Support for ACPI probing of the Qualcomm SC8180x.
      
         - Fix interrupt clear status on rockchip
      
         - Fix some missing pins on the Ingenic JZ4770, some semantic fixes
           for the behaviour of the Ingenic pin controller. Add DMIC pins for
           JZ4780, X1000, X1500 and X1830.
      
         - A slew of janitorial like of_node_put() calls"
      
      * tag 'pinctrl-v5.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (99 commits)
        pinctrl: Add Xilinx ZynqMP pinctrl driver support
        firmware: xilinx: Add pinctrl support
        pinctrl: rockchip: do coding style for mux route struct
        pinctrl: Add PIN_CONFIG_MODE_PWM to enum pin_config_param
        pinctrl: Introduce MODE group in enum pin_config_param
        pinctrl: Keep enum pin_config_param ordered by name
        dt-bindings: pinctrl: Add binding for ZynqMP pinctrl driver
        pinctrl: core: Fix kernel doc string for pin_get_name()
        pinctrl: mediatek: use spin lock in mtk_rmw
        pinctrl: add drive for I2C related pins on MT8195
        pinctrl: add pinctrl driver on mt8195
        dt-bindings: pinctrl: mt8195: add pinctrl file and binding document
        pinctrl: Ingenic: Add pinctrl driver for X2000.
        pinctrl: Ingenic: Add pinctrl driver for JZ4775.
        pinctrl: Ingenic: Add pinctrl driver for JZ4755.
        pinctrl: Ingenic: Add pinctrl driver for JZ4750.
        pinctrl: Ingenic: Add pinctrl driver for JZ4730.
        dt-bindings: pinctrl: Add bindings for new Ingenic SoCs.
        pinctrl: Ingenic: Reformat the code.
        pinctrl: Ingenic: Add DMIC pins support for Ingenic SoCs.
        ...
      65ec0a7d
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 592fa953
      Linus Torvalds authored
      Pull i2c updates from Wolfram Sang:
      
       - new drivers for Silicon Labs CP2615 and the HiSilicon I2C unit
      
       - bigger refactoring for the MPC driver
      
       - support for full software nodes - no need to work around with only
         properties anymore
      
       - we now have 'devm_i2c_add_adapter', too
      
       - sub-system wide fixes for the RPM refcounting problem which often
         caused a leak when an error was encountered during probe
      
       - the rest is usual driver updates and improvements
      
      * 'i2c/for-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (77 commits)
        i2c: mediatek: Use scl_int_delay_ns to compensate clock-stretching
        i2c: mediatek: Fix wrong dma sync flag
        i2c: mediatek: Fix send master code at more than 1MHz
        i2c: sh7760: fix IRQ error path
        i2c: i801: Add support for Intel Alder Lake PCH-M
        i2c: core: Fix spacing error by checkpatch
        i2c: s3c2410: simplify getting of_device_id match data
        i2c: nomadik: Fix space errors
        i2c: iop3xx: Fix coding style issues
        i2c: amd8111: Fix coding style issues
        i2c: mpc: Drop duplicate message from devm_platform_ioremap_resource()
        i2c: mpc: Use device_get_match_data() helper
        i2c: mpc: Remove CONFIG_PM_SLEEP ifdeffery
        i2c: mpc: Use devm_clk_get_optional()
        i2c: mpc: Update license and copyright
        i2c: mpc: Interrupt driven transfer
        i2c: sh7760: add IRQ check
        i2c: rcar: add IRQ check
        i2c: mlxbf: add IRQ check
        i2c: jz4780: add IRQ check
        ...
      592fa953
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid · efd8929b
      Linus Torvalds authored
      Pull HID updates from Jiri Kosina:
      
       - Surface Aggregator Module support from Maximilian Luz
      
       - Apple Magic Mouse 2 support from John Chen
      
       - Support for newer Quad/BT 2.0 Logitech receivers in HID proxy mode
         from Hans de Goede
      
       - Thinkpad X1 Tablet keyboard support from Hans de Goede
      
       - Support for FTDI FT260 I2C host adapter from Michael Zaidman
      
       - other various small device-specific quirks, fixes and cleanups
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: (46 commits)
        HID: wacom: Setup pen input capabilities to the targeted tools
        HID: hid-sensor-hub: Move 'hsdev' description to correct struct definition
        HID: hid-sensor-hub: Remove unused struct member 'quirks'
        HID: wacom_sys: Demote kernel-doc abuse
        HID: hid-sensor-custom: Remove unused variable 'ret'
        HID: hid-uclogic-params: Ensure function names are present and correct in kernel-doc headers
        HID: hid-uclogic-rdesc: Kernel-doc is for functions and structs
        HID: hid-logitech-hidpp: Fix conformant kernel-doc header and demote abuses
        HID: hid-picolcd_core: Remove unused variable 'ret'
        HID: hid-kye: Fix incorrect function name for kye_tablet_enable()
        HID: hid-core: Fix incorrect function name in header
        HID: hid-alps: Correct struct misnaming
        HID: usbhid: hid-pidff: Demote a couple kernel-doc abuses
        HID: usbhid: Repair a formatting issue in a struct description
        HID: hid-thrustmaster: Demote a bunch of kernel-doc abuses
        HID: input: map battery capacity (00850065)
        HID: magicmouse: fix reconnection of Magic Mouse 2
        HID: magicmouse: fix 3 button emulation of Mouse 2
        HID: magicmouse: add Apple Magic Mouse 2 support
        HID: lenovo: Add support for Thinkpad X1 Tablet Thin keyboard
        ...
      efd8929b
    • Linus Torvalds's avatar
      Merge tag 'sound-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · b71428d7
      Linus Torvalds authored
      Pull sound updates from Takashi Iwai:
       "No surprises in this development cycle, and most of work is about the
        fixes and the improvements of the existing code, while a new LED
        control layer and a few new drivers have been introduced.
      
        Here are some highlights:
      
        Core:
         - A common mute-LED framework was introduced. It is used by HD-audio
           for now, more adaption will follow later. The former "Mic Mute-LED
           Mode" mixer control has been replaced with the corresponding sysfs
           now.
         - User-control management was changed to count consumed bytes instead
           of capping by number of elements; this will allow more controls in
           the normal usage pattern while avoiding the possible memory
           exhaustion DoS
      
        ASoC:
         - Continued refactoring and cleanups in ASoC core and generic card
           drivers
         - Wide range of small cppcheck and warning fixes
         - New drivers for Freescale i.MX DMA over rpmsg, Mediatek MT6358
           accessory detection, and Realtek RT1019, RT1316, RT711 and RT715
      
        USB-audio:
         - Continued improvements and fixes of the implicit feedback mode,
           including better support for Pioneer and Roland/BOSS devices
      
        HD-audio:
         - Default back to non-buffer preallocation on x86
         - Cirrus codec improvements, more quirks for Realtek codecs
      
        Others:
         - New virtio sound driver
         - FireWire Bebob updates"
      
      * tag 'sound-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (587 commits)
        ALSA: hda/conexant: Re-order CX5066 quirk table entries
        ALSA: hda/realtek: Remove redundant entry for ALC861 Haier/Uniwill devices
        ALSA: hda/realtek: Re-order ALC662 quirk table entries
        ALSA: hda/realtek: Re-order remaining ALC269 quirk table entries
        ALSA: hda/realtek: Re-order ALC269 Lenovo quirk table entries
        ALSA: hda/realtek: Re-order ALC269 Sony quirk table entries
        ALSA: hda/realtek: Re-order ALC269 ASUS quirk table entries
        ALSA: hda/realtek: Re-order ALC269 Dell quirk table entries
        ALSA: hda/realtek: Re-order ALC269 Acer quirk table entries
        ALSA: hda/realtek: Re-order ALC269 HP quirk table entries
        ALSA: hda/realtek: Re-order ALC882 Clevo quirk table entries
        ALSA: hda/realtek: Re-order ALC882 Sony quirk table entries
        ALSA: hda/realtek: Re-order ALC882 Acer quirk table entries
        ALSA: usb-audio: Remove redundant assignment to len
        ALSA: hda/realtek: Add quirk for Intel Clevo PCx0Dx
        ALSA: virtio: fix kernel-doc
        ALSA: hda/cirrus: Use CS8409 filter to fix abnormal sounds on Bullseye
        ALSA: hda/cirrus: Set Initial DMIC volume for Bullseye to -26 dB
        ALSA: sb: Fix two use after free in snd_sb_qsound_build
        ALSA: emu8000: Fix a use after free in snd_emu8000_create_mixer
        ...
      b71428d7
    • Linus Torvalds's avatar
      Merge tag 'drm-next-2021-04-30' of git://anongit.freedesktop.org/drm/drm · 95275402
      Linus Torvalds authored
      Pull more drm updates from Dave Airlie:
       "Looks like I missed a tegra feature request for next, but should still
        be fine since it's pretty self contained.
      
        Apart from that got a set of i915 and amdgpu fixes as per usual along
        with a few misc fixes.
      
        tegra:
         - Tegra186 hardware cursor support
         - better capability reporting for different SoC
         - better framebuffer modifier support
         - host1x fixes
      
        ttm:
         - fix unswappable BO handling
      
        efifb:
         - check for PCI before using it
      
        amdgpu:
         - Fixes for Aldebaran
         - Display LTTPR fixes
         - eDP fixes
         - Fixes for Vangogh
         - RAS fixes
         - ASPM support
         - Renoir SMU fixes
         - Modifier fixes
         - Misc code cleanups
         - Freesync fixes
      
        i915:
         - Several fixes to GLK handling in recent display refactoring
         - Rare watchdog timer race fix
         - Cppcheck redundant condition fix
         - Overlay error code propagation fix
         - Documentation fix
         - gvt: Remove one unused function warning
         - gvt: Fix intel_gvt_init_device() return type
         - gvt: Remove one duplicated register accessible check"
      
      * tag 'drm-next-2021-04-30' of git://anongit.freedesktop.org/drm/drm: (111 commits)
        efifb: Check efifb_pci_dev before using it
        drm/i915: Fix docbook descriptions for i915_gem_shrinker
        drm/i915: fix an error code in intel_overlay_do_put_image()
        drm/i915/display/psr: Fix cppcheck warnings
        drm/i915: Disable LTTPR detection on GLK once again
        drm/i915: Restore lost glk ccs w/a
        drm/i915: Restore lost glk FBC 16bpp w/a
        drm/i915: Take request reference before arming the watchdog timer
        drm/ttm: fix error handling if no BO can be swapped out v4
        drm/i915/gvt: Remove duplicated register accessible check
        drm/amdgpu/gmc9: remove dummy read workaround for newer chips
        drm/amdgpu: Add mem sync flag for IB allocated by SA
        drm/amdgpu: Fix SDMA RAS error reporting on Aldebaran
        drm/amdgpu: Reset RAS error count and status regs
        Revert "drm/amdgpu: workaround the TMR MC address issue (v2)"
        drm/amd/display: 3.2.132
        drm/amd/display: [FW Promotion] Release 0.0.62
        drm/amd/display: add helper for enabling mst stream features
        drm/amd/display: Report Proper Quantization Range in AVI Infoframe
        drm/amd/display: Fix call to pass bpp in 16ths of a bit
        ...
      95275402
    • Linus Torvalds's avatar
      Merge tag 'modules-for-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux · 65c61de9
      Linus Torvalds authored
      Pull module updates from Jessica Yu:
       "Fix an age old bug involving jump_calls and static_labels when
        CONFIG_MODULE_UNLOAD=n.
      
        When CONFIG_MODULE_UNLOAD=n, it means you can't unload modules, so
        normally the __exit sections of a module are not loaded at all.
        However, dynamic code patching (jump_label, static_call, alternatives)
        can have sites in __exit sections even if __exit is never executed.
      
        Reported by Peter Zijlstra:
           'Alternatives, jump_labels and static_call all can have relocations
            into __exit code. Not loading it at all would be BAD.'
      
        Therefore, load the __exit sections even when CONFIG_MODULE_UNLOAD=n,
        and discard them after init"
      
      * tag 'modules-for-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
        module: treat exit sections the same as init sections when !CONFIG_MODULE_UNLOAD
      65c61de9
    • Linus Torvalds's avatar
      Merge tag 'powerpc-5.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · c70a4be1
      Linus Torvalds authored
      Pull powerpc updates from Michael Ellerman:
      
       - Enable KFENCE for 32-bit.
      
       - Implement EBPF for 32-bit.
      
       - Convert 32-bit to do interrupt entry/exit in C.
      
       - Convert 64-bit BookE to do interrupt entry/exit in C.
      
       - Changes to our signal handling code to use user_access_begin/end()
         more extensively.
      
       - Add support for time namespaces (CONFIG_TIME_NS)
      
       - A series of fixes that allow us to reenable STRICT_KERNEL_RWX.
      
       - Other smaller features, fixes & cleanups.
      
      Thanks to Alexey Kardashevskiy, Andreas Schwab, Andrew Donnellan, Aneesh
      Kumar K.V, Athira Rajeev, Bhaskar Chowdhury, Bixuan Cui, Cédric Le
      Goater, Chen Huang, Chris Packham, Christophe Leroy, Christopher M.
      Riedl, Colin Ian King, Dan Carpenter, Daniel Axtens, Daniel Henrique
      Barboza, David Gibson, Davidlohr Bueso, Denis Efremov, dingsenjie,
      Dmitry Safonov, Dominic DeMarco, Fabiano Rosas, Ganesh Goudar, Geert
      Uytterhoeven, Geetika Moolchandani, Greg Kurz, Guenter Roeck, Haren
      Myneni, He Ying, Jiapeng Chong, Jordan Niethe, Laurent Dufour, Lee
      Jones, Leonardo Bras, Li Huafei, Madhavan Srinivasan, Mahesh Salgaonkar,
      Masahiro Yamada, Nathan Chancellor, Nathan Lynch, Nicholas Piggin,
      Oliver O'Halloran, Paul Menzel, Pu Lehui, Randy Dunlap, Ravi Bangoria,
      Rosen Penev, Russell Currey, Santosh Sivaraj, Sebastian Andrzej Siewior,
      Segher Boessenkool, Shivaprasad G Bhat, Srikar Dronamraju, Stephen
      Rothwell, Thadeu Lima de Souza Cascardo, Thomas Gleixner, Tony Ambardar,
      Tyrel Datwyler, Vaibhav Jain, Vincenzo Frascino, Xiongwei Song, Yang Li,
      Yu Kuai, and Zhang Yunkai.
      
      * tag 'powerpc-5.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (302 commits)
        powerpc/signal32: Fix erroneous SIGSEGV on RT signal return
        powerpc: Avoid clang uninitialized warning in __get_user_size_allowed
        powerpc/papr_scm: Mark nvdimm as unarmed if needed during probe
        powerpc/kvm: Fix build error when PPC_MEM_KEYS/PPC_PSERIES=n
        powerpc/kasan: Fix shadow start address with modules
        powerpc/kernel/iommu: Use largepool as a last resort when !largealloc
        powerpc/kernel/iommu: Align size for IOMMU_PAGE_SIZE() to save TCEs
        powerpc/44x: fix spelling mistake in Kconfig "varients" -> "variants"
        powerpc/iommu: Annotate nested lock for lockdep
        powerpc/iommu: Do not immediately panic when failed IOMMU table allocation
        powerpc/iommu: Allocate it_map by vmalloc
        selftests/powerpc: remove unneeded semicolon
        powerpc/64s: remove unneeded semicolon
        powerpc/eeh: remove unneeded semicolon
        powerpc/selftests: Add selftest to test concurrent perf/ptrace events
        powerpc/selftests/perf-hwbreak: Add testcases for 2nd DAWR
        powerpc/selftests/perf-hwbreak: Coalesce event creation code
        powerpc/selftests/ptrace-hwbreak: Add testcases for 2nd DAWR
        powerpc/configs: Add IBMVNIC to some 64-bit configs
        selftests/powerpc: Add uaccess flush test
        ...
      c70a4be1
    • Linus Torvalds's avatar
      Merge tag 'xtensa-20210429' of git://github.com/jcmvbkbc/linux-xtensa · 437d1a5b
      Linus Torvalds authored
      Pull Xtensa updates from Max Filippov:
      
       - switch to generic syscall generation scripts
      
       - new GDBIO implementation for xtensa semihosting interface
      
       - various small code fixes and cleanups
      
       - a few typo fixes in comments and Kconfig help text
      
      * tag 'xtensa-20210429' of git://github.com/jcmvbkbc/linux-xtensa:
        xtensa: ISS: add GDBIO implementation to semihosting interface
        xtensa: ISS: split simcall implementation from semihosting interface
        xtensa: simcall.h: Change compitible to compatible
        xtensa: Couple of typo fixes
        xtensa: drop extraneous register load from initialize_mmu
        xtensa: fix pgprot_noncached assumptions
        xtensa: simplify coherent_kvaddr logic
        xtensa: syscalls: switch to generic syscallhdr.sh
        xtensa: syscalls: switch to generic syscalltbl.sh
        xtensa: stop filling syscall array with sys_ni_syscall
        xtensa: remove unneeded export in boot-elf/Makefile
        xtensa: move CONFIG_CPU_*_ENDIAN defines to Kconfig
        xtensa: fix warning comparing pointer to 0
        xtensa: fix spelling mistake in Kconfig "wont" -> "won't"
      437d1a5b
    • Mike Snitzer's avatar
      dm raid: remove unnecessary discard limits for raid0 and raid10 · ca4a4e9a
      Mike Snitzer authored
      Commit 29efc390 ("md/md0: optimize raid0 discard handling") and
      commit d30588b2 ("md/raid10: improve raid10 discard request")
      remove MD raid0's and raid10's inability to properly handle large
      discards. So eliminate associated constraints from dm-raid's support.
      
      Depends-on: 29efc390 ("md/md0: optimize raid0 discard handling")
      Depends-on: d30588b2 ("md/raid10: improve raid10 discard request")
      Reported-by: default avatarMatthew Ruffell <matthew.ruffell@canonical.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      ca4a4e9a
    • Jane Chu's avatar
      mm/memory-failure: unnecessary amount of unmapping · 4d75136b
      Jane Chu authored
      It appears that unmap_mapping_range() actually takes a 'size' as its third
      argument rather than a location, the current calling fashion causes
      unnecessary amount of unmapping to occur.
      
      Link: https://lkml.kernel.org/r/20210420002821.2749748-1-jane.chu@oracle.com
      Fixes: 6100e34b ("mm, memory_failure: Teach memory_failure() about dev_pagemap pages")
      Signed-off-by: default avatarJane Chu <jane.chu@oracle.com>
      Reviewed-by: default avatarDan Williams <dan.j.williams@intel.com>
      Reviewed-by: default avatarNaoya Horiguchi <naoya.horiguchi@nec.com>
      Cc: Dave Jiang <dave.jiang@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4d75136b
    • Mike Rapoport's avatar
      mm/mmzone.h: fix existing kernel-doc comments and link them to core-api · 198fba41
      Mike Rapoport authored
      There are a couple of kernel-doc comments in include/linux/mmzone.h but
      they have minor formatting issues that would cause kernel-doc warnings.
      
      Fix the formatting of those comments, add missing Return: descriptions and
      link include/linux/mmzone.h to Documentation/core-api/mm-api.rst
      
      Link: https://lkml.kernel.org/r/20210426141927.1314326-2-rppt@kernel.orgSigned-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Reviewed-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Reviewed-by: default avatarAnshuman Khandual <anshuman.khandual@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      198fba41
    • Sergei Trofimovich's avatar
      mm: page_alloc: ignore init_on_free=1 for debug_pagealloc=1 · 9df65f52
      Sergei Trofimovich authored
      On !ARCH_SUPPORTS_DEBUG_PAGEALLOC (like ia64) debug_pagealloc=1 implies
      page_poison=on:
      
          if (page_poisoning_enabled() ||
               (!IS_ENABLED(CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC) &&
                debug_pagealloc_enabled()))
                  static_branch_enable(&_page_poisoning_enabled);
      
      page_poison=on needs to override init_on_free=1.
      
      Before the change it did not work as expected for the following case:
      - have PAGE_POISONING=y
      - have page_poison unset
      - have !ARCH_SUPPORTS_DEBUG_PAGEALLOC arch (like ia64)
      - have init_on_free=1
      - have debug_pagealloc=1
      
      That way we get both keys enabled:
      - static_branch_enable(&init_on_free);
      - static_branch_enable(&_page_poisoning_enabled);
      
      which leads to poisoned pages returned for __GFP_ZERO pages.
      
      After the change we execute only:
      - static_branch_enable(&_page_poisoning_enabled);
        and ignore init_on_free=1.
      
      Link: https://lkml.kernel.org/r/20210329222555.3077928-1-slyfox@gentoo.org
      Link: https://lkml.org/lkml/2021/3/26/443
      Fixes: 8db26a3d ("mm, page_poison: use static key more efficiently")
      Signed-off-by: default avatarSergei Trofimovich <slyfox@gentoo.org>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Andrey Konovalov <andreyknvl@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9df65f52
    • Jesper Dangaard Brouer's avatar
      net: page_pool: use alloc_pages_bulk in refill code path · be5dba25
      Jesper Dangaard Brouer authored
      There are cases where the page_pool need to refill with pages from the
      page allocator.  Some workloads cause the page_pool to release pages
      instead of recycling these pages.
      
      For these workload it can improve performance to bulk alloc pages from the
      page-allocator to refill the alloc cache.
      
      For XDP-redirect workload with 100G mlx5 driver (that use page_pool)
      redirecting xdp_frame packets into a veth, that does XDP_PASS to create an
      SKB from the xdp_frame, which then cannot return the page to the
      page_pool.
      
      Performance results under GitHub xdp-project[1]:
       [1] https://github.com/xdp-project/xdp-project/blob/master/areas/mem/page_pool06_alloc_pages_bulk.org
      
      Mel: The patch "net: page_pool: convert to use alloc_pages_bulk_array
      variant" was squashed with this patch. From the test page, the array
      variant was superior with one of the test results as follows.
      
      	Kernel		XDP stats       CPU     pps           Delta
      	Baseline	XDP-RX CPU      total   3,771,046       n/a
      	List		XDP-RX CPU      total   3,940,242    +4.49%
      	Array		XDP-RX CPU      total   4,249,224   +12.68%
      
      Link: https://lkml.kernel.org/r/20210325114228.27719-10-mgorman@techsingularity.netSigned-off-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Reviewed-by: default avatarAlexander Lobakin <alobakin@pm.me>
      Cc: Alexander Duyck <alexander.duyck@gmail.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Chuck Lever <chuck.lever@oracle.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Ilias Apalodimas <ilias.apalodimas@linaro.org>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      be5dba25
    • Jesper Dangaard Brouer's avatar
      net: page_pool: refactor dma_map into own function page_pool_dma_map · dfa59717
      Jesper Dangaard Brouer authored
      In preparation for next patch, move the dma mapping into its own function,
      as this will make it easier to follow the changes.
      
      [ilias.apalodimas: make page_pool_dma_map return boolean]
      
      Link: https://lkml.kernel.org/r/20210325114228.27719-9-mgorman@techsingularity.netSigned-off-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Reviewed-by: default avatarIlias Apalodimas <ilias.apalodimas@linaro.org>
      Reviewed-by: default avatarAlexander Lobakin <alobakin@pm.me>
      Cc: Alexander Duyck <alexander.duyck@gmail.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Chuck Lever <chuck.lever@oracle.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dfa59717
    • Chuck Lever's avatar
      SUNRPC: refresh rq_pages using a bulk page allocator · f6e70aab
      Chuck Lever authored
      Reduce the rate at which nfsd threads hammer on the page allocator.  This
      improves throughput scalability by enabling the threads to run more
      independently of each other.
      
      [mgorman: Update interpretation of alloc_pages_bulk return value]
      
      Link: https://lkml.kernel.org/r/20210325114228.27719-8-mgorman@techsingularity.netSigned-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Reviewed-by: default avatarAlexander Lobakin <alobakin@pm.me>
      Cc: Alexander Duyck <alexander.duyck@gmail.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Ilias Apalodimas <ilias.apalodimas@linaro.org>
      Cc: Jesper Dangaard Brouer <brouer@redhat.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f6e70aab
    • Chuck Lever's avatar
      SUNRPC: set rq_page_end differently · ab836264
      Chuck Lever authored
      Patch series "SUNRPC consumer for the bulk page allocator"
      
      This patch set and the measurements below are based on yesterday's
      bulk allocator series:
      
        git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux.git mm-bulk-rebase-v5r9
      
      The patches change SUNRPC to invoke the array-based bulk allocator
      instead of alloc_page().
      
      The micro-benchmark results are promising.  I ran a mixture of 256KB
      reads and writes over NFSv3.  The server's kernel is built with KASAN
      enabled, so the comparison is exaggerated but I believe it is still
      valid.
      
      I instrumented svc_recv() to measure the latency of each call to
      svc_alloc_arg() and report it via a trace point.  The following results
      are averages across the trace events.
      
        Single page: 25.007 us per call over 532,571 calls
        Bulk list:    6.258 us per call over 517,034 calls
        Bulk array:   4.590 us per call over 517,442 calls
      
      This patch (of 2)
      
      Refactor:
      
      I'm about to use the loop variable @i for something else.
      
      As far as the "i++" is concerned, that is a post-increment. The
      value of @i is not used subsequently, so the increment operator
      is unnecessary and can be removed.
      
      Also note that nfsd_read_actor() was renamed nfsd_splice_actor()
      by commit cf8208d0 ("sendfile: convert nfsd to
      splice_direct_to_actor()").
      
      Link: https://lkml.kernel.org/r/20210325114228.27719-7-mgorman@techsingularity.netSigned-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Reviewed-by: default avatarAlexander Lobakin <alobakin@pm.me>
      Cc: Alexander Duyck <alexander.duyck@gmail.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: David Miller <davem@davemloft.net>
      Cc: Ilias Apalodimas <ilias.apalodimas@linaro.org>
      Cc: Jesper Dangaard Brouer <brouer@redhat.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ab836264
    • Jesper Dangaard Brouer's avatar
      mm/page_alloc: inline __rmqueue_pcplist · 3b822017
      Jesper Dangaard Brouer authored
      When __alloc_pages_bulk() got introduced two callers of __rmqueue_pcplist
      exist and the compiler chooses to not inline this function.
      
        ./scripts/bloat-o-meter vmlinux-before vmlinux-inline__rmqueue_pcplist
        add/remove: 0/1 grow/shrink: 2/0 up/down: 164/-125 (39)
        Function                                     old     new   delta
        rmqueue                                     2197    2296     +99
        __alloc_pages_bulk                          1921    1986     +65
        __rmqueue_pcplist                            125       -    -125
        Total: Before=19374127, After=19374166, chg +0.00%
      
      modprobe page_bench04_bulk loops=$((10**7))
      
      Type:time_bulk_page_alloc_free_array
       -  Per elem: 106 cycles(tsc) 29.595 ns (step:64)
       - (measurement period time:0.295955434 sec time_interval:295955434)
       - (invoke count:10000000 tsc_interval:1065447105)
      
      Before:
       - Per elem: 110 cycles(tsc) 30.633 ns (step:64)
      
      Link: https://lkml.kernel.org/r/20210325114228.27719-6-mgorman@techsingularity.netSigned-off-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Reviewed-by: default avatarAlexander Lobakin <alobakin@pm.me>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Alexander Duyck <alexander.duyck@gmail.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Chuck Lever <chuck.lever@oracle.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Ilias Apalodimas <ilias.apalodimas@linaro.org>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3b822017
    • Jesper Dangaard Brouer's avatar
      mm/page_alloc: optimize code layout for __alloc_pages_bulk · ce76f9a1
      Jesper Dangaard Brouer authored
      Looking at perf-report and ASM-code for __alloc_pages_bulk() it is clear
      that the code activated is suboptimal.  The compiler guesses wrong and
      places unlikely code at the beginning.  Due to the use of WARN_ON_ONCE()
      macro the UD2 asm instruction is added to the code, which confuse the
      I-cache prefetcher in the CPU.
      
      [mgorman@techsingularity.net: minor changes and rebasing]
      
      Link: https://lkml.kernel.org/r/20210325114228.27719-5-mgorman@techsingularity.netSigned-off-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Reviewed-by: default avatarAlexander Lobakin <alobakin@pm.me>
      Acked-By: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Alexander Duyck <alexander.duyck@gmail.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Chuck Lever <chuck.lever@oracle.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Ilias Apalodimas <ilias.apalodimas@linaro.org>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ce76f9a1
    • Mel Gorman's avatar
      mm/page_alloc: add an array-based interface to the bulk page allocator · 0f87d9d3
      Mel Gorman authored
      The proposed callers for the bulk allocator store pages from the bulk
      allocator in an array.  This patch adds an array-based interface to the
      API to avoid multiple list iterations.  The page list interface is
      preserved to avoid requiring all users of the bulk API to allocate and
      manage enough storage to store the pages.
      
      [akpm@linux-foundation.org: remove now unused local `allocated']
      
      Link: https://lkml.kernel.org/r/20210325114228.27719-4-mgorman@techsingularity.netSigned-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Reviewed-by: default avatarAlexander Lobakin <alobakin@pm.me>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Alexander Duyck <alexander.duyck@gmail.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Chuck Lever <chuck.lever@oracle.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Ilias Apalodimas <ilias.apalodimas@linaro.org>
      Cc: Jesper Dangaard Brouer <brouer@redhat.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0f87d9d3
    • Mel Gorman's avatar
      mm/page_alloc: add a bulk page allocator · 387ba26f
      Mel Gorman authored
      This patch adds a new page allocator interface via alloc_pages_bulk, and
      __alloc_pages_bulk_nodemask.  A caller requests a number of pages to be
      allocated and added to a list.
      
      The API is not guaranteed to return the requested number of pages and
      may fail if the preferred allocation zone has limited free memory, the
      cpuset changes during the allocation or page debugging decides to fail
      an allocation.  It's up to the caller to request more pages in batch if
      necessary.
      
      Note that this implementation is not very efficient and could be
      improved but it would require refactoring.  The intent is to make it
      available early to determine what semantics are required by different
      callers.  Once the full semantics are nailed down, it can be refactored.
      
      [mgorman@techsingularity.net: fix alloc_pages_bulk() return type, per Matthew]
        Link: https://lkml.kernel.org/r/20210325123713.GQ3697@techsingularity.net
      [mgorman@techsingularity.net: fix uninit var warning]
        Link: https://lkml.kernel.org/r/20210330114847.GX3697@techsingularity.net
      [mgorman@techsingularity.net: fix comment, per Vlastimil]
        Link: https://lkml.kernel.org/r/20210412110255.GV3697@techsingularity.net
      
      Link: https://lkml.kernel.org/r/20210325114228.27719-3-mgorman@techsingularity.netSigned-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Reviewed-by: default avatarAlexander Lobakin <alobakin@pm.me>
      Tested-by: default avatarColin Ian King <colin.king@canonical.com>
      Cc: Alexander Duyck <alexander.duyck@gmail.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Chuck Lever <chuck.lever@oracle.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Ilias Apalodimas <ilias.apalodimas@linaro.org>
      Cc: Jesper Dangaard Brouer <brouer@redhat.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      387ba26f
    • Mel Gorman's avatar
      mm/page_alloc: rename alloced to allocated · cb66bede
      Mel Gorman authored
      Patch series "Introduce a bulk order-0 page allocator with two in-tree users", v6.
      
      This series introduces a bulk order-0 page allocator with sunrpc and the
      network page pool being the first users.  The implementation is not
      efficient as semantics needed to be ironed out first.  If no other
      semantic changes are needed, it can be made more efficient.  Despite that,
      this is a performance-related for users that require multiple pages for an
      operation without multiple round-trips to the page allocator.  Quoting the
      last patch for the high-speed networking use-case
      
                  Kernel          XDP stats       CPU     pps           Delta
                  Baseline        XDP-RX CPU      total   3,771,046       n/a
                  List            XDP-RX CPU      total   3,940,242    +4.49%
                  Array           XDP-RX CPU      total   4,249,224   +12.68%
      
      Via the SUNRPC traces of svc_alloc_arg()
      
      	Single page: 25.007 us per call over 532,571 calls
      	Bulk list:    6.258 us per call over 517,034 calls
      	Bulk array:   4.590 us per call over 517,442 calls
      
      Both potential users in this series are corner cases (NFS and high-speed
      networks) so it is unlikely that most users will see any benefit in the
      short term.  Other potential other users are batch allocations for page
      cache readahead, fault around and SLUB allocations when high-order pages
      are unavailable.  It's unknown how much benefit would be seen by
      converting multiple page allocation calls to a single batch or what
      difference it may make to headline performance.
      
      Light testing of my own running dbench over NFS passed.  Chuck and Jesper
      conducted their own tests and details are included in the changelogs.
      
      Patch 1 renames a variable name that is particularly unpopular
      
      Patch 2 adds a bulk page allocator
      
      Patch 3 adds an array-based version of the bulk allocator
      
      Patches 4-5 adds micro-optimisations to the implementation
      
      Patches 6-7 SUNRPC user
      
      Patches 8-9 Network page_pool user
      
      This patch (of 9):
      
      Review feedback of the bulk allocator twice found problems with "alloced"
      being a counter for pages allocated.  The naming was based on the API name
      "alloc" and was based on the idea that verbal communication about malloc
      tends to use the fake word "malloced" instead of the fake word mallocated.
      To be consistent, this preparation patch renames alloced to allocated in
      rmqueue_bulk so the bulk allocator and per-cpu allocator use similar names
      when the bulk allocator is introduced.
      
      Link: https://lkml.kernel.org/r/20210325114228.27719-1-mgorman@techsingularity.net
      Link: https://lkml.kernel.org/r/20210325114228.27719-2-mgorman@techsingularity.netSigned-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Reviewed-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Reviewed-by: default avatarAlexander Lobakin <alobakin@pm.me>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Chuck Lever <chuck.lever@oracle.com>
      Cc: Jesper Dangaard Brouer <brouer@redhat.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Alexander Duyck <alexander.duyck@gmail.com>
      Cc: Ilias Apalodimas <ilias.apalodimas@linaro.org>
      Cc: David Miller <davem@davemloft.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cb66bede
    • zhouchuangao's avatar
    • Kefeng Wang's avatar
      mm, page_alloc: avoid page_to_pfn() in move_freepages() · 39ddb991
      Kefeng Wang authored
      The start_pfn and end_pfn are already available in move_freepages_block(),
      there is no need to go back and forth between page and pfn in
      move_freepages and move_freepages_block, and pfn_valid_within() should
      validate pfn first before touching the page.
      
      Link: https://lkml.kernel.org/r/20210323131215.934472-1-liushixin2@huawei.comSigned-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Signed-off-by: default avatarLiu Shixin <liushixin2@huawei.com>
      Reviewed-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Michal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      39ddb991
    • Geert Uytterhoeven's avatar
      mm/Kconfig: remove default DISCONTIGMEM_MANUAL · d68d015a
      Geert Uytterhoeven authored
      Commit 214496cb ("ia64: make SPARSEMEM default and disable
      DISCONTIGMEM") removed the last enabler of ARCH_DISCONTIGMEM_DEFAULT,
      hence the memory model can no longer default to DISCONTIGMEM_MANUAL.
      
      Link: https://lkml.kernel.org/r/20210312141208.3465520-1-geert@linux-m68k.orgSigned-off-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Reviewed-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d68d015a