1. 18 Oct, 2023 4 commits
    • Rik van Riel's avatar
      hugetlbfs: close race between MADV_DONTNEED and page fault · 2820b0f0
      Rik van Riel authored
      Malloc libraries, like jemalloc and tcalloc, take decisions on when to
      call madvise independently from the code in the main application.
      
      This sometimes results in the application page faulting on an address,
      right after the malloc library has shot down the backing memory with
      MADV_DONTNEED.
      
      Usually this is harmless, because we always have some 4kB pages sitting
      around to satisfy a page fault.  However, with hugetlbfs systems often
      allocate only the exact number of huge pages that the application wants.
      
      Due to TLB batching, hugetlbfs MADV_DONTNEED will free pages outside of
      any lock taken on the page fault path, which can open up the following
      race condition:
      
             CPU 1                            CPU 2
      
             MADV_DONTNEED
             unmap page
             shoot down TLB entry
                                             page fault
                                             fail to allocate a huge page
                                             killed with SIGBUS
             free page
      
      Fix that race by pulling the locking from __unmap_hugepage_final_range
      into helper functions called from zap_page_range_single.  This ensures
      page faults stay locked out of the MADV_DONTNEED VMA until the huge pages
      have actually been freed.
      
      Link: https://lkml.kernel.org/r/20231006040020.3677377-4-riel@surriel.com
      Fixes: 04ada095 ("hugetlb: don't delete vma_lock in hugetlb MADV_DONTNEED processing")
      Signed-off-by: default avatarRik van Riel <riel@surriel.com>
      Reviewed-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Muchun Song <muchun.song@linux.dev>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      2820b0f0
    • Rik van Riel's avatar
      hugetlbfs: extend hugetlb_vma_lock to private VMAs · bf491692
      Rik van Riel authored
      Extend the locking scheme used to protect shared hugetlb mappings from
      truncate vs page fault races, in order to protect private hugetlb mappings
      (with resv_map) against MADV_DONTNEED.
      
      Add a read-write semaphore to the resv_map data structure, and use that
      from the hugetlb_vma_(un)lock_* functions, in preparation for closing the
      race between MADV_DONTNEED and page faults.
      
      Link: https://lkml.kernel.org/r/20231006040020.3677377-3-riel@surriel.com
      Fixes: 04ada095 ("hugetlb: don't delete vma_lock in hugetlb MADV_DONTNEED processing")
      Signed-off-by: default avatarRik van Riel <riel@surriel.com>
      Reviewed-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Muchun Song <muchun.song@linux.dev>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      bf491692
    • Rik van Riel's avatar
      hugetlbfs: clear resv_map pointer if mmap fails · 92fe9dcb
      Rik van Riel authored
      Patch series "hugetlbfs: close race between MADV_DONTNEED and page fault", v7.
      
      Malloc libraries, like jemalloc and tcalloc, take decisions on when to
      call madvise independently from the code in the main application.
      
      This sometimes results in the application page faulting on an address,
      right after the malloc library has shot down the backing memory with
      MADV_DONTNEED.
      
      Usually this is harmless, because we always have some 4kB pages sitting
      around to satisfy a page fault.  However, with hugetlbfs systems often
      allocate only the exact number of huge pages that the application wants.
      
      Due to TLB batching, hugetlbfs MADV_DONTNEED will free pages outside of
      any lock taken on the page fault path, which can open up the following
      race condition:
      
             CPU 1                            CPU 2
      
             MADV_DONTNEED
             unmap page
             shoot down TLB entry
                                             page fault
                                             fail to allocate a huge page
                                             killed with SIGBUS
             free page
      
      Fix that race by extending the hugetlb_vma_lock locking scheme to also
      cover private hugetlb mappings (with resv_map), and pulling the locking
      from __unmap_hugepage_final_range into helper functions called from
      zap_page_range_single.  This ensures page faults stay locked out of the
      MADV_DONTNEED VMA until the huge pages have actually been freed.
      
      
      This patch (of 3):
      
      Hugetlbfs leaves a dangling pointer in the VMA if mmap fails.  This has
      not been a problem so far, but other code in this patch series tries to
      follow that pointer.
      
      Link: https://lkml.kernel.org/r/20231006040020.3677377-1-riel@surriel.com
      Link: https://lkml.kernel.org/r/20231006040020.3677377-2-riel@surriel.com
      Fixes: 04ada095 ("hugetlb: don't delete vma_lock in hugetlb MADV_DONTNEED processing")
      Signed-off-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Signed-off-by: default avatarRik van Riel <riel@surriel.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Muchun Song <muchun.song@linux.dev>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      92fe9dcb
    • Johannes Weiner's avatar
      mm: zswap: fix pool refcount bug around shrink_worker() · 969d63e1
      Johannes Weiner authored
      When a zswap store fails due to the limit, it acquires a pool reference
      and queues the shrinker.  When the shrinker runs, it drops the reference. 
      However, there can be multiple store attempts before the shrinker wakes up
      and runs once.  This results in reference leaks and eventual saturation
      warnings for the pool refcount.
      
      Fix this by dropping the reference again when the shrinker is already
      queued.  This ensures one reference per shrinker run.
      
      Link: https://lkml.kernel.org/r/20231006160024.170748-1-hannes@cmpxchg.org
      Fixes: 45190f01 ("mm/zswap.c: add allocation hysteresis if pool limit is hit")
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Reported-by: default avatarChris Mason <clm@fb.com>
      Acked-by: default avatarNhat Pham <nphamcs@gmail.com>
      Cc: Vitaly Wool <vitaly.wool@konsulko.com>
      Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com>
      Cc: <stable@vger.kernel.org>	[5.6+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      969d63e1
  2. 06 Oct, 2023 8 commits
  3. 01 Oct, 2023 15 commits
    • Linus Torvalds's avatar
      Linux 6.6-rc4 · 8a749fd1
      Linus Torvalds authored
      8a749fd1
    • Linus Torvalds's avatar
      Merge tag 'kbuild-fixes-v6.6-2' of... · e81a2dab
      Linus Torvalds authored
      Merge tag 'kbuild-fixes-v6.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
      
      Pull Kbuild fixes from Masahiro Yamada:
      
       - Fix the module compression with xz so the in-kernel decompressor
         works
      
       - Document a kconfig idiom to express an optional dependency between
         modules
      
       - Make modpost, when W=1 is given, detect broken drivers that reference
         .exit.* sections
      
       - Remove unused code
      
      * tag 'kbuild-fixes-v6.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        kbuild: remove stale code for 'source' symlink in packaging scripts
        modpost: Don't let "driver"s reference .exit.*
        vmlinux.lds.h: remove unused CPU_KEEP and CPU_DISCARD macros
        modpost: add missing else to the "of" check
        Documentation: kbuild: explain handling optional dependencies
        kbuild: Use CRC32 and a 1MiB dictionary for XZ compressed modules
      e81a2dab
    • Linus Torvalds's avatar
      Merge tag 'mm-hotfixes-stable-2023-10-01-08-34' of... · d2c52315
      Linus Torvalds authored
      Merge tag 'mm-hotfixes-stable-2023-10-01-08-34' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
      
      Pull misc fixes from Andrew Morton:
       "Fourteen hotfixes, eleven of which are cc:stable. The remainder
        pertain to issues which were introduced after 6.5"
      
      * tag 'mm-hotfixes-stable-2023-10-01-08-34' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
        Crash: add lock to serialize crash hotplug handling
        selftests/mm: fix awk usage in charge_reserved_hugetlb.sh and hugetlb_reparenting_test.sh that may cause error
        mm: mempolicy: keep VMA walk if both MPOL_MF_STRICT and MPOL_MF_MOVE are specified
        mm/damon/vaddr-test: fix memory leak in damon_do_test_apply_three_regions()
        mm, memcg: reconsider kmem.limit_in_bytes deprecation
        mm: zswap: fix potential memory corruption on duplicate store
        arm64: hugetlb: fix set_huge_pte_at() to work with all swap entries
        mm: hugetlb: add huge page size param to set_huge_pte_at()
        maple_tree: add MAS_UNDERFLOW and MAS_OVERFLOW states
        maple_tree: add mas_is_active() to detect in-tree walks
        nilfs2: fix potential use after free in nilfs_gccache_submit_read_data()
        mm: abstract moving to the next PFN
        mm: report success more often from filemap_map_folio_range()
        fs: binfmt_elf_efpic: fix personality for ELF-FDPIC
      d2c52315
    • Linus Torvalds's avatar
      Merge tag 'char-misc-6.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · 8f633369
      Linus Torvalds authored
      Pull misc driver fix from Greg KH:
       "Here is a single, much requested, fix for a set of misc drivers to
        resolve a much reported regression in the -rc series that has also
        propagated back to the stable releases. Sorry for the delay, lots of
        conference travel for a few weeks put me very far behind in patch
        wrangling.
      
        It has been reported by many to resolve the reported problem, and has
        been in linux-next with no reported issues"
      
      * tag 'char-misc-6.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
        misc: rtsx: Fix some platforms can not boot and move the l1ss judgment to probe
      8f633369
    • Linus Torvalds's avatar
      Merge tag 'tty-6.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · 3abd15e2
      Linus Torvalds authored
      Pull tty / serial driver fixes from Greg KH:
       "Here are two tty/serial driver fixes for 6.6-rc4 that resolve some
        reported regressions:
      
         - revert a n_gsm change that ended up causing problems
      
         - 8250_port fix for irq data
      
        both have been in linux-next for over a week with no reported
        problems"
      
      * tag 'tty-6.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        Revert "tty: n_gsm: fix UAF in gsm_cleanup_mux"
        serial: 8250_port: Check IRQ data before use
      3abd15e2
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2023-10-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ec8c2981
      Linus Torvalds authored
      Pull x86 fixes from Ingo Molnar:
       "Misc fixes: a kerneldoc build warning fix, add SRSO mitigation for
        AMD-derived Hygon processors, and fix a SGX kernel crash in the page
        fault handler that can trigger when ksgxd races to reclaim the SECS
        special page, by making the SECS page unswappable"
      
      * tag 'x86-urgent-2023-10-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/sgx: Resolves SECS reclaim vs. page fault for EAUG race
        x86/srso: Add SRSO mitigation for Hygon processors
        x86/kgdb: Fix a kerneldoc warning when build with W=1
      ec8c2981
    • Linus Torvalds's avatar
      Merge tag 'timers-urgent-2023-10-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 373ceff2
      Linus Torvalds authored
      Pull timer fix from Ingo Molnar:
       "Fix a spurious kernel warning during CPU hotplug events that may
        trigger when timer/hrtimer softirqs are pending, which are otherwise
        hotplug-safe and don't merit a warning"
      
      * tag 'timers-urgent-2023-10-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        timers: Tag (hr)timer softirq as hotplug safe
      373ceff2
    • Linus Torvalds's avatar
      Merge tag 'sched-urgent-2023-10-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · c5ecffe6
      Linus Torvalds authored
      Pull scheduler fix from Ingo Molnar:
       "Fix a RT tasks related lockup/live-lock during CPU offlining"
      
      * tag 'sched-urgent-2023-10-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/rt: Fix live lock between select_fallback_rq() and RT push
      c5ecffe6
    • Linus Torvalds's avatar
      Merge tag 'perf-urgent-2023-10-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3a38c57a
      Linus Torvalds authored
      Pull perf event fixes from Ingo Molnar:
       "Misc fixes: work around an AMD microcode bug on certain models, and
        fix kexec kernel PMI handlers on AMD systems that get loaded on older
        kernels that have an unexpected register state"
      
      * tag 'perf-urgent-2023-10-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/x86/amd: Do not WARN() on every IRQ
        perf/x86/amd/core: Fix overflow reset on hotplug
      3a38c57a
    • Masahiro Yamada's avatar
      kbuild: remove stale code for 'source' symlink in packaging scripts · 2d7d1bc1
      Masahiro Yamada authored
      Since commit d8131c29 ("kbuild: remove $(MODLIB)/source symlink"),
      modules_install does not create the 'source' symlink.
      
      Remove the stale code from builddeb and kernel.spec.
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      2d7d1bc1
    • Uwe Kleine-König's avatar
      modpost: Don't let "driver"s reference .exit.* · f177cd0c
      Uwe Kleine-König authored
      Drivers must not reference functions marked with __exit as these likely
      are not available when the code is built-in.
      
      There are few creative offenders uncovered for example in ARCH=amd64
      allmodconfig builds. So only trigger the section mismatch warning for
      W=1 builds.
      
      The dual rule that drivers must not reference .init.* is implemented
      since commit 0db25245 ("modpost: don't allow *driver to reference
      .init.*") which however missed that .exit.* should be handled in the
      same way.
      
      Thanks to Masahiro Yamada and Arnd Bergmann who gave valuable hints to
      find this improvement.
      Signed-off-by: default avatarUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      f177cd0c
    • Masahiro Yamada's avatar
      vmlinux.lds.h: remove unused CPU_KEEP and CPU_DISCARD macros · 15e86643
      Masahiro Yamada authored
      Remove the left-over of commit e24f6628 ("modpost: remove all
      traces of cpuinit/cpuexit sections").
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Acked-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      15e86643
    • Mauricio Faria de Oliveira's avatar
      modpost: add missing else to the "of" check · cbc3d00c
      Mauricio Faria de Oliveira authored
      Without this 'else' statement, an "usb" name goes into two handlers:
      the first/previous 'if' statement _AND_ the for-loop over 'devtable',
      but the latter is useless as it has no 'usb' device_id entry anyway.
      
      Tested with allmodconfig before/after patch; no changes to *.mod.c:
      
          git checkout v6.6-rc3
          make -j$(nproc) allmodconfig
          make -j$(nproc) olddefconfig
      
          make -j$(nproc)
          find . -name '*.mod.c' | cpio -pd /tmp/before
      
          # apply patch
      
          make -j$(nproc)
          find . -name '*.mod.c' | cpio -pd /tmp/after
      
          diff -r /tmp/before/ /tmp/after/
          # no difference
      
      Fixes: acbef7b7 ("modpost: fix module autoloading for OF devices with generic compatible property")
      Signed-off-by: default avatarMauricio Faria de Oliveira <mfo@canonical.com>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      cbc3d00c
    • Linus Torvalds's avatar
      Merge tag 'soc-fixes-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc · e402b086
      Linus Torvalds authored
      Pull ARM SoC fixes from Arnd Bergmann:
       "These are the latest bug fixes that have come up in the soc tree. Most
        of these are fairly minor. Most notably, the majority of changes this
        time are not for dts files as usual.
      
         - Updates to the addresses of the broadcom and aspeed entries in the
           MAINTAINERS file.
      
         - Defconfig updates to address a regression on samsung and a build
           warning from an unknown Kconfig symbol
      
         - Build fixes for the StrongARM and Uniphier platforms
      
         - Code fixes for SCMI and FF-A firmware drivers, both of which had a
           simple bug that resulted in invalid data, and a lesser fix for the
           optee firmware driver
      
         - Multiple fixes for the recently added loongson/loongarch "guts" soc
           driver
      
         - Devicetree fixes for RISC-V on the startfive platform, addressing
           issues with NOR flash, usb and uart.
      
         - Multiple fixes for NXP i.MX8/i.MX9 dts files, fixing problems with
           clock, gpio, hdmi settings and the Makefile
      
         - Bug fixes for i.MX firmware code and the OCOTP soc driver
      
         - Multiple fixes for the TI sysc bus driver
      
         - Minor dts updates for TI omap dts files, to address boot time
           warnings and errors"
      
      * tag 'soc-fixes-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (35 commits)
        MAINTAINERS: Fix Florian Fainelli's email address
        arm64: defconfig: enable syscon-poweroff driver
        ARM: locomo: fix locomolcd_power declaration
        soc: loongson: loongson2_guts: Remove unneeded semicolon
        soc: loongson: loongson2_guts: Convert to devm_platform_ioremap_resource()
        soc: loongson: loongson_pm2: Populate children syscon nodes
        dt-bindings: soc: loongson,ls2k-pmc: Allow syscon-reboot/syscon-poweroff as child
        soc: loongson: loongson_pm2: Drop useless of_device_id compatible
        dt-bindings: soc: loongson,ls2k-pmc: Use fallbacks for ls2k-pmc compatible
        soc: loongson: loongson_pm2: Add dependency for INPUT
        arm64: defconfig: remove CONFIG_COMMON_CLK_NPCM8XX=y
        ARM: uniphier: fix cache kernel-doc warnings
        MAINTAINERS: aspeed: Update Andrew's email address
        MAINTAINERS: aspeed: Update git tree URL
        firmware: arm_ffa: Don't set the memory region attributes for MEM_LEND
        arm64: dts: imx: Add imx8mm-prt8mm.dtb to build
        arm64: dts: imx8mm-evk: Fix hdmi@3d node
        soc: imx8m: Enable OCOTP clock for imx8mm before reading registers
        arm64: dts: imx8mp-beacon-kit: Fix audio_pll2 clock
        arm64: dts: imx8mp: Fix SDMA2/3 clocks
        ...
      e402b086
    • Linus Torvalds's avatar
      Merge tag 'trace-v6.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace · 3b347e40
      Linus Torvalds authored
      Pull tracing fixes from Steven Rostedt:
      
       - Make sure 32-bit applications using user events have aligned access
         when running on a 64-bit kernel.
      
       - Add cond_resched in the loop that handles converting enums in
         print_fmt string is trace events.
      
       - Fix premature wake ups of polling processes in the tracing ring
         buffer. When a task polls waiting for a percentage of the ring buffer
         to be filled, the writer still will wake it up at every event. Add
         the polling's percentage to the "shortest_full" list to tell the
         writer when to wake it up.
      
       - For eventfs dir lookups on dynamic events, an event system's only
         event could be removed, leaving its dentry with no children. This is
         totally legitimate. But in eventfs_release() it must not access the
         children array, as it is only allocated when the dentry has children.
      
      * tag 'trace-v6.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        eventfs: Test for dentries array allocated in eventfs_release()
        tracing/user_events: Align set_bit() address for all archs
        tracing: relax trace_event_eval_update() execution with cond_resched()
        ring-buffer: Update "shortest_full" in polling
      3b347e40
  4. 30 Sep, 2023 13 commits