1. 25 Jun, 2024 13 commits
    • Andrew Bresticker's avatar
      mm/memory: don't require head page for do_set_pmd() · ab1ffc86
      Andrew Bresticker authored
      The requirement that the head page be passed to do_set_pmd() was added in
      commit ef37b2ea ("mm/memory: page_add_file_rmap() ->
      folio_add_file_rmap_[pte|pmd]()") and prevents pmd-mapping in the
      finish_fault() and filemap_map_pages() paths if the page to be inserted is
      anything but the head page for an otherwise suitable vma and pmd-sized
      page.
      
      Matthew said:
      
      : We're going to stop using PMDs to map large folios unless the fault is
      : within the first 4KiB of the PMD.  No idea how many workloads that
      : affects, but it only needs to be backported as far as v6.8, so we may
      : as well backport it.
      
      Link: https://lkml.kernel.org/r/20240611153216.2794513-1-abrestic@rivosinc.com
      Fixes: ef37b2ea ("mm/memory: page_add_file_rmap() -> folio_add_file_rmap_[pte|pmd]()")
      Signed-off-by: default avatarAndrew Bresticker <abrestic@rivosinc.com>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Acked-by: default avatarHugh Dickins <hughd@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      ab1ffc86
    • yangge's avatar
      mm/page_alloc: Separate THP PCP into movable and non-movable categories · bf14ed81
      yangge authored
      Since commit 5d0a661d ("mm/page_alloc: use only one PCP list for
      THP-sized allocations") no longer differentiates the migration type of
      pages in THP-sized PCP list, it's possible that non-movable allocation
      requests may get a CMA page from the list, in some cases, it's not
      acceptable.
      
      If a large number of CMA memory are configured in system (for example, the
      CMA memory accounts for 50% of the system memory), starting a virtual
      machine with device passthrough will get stuck.  During starting the
      virtual machine, it will call pin_user_pages_remote(..., FOLL_LONGTERM,
      ...) to pin memory.  Normally if a page is present and in CMA area,
      pin_user_pages_remote() will migrate the page from CMA area to non-CMA
      area because of FOLL_LONGTERM flag.  But if non-movable allocation
      requests return CMA memory, migrate_longterm_unpinnable_pages() will
      migrate a CMA page to another CMA page, which will fail to pass the check
      in check_and_migrate_movable_pages() and cause migration endless.
      
      Call trace:
      pin_user_pages_remote
      --__gup_longterm_locked // endless loops in this function
      ----_get_user_pages_locked
      ----check_and_migrate_movable_pages
      ------migrate_longterm_unpinnable_pages
      --------alloc_migration_target
      
      This problem will also have a negative impact on CMA itself.  For example,
      when CMA is borrowed by THP, and we need to reclaim it through cma_alloc()
      or dma_alloc_coherent(), we must move those pages out to ensure CMA's
      users can retrieve that contigous memory.  Currently, CMA's memory is
      occupied by non-movable pages, meaning we can't relocate them.  As a
      result, cma_alloc() is more likely to fail.
      
      To fix the problem above, we add one PCP list for THP, which will not
      introduce a new cacheline for struct per_cpu_pages.  THP will have 2 PCP
      lists, one PCP list is used by MOVABLE allocation, and the other PCP list
      is used by UNMOVABLE allocation.  MOVABLE allocation contains GPF_MOVABLE,
      and UNMOVABLE allocation contains GFP_UNMOVABLE and GFP_RECLAIMABLE.
      
      Link: https://lkml.kernel.org/r/1718845190-4456-1-git-send-email-yangge1116@126.com
      Fixes: 5d0a661d ("mm/page_alloc: use only one PCP list for THP-sized allocations")
      Signed-off-by: default avataryangge <yangge1116@126.com>
      Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
      Cc: Barry Song <21cnbao@gmail.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      bf14ed81
    • Christoph Hellwig's avatar
      nfs: drop the incorrect assertion in nfs_swap_rw() · 54e7d598
      Christoph Hellwig authored
      Since commit 2282679f ("mm: submit multipage write for SWP_FS_OPS
      swap-space"), we can plug multiple pages then unplug them all together. 
      That means iov_iter_count(iter) could be way bigger than PAGE_SIZE, it
      actually equals the size of iov_iter_npages(iter, INT_MAX).
      
      Note this issue has nothing to do with large folios as we don't support
      THP_SWPOUT to non-block devices.
      
      [v-songbaohua@oppo.com: figure out the cause and correct the commit message]
      Link: https://lkml.kernel.org/r/20240618065647.21791-1-21cnbao@gmail.com
      Fixes: 2282679f ("mm: submit multipage write for SWP_FS_OPS swap-space")
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarBarry Song <v-songbaohua@oppo.com>
      Closes: https://lore.kernel.org/linux-mm/20240617053201.GA16852@lst.de/Reviewed-by: default avatarMartin Wege <martin.l.wege@gmail.com>
      Cc: NeilBrown <neilb@suse.de>
      Cc: Anna Schumaker <anna@kernel.org>
      Cc: Steve French <sfrench@samba.org>
      Cc: Trond Myklebust <trondmy@kernel.org>
      Cc: Chuanhua Han <hanchuanhua@oppo.com>
      Cc: Ryan Roberts <ryan.roberts@arm.com>
      Cc: Chris Li <chrisl@kernel.org>
      Cc: "Huang, Ying" <ying.huang@intel.com>
      Cc: Jeff Layton <jlayton@kernel.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      54e7d598
    • Zi Yan's avatar
      mm/migrate: make migrate_pages_batch() stats consistent · c6408250
      Zi Yan authored
      As Ying pointed out in [1], stats->nr_thp_failed needs to be updated to
      avoid stats inconsistency between MIGRATE_SYNC and MIGRATE_ASYNC when
      calling migrate_pages_batch().
      
      Because if not, when migrate_pages_batch() is called via
      migrate_pages(MIGRATE_ASYNC), nr_thp_failed will not be increased and when
      migrate_pages_batch() is called via migrate_pages(MIGRATE_SYNC*),
      nr_thp_failed will be increase in migrate_pages_sync() by
      stats->nr_thp_failed += astats.nr_thp_split.
      
      [1] https://lore.kernel.org/linux-mm/87msnq7key.fsf@yhuang6-desk2.ccr.corp.intel.com/
      
      Link: https://lkml.kernel.org/r/20240620012712.19804-1-zi.yan@sent.com
      Link: https://lkml.kernel.org/r/20240618134151.29214-1-zi.yan@sent.com
      Fixes: 7262f208 ("mm/migrate: split source folio if it is on deferred split list")
      Signed-off-by: default avatarZi Yan <ziy@nvidia.com>
      Suggested-by: default avatar"Huang, Ying" <ying.huang@intel.com>
      Reviewed-by: default avatar"Huang, Ying" <ying.huang@intel.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Yang Shi <shy828301@gmail.com>
      Cc: Yin Fengwei <fengwei.yin@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      c6408250
    • Jarkko Sakkinen's avatar
      MAINTAINERS: TPM DEVICE DRIVER: update the W-tag · f3228a2d
      Jarkko Sakkinen authored
      Git hosting for the test suite has been migrated from Gitlab to Codeberg,
      given the "less hostile environment".
      
      Link: https://lkml.kernel.org/r/20240618133556.105604-1-jarkko@kernel.org
      Link: https://codeberg.org/jarkko/linux-tpmdd-testSigned-off-by: default avatarJarkko Sakkinen <jarkko@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      f3228a2d
    • aigourensheng's avatar
      selftests/mm:fix test_prctl_fork_exec return failure · 8b8546d2
      aigourensheng authored
      After calling fork() in test_prctl_fork_exec(), the global variable
      ksm_full_scans_fd is initialized to 0 in the child process upon entering
      the main function of ./ksm_functional_tests.
      
      In the function call chain test_child_ksm() -> __mmap_and_merge_range ->
      ksm_merge-> ksm_get_full_scans, start_scans = ksm_get_full_scans() will
      return an error.  Therefore, the value of ksm_full_scans_fd needs to be
      initialized before calling test_child_ksm in the child process.
      
      Link: https://lkml.kernel.org/r/20240617052934.5834-1-shechenglong001@gmail.comSigned-off-by: default avataraigourensheng <shechenglong001@gmail.com>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      8b8546d2
    • Stephen Brennan's avatar
      mm: convert page type macros to enum · ff202303
      Stephen Brennan authored
      Changing PG_slab from a page flag to a page type in commit 46df8e73
      ("mm: free up PG_slab") in has the unintended consequence of removing the
      PG_slab constant from kernel debuginfo.  The commit does add the value to
      the vmcoreinfo note, which allows debuggers to find the value without
      hardcoding it.  However it's most flexible to continue representing the
      constant with an enum.  To that end, convert the page type fields into an
      enum.  Debuggers will now be able to detect that PG_slab's type has
      changed from enum pageflags to enum pagetype.
      
      Link: https://lkml.kernel.org/r/20240607202954.1198180-1-stephen.s.brennan@oracle.com
      Fixes: 46df8e73 ("mm: free up PG_slab")
      Signed-off-by: default avatarStephen Brennan <stephen.s.brennan@oracle.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Hao Ge <gehao@kylinos.cn>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Omar Sandoval <osandov@osandov.com>
      Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      ff202303
    • Jan Kara's avatar
      ocfs2: fix DIO failure due to insufficient transaction credits · be346c1a
      Jan Kara authored
      The code in ocfs2_dio_end_io_write() estimates number of necessary
      transaction credits using ocfs2_calc_extend_credits().  This however does
      not take into account that the IO could be arbitrarily large and can
      contain arbitrary number of extents.
      
      Extent tree manipulations do often extend the current transaction but not
      in all of the cases.  For example if we have only single block extents in
      the tree, ocfs2_mark_extent_written() will end up calling
      ocfs2_replace_extent_rec() all the time and we will never extend the
      current transaction and eventually exhaust all the transaction credits if
      the IO contains many single block extents.  Once that happens a
      WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in
      jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to
      this error.  This was actually triggered by one of our customers on a
      heavily fragmented OCFS2 filesystem.
      
      To fix the issue make sure the transaction always has enough credits for
      one extent insert before each call of ocfs2_mark_extent_written().
      
      Heming Zhao said:
      
      ------
      PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error"
      
      PID: xxx  TASK: xxxx  CPU: 5  COMMAND: "SubmitThread-CA"
        #0 machine_kexec at ffffffff8c069932
        #1 __crash_kexec at ffffffff8c1338fa
        #2 panic at ffffffff8c1d69b9
        #3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2]
        #4 __ocfs2_abort at ffffffffc0c88387 [ocfs2]
        #5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2]
        #6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2]
        #7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2]
        #8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2]
        #9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2]
      #10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2]
      #11 dio_complete at ffffffff8c2b9fa7
      #12 do_blockdev_direct_IO at ffffffff8c2bc09f
      #13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2]
      #14 generic_file_direct_write at ffffffff8c1dcf14
      #15 __generic_file_write_iter at ffffffff8c1dd07b
      #16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2]
      #17 aio_write at ffffffff8c2cc72e
      #18 kmem_cache_alloc at ffffffff8c248dde
      #19 do_io_submit at ffffffff8c2ccada
      #20 do_syscall_64 at ffffffff8c004984
      #21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba
      
      Link: https://lkml.kernel.org/r/20240617095543.6971-1-jack@suse.cz
      Link: https://lkml.kernel.org/r/20240614145243.8837-1-jack@suse.cz
      Fixes: c15471f7 ("ocfs2: fix sparse file & data ordering issue in direct io")
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarJoseph Qi <joseph.qi@linux.alibaba.com>
      Reviewed-by: default avatarHeming Zhao <heming.zhao@suse.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Changwei Ge <gechangwei@live.cn>
      Cc: Gang He <ghe@suse.com>
      Cc: Jun Piao <piaojun@huawei.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      be346c1a
    • Andrey Konovalov's avatar
      kasan: fix bad call to unpoison_slab_object · 1c61990d
      Andrey Konovalov authored
      Commit 29d7355a ("kasan: save alloc stack traces for mempool") messed
      up one of the calls to unpoison_slab_object: the last two arguments are
      supposed to be GFP flags and whether to init the object memory.
      
      Fix the call.
      
      Without this fix, __kasan_mempool_unpoison_object provides the object's
      size as GFP flags to unpoison_slab_object, which can cause LOCKDEP reports
      (and probably other issues).
      
      Link: https://lkml.kernel.org/r/20240614143238.60323-1-andrey.konovalov@linux.dev
      Fixes: 29d7355a ("kasan: save alloc stack traces for mempool")
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@gmail.com>
      Reported-by: default avatarBrad Spengler <spender@grsecurity.net>
      Acked-by: default avatarMarco Elver <elver@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      1c61990d
    • Suren Baghdasaryan's avatar
      mm: handle profiling for fake memory allocations during compaction · 34a023dc
      Suren Baghdasaryan authored
      During compaction isolated free pages are marked allocated so that they
      can be split and/or freed.  For that, post_alloc_hook() is used inside
      split_map_pages() and release_free_list().  split_map_pages() marks free
      pages allocated, splits the pages and then lets
      alloc_contig_range_noprof() free those pages.  release_free_list() marks
      free pages and immediately frees them.  This usage of post_alloc_hook()
      affect memory allocation profiling because these functions might not be
      called from an instrumented allocator, therefore current->alloc_tag is
      NULL and when debugging is enabled (CONFIG_MEM_ALLOC_PROFILING_DEBUG=y)
      that causes warnings.  To avoid that, wrap such post_alloc_hook() calls
      into an instrumented function which acts as an allocator which will be
      charged for these fake allocations.  Note that these allocations are very
      short lived until they are freed, therefore the associated counters should
      usually read 0.
      
      Link: https://lkml.kernel.org/r/20240614230504.3849136-1-surenb@google.comSigned-off-by: default avatarSuren Baghdasaryan <surenb@google.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Kent Overstreet <kent.overstreet@linux.dev>
      Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
      Cc: Sourav Panda <souravpanda@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      34a023dc
    • Suren Baghdasaryan's avatar
      mm/slab: fix 'variable obj_exts set but not used' warning · b4601d09
      Suren Baghdasaryan authored
      slab_post_alloc_hook() uses prepare_slab_obj_exts_hook() to obtain
      slabobj_ext object.  Currently the only user of slabobj_ext object in this
      path is memory allocation profiling, therefore when it's not enabled this
      object is not needed.  This also generates a warning when compiling with
      CONFIG_MEM_ALLOC_PROFILING=n.  Move the code under this configuration to
      fix the warning.  If more slabobj_ext users appear in the future, the code
      will have to be changed back to call prepare_slab_obj_exts_hook().
      
      Link: https://lkml.kernel.org/r/20240614225951.3845577-1-surenb@google.com
      Fixes: 4b873696 ("mm/slab: add allocation accounting into slab allocation and free paths")
      Signed-off-by: default avatarSuren Baghdasaryan <surenb@google.com>
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Closes: https://lore.kernel.org/oe-kbuild-all/202406150444.F6neSaiy-lkp@intel.com/
      Cc: Kent Overstreet <kent.overstreet@linux.dev>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      b4601d09
    • Jeff Xu's avatar
      /proc/pid/smaps: add mseal info for vma · 399ab86e
      Jeff Xu authored
      Add sl in /proc/pid/smaps to indicate vma is sealed
      
      Link: https://lkml.kernel.org/r/20240614232014.806352-2-jeffxu@google.com
      Fixes: 8be7258a ("mseal: add mseal syscall")
      Signed-off-by: default avatarJeff Xu <jeffxu@chromium.org>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Adhemerval Zanella <adhemerval.zanella@linaro.org>
      Cc: Jann Horn <jannh@google.com>
      Cc: Jorge Lucangeli Obes <jorgelo@chromium.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Stephen Röttger <sroettger@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      399ab86e
    • Zhaoyang Huang's avatar
      mm: fix incorrect vbq reference in purge_fragmented_block · 8c61291f
      Zhaoyang Huang authored
      xa_for_each() in _vm_unmap_aliases() loops through all vbs.  However,
      since commit 062eacf5 ("mm: vmalloc: remove a global vmap_blocks
      xarray") the vb from xarray may not be on the corresponding CPU
      vmap_block_queue.  Consequently, purge_fragmented_block() might use the
      wrong vbq->lock to protect the free list, leading to vbq->free breakage.
      
      Incorrect lock protection can exhaust all vmalloc space as follows:
      CPU0                                            CPU1
      +--------------------------------------------+
      |    +--------------------+     +-----+      |
      +--> |                    |---->|     |------+
           | CPU1:vbq free_list |     | vb1 |
      +--- |                    |<----|     |<-----+
      |    +--------------------+     +-----+      |
      +--------------------------------------------+
      
      _vm_unmap_aliases()                             vb_alloc()
                                                      new_vmap_block()
      xa_for_each(&vbq->vmap_blocks, idx, vb)
      --> vb in CPU1:vbq->freelist
      
      purge_fragmented_block(vb)
      spin_lock(&vbq->lock)                           spin_lock(&vbq->lock)
      --> use CPU0:vbq->lock                          --> use CPU1:vbq->lock
      
      list_del_rcu(&vb->free_list)                    list_add_tail_rcu(&vb->free_list, &vbq->free)
          __list_del(vb->prev, vb->next)
              next->prev = prev
          +--------------------+
          |                    |
          | CPU1:vbq free_list |
      +---|                    |<--+
      |   +--------------------+   |
      +----------------------------+
                                                      __list_add(new, head->prev, head)
      +--------------------------------------------+
      |    +--------------------+     +-----+      |
      +--> |                    |---->|     |------+
           | CPU1:vbq free_list |     | vb2 |
      +--- |                    |<----|     |<-----+
      |    +--------------------+     +-----+      |
      +--------------------------------------------+
      
              prev->next = next
      +--------------------------------------------+
      |----------------------------+               |
      |    +--------------------+  |  +-----+      |
      +--> |                    |--+  |     |------+
           | CPU1:vbq free_list |     | vb2 |
      +--- |                    |<----|     |<-----+
      |    +--------------------+     +-----+      |
      +--------------------------------------------+
      Here’s a list breakdown. All vbs, which were to be added to
      ‘prev’, cannot be used by list_for_each_entry_rcu(vb, &vbq->free,
      free_list) in vb_alloc(). Thus, vmalloc space is exhausted.
      
      This issue affects both erofs and f2fs, the stacktrace is as follows:
      erofs:
      [<ffffffd4ffb93ad4>] __switch_to+0x174
      [<ffffffd4ffb942f0>] __schedule+0x624
      [<ffffffd4ffb946f4>] schedule+0x7c
      [<ffffffd4ffb947cc>] schedule_preempt_disabled+0x24
      [<ffffffd4ffb962ec>] __mutex_lock+0x374
      [<ffffffd4ffb95998>] __mutex_lock_slowpath+0x14
      [<ffffffd4ffb95954>] mutex_lock+0x24
      [<ffffffd4fef2900c>] reclaim_and_purge_vmap_areas+0x44
      [<ffffffd4fef25908>] alloc_vmap_area+0x2e0
      [<ffffffd4fef24ea0>] vm_map_ram+0x1b0
      [<ffffffd4ff1b46f4>] z_erofs_lz4_decompress+0x278
      [<ffffffd4ff1b8ac4>] z_erofs_decompress_queue+0x650
      [<ffffffd4ff1b8328>] z_erofs_runqueue+0x7f4
      [<ffffffd4ff1b66a8>] z_erofs_read_folio+0x104
      [<ffffffd4feeb6fec>] filemap_read_folio+0x6c
      [<ffffffd4feeb68c4>] filemap_fault+0x300
      [<ffffffd4fef0ecac>] __do_fault+0xc8
      [<ffffffd4fef0c908>] handle_mm_fault+0xb38
      [<ffffffd4ffb9f008>] do_page_fault+0x288
      [<ffffffd4ffb9ed64>] do_translation_fault[jt]+0x40
      [<ffffffd4fec39c78>] do_mem_abort+0x58
      [<ffffffd4ffb8c3e4>] el0_ia+0x70
      [<ffffffd4ffb8c260>] el0t_64_sync_handler[jt]+0xb0
      [<ffffffd4fec11588>] ret_to_user[jt]+0x0
      
      f2fs:
      [<ffffffd4ffb93ad4>] __switch_to+0x174
      [<ffffffd4ffb942f0>] __schedule+0x624
      [<ffffffd4ffb946f4>] schedule+0x7c
      [<ffffffd4ffb947cc>] schedule_preempt_disabled+0x24
      [<ffffffd4ffb962ec>] __mutex_lock+0x374
      [<ffffffd4ffb95998>] __mutex_lock_slowpath+0x14
      [<ffffffd4ffb95954>] mutex_lock+0x24
      [<ffffffd4fef2900c>] reclaim_and_purge_vmap_areas+0x44
      [<ffffffd4fef25908>] alloc_vmap_area+0x2e0
      [<ffffffd4fef24ea0>] vm_map_ram+0x1b0
      [<ffffffd4ff1a3b60>] f2fs_prepare_decomp_mem+0x144
      [<ffffffd4ff1a6c24>] f2fs_alloc_dic+0x264
      [<ffffffd4ff175468>] f2fs_read_multi_pages+0x428
      [<ffffffd4ff17b46c>] f2fs_mpage_readpages+0x314
      [<ffffffd4ff1785c4>] f2fs_readahead+0x50
      [<ffffffd4feec3384>] read_pages+0x80
      [<ffffffd4feec32c0>] page_cache_ra_unbounded+0x1a0
      [<ffffffd4feec39e8>] page_cache_ra_order+0x274
      [<ffffffd4feeb6cec>] do_sync_mmap_readahead+0x11c
      [<ffffffd4feeb6764>] filemap_fault+0x1a0
      [<ffffffd4ff1423bc>] f2fs_filemap_fault+0x28
      [<ffffffd4fef0ecac>] __do_fault+0xc8
      [<ffffffd4fef0c908>] handle_mm_fault+0xb38
      [<ffffffd4ffb9f008>] do_page_fault+0x288
      [<ffffffd4ffb9ed64>] do_translation_fault[jt]+0x40
      [<ffffffd4fec39c78>] do_mem_abort+0x58
      [<ffffffd4ffb8c3e4>] el0_ia+0x70
      [<ffffffd4ffb8c260>] el0t_64_sync_handler[jt]+0xb0
      [<ffffffd4fec11588>] ret_to_user[jt]+0x0
      
      To fix this, introducee cpu within vmap_block to record which this vb
      belongs to.
      
      Link: https://lkml.kernel.org/r/20240614021352.1822225-1-zhaoyang.huang@unisoc.com
      Link: https://lkml.kernel.org/r/20240607023116.1720640-1-zhaoyang.huang@unisoc.com
      Fixes: fc1e0d98 ("mm/vmalloc: prevent stale TLBs in fully utilized blocks")
      Signed-off-by: default avatarZhaoyang Huang <zhaoyang.huang@unisoc.com>
      Suggested-by: default avatarHailong.Liu <hailong.liu@oppo.com>
      Reviewed-by: default avatarUladzislau Rezki (Sony) <urezki@gmail.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Lorenzo Stoakes <lstoakes@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      8c61291f
  2. 23 Jun, 2024 8 commits
  3. 22 Jun, 2024 19 commits