1. 03 Jul, 2024 9 commits
    • Jingbo Xu's avatar
      cachefiles: add missing lock protection when polling · cf5bb09e
      Jingbo Xu authored
      Add missing lock protection in poll routine when iterating xarray,
      otherwise:
      
      Even with RCU read lock held, only the slot of the radix tree is
      ensured to be pinned there, while the data structure (e.g. struct
      cachefiles_req) stored in the slot has no such guarantee.  The poll
      routine will iterate the radix tree and dereference cachefiles_req
      accordingly.  Thus RCU read lock is not adequate in this case and
      spinlock is needed here.
      
      Fixes: b817e22b ("cachefiles: narrow the scope of triggering EPOLLIN events in ondemand mode")
      Signed-off-by: default avatarJingbo Xu <jefflexu@linux.alibaba.com>
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Link: https://lore.kernel.org/r/20240628062930.2467993-10-libaokun@huaweicloud.comAcked-by: default avatarJeff Layton <jlayton@kernel.org>
      Reviewed-by: default avatarJia Zhu <zhujia.zj@bytedance.com>
      Reviewed-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      cf5bb09e
    • Baokun Li's avatar
      cachefiles: cyclic allocation of msg_id to avoid reuse · 19f4f399
      Baokun Li authored
      Reusing the msg_id after a maliciously completed reopen request may cause
      a read request to remain unprocessed and result in a hung, as shown below:
      
             t1       |      t2       |      t3
      -------------------------------------------------
      cachefiles_ondemand_select_req
       cachefiles_ondemand_object_is_close(A)
       cachefiles_ondemand_set_object_reopening(A)
       queue_work(fscache_object_wq, &info->work)
                      ondemand_object_worker
                       cachefiles_ondemand_init_object(A)
                        cachefiles_ondemand_send_req(OPEN)
                          // get msg_id 6
                          wait_for_completion(&req_A->done)
      cachefiles_ondemand_daemon_read
       // read msg_id 6 req_A
       cachefiles_ondemand_get_fd
       copy_to_user
                                      // Malicious completion msg_id 6
                                      copen 6,-1
                                      cachefiles_ondemand_copen
                                       complete(&req_A->done)
                                       // will not set the object to close
                                       // because ondemand_id && fd is valid.
      
                      // ondemand_object_worker() is done
                      // but the object is still reopening.
      
                                      // new open req_B
                                      cachefiles_ondemand_init_object(B)
                                       cachefiles_ondemand_send_req(OPEN)
                                       // reuse msg_id 6
      process_open_req
       copen 6,A.size
       // The expected failed copen was executed successfully
      
      Expect copen to fail, and when it does, it closes fd, which sets the
      object to close, and then close triggers reopen again. However, due to
      msg_id reuse resulting in a successful copen, the anonymous fd is not
      closed until the daemon exits. Therefore read requests waiting for reopen
      to complete may trigger hung task.
      
      To avoid this issue, allocate the msg_id cyclically to avoid reusing the
      msg_id for a very short duration of time.
      
      Fixes: c8383054 ("cachefiles: notify the user daemon when looking up cookie")
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Link: https://lore.kernel.org/r/20240628062930.2467993-9-libaokun@huaweicloud.comAcked-by: default avatarJeff Layton <jlayton@kernel.org>
      Reviewed-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
      Reviewed-by: default avatarJia Zhu <zhujia.zj@bytedance.com>
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      19f4f399
    • Hou Tao's avatar
      cachefiles: wait for ondemand_object_worker to finish when dropping object · 12e009d6
      Hou Tao authored
      When queuing ondemand_object_worker() to re-open the object,
      cachefiles_object is not pinned. The cachefiles_object may be freed when
      the pending read request is completed intentionally and the related
      erofs is umounted. If ondemand_object_worker() runs after the object is
      freed, it will incur use-after-free problem as shown below.
      
      process A  processs B  process C  process D
      
      cachefiles_ondemand_send_req()
      // send a read req X
      // wait for its completion
      
                 // close ondemand fd
                 cachefiles_ondemand_fd_release()
                 // set object as CLOSE
      
                             cachefiles_ondemand_daemon_read()
                             // set object as REOPENING
                             queue_work(fscache_wq, &info->ondemand_work)
      
                                      // close /dev/cachefiles
                                      cachefiles_daemon_release
                                      cachefiles_flush_reqs
                                      complete(&req->done)
      
      // read req X is completed
      // umount the erofs fs
      cachefiles_put_object()
      // object will be freed
      cachefiles_ondemand_deinit_obj_info()
      kmem_cache_free(object)
                             // both info and object are freed
                             ondemand_object_worker()
      
      When dropping an object, it is no longer necessary to reopen the object,
      so use cancel_work_sync() to cancel or wait for ondemand_object_worker()
      to finish.
      
      Fixes: 0a7e54c1 ("cachefiles: resend an open request if the read request's object is closed")
      Signed-off-by: default avatarHou Tao <houtao1@huawei.com>
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Link: https://lore.kernel.org/r/20240628062930.2467993-8-libaokun@huaweicloud.comAcked-by: default avatarJeff Layton <jlayton@kernel.org>
      Reviewed-by: default avatarJia Zhu <zhujia.zj@bytedance.com>
      Reviewed-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      12e009d6
    • Baokun Li's avatar
      cachefiles: cancel all requests for the object that is being dropped · 751f5246
      Baokun Li authored
      Because after an object is dropped, requests for that object are useless,
      cancel them to avoid causing other problems.
      
      This prepares for the later addition of cancel_work_sync(). After the
      reopen requests is generated, cancel it to avoid cancel_work_sync()
      blocking by waiting for daemon to complete the reopen requests.
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Link: https://lore.kernel.org/r/20240628062930.2467993-7-libaokun@huaweicloud.comAcked-by: default avatarJeff Layton <jlayton@kernel.org>
      Reviewed-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
      Reviewed-by: default avatarJia Zhu <zhujia.zj@bytedance.com>
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      751f5246
    • Baokun Li's avatar
      cachefiles: stop sending new request when dropping object · b2415d1f
      Baokun Li authored
      Added CACHEFILES_ONDEMAND_OBJSTATE_DROPPING indicates that the cachefiles
      object is being dropped, and is set after the close request for the dropped
      object completes, and no new requests are allowed to be sent after this
      state.
      
      This prepares for the later addition of cancel_work_sync(). It prevents
      leftover reopen requests from being sent, to avoid processing unnecessary
      requests and to avoid cancel_work_sync() blocking by waiting for daemon to
      complete the reopen requests.
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Link: https://lore.kernel.org/r/20240628062930.2467993-6-libaokun@huaweicloud.comAcked-by: default avatarJeff Layton <jlayton@kernel.org>
      Reviewed-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
      Reviewed-by: default avatarJia Zhu <zhujia.zj@bytedance.com>
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      b2415d1f
    • Baokun Li's avatar
      cachefiles: propagate errors from vfs_getxattr() to avoid infinite loop · 0ece614a
      Baokun Li authored
      In cachefiles_check_volume_xattr(), the error returned by vfs_getxattr()
      is not passed to ret, so it ends up returning -ESTALE, which leads to an
      endless loop as follows:
      
      cachefiles_acquire_volume
      retry:
        ret = cachefiles_check_volume_xattr
          ret = -ESTALE
          xlen = vfs_getxattr // return -EIO
          // The ret is not updated when xlen < 0, so -ESTALE is returned.
          return ret
        // Supposed to jump out of the loop at this judgement.
        if (ret != -ESTALE)
            goto error_dir;
        cachefiles_bury_object
          //  EIO causes rename failure
        goto retry;
      
      Hence propagate the error returned by vfs_getxattr() to avoid the above
      issue. Do the same in cachefiles_check_auxdata().
      
      Fixes: 32e15003 ("fscache, cachefiles: Store the volume coherency data")
      Fixes: 72b95785 ("cachefiles: Implement metadata/coherency data storage in xattrs")
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Link: https://lore.kernel.org/r/20240628062930.2467993-5-libaokun@huaweicloud.comReviewed-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      0ece614a
    • Baokun Li's avatar
      cachefiles: fix slab-use-after-free in cachefiles_withdraw_cookie() · 5d8f8057
      Baokun Li authored
      We got the following issue in our fault injection stress test:
      
      ==================================================================
      BUG: KASAN: slab-use-after-free in cachefiles_withdraw_cookie+0x4d9/0x600
      Read of size 8 at addr ffff888118efc000 by task kworker/u78:0/109
      
      CPU: 13 PID: 109 Comm: kworker/u78:0 Not tainted 6.8.0-dirty #566
      Call Trace:
       <TASK>
       kasan_report+0x93/0xc0
       cachefiles_withdraw_cookie+0x4d9/0x600
       fscache_cookie_state_machine+0x5c8/0x1230
       fscache_cookie_worker+0x91/0x1c0
       process_one_work+0x7fa/0x1800
       [...]
      
      Allocated by task 117:
       kmalloc_trace+0x1b3/0x3c0
       cachefiles_acquire_volume+0xf3/0x9c0
       fscache_create_volume_work+0x97/0x150
       process_one_work+0x7fa/0x1800
       [...]
      
      Freed by task 120301:
       kfree+0xf1/0x2c0
       cachefiles_withdraw_cache+0x3fa/0x920
       cachefiles_put_unbind_pincount+0x1f6/0x250
       cachefiles_daemon_release+0x13b/0x290
       __fput+0x204/0xa00
       task_work_run+0x139/0x230
       do_exit+0x87a/0x29b0
       [...]
      ==================================================================
      
      Following is the process that triggers the issue:
      
                 p1                |             p2
      ------------------------------------------------------------
                                    fscache_begin_lookup
                                     fscache_begin_volume_access
                                      fscache_cache_is_live(fscache_cache)
      cachefiles_daemon_release
       cachefiles_put_unbind_pincount
        cachefiles_daemon_unbind
         cachefiles_withdraw_cache
          fscache_withdraw_cache
           fscache_set_cache_state(cache, FSCACHE_CACHE_IS_WITHDRAWN);
          cachefiles_withdraw_objects(cache)
          fscache_wait_for_objects(fscache)
            atomic_read(&fscache_cache->object_count) == 0
                                    fscache_perform_lookup
                                     cachefiles_lookup_cookie
                                      cachefiles_alloc_object
                                       refcount_set(&object->ref, 1);
                                       object->volume = volume
                                       fscache_count_object(vcookie->cache);
                                        atomic_inc(&fscache_cache->object_count)
          cachefiles_withdraw_volumes
           cachefiles_withdraw_volume
            fscache_withdraw_volume
            __cachefiles_free_volume
             kfree(cachefiles_volume)
                                    fscache_cookie_state_machine
                                     cachefiles_withdraw_cookie
                                      cache = object->volume->cache;
                                      // cachefiles_volume UAF !!!
      
      After setting FSCACHE_CACHE_IS_WITHDRAWN, wait for all the cookie lookups
      to complete first, and then wait for fscache_cache->object_count == 0 to
      avoid the cookie exiting after the volume has been freed and triggering
      the above issue. Therefore call fscache_withdraw_volume() before calling
      cachefiles_withdraw_objects().
      
      This way, after setting FSCACHE_CACHE_IS_WITHDRAWN, only the following two
      cases will occur:
      1) fscache_begin_lookup fails in fscache_begin_volume_access().
      2) fscache_withdraw_volume() will ensure that fscache_count_object() has
         been executed before calling fscache_wait_for_objects().
      
      Fixes: fe2140e2 ("cachefiles: Implement volume support")
      Suggested-by: default avatarHou Tao <houtao1@huawei.com>
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Link: https://lore.kernel.org/r/20240628062930.2467993-4-libaokun@huaweicloud.comSigned-off-by: default avatarChristian Brauner <brauner@kernel.org>
      5d8f8057
    • Baokun Li's avatar
      cachefiles: fix slab-use-after-free in fscache_withdraw_volume() · 522018a0
      Baokun Li authored
      We got the following issue in our fault injection stress test:
      
      ==================================================================
      BUG: KASAN: slab-use-after-free in fscache_withdraw_volume+0x2e1/0x370
      Read of size 4 at addr ffff88810680be08 by task ondemand-04-dae/5798
      
      CPU: 0 PID: 5798 Comm: ondemand-04-dae Not tainted 6.8.0-dirty #565
      Call Trace:
       kasan_check_range+0xf6/0x1b0
       fscache_withdraw_volume+0x2e1/0x370
       cachefiles_withdraw_volume+0x31/0x50
       cachefiles_withdraw_cache+0x3ad/0x900
       cachefiles_put_unbind_pincount+0x1f6/0x250
       cachefiles_daemon_release+0x13b/0x290
       __fput+0x204/0xa00
       task_work_run+0x139/0x230
      
      Allocated by task 5820:
       __kmalloc+0x1df/0x4b0
       fscache_alloc_volume+0x70/0x600
       __fscache_acquire_volume+0x1c/0x610
       erofs_fscache_register_volume+0x96/0x1a0
       erofs_fscache_register_fs+0x49a/0x690
       erofs_fc_fill_super+0x6c0/0xcc0
       vfs_get_super+0xa9/0x140
       vfs_get_tree+0x8e/0x300
       do_new_mount+0x28c/0x580
       [...]
      
      Freed by task 5820:
       kfree+0xf1/0x2c0
       fscache_put_volume.part.0+0x5cb/0x9e0
       erofs_fscache_unregister_fs+0x157/0x1b0
       erofs_kill_sb+0xd9/0x1c0
       deactivate_locked_super+0xa3/0x100
       vfs_get_super+0x105/0x140
       vfs_get_tree+0x8e/0x300
       do_new_mount+0x28c/0x580
       [...]
      ==================================================================
      
      Following is the process that triggers the issue:
      
              mount failed         |         daemon exit
      ------------------------------------------------------------
       deactivate_locked_super        cachefiles_daemon_release
        erofs_kill_sb
         erofs_fscache_unregister_fs
          fscache_relinquish_volume
           __fscache_relinquish_volume
            fscache_put_volume(fscache_volume, fscache_volume_put_relinquish)
             zero = __refcount_dec_and_test(&fscache_volume->ref, &ref);
                                       cachefiles_put_unbind_pincount
                                        cachefiles_daemon_unbind
                                         cachefiles_withdraw_cache
                                          cachefiles_withdraw_volumes
                                           list_del_init(&volume->cache_link)
             fscache_free_volume(fscache_volume)
              cache->ops->free_volume
               cachefiles_free_volume
                list_del_init(&cachefiles_volume->cache_link);
              kfree(fscache_volume)
                                           cachefiles_withdraw_volume
                                            fscache_withdraw_volume
                                             fscache_volume->n_accesses
                                             // fscache_volume UAF !!!
      
      The fscache_volume in cache->volumes must not have been freed yet, but its
      reference count may be 0. So use the new fscache_try_get_volume() helper
      function try to get its reference count.
      
      If the reference count of fscache_volume is 0, fscache_put_volume() is
      freeing it, so wait for it to be removed from cache->volumes.
      
      If its reference count is not 0, call cachefiles_withdraw_volume() with
      reference count protection to avoid the above issue.
      
      Fixes: fe2140e2 ("cachefiles: Implement volume support")
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Link: https://lore.kernel.org/r/20240628062930.2467993-3-libaokun@huaweicloud.comSigned-off-by: default avatarChristian Brauner <brauner@kernel.org>
      522018a0
    • Baokun Li's avatar
      netfs, fscache: export fscache_put_volume() and add fscache_try_get_volume() · 85b08b31
      Baokun Li authored
      Export fscache_put_volume() and add fscache_try_get_volume()
      helper function to allow cachefiles to get/put fscache_volume
      via linux/fscache-cache.h.
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Link: https://lore.kernel.org/r/20240628062930.2467993-2-libaokun@huaweicloud.comSigned-off-by: default avatarChristian Brauner <brauner@kernel.org>
      85b08b31
  2. 12 Jun, 2024 1 commit
  3. 26 May, 2024 5 commits
  4. 25 May, 2024 12 commits
    • Linus Torvalds's avatar
      Merge tag 'mm-hotfixes-stable-2024-05-25-09-13' of... · 9b62e02e
      Linus Torvalds authored
      Merge tag 'mm-hotfixes-stable-2024-05-25-09-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
      
      Pull misc fixes from Andrew Morton:
       "16 hotfixes, 11 of which are cc:stable.
      
        A few nilfs2 fixes, the remainder are for MM: a couple of selftests
        fixes, various singletons fixing various issues in various parts"
      
      * tag 'mm-hotfixes-stable-2024-05-25-09-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
        mm/ksm: fix possible UAF of stable_node
        mm/memory-failure: fix handling of dissolved but not taken off from buddy pages
        mm: /proc/pid/smaps_rollup: avoid skipping vma after getting mmap_lock again
        nilfs2: fix potential hang in nilfs_detach_log_writer()
        nilfs2: fix unexpected freezing of nilfs_segctor_sync()
        nilfs2: fix use-after-free of timer for log writer thread
        selftests/mm: fix build warnings on ppc64
        arm64: patching: fix handling of execmem addresses
        selftests/mm: compaction_test: fix bogus test success and reduce probability of OOM-killer invocation
        selftests/mm: compaction_test: fix incorrect write of zero to nr_hugepages
        selftests/mm: compaction_test: fix bogus test success on Aarch64
        mailmap: update email address for Satya Priya
        mm/huge_memory: don't unpoison huge_zero_folio
        kasan, fortify: properly rename memintrinsics
        lib: add version into /proc/allocinfo output
        mm/vmalloc: fix vmalloc which may return null if called with __GFP_NOFAIL
      9b62e02e
    • Linus Torvalds's avatar
      Merge tag 'irq-urgent-2024-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a0db36ed
      Linus Torvalds authored
      Pull irq fixes from Ingo Molnar:
      
       - Fix x86 IRQ vector leak caused by a CPU offlining race
      
       - Fix build failure in the riscv-imsic irqchip driver
         caused by an API-change semantic conflict
      
       - Fix use-after-free in irq_find_at_or_after()
      
      * tag 'irq-urgent-2024-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        genirq/irqdesc: Prevent use-after-free in irq_find_at_or_after()
        genirq/cpuhotplug, x86/vector: Prevent vector leak during CPU offline
        irqchip/riscv-imsic: Fixup riscv_ipi_set_virq_range() conflict
      a0db36ed
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2024-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3a390f24
      Linus Torvalds authored
      Pull x86 fixes from Ingo Molnar:
      
       - Fix regressions of the new x86 CPU VFM (vendor/family/model)
         enumeration/matching code
      
       - Fix crash kernel detection on buggy firmware with
         non-compliant ACPI MADT tables
      
       - Address Kconfig warning
      
      * tag 'x86-urgent-2024-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/cpu: Fix x86_match_cpu() to match just X86_VENDOR_INTEL
        crypto: x86/aes-xts - switch to new Intel CPU model defines
        x86/topology: Handle bogus ACPI tables correctly
        x86/kconfig: Select ARCH_WANT_FRAME_POINTERS again when UNWINDER_FRAME_POINTER=y
      3a390f24
    • Linus Torvalds's avatar
      Merge tag 'for-linus-6.10-1' of https://github.com/cminyard/linux-ipmi · 56676c4c
      Linus Torvalds authored
      Pull ipmi updates from Corey Minyard:
       "Mostly updates for deprecated interfaces, platform.remove and
        converting from a tasklet to a BH workqueue.
      
        Also use HAS_IOPORT for disabling inb()/outb()"
      
      * tag 'for-linus-6.10-1' of https://github.com/cminyard/linux-ipmi:
        ipmi: kcs_bmc_npcm7xx: Convert to platform remove callback returning void
        ipmi: kcs_bmc_aspeed: Convert to platform remove callback returning void
        ipmi: ipmi_ssif: Convert to platform remove callback returning void
        ipmi: ipmi_si_platform: Convert to platform remove callback returning void
        ipmi: ipmi_powernv: Convert to platform remove callback returning void
        ipmi: bt-bmc: Convert to platform remove callback returning void
        char: ipmi: handle HAS_IOPORT dependencies
        ipmi: Convert from tasklet to BH workqueue
      56676c4c
    • Linus Torvalds's avatar
      Merge tag 'ceph-for-6.10-rc1' of https://github.com/ceph/ceph-client · 74eca356
      Linus Torvalds authored
      Pull ceph updates from Ilya Dryomov:
       "A series from Xiubo that adds support for additional access checks
        based on MDS auth caps which were recently made available to clients.
      
        This is needed to prevent scenarios where the MDS quietly discards
        updates that a UID-restricted client previously (wrongfully) acked to
        the user.
      
        Other than that, just a documentation fixup"
      
      * tag 'ceph-for-6.10-rc1' of https://github.com/ceph/ceph-client:
        doc: ceph: update userspace command to get CephFS metadata
        ceph: add CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK feature bit
        ceph: check the cephx mds auth access for async dirop
        ceph: check the cephx mds auth access for open
        ceph: check the cephx mds auth access for setattr
        ceph: add ceph_mds_check_access() helper
        ceph: save cap_auths in MDS client when session is opened
      74eca356
    • Linus Torvalds's avatar
      Merge tag 'ntfs3_for_6.10' of https://github.com/Paragon-Software-Group/linux-ntfs3 · 89b61ca4
      Linus Torvalds authored
      Pull ntfs3 updates from Konstantin Komarov:
       "Fixes:
         - reusing of the file index (could cause the file to be trimmed)
         - infinite dir enumeration
         - taking DOS names into account during link counting
         - le32_to_cpu conversion, 32 bit overflow, NULL check
         - some code was refactored
      
        Changes:
         - removed max link count info display during driver init
      
        Remove:
         - atomic_open has been removed for lack of use"
      
      * tag 'ntfs3_for_6.10' of https://github.com/Paragon-Software-Group/linux-ntfs3:
        fs/ntfs3: Break dir enumeration if directory contents error
        fs/ntfs3: Fix case when index is reused during tree transformation
        fs/ntfs3: Mark volume as dirty if xattr is broken
        fs/ntfs3: Always make file nonresident on fallocate call
        fs/ntfs3: Redesign ntfs_create_inode to return error code instead of inode
        fs/ntfs3: Use variable length array instead of fixed size
        fs/ntfs3: Use 64 bit variable to avoid 32 bit overflow
        fs/ntfs3: Check 'folio' pointer for NULL
        fs/ntfs3: Missed le32_to_cpu conversion
        fs/ntfs3: Remove max link count info display during driver init
        fs/ntfs3: Taking DOS names into account during link counting
        fs/ntfs3: remove atomic_open
        fs/ntfs3: use kcalloc() instead of kzalloc()
      89b61ca4
    • Linus Torvalds's avatar
      Merge tag '6.10-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd · 6c8b1a2d
      Linus Torvalds authored
      Pull smb server fixes from Steve French:
       "Two ksmbd server fixes, both for stable"
      
      * tag '6.10-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
        ksmbd: ignore trailing slashes in share paths
        ksmbd: avoid to send duplicate oplock break notifications
      6c8b1a2d
    • Linus Torvalds's avatar
      Merge tag 'rtc-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux · 54f71b03
      Linus Torvalds authored
      Pull RTC updates from Alexandre Belloni:
       "There is one new driver and then most of the changes are the device
        tree bindings conversions to yaml.
      
        New driver:
         - Epson RX8111
      
        Drivers:
         - Many Device Tree bindings conversions to dtschema
         - pcf8563: wakeup-source support"
      
      * tag 'rtc-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
        pcf8563: add wakeup-source support
        rtc: rx8111: handle VLOW flag
        rtc: rx8111: demote warnings to debug level
        rtc: rx6110: Constify struct regmap_config
        dt-bindings: rtc: convert trivial devices into dtschema
        dt-bindings: rtc: stmp3xxx-rtc: convert to dtschema
        dt-bindings: rtc: pxa-rtc: convert to dtschema
        rtc: Add driver for Epson RX8111
        dt-bindings: rtc: Add Epson RX8111
        rtc: mcp795: drop unneeded MODULE_ALIAS
        rtc: nuvoton: Modify part number value
        rtc: test: Split rtc unit test into slow and normal speed test
        dt-bindings: rtc: nxp,lpc1788-rtc: convert to dtschema
        dt-bindings: rtc: digicolor-rtc: move to trivial-rtc
        dt-bindings: rtc: alphascale,asm9260-rtc: convert to dtschema
        dt-bindings: rtc: armada-380-rtc: convert to dtschema
        rtc: cros-ec: provide ID table for avoiding fallback match
      54f71b03
    • Linus Torvalds's avatar
      Merge tag 'i3c/for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux · 4286e1fc
      Linus Torvalds authored
      Pull i3c updates from Alexandre Belloni:
       "Runtime PM (power management) is improved and hot-join support has
        been added to the dw controller driver.
      
        Core:
         - Allow device driver to trigger controller runtime PM
      
        Drivers:
         - dw: hot-join support
         - svc: better IBI handling"
      
      * tag 'i3c/for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux:
        i3c: dw: Add hot-join support.
        i3c: master: Enable runtime PM for master controller
        i3c: master: svc: fix invalidate IBI type and miss call client IBI handler
        i3c: master: svc: change ENXIO to EAGAIN when IBI occurs during start frame
        i3c: Add comment for -EAGAIN in i3c_device_do_priv_xfers()
      4286e1fc
    • Linus Torvalds's avatar
      Merge tag 'jffs2-for-linus-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs · 6951abe8
      Linus Torvalds authored
      Pull jffs2 updates from Richard Weinberger:
      
       - Fix illegal memory access in jffs2_free_inode()
      
       - Kernel-doc fixes
      
       - print symbolic error names
      
      * tag 'jffs2-for-linus-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs:
        jffs2: Fix potential illegal address access in jffs2_free_inode
        jffs2: Simplify the allocation of slab caches
        jffs2: nodemgmt: fix kernel-doc comments
        jffs2: print symbolic error name instead of error code
      6951abe8
    • Linus Torvalds's avatar
      Merge tag 'uml-for-linus-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux · 2313022e
      Linus Torvalds authored
      Pull UML updates from Richard Weinberger:
      
       - Fixes for -Wmissing-prototypes warnings and further cleanup
      
       - Remove callback returning void from rtc and virtio drivers
      
       - Fix bash location
      
      * tag 'uml-for-linus-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: (26 commits)
        um: virtio_uml: Convert to platform remove callback returning void
        um: rtc: Convert to platform remove callback returning void
        um: Remove unused do_get_thread_area function
        um: Fix -Wmissing-prototypes warnings for __vdso_*
        um: Add an internal header shared among the user code
        um: Fix the declaration of kasan_map_memory
        um: Fix the -Wmissing-prototypes warning for get_thread_reg
        um: Fix the -Wmissing-prototypes warning for __switch_mm
        um: Fix -Wmissing-prototypes warnings for (rt_)sigreturn
        um: Stop tracking host PID in cpu_tasks
        um: process: remove unused 'n' variable
        um: vector: remove unused len variable/calculation
        um: vector: fix bpfflash parameter evaluation
        um: slirp: remove set but unused variable 'pid'
        um: signal: move pid variable where needed
        um: Makefile: use bash from the environment
        um: Add winch to winch_handlers before registering winch IRQ
        um: Fix -Wmissing-prototypes warnings for __warp_* and foo
        um: Fix -Wmissing-prototypes warnings for text_poke*
        um: Move declarations to proper headers
        ...
      2313022e
    • Linus Torvalds's avatar
      Merge tag 'drm-next-2024-05-25' of https://gitlab.freedesktop.org/drm/kernel · 56fb6f92
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Some fixes for the end of the merge window, mostly amdgpu and panthor,
        with one nouveau uAPI change that fixes a bad decision we made a few
        months back.
      
        nouveau:
         - fix bo metadata uAPI for vm bind
      
        panthor:
         - Fixes for panthor's heap logical block.
         - Reset on unrecoverable fault
         - Fix VM references.
         - Reset fix.
      
        xlnx:
         - xlnx compile and doc fixes.
      
        amdgpu:
         - Handle vbios table integrated info v2.3
      
        amdkfd:
         - Handle duplicate BOs in reserve_bo_and_cond_vms
         - Handle memory limitations on small APUs
      
        dp/mst:
         - MST null deref fix.
      
        bridge:
         - Don't let next bridge create connector in adv7511 to make probe
           work"
      
      * tag 'drm-next-2024-05-25' of https://gitlab.freedesktop.org/drm/kernel:
        drm/amdgpu/atomfirmware: add intergrated info v2.3 table
        drm/mst: Fix NULL pointer dereference at drm_dp_add_payload_part2
        drm/amdkfd: Let VRAM allocations go to GTT domain on small APUs
        drm/amdkfd: handle duplicate BOs in reserve_bo_and_cond_vms
        drm/bridge: adv7511: Attach next bridge without creating connector
        drm/buddy: Fix the warn on's during force merge
        drm/nouveau: use tile_mode and pte_kind for VM_BIND bo allocations
        drm/panthor: Call panthor_sched_post_reset() even if the reset failed
        drm/panthor: Reset the FW VM to NULL on unplug
        drm/panthor: Keep a ref to the VM at the panthor_kernel_bo level
        drm/panthor: Force an immediate reset on unrecoverable faults
        drm/panthor: Document drm_panthor_tiler_heap_destroy::handle validity constraints
        drm/panthor: Fix an off-by-one in the heap context retrieval logic
        drm/panthor: Relax the constraints on the tiler chunk size
        drm/panthor: Make sure the tiler initial/max chunks are consistent
        drm/panthor: Fix tiler OOM handling to allow incremental rendering
        drm: xlnx: zynqmp_dpsub: Fix compilation error
        drm: xlnx: zynqmp_dpsub: Fix few function comments
      56fb6f92
  5. 24 May, 2024 13 commits
    • David Howells's avatar
      cifs: Fix missing set of remote_i_size · 93a43155
      David Howells authored
      Occasionally, the generic/001 xfstest will fail indicating corruption in
      one of the copy chains when run on cifs against a server that supports
      FSCTL_DUPLICATE_EXTENTS_TO_FILE (eg. Samba with a share on btrfs).  The
      problem is that the remote_i_size value isn't updated by cifs_setsize()
      when called by smb2_duplicate_extents(), but i_size *is*.
      
      This may cause cifs_remap_file_range() to then skip the bit after calling
      ->duplicate_extents() that sets sizes.
      
      Fix this by calling netfs_resize_file() in smb2_duplicate_extents() before
      calling cifs_setsize() to set i_size.
      
      This means we don't then need to call netfs_resize_file() upon return from
      ->duplicate_extents(), but we also fix the test to compare against the pre-dup
      inode size.
      
      [Note that this goes back before the addition of remote_i_size with the
      netfs_inode struct.  It should probably have been setting cifsi->server_eof
      previously.]
      
      Fixes: cfc63fc8 ("smb3: fix cached file size problems in duplicate extents (reflink)")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Steve French <sfrench@samba.org>
      cc: Paulo Alcantara <pc@manguebit.com>
      cc: Shyam Prasad N <nspmangalore@gmail.com>
      cc: Rohith Surabattula <rohiths.msft@gmail.com>
      cc: Jeff Layton <jlayton@kernel.org>
      cc: linux-cifs@vger.kernel.org
      cc: netfs@lists.linux.dev
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      93a43155
    • David Howells's avatar
      cifs: Fix smb3_insert_range() to move the zero_point · 8a160723
      David Howells authored
      Fix smb3_insert_range() to move the zero_point over to the new EOF.
      Without this, generic/147 fails as reads of data beyond the old EOF point
      return zeroes.
      
      Fixes: 3ee1a1fc ("cifs: Cut over to using netfslib")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Shyam Prasad N <nspmangalore@gmail.com>
      cc: Rohith Surabattula <rohiths.msft@gmail.com>
      cc: Jeff Layton <jlayton@kernel.org>
      cc: linux-cifs@vger.kernel.org
      cc: netfs@lists.linux.dev
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      8a160723
    • Linus Torvalds's avatar
      Merge tag 'mm-stable-2024-05-24-11-49' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm · 0b32d436
      Linus Torvalds authored
      Pull more mm updates from Andrew Morton:
       "Jeff Xu's implementation of the mseal() syscall"
      
      * tag 'mm-stable-2024-05-24-11-49' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
        selftest mm/mseal read-only elf memory segment
        mseal: add documentation
        selftest mm/mseal memory sealing
        mseal: add mseal syscall
        mseal: wire up mseal syscall
      0b32d436
    • Chengming Zhou's avatar
      mm/ksm: fix possible UAF of stable_node · 90e82349
      Chengming Zhou authored
      The commit 2c653d0e ("ksm: introduce ksm_max_page_sharing per page
      deduplication limit") introduced a possible failure case in the
      stable_tree_insert(), where we may free the new allocated stable_node_dup
      if we fail to prepare the missing chain node.
      
      Then that kfolio return and unlock with a freed stable_node set...  And
      any MM activities can come in to access kfolio->mapping, so UAF.
      
      Fix it by moving folio_set_stable_node() to the end after stable_node
      is inserted successfully.
      
      Link: https://lkml.kernel.org/r/20240513-b4-ksm-stable-node-uaf-v1-1-f687de76f452@linux.dev
      Fixes: 2c653d0e ("ksm: introduce ksm_max_page_sharing per page deduplication limit")
      Signed-off-by: default avatarChengming Zhou <chengming.zhou@linux.dev>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Stefan Roesch <shr@devkernel.io>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      90e82349
    • Miaohe Lin's avatar
      mm/memory-failure: fix handling of dissolved but not taken off from buddy pages · 8cf360b9
      Miaohe Lin authored
      When I did memory failure tests recently, below panic occurs:
      
      page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00
      flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff)
      raw: 06fffe0000000000 dead000000000100 dead000000000122 0000000000000000
      raw: 0000000000000000 0000000000000009 00000000ffffffff 0000000000000000
      page dumped because: VM_BUG_ON_PAGE(!PageBuddy(page))
      ------------[ cut here ]------------
      kernel BUG at include/linux/page-flags.h:1009!
      invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
      RIP: 0010:__del_page_from_free_list+0x151/0x180
      RSP: 0018:ffffa49c90437998 EFLAGS: 00000046
      RAX: 0000000000000035 RBX: 0000000000000009 RCX: ffff8dd8dfd1c9c8
      RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff8dd8dfd1c9c0
      RBP: ffffd901233b8000 R08: ffffffffab5511f8 R09: 0000000000008c69
      R10: 0000000000003c15 R11: ffffffffab5511f8 R12: ffff8dd8fffc0c80
      R13: 0000000000000001 R14: ffff8dd8fffc0c80 R15: 0000000000000009
      FS:  00007ff916304740(0000) GS:ffff8dd8dfd00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 000055eae50124c8 CR3: 00000008479e0000 CR4: 00000000000006f0
      Call Trace:
       <TASK>
       __rmqueue_pcplist+0x23b/0x520
       get_page_from_freelist+0x26b/0xe40
       __alloc_pages_noprof+0x113/0x1120
       __folio_alloc_noprof+0x11/0xb0
       alloc_buddy_hugetlb_folio.isra.0+0x5a/0x130
       __alloc_fresh_hugetlb_folio+0xe7/0x140
       alloc_pool_huge_folio+0x68/0x100
       set_max_huge_pages+0x13d/0x340
       hugetlb_sysctl_handler_common+0xe8/0x110
       proc_sys_call_handler+0x194/0x280
       vfs_write+0x387/0x550
       ksys_write+0x64/0xe0
       do_syscall_64+0xc2/0x1d0
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
      RIP: 0033:0x7ff916114887
      RSP: 002b:00007ffec8a2fd78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      RAX: ffffffffffffffda RBX: 000055eae500e350 RCX: 00007ff916114887
      RDX: 0000000000000004 RSI: 000055eae500e390 RDI: 0000000000000003
      RBP: 000055eae50104c0 R08: 0000000000000000 R09: 000055eae50104c0
      R10: 0000000000000077 R11: 0000000000000246 R12: 0000000000000004
      R13: 0000000000000004 R14: 00007ff916216b80 R15: 00007ff916216a00
       </TASK>
      Modules linked in: mce_inject hwpoison_inject
      ---[ end trace 0000000000000000 ]---
      
      And before the panic, there had an warning about bad page state:
      
      BUG: Bad page state in process page-types  pfn:8cee00
      page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00
      flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff)
      page_type: 0xffffff7f(buddy)
      raw: 06fffe0000000000 ffffd901241c0008 ffffd901240f8008 0000000000000000
      raw: 0000000000000000 0000000000000009 00000000ffffff7f 0000000000000000
      page dumped because: nonzero mapcount
      Modules linked in: mce_inject hwpoison_inject
      CPU: 8 PID: 154211 Comm: page-types Not tainted 6.9.0-rc4-00499-g5544ec3178e2-dirty #22
      Call Trace:
       <TASK>
       dump_stack_lvl+0x83/0xa0
       bad_page+0x63/0xf0
       free_unref_page+0x36e/0x5c0
       unpoison_memory+0x50b/0x630
       simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
       debugfs_attr_write+0x42/0x60
       full_proxy_write+0x5b/0x80
       vfs_write+0xcd/0x550
       ksys_write+0x64/0xe0
       do_syscall_64+0xc2/0x1d0
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
      RIP: 0033:0x7f189a514887
      RSP: 002b:00007ffdcd899718 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f189a514887
      RDX: 0000000000000009 RSI: 00007ffdcd899730 RDI: 0000000000000003
      RBP: 00007ffdcd8997a0 R08: 0000000000000000 R09: 00007ffdcd8994b2
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcda199a8
      R13: 0000000000404af1 R14: 000000000040ad78 R15: 00007f189a7a5040
       </TASK>
      
      The root cause should be the below race:
      
       memory_failure
        try_memory_failure_hugetlb
         me_huge_page
          __page_handle_poison
           dissolve_free_hugetlb_folio
           drain_all_pages -- Buddy page can be isolated e.g. for compaction.
           take_page_off_buddy -- Failed as page is not in the buddy list.
      	     -- Page can be putback into buddy after compaction.
          page_ref_inc -- Leads to buddy page with refcnt = 1.
      
      Then unpoison_memory() can unpoison the page and send the buddy page back
      into buddy list again leading to the above bad page state warning.  And
      bad_page() will call page_mapcount_reset() to remove PageBuddy from buddy
      page leading to later VM_BUG_ON_PAGE(!PageBuddy(page)) when trying to
      allocate this page.
      
      Fix this issue by only treating __page_handle_poison() as successful when
      it returns 1.
      
      Link: https://lkml.kernel.org/r/20240523071217.1696196-1-linmiaohe@huawei.com
      Fixes: ceaf8fbe ("mm, hwpoison: skip raw hwpoison page in freeing 1GB hugepage")
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      8cf360b9
    • Yuanyuan Zhong's avatar
      mm: /proc/pid/smaps_rollup: avoid skipping vma after getting mmap_lock again · 6d065f50
      Yuanyuan Zhong authored
      After switching smaps_rollup to use VMA iterator, searching for next entry
      is part of the condition expression of the do-while loop.  So the current
      VMA needs to be addressed before the continue statement.
      
      Otherwise, with some VMAs skipped, userspace observed memory
      consumption from /proc/pid/smaps_rollup will be smaller than the sum of
      the corresponding fields from /proc/pid/smaps.
      
      Link: https://lkml.kernel.org/r/20240523183531.2535436-1-yzhong@purestorage.com
      Fixes: c4c84f06 ("fs/proc/task_mmu: stop using linked list and highest_vm_end")
      Signed-off-by: default avatarYuanyuan Zhong <yzhong@purestorage.com>
      Reviewed-by: default avatarMohamed Khalfella <mkhalfella@purestorage.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      6d065f50
    • Ryusuke Konishi's avatar
      nilfs2: fix potential hang in nilfs_detach_log_writer() · eb85dace
      Ryusuke Konishi authored
      Syzbot has reported a potential hang in nilfs_detach_log_writer() called
      during nilfs2 unmount.
      
      Analysis revealed that this is because nilfs_segctor_sync(), which
      synchronizes with the log writer thread, can be called after
      nilfs_segctor_destroy() terminates that thread, as shown in the call trace
      below:
      
      nilfs_detach_log_writer
        nilfs_segctor_destroy
          nilfs_segctor_kill_thread  --> Shut down log writer thread
          flush_work
            nilfs_iput_work_func
              nilfs_dispose_list
                iput
                  nilfs_evict_inode
                    nilfs_transaction_commit
                      nilfs_construct_segment (if inode needs sync)
                        nilfs_segctor_sync  --> Attempt to synchronize with
                                                log writer thread
                                 *** DEADLOCK ***
      
      Fix this issue by changing nilfs_segctor_sync() so that the log writer
      thread returns normally without synchronizing after it terminates, and by
      forcing tasks that are already waiting to complete once after the thread
      terminates.
      
      The skipped inode metadata flushout will then be processed together in the
      subsequent cleanup work in nilfs_segctor_destroy().
      
      Link: https://lkml.kernel.org/r/20240520132621.4054-4-konishi.ryusuke@gmail.comSigned-off-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Reported-by: syzbot+e3973c409251e136fdd0@syzkaller.appspotmail.com
      Closes: https://syzkaller.appspot.com/bug?extid=e3973c409251e136fdd0Tested-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Cc: <stable@vger.kernel.org>
      Cc: "Bai, Shuangpeng" <sjb7183@psu.edu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      eb85dace
    • Ryusuke Konishi's avatar
      nilfs2: fix unexpected freezing of nilfs_segctor_sync() · 936184ea
      Ryusuke Konishi authored
      A potential and reproducible race issue has been identified where
      nilfs_segctor_sync() would block even after the log writer thread writes a
      checkpoint, unless there is an interrupt or other trigger to resume log
      writing.
      
      This turned out to be because, depending on the execution timing of the
      log writer thread running in parallel, the log writer thread may skip
      responding to nilfs_segctor_sync(), which causes a call to schedule()
      waiting for completion within nilfs_segctor_sync() to lose the opportunity
      to wake up.
      
      The reason why waking up the task waiting in nilfs_segctor_sync() may be
      skipped is that updating the request generation issued using a shared
      sequence counter and adding an wait queue entry to the request wait queue
      to the log writer, are not done atomically.  There is a possibility that
      log writing and request completion notification by nilfs_segctor_wakeup()
      may occur between the two operations, and in that case, the wait queue
      entry is not yet visible to nilfs_segctor_wakeup() and the wake-up of
      nilfs_segctor_sync() will be carried over until the next request occurs.
      
      Fix this issue by performing these two operations simultaneously within
      the lock section of sc_state_lock.  Also, following the memory barrier
      guidelines for event waiting loops, move the call to set_current_state()
      in the same location into the event waiting loop to ensure that a memory
      barrier is inserted just before the event condition determination.
      
      Link: https://lkml.kernel.org/r/20240520132621.4054-3-konishi.ryusuke@gmail.com
      Fixes: 9ff05123 ("nilfs2: segment constructor")
      Signed-off-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Tested-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Cc: <stable@vger.kernel.org>
      Cc: "Bai, Shuangpeng" <sjb7183@psu.edu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      936184ea
    • Ryusuke Konishi's avatar
      nilfs2: fix use-after-free of timer for log writer thread · f5d4e046
      Ryusuke Konishi authored
      Patch series "nilfs2: fix log writer related issues".
      
      This bug fix series covers three nilfs2 log writer-related issues,
      including a timer use-after-free issue and potential deadlock issue on
      unmount, and a potential freeze issue in event synchronization found
      during their analysis.  Details are described in each commit log.
      
      
      This patch (of 3):
      
      A use-after-free issue has been reported regarding the timer sc_timer on
      the nilfs_sc_info structure.
      
      The problem is that even though it is used to wake up a sleeping log
      writer thread, sc_timer is not shut down until the nilfs_sc_info structure
      is about to be freed, and is used regardless of the thread's lifetime.
      
      Fix this issue by limiting the use of sc_timer only while the log writer
      thread is alive.
      
      Link: https://lkml.kernel.org/r/20240520132621.4054-1-konishi.ryusuke@gmail.com
      Link: https://lkml.kernel.org/r/20240520132621.4054-2-konishi.ryusuke@gmail.com
      Fixes: fdce895e ("nilfs2: change sc_timer from a pointer to an embedded one in struct nilfs_sc_info")
      Signed-off-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Reported-by: default avatar"Bai, Shuangpeng" <sjb7183@psu.edu>
      Closes: https://groups.google.com/g/syzkaller/c/MK_LYqtt8ko/m/8rgdWeseAwAJTested-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      f5d4e046
    • Michael Ellerman's avatar
      selftests/mm: fix build warnings on ppc64 · 1901472f
      Michael Ellerman authored
      Fix warnings like:
      
        In file included from uffd-unit-tests.c:8:
        uffd-unit-tests.c: In function `uffd_poison_handle_fault':
        uffd-common.h:45:33: warning: format `%llu' expects argument of type
        `long long unsigned int', but argument 3 has type `__u64' {aka `long
        unsigned int'} [-Wformat=]
      
      By switching to unsigned long long for u64 for ppc64 builds.
      
      Link: https://lkml.kernel.org/r/20240521030219.57439-1-mpe@ellerman.id.auSigned-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Cc: Shuah Khan <skhan@linuxfoundation.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      1901472f
    • Will Deacon's avatar
      arm64: patching: fix handling of execmem addresses · b1480ed2
      Will Deacon authored
      Klara Modin reported warnings for a kernel configured with BPF_JIT but
      without MODULES:
      
      [   44.131296] Trying to vfree() bad address (000000004a17c299)
      [   44.138024] WARNING: CPU: 1 PID: 193 at mm/vmalloc.c:3189 remove_vm_area (mm/vmalloc.c:3189 (discriminator 1))
      [   44.146675] CPU: 1 PID: 193 Comm: kworker/1:2 Tainted: G      D W          6.9.0-01786-g2c9e5d4a #25
      [   44.158229] Hardware name: Raspberry Pi 3 Model B (DT)
      [   44.164433] Workqueue: events bpf_prog_free_deferred
      [   44.170492] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      [   44.178601] pc : remove_vm_area (mm/vmalloc.c:3189 (discriminator 1))
      [   44.183705] lr : remove_vm_area (mm/vmalloc.c:3189 (discriminator 1))
      [   44.188772] sp : ffff800082a13c70
      [   44.193112] x29: ffff800082a13c70 x28: 0000000000000000 x27: 0000000000000000
      [   44.201384] x26: 0000000000000000 x25: ffff00003a44efa0 x24: 00000000d4202000
      [   44.209658] x23: ffff800081223dd0 x22: ffff00003a198a40 x21: ffff8000814dd880
      [   44.217924] x20: 00000000d4202000 x19: ffff8000814dd880 x18: 0000000000000006
      [   44.226206] x17: 0000000000000000 x16: 0000000000000020 x15: 0000000000000002
      [   44.234460] x14: ffff8000811a6370 x13: 0000000020000000 x12: 0000000000000000
      [   44.242710] x11: ffff8000811a6370 x10: 0000000000000144 x9 : ffff8000811fe370
      [   44.250959] x8 : 0000000000017fe8 x7 : 00000000fffff000 x6 : ffff8000811fe370
      [   44.259206] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
      [   44.267457] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000002203240
      [   44.275703] Call trace:
      [   44.279158] remove_vm_area (mm/vmalloc.c:3189 (discriminator 1))
      [   44.283858] vfree (mm/vmalloc.c:3322)
      [   44.287835] execmem_free (mm/execmem.c:70)
      [   44.292347] bpf_jit_free_exec+0x10/0x1c
      [   44.297283] bpf_prog_pack_free (kernel/bpf/core.c:1006)
      [   44.302457] bpf_jit_binary_pack_free (kernel/bpf/core.c:1195)
      [   44.307951] bpf_jit_free (include/linux/filter.h:1083 arch/arm64/net/bpf_jit_comp.c:2474)
      [   44.312342] bpf_prog_free_deferred (kernel/bpf/core.c:2785)
      [   44.317785] process_one_work (kernel/workqueue.c:3273)
      [   44.322684] worker_thread (kernel/workqueue.c:3342 (discriminator 2) kernel/workqueue.c:3429 (discriminator 2))
      [   44.327292] kthread (kernel/kthread.c:388)
      [   44.331342] ret_from_fork (arch/arm64/kernel/entry.S:861)
      
      The problem is because bpf_arch_text_copy() silently fails to write to the
      read-only area as a result of patch_map() faulting and the resulting
      -EFAULT being chucked away.
      
      Update patch_map() to use CONFIG_EXECMEM instead of
      CONFIG_STRICT_MODULE_RWX to check for vmalloc addresses.
      
      Link: https://lkml.kernel.org/r/20240521213813.703309-1-rppt@kernel.org
      Fixes: 2c9e5d4a ("bpf: remove CONFIG_BPF_JIT dependency on CONFIG_MODULES of")
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      Signed-off-by: default avatarMike Rapoport (IBM) <rppt@kernel.org>
      Reported-by: default avatarKlara Modin <klarasmodin@gmail.com>
      Closes: https://lore.kernel.org/all/7983fbbf-0127-457c-9394-8d6e4299c685@gmail.comTested-by: default avatarKlara Modin <klarasmodin@gmail.com>
      Cc: Björn Töpel <bjorn@kernel.org>
      Cc: Luis Chamberlain <mcgrof@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      b1480ed2
    • Dev Jain's avatar
      selftests/mm: compaction_test: fix bogus test success and reduce probability... · fb9293b6
      Dev Jain authored
      selftests/mm: compaction_test: fix bogus test success and reduce probability of OOM-killer invocation
      
      Reset nr_hugepages to zero before the start of the test.
      
      If a non-zero number of hugepages is already set before the start of the
      test, the following problems arise:
      
       - The probability of the test getting OOM-killed increases.  Proof:
         The test wants to run on 80% of available memory to prevent OOM-killing
         (see original code comments).  Let the value of mem_free at the start
         of the test, when nr_hugepages = 0, be x.  In the other case, when
         nr_hugepages > 0, let the memory consumed by hugepages be y.  In the
         former case, the test operates on 0.8 * x of memory.  In the latter,
         the test operates on 0.8 * (x - y) of memory, with y already filled,
         hence, memory consumed is y + 0.8 * (x - y) = 0.8 * x + 0.2 * y > 0.8 *
         x.  Q.E.D
      
       - The probability of a bogus test success increases.  Proof: Let the
         memory consumed by hugepages be greater than 25% of x, with x and y
         defined as above.  The definition of compaction_index is c_index = (x -
         y)/z where z is the memory consumed by hugepages after trying to
         increase them again.  In check_compaction(), we set the number of
         hugepages to zero, and then increase them back; the probability that
         they will be set back to consume at least y amount of memory again is
         very high (since there is not much delay between the two attempts of
         changing nr_hugepages).  Hence, z >= y > (x/4) (by the 25% assumption).
         Therefore, c_index = (x - y)/z <= (x - y)/y = x/y - 1 < 4 - 1 = 3
         hence, c_index can always be forced to be less than 3, thereby the test
         succeeding always.  Q.E.D
      
      Link: https://lkml.kernel.org/r/20240521074358.675031-4-dev.jain@arm.com
      Fixes: bd67d5c1 ("Test compaction of mlocked memory")
      Signed-off-by: default avatarDev Jain <dev.jain@arm.com>
      Cc: <stable@vger.kernel.org>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Sri Jayaramappa <sjayaram@akamai.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      fb9293b6
    • Dev Jain's avatar
      selftests/mm: compaction_test: fix incorrect write of zero to nr_hugepages · 9ad665ef
      Dev Jain authored
      Currently, the test tries to set nr_hugepages to zero, but that is not
      actually done because the file offset is not reset after read().  Fix that
      using lseek().
      
      Link: https://lkml.kernel.org/r/20240521074358.675031-3-dev.jain@arm.com
      Fixes: bd67d5c1 ("Test compaction of mlocked memory")
      Signed-off-by: default avatarDev Jain <dev.jain@arm.com>
      Cc: <stable@vger.kernel.org>
      Cc: Anshuman Khandual <anshuman.khandual@arm.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Sri Jayaramappa <sjayaram@akamai.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      9ad665ef