• Baokun Li's avatar
    cachefiles: fix slab-use-after-free in fscache_withdraw_volume() · 522018a0
    Baokun Li authored
    We got the following issue in our fault injection stress test:
    
    ==================================================================
    BUG: KASAN: slab-use-after-free in fscache_withdraw_volume+0x2e1/0x370
    Read of size 4 at addr ffff88810680be08 by task ondemand-04-dae/5798
    
    CPU: 0 PID: 5798 Comm: ondemand-04-dae Not tainted 6.8.0-dirty #565
    Call Trace:
     kasan_check_range+0xf6/0x1b0
     fscache_withdraw_volume+0x2e1/0x370
     cachefiles_withdraw_volume+0x31/0x50
     cachefiles_withdraw_cache+0x3ad/0x900
     cachefiles_put_unbind_pincount+0x1f6/0x250
     cachefiles_daemon_release+0x13b/0x290
     __fput+0x204/0xa00
     task_work_run+0x139/0x230
    
    Allocated by task 5820:
     __kmalloc+0x1df/0x4b0
     fscache_alloc_volume+0x70/0x600
     __fscache_acquire_volume+0x1c/0x610
     erofs_fscache_register_volume+0x96/0x1a0
     erofs_fscache_register_fs+0x49a/0x690
     erofs_fc_fill_super+0x6c0/0xcc0
     vfs_get_super+0xa9/0x140
     vfs_get_tree+0x8e/0x300
     do_new_mount+0x28c/0x580
     [...]
    
    Freed by task 5820:
     kfree+0xf1/0x2c0
     fscache_put_volume.part.0+0x5cb/0x9e0
     erofs_fscache_unregister_fs+0x157/0x1b0
     erofs_kill_sb+0xd9/0x1c0
     deactivate_locked_super+0xa3/0x100
     vfs_get_super+0x105/0x140
     vfs_get_tree+0x8e/0x300
     do_new_mount+0x28c/0x580
     [...]
    ==================================================================
    
    Following is the process that triggers the issue:
    
            mount failed         |         daemon exit
    ------------------------------------------------------------
     deactivate_locked_super        cachefiles_daemon_release
      erofs_kill_sb
       erofs_fscache_unregister_fs
        fscache_relinquish_volume
         __fscache_relinquish_volume
          fscache_put_volume(fscache_volume, fscache_volume_put_relinquish)
           zero = __refcount_dec_and_test(&fscache_volume->ref, &ref);
                                     cachefiles_put_unbind_pincount
                                      cachefiles_daemon_unbind
                                       cachefiles_withdraw_cache
                                        cachefiles_withdraw_volumes
                                         list_del_init(&volume->cache_link)
           fscache_free_volume(fscache_volume)
            cache->ops->free_volume
             cachefiles_free_volume
              list_del_init(&cachefiles_volume->cache_link);
            kfree(fscache_volume)
                                         cachefiles_withdraw_volume
                                          fscache_withdraw_volume
                                           fscache_volume->n_accesses
                                           // fscache_volume UAF !!!
    
    The fscache_volume in cache->volumes must not have been freed yet, but its
    reference count may be 0. So use the new fscache_try_get_volume() helper
    function try to get its reference count.
    
    If the reference count of fscache_volume is 0, fscache_put_volume() is
    freeing it, so wait for it to be removed from cache->volumes.
    
    If its reference count is not 0, call cachefiles_withdraw_volume() with
    reference count protection to avoid the above issue.
    
    Fixes: fe2140e2 ("cachefiles: Implement volume support")
    Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
    Link: https://lore.kernel.org/r/20240628062930.2467993-3-libaokun@huaweicloud.comSigned-off-by: default avatarChristian Brauner <brauner@kernel.org>
    522018a0
cache.c 9.97 KB