• Baokun Li's avatar
    cachefiles: never get a new anonymous fd if ondemand_id is valid · 4988e35e
    Baokun Li authored
    Now every time the daemon reads an open request, it gets a new anonymous fd
    and ondemand_id. With the introduction of "restore", it is possible to read
    the same open request more than once, and therefore an object can have more
    than one anonymous fd.
    
    If the anonymous fd is not unique, the following concurrencies will result
    in an fd leak:
    
         t1     |         t2         |          t3
    ------------------------------------------------------------
     cachefiles_ondemand_init_object
      cachefiles_ondemand_send_req
       REQ_A = kzalloc(sizeof(*req) + data_len)
       wait_for_completion(&REQ_A->done)
                cachefiles_daemon_read
                 cachefiles_ondemand_daemon_read
                  REQ_A = cachefiles_ondemand_select_req
                  cachefiles_ondemand_get_fd
                    load->fd = fd0
                    ondemand_id = object_id0
                                      ------ restore ------
                                      cachefiles_ondemand_restore
                                       // restore REQ_A
                                      cachefiles_daemon_read
                                       cachefiles_ondemand_daemon_read
                                        REQ_A = cachefiles_ondemand_select_req
                                          cachefiles_ondemand_get_fd
                                            load->fd = fd1
                                            ondemand_id = object_id1
                 process_open_req(REQ_A)
                 write(devfd, ("copen %u,%llu", msg->msg_id, size))
                 cachefiles_ondemand_copen
                  xa_erase(&cache->reqs, id)
                  complete(&REQ_A->done)
       kfree(REQ_A)
                                      process_open_req(REQ_A)
                                      // copen fails due to no req
                                      // daemon close(fd1)
                                      cachefiles_ondemand_fd_release
                                       // set object closed
     -- umount --
     cachefiles_withdraw_cookie
      cachefiles_ondemand_clean_object
       cachefiles_ondemand_init_close_req
        if (!cachefiles_ondemand_object_is_open(object))
          return -ENOENT;
        // The fd0 is not closed until the daemon exits.
    
    However, the anonymous fd holds the reference count of the object and the
    object holds the reference count of the cookie. So even though the cookie
    has been relinquished, it will not be unhashed and freed until the daemon
    exits.
    
    In fscache_hash_cookie(), when the same cookie is found in the hash list,
    if the cookie is set with the FSCACHE_COOKIE_RELINQUISHED bit, then the new
    cookie waits for the old cookie to be unhashed, while the old cookie is
    waiting for the leaked fd to be closed, if the daemon does not exit in time
    it will trigger a hung task.
    
    To avoid this, allocate a new anonymous fd only if no anonymous fd has
    been allocated (ondemand_id == 0) or if the previously allocated anonymous
    fd has been closed (ondemand_id == -1). Moreover, returns an error if
    ondemand_id is valid, letting the daemon know that the current userland
    restore logic is abnormal and needs to be checked.
    
    Fixes: c8383054 ("cachefiles: notify the user daemon when looking up cookie")
    Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
    Link: https://lore.kernel.org/r/20240522114308.2402121-9-libaokun@huaweicloud.comAcked-by: default avatarJeff Layton <jlayton@kernel.org>
    Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
    4988e35e
ondemand.c 17.3 KB