• Tetsuo Handa's avatar
    loop: avoid loop_validate_mutex/lo_mutex in ->release · 158eaeba
    Tetsuo Handa authored
    Since ->release is called with disk->open_mutex held, and __loop_clr_fd()
     from lo_release() is called via ->release when disk_openers() == 0, we are
    guaranteed that "struct file" which will be passed to loop_validate_file()
    via fget() cannot be the loop device __loop_clr_fd(lo, true) will clear.
    Thus, there is no need to hold loop_validate_mutex from __loop_clr_fd()
    if release == true.
    
    When I made commit 3ce6e1f6 ("loop: reintroduce global lock for
    safe loop_validate_file() traversal"), I wrote "It is acceptable for
    loop_validate_file() to succeed, for actual clear operation has not started
    yet.". But now I came to feel why it is acceptable to succeed.
    
    It seems that the loop driver was added in Linux 1.3.68, and
    
      if (lo->lo_refcnt > 1)
        return -EBUSY;
    
    check in loop_clr_fd() was there from the beginning. The intent of this
    check was unclear. But now I think that current
    
      disk_openers(lo->lo_disk) > 1
    
    form is there for three reasons.
    
    (1) Avoid I/O errors when some process which opens and reads from this
        loop device in response to uevent notification (e.g. systemd-udevd),
        as described in commit a1ecac3b ("loop: Make explicit loop
        device destruction lazy"). This opener is short-lived because it is
        likely that the file descriptor used by that process is closed soon.
    
    (2) Avoid I/O errors caused by underlying layer of stacked loop devices
        (i.e. ioctl(some_loop_fd, LOOP_SET_FD, other_loop_fd)) being suddenly
        disappeared. This opener is long-lived because this reference is
        associated with not a file descriptor but lo->lo_backing_file.
    
    (3) Avoid I/O errors caused by underlying layer of mounted loop device
        (i.e. mount(some_loop_device, some_mount_point)) being suddenly
        disappeared. This opener is long-lived because this reference is
        associated with not a file descriptor but mount.
    
    While race in (1) might be acceptable, (2) and (3) should be checked
    racelessly. That is, make sure that __loop_clr_fd() will not run if
    loop_validate_file() succeeds, by doing refcount check with global lock
    held when explicit loop device destruction is requested.
    
    As a result of no longer waiting for lo->lo_mutex after setting Lo_rundown,
    we can remove pointless BUG_ON(lo->lo_state != Lo_rundown) check.
    Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
    Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
    Link: https://lore.kernel.org/r/20220330052917.2566582-14-hch@lst.deSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
    158eaeba
loop.c 57.7 KB