• Linus Torvalds's avatar
    proc: fix missing conversion to 'iterate_shared' · 0a2c2baa
    Linus Torvalds authored
    I'm looking at the directory handling due to the discussion about f_pos
    locking (see commit 79796425: "file: reinstate f_pos locking
    optimization for regular files"), and wanting to clean that up.
    
    And one source of ugliness is how we were supposed to move filesystems
    over to the '->iterate_shared()' function that only takes the inode lock
    for reading many many years ago, but several filesystems still use the
    bad old '->iterate()' that takes the inode lock for exclusive access.
    
    See commit 61922694 ("introduce a parallel variant of ->iterate()")
    that also added some documentation stating
    
          Old method is only used if the new one is absent; eventually it will
          be removed.  Switch while you still can; the old one won't stay.
    
    and that was back in April 2016.  Here we are, many years later, and the
    old version is still clearly sadly alive and well.
    
    Now, some of those old style iterators are probably just because the
    filesystem may end up having per-inode mutable data that it uses for
    iterating a directory, but at least one case is just a mistake.
    
    Al switched over most filesystems to use '->iterate_shared()' back when
    it was introduced.  In particular, the /proc filesystem was converted as
    one of the first ones in commit f50752ea ("switch all procfs
    directories ->iterate_shared()").
    
    But then later one new user of '->iterate()' was then re-introduced by
    commit 6d9c939d ("procfs: add smack subdir to attrs").
    
    And that's clearly not what we wanted, since that new case just uses the
    same 'proc_pident_readdir()' and 'proc_pident_lookup()' helper functions
    that other /proc pident directories use, and they are most definitely
    safe to use with the inode lock held shared.
    
    So just fix it.
    
    This still leaves a fair number of oddball filesystems using the
    old-style directory iterator (ceph, coda, exfat, jfs, ntfs, ocfs2,
    overlayfs, and vboxsf), but at least we don't have any remaining in the
    core filesystems.
    
    I'm going to add a wrapper function that just drops the read-lock and
    takes it as a write lock, so that we can clean up the core vfs layer and
    make all the ugly 'this filesystem needs exclusive inode locking' be
    just filesystem-internal warts.
    
    I just didn't want to make that conversion when we still had a core user
    left.
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
    0a2c2baa
base.c 93.9 KB