• Louis Rilling's avatar
    configfs: Silence lockdep on mkdir() and rmdir() · e74cc06d
    Louis Rilling authored
    When attaching default groups (subdirs) of a new group (in mkdir() or
    in configfs_register()), configfs recursively takes inode's mutexes
    along the path from the parent of the new group to the default
    subdirs. This is needed to ensure that the VFS will not race with
    operations on these sub-dirs. This is safe for the following reasons:
    
    - the VFS allows one to lock first an inode and second one of its
      children (The lock subclasses for this pattern are respectively
      I_MUTEX_PARENT and I_MUTEX_CHILD);
    - from this rule any inode path can be recursively locked in
      descending order as long as it stays under a single mountpoint and
      does not follow symlinks.
    
    Unfortunately lockdep does not know (yet?) how to handle such
    recursion.
    
    I've tried to use Peter Zijlstra's lock_set_subclass() helper to
    upgrade i_mutexes from I_MUTEX_CHILD to I_MUTEX_PARENT when we know
    that we might recursively lock some of their descendant, but this
    usage does not seem to fit the purpose of lock_set_subclass() because
    it leads to several i_mutex locked with subclass I_MUTEX_PARENT by
    the same task.
    
    >From inside configfs it is not possible to serialize those recursive
    locking with a top-level one, because mkdir() and rmdir() are already
    called with inodes locked by the VFS. So using some
    mutex_lock_nest_lock() is not an option.
    
    I am proposing two solutions:
    1) one that wraps recursive mutex_lock()s with
       lockdep_off()/lockdep_on().
    2) (as suggested earlier by Peter Zijlstra) one that puts the
       i_mutexes recursively locked in different classes based on their
       depth from the top-level config_group created. This
       induces an arbitrary limit (MAX_LOCK_DEPTH - 2 == 46) on the
       nesting of configfs default groups whenever lockdep is activated
       but this limit looks reasonably high. Unfortunately, this also
       isolates VFS operations on configfs default groups from the others
       and thus lowers the chances to detect locking issues.
    
    Nobody likes solution 1), which I can understand.
    
    This patch implements solution 2). However lockdep is still not happy with
    configfs_depend_item(). Next patch reworks the locking of
    configfs_depend_item() and finally makes lockdep happy.
    
    [ Note: This hides a few locking interactions with the VFS from lockdep.
      That was my big concern, because we like lockdep's protection.  However,
      the current state always dumps a spurious warning.  The locking is
      correct, so I tell people to ignore the warning and that we'll keep
      our eyes on the locking to make sure it stays correct.  With this patch,
      we eliminate the warning.  We do lose some of the lockdep protections,
      but this only means that we still have to keep our eyes on the locking.
      We're going to do that anyway.  -- Joel ]
    Signed-off-by: default avatarLouis Rilling <louis.rilling@kerlabs.com>
    Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
    e74cc06d
inode.c 7.77 KB