Commit 5db9a4d9 authored by Tejun Heo's avatar Tejun Heo

cgroup: fix cgroup hierarchy umount race

48ddbe19 "cgroup: make css->refcnt clearing on cgroup removal
optional" allowed a css to linger after the associated cgroup is
removed.  As a css holds a reference on the cgroup's dentry, it means
that cgroup dentries may linger for a while.

Destroying a superblock which has dentries with positive refcnts is a
critical bug and triggers BUG() in vfs code.  As each cgroup dentry
holds an s_active reference, any lingering cgroup has both its dentry
and the superblock pinned and thus preventing premature release of
superblock.

Unfortunately, after 48ddbe19, there's a small window while
releasing a cgroup which is directly under the root of the hierarchy.
When a cgroup directory is released, vfs layer first deletes the
corresponding dentry and then invokes dput() on the parent, which may
recurse further, so when a cgroup directly below root cgroup is
released, the cgroup is first destroyed - which releases the s_active
it was holding - and then the dentry for the root cgroup is dput().

This creates a window where the root dentry's refcnt isn't zero but
superblock's s_active is.  If umount happens before or during this
window, vfs will see the root dentry with non-zero refcnt and trigger
BUG().

Before 48ddbe19, this problem didn't exist because the last dentry
reference was guaranteed to be put synchronously from rmdir(2)
invocation which holds s_active around the whole process.

Fix it by holding an extra superblock->s_active reference across
dput() from css release, which is the dput() path added by 48ddbe19
and the only one which doesn't hold an extra s_active ref across the
final cgroup dput().
Signed-off-by: default avatarTejun Heo <tj@kernel.org>
LKML-Reference: <4FEEA5CB.8070809@huawei.com>
Reported-by: default avatarshyju pv <shyju.pv@huawei.com>
Tested-by: default avatarshyju pv <shyju.pv@huawei.com>
Cc: Sasha Levin <levinsasha928@gmail.com>
Acked-by: default avatarLi Zefan <lizefan@huawei.com>
parent 7db5b3ca
...@@ -3883,8 +3883,12 @@ static void css_dput_fn(struct work_struct *work) ...@@ -3883,8 +3883,12 @@ static void css_dput_fn(struct work_struct *work)
{ {
struct cgroup_subsys_state *css = struct cgroup_subsys_state *css =
container_of(work, struct cgroup_subsys_state, dput_work); container_of(work, struct cgroup_subsys_state, dput_work);
struct dentry *dentry = css->cgroup->dentry;
struct super_block *sb = dentry->d_sb;
dput(css->cgroup->dentry); atomic_inc(&sb->s_active);
dput(dentry);
deactivate_super(sb);
} }
static void init_cgroup_css(struct cgroup_subsys_state *css, static void init_cgroup_css(struct cgroup_subsys_state *css,
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment