Merge tag 'cgroup-for-6.4-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup fixes from Tejun Heo: - Fix css_set reference leaks on fork failures - Fix CPU hotplug locking in cgroup_transfer_tasks() which is used by cgroup1 cpuset - Doc update * tag 'cgroup-for-6.4-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: Documentation: Clarify usage of memory limits cgroup: always put cset in cgroup_css_set_put_fork cgroup: fix missing cpus_read_{lock,unlock}() in cgroup_transfer_tasks()

Merge tag 'cgroup-for-6.4-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo: - Fix css_set reference leaks on fork failures - Fix CPU hotplug locking in cgroup_transfer_tasks() which is used by cgroup1 cpuset - Doc update * tag 'cgroup-for-6.4-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: Documentation: Clarify usage of memory limits cgroup: always put cset in cgroup_css_set_put_fork cgroup: fix missing cpus_read_{lock,unlock}() in cgroup_transfer_tasks()
9cd6357f · Linus Torvalds · 8d15d5e1 · 5647e53f · 9cd6357f · 9cd6357f
Commit 9cd6357f authored Jun 08, 2023 by Linus Torvalds
Showing with 20 additions and 23 deletions

Documentation/admin-guide/cgroup-v2.rst Documentation/admin-guide/cgroup-v2.rst +10 -12

kernel/cgroup/cgroup-v1.c kernel/cgroup/cgroup-v1.c +2 -2

kernel/cgroup/cgroup.c kernel/cgroup/cgroup.c +8 -9

No files found.
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -1213,23 +1213,25 @@ PAGE_SIZE multiple when read back.
 	A read-write single value file which exists on non-root
 	cgroups.  The default is "max".
-	Memory usage throttle limit.  This is the main mechanism to
+	Memory usage throttle limit.  If a cgroup's usage goes
-	control memory usage of a cgroup.  If a cgroup's usage goes
 	over the high boundary, the processes of the cgroup are
 	throttled and put under heavy reclaim pressure.
 	Going over the high limit never invokes the OOM killer and
-	under extreme conditions the limit may be breached.
+	under extreme conditions the limit may be breached. The high
+	limit should be used in scenarios where an external process
+	monitors the limited cgroup to alleviate heavy reclaim
+	pressure.
  memory.max
 	A read-write single value file which exists on non-root
 	cgroups.  The default is "max".
-	Memory usage hard limit.  This is the final protection
+	Memory usage hard limit.  This is the main mechanism to limit
-	mechanism.  If a cgroup's memory usage reaches this limit and
+	memory usage of a cgroup.  If a cgroup's memory usage reaches
-	can't be reduced, the OOM killer is invoked in the cgroup.
+	this limit and can't be reduced, the OOM killer is invoked in
-	Under certain circumstances, the usage may go over the limit
+	the cgroup. Under certain circumstances, the usage may go
-	temporarily.
+	over the limit temporarily.
 	In default configuration regular 0-order allocations always
 	succeed unless OOM killer chooses current task as a victim.
@@ -1238,10 +1240,6 @@ PAGE_SIZE multiple when read back.
 	Caller could retry them differently, return into userspace
 	as -ENOMEM or silently ignore in cases like disk readahead.
-	This is the ultimate protection mechanism.  As long as the
-	high limit is used and monitored properly, this limit's
-	utility is limited to providing the final safety net.
  memory.reclaim
 	A write-only nested-keyed file which exists for all cgroups.

--- a/kernel/cgroup/cgroup-v1.c
+++ b/kernel/cgroup/cgroup-v1.c
@@ -108,7 +108,7 @@ int cgroup_transfer_tasks(struct cgroup *to, struct cgroup *from)
 	cgroup_lock();
-	percpu_down_write(&cgroup_threadgroup_rwsem);
+	cgroup_attach_lock(true);
 	/* all tasks in @from are being moved, all csets are source */
 	spin_lock_irq(&css_set_lock);
@@ -144,7 +144,7 @@ int cgroup_transfer_tasks(struct cgroup *to, struct cgroup *from)
 	} while (task && !ret);
 out_err:
 	cgroup_migrate_finish(&mgctx);
-	percpu_up_write(&cgroup_threadgroup_rwsem);
+	cgroup_attach_unlock(true);
 	cgroup_unlock();
 	return ret;
 }

--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -6486,19 +6486,18 @@ static int cgroup_css_set_fork(struct kernel_clone_args *kargs)
 static void cgroup_css_set_put_fork(struct kernel_clone_args *kargs)
 	__releases(&cgroup_threadgroup_rwsem) __releases(&cgroup_mutex)
 {
-	cgroup_threadgroup_change_end(current);
-	if (kargs->flags & CLONE_INTO_CGROUP) {
 	struct cgroup *cgrp = kargs->cgrp;
 	struct css_set *cset = kargs->cset;
-		cgroup_unlock();
+	cgroup_threadgroup_change_end(current);
 	if (cset) {
 		put_css_set(cset);
 		kargs->cset = NULL;
 	}
+	if (kargs->flags & CLONE_INTO_CGROUP) {
+		cgroup_unlock();
 		if (cgrp) {
 			cgroup_put(cgrp);
 			kargs->cgrp = NULL;