Commit 69234ace authored by Linus Torvalds's avatar Linus Torvalds

Merge branch 'for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup updates from Tejun Heo:
 "The cgroup core saw several significant updates this cycle:

   - percpu_rwsem for threadgroup locking is reinstated.  This was
     temporarily dropped due to down_write latency issues.  Oleg's
     rework of percpu_rwsem which is scheduled to be merged in this
     merge window resolves the issue.

   - On the v2 hierarchy, when controllers are enabled and disabled, all
     operations are atomic and can fail and revert cleanly.  This allows
     ->can_attach() failure which is necessary for cpu RT slices.

   - Tasks now stay associated with the original cgroups after exit
     until released.  This allows tracking resources held by zombies
     (e.g.  pids) and makes it easy to find out where zombies came from
     on the v2 hierarchy.  The pids controller was broken before these
     changes as zombies escaped the limits; unfortunately, updating this
     behavior required too many invasive changes and I don't think it's
     a good idea to backport them, so the pids controller on 4.3, the
     first version which included the pids controller, will stay broken
     at least until I'm sure about the cgroup core changes.

   - Optimization of a couple common tests using static_key"

* 'for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (38 commits)
  cgroup: fix race condition around termination check in css_task_iter_next()
  blkcg: don't create "io.stat" on the root cgroup
  cgroup: drop cgroup__DEVEL__legacy_files_on_dfl
  cgroup: replace error handling in cgroup_init() with WARN_ON()s
  cgroup: add cgroup_subsys->free() method and use it to fix pids controller
  cgroup: keep zombies associated with their original cgroups
  cgroup: make css_set_rwsem a spinlock and rename it to css_set_lock
  cgroup: don't hold css_set_rwsem across css task iteration
  cgroup: reorganize css_task_iter functions
  cgroup: factor out css_set_move_task()
  cgroup: keep css_set and task lists in chronological order
  cgroup: make cgroup_destroy_locked() test cgroup_is_populated()
  cgroup: make css_sets pin the associated cgroups
  cgroup: relocate cgroup_[try]get/put()
  cgroup: move check_for_release() invocation
  cgroup: replace cgroup_has_tasks() with cgroup_is_populated()
  cgroup: make cgroup->nr_populated count the number of populated css_sets
  cgroup: remove an unused parameter from cgroup_task_migrate()
  cgroup: fix too early usage of static_branch_disable()
  cgroup: make cgroup_update_dfl_csses() migrate all target processes atomically
  ...
parents 11eaaadb d5745675
...@@ -637,6 +637,10 @@ void exit(struct task_struct *task) ...@@ -637,6 +637,10 @@ void exit(struct task_struct *task)
Called during task exit. Called during task exit.
void free(struct task_struct *task)
Called when the task_struct is freed.
void bind(struct cgroup *root) void bind(struct cgroup *root)
(cgroup_mutex held by caller) (cgroup_mutex held by caller)
......
...@@ -107,12 +107,6 @@ root of unified hierarchy can be bound to other hierarchies. This ...@@ -107,12 +107,6 @@ root of unified hierarchy can be bound to other hierarchies. This
allows mixing unified hierarchy with the traditional multiple allows mixing unified hierarchy with the traditional multiple
hierarchies in a fully backward compatible way. hierarchies in a fully backward compatible way.
For development purposes, the following boot parameter makes all
controllers to appear on the unified hierarchy whether supported or
not.
cgroup__DEVEL__legacy_files_on_dfl
A controller can be moved across hierarchies only after the controller A controller can be moved across hierarchies only after the controller
is no longer referenced in its current hierarchy. Because per-cgroup is no longer referenced in its current hierarchy. Because per-cgroup
controller states are destroyed asynchronously and controllers may controller states are destroyed asynchronously and controllers may
...@@ -341,11 +335,11 @@ is riddled with issues. ...@@ -341,11 +335,11 @@ is riddled with issues.
unnecessarily complicated and probably done this way because event unnecessarily complicated and probably done this way because event
delivery itself was expensive. delivery itself was expensive.
Unified hierarchy implements an interface file "cgroup.populated" Unified hierarchy implements "populated" field in "cgroup.events"
which can be used to monitor whether the cgroup's subhierarchy has interface file which can be used to monitor whether the cgroup's
tasks in it or not. Its value is 0 if there is no task in the cgroup subhierarchy has tasks in it or not. Its value is 0 if there is no
and its descendants; otherwise, 1. poll and [id]notify events are task in the cgroup and its descendants; otherwise, 1. poll and
triggered when the value changes. [id]notify events are triggered when the value changes.
This is significantly lighter and simpler and trivially allows This is significantly lighter and simpler and trivially allows
delegating management of subhierarchy - subhierarchy monitoring can delegating management of subhierarchy - subhierarchy monitoring can
...@@ -374,6 +368,10 @@ supported and the interface files "release_agent" and ...@@ -374,6 +368,10 @@ supported and the interface files "release_agent" and
- The "cgroup.clone_children" file is removed. - The "cgroup.clone_children" file is removed.
- /proc/PID/cgroup keeps reporting the cgroup that a zombie belonged
to before exiting. If the cgroup is removed before the zombie is
reaped, " (deleted)" is appeneded to the path.
5-3. Controller File Conventions 5-3. Controller File Conventions
...@@ -435,6 +433,11 @@ may be specified in any order and not all pairs have to be specified. ...@@ -435,6 +433,11 @@ may be specified in any order and not all pairs have to be specified.
the first entry in the file. Specific entries can use "default" as the first entry in the file. Specific entries can use "default" as
its value to indicate inheritance of the default value. its value to indicate inheritance of the default value.
- For events which are not very high frequency, an interface file
"events" should be created which lists event key value pairs.
Whenever a notifiable event happens, file modified event should be
generated on the file.
5-4. Per-Controller Changes 5-4. Per-Controller Changes
......
...@@ -899,6 +899,7 @@ static int blkcg_print_stat(struct seq_file *sf, void *v) ...@@ -899,6 +899,7 @@ static int blkcg_print_stat(struct seq_file *sf, void *v)
struct cftype blkcg_files[] = { struct cftype blkcg_files[] = {
{ {
.name = "stat", .name = "stat",
.flags = CFTYPE_NOT_ON_ROOT,
.seq_show = blkcg_print_stat, .seq_show = blkcg_print_stat,
}, },
{ } /* terminate */ { } /* terminate */
......
...@@ -369,7 +369,7 @@ static void throtl_pd_init(struct blkg_policy_data *pd) ...@@ -369,7 +369,7 @@ static void throtl_pd_init(struct blkg_policy_data *pd)
* regardless of the position of the group in the hierarchy. * regardless of the position of the group in the hierarchy.
*/ */
sq->parent_sq = &td->service_queue; sq->parent_sq = &td->service_queue;
if (cgroup_on_dfl(blkg->blkcg->css.cgroup) && blkg->parent) if (cgroup_subsys_on_dfl(io_cgrp_subsys) && blkg->parent)
sq->parent_sq = &blkg_to_tg(blkg->parent)->service_queue; sq->parent_sq = &blkg_to_tg(blkg->parent)->service_queue;
tg->td = td; tg->td = td;
} }
......
...@@ -1581,7 +1581,7 @@ static struct blkcg_policy_data *cfq_cpd_alloc(gfp_t gfp) ...@@ -1581,7 +1581,7 @@ static struct blkcg_policy_data *cfq_cpd_alloc(gfp_t gfp)
static void cfq_cpd_init(struct blkcg_policy_data *cpd) static void cfq_cpd_init(struct blkcg_policy_data *cpd)
{ {
struct cfq_group_data *cgd = cpd_to_cfqgd(cpd); struct cfq_group_data *cgd = cpd_to_cfqgd(cpd);
unsigned int weight = cgroup_on_dfl(blkcg_root.css.cgroup) ? unsigned int weight = cgroup_subsys_on_dfl(io_cgrp_subsys) ?
CGROUP_WEIGHT_DFL : CFQ_WEIGHT_LEGACY_DFL; CGROUP_WEIGHT_DFL : CFQ_WEIGHT_LEGACY_DFL;
if (cpd_to_blkcg(cpd) == &blkcg_root) if (cpd_to_blkcg(cpd) == &blkcg_root)
...@@ -1599,7 +1599,7 @@ static void cfq_cpd_free(struct blkcg_policy_data *cpd) ...@@ -1599,7 +1599,7 @@ static void cfq_cpd_free(struct blkcg_policy_data *cpd)
static void cfq_cpd_bind(struct blkcg_policy_data *cpd) static void cfq_cpd_bind(struct blkcg_policy_data *cpd)
{ {
struct blkcg *blkcg = cpd_to_blkcg(cpd); struct blkcg *blkcg = cpd_to_blkcg(cpd);
bool on_dfl = cgroup_on_dfl(blkcg_root.css.cgroup); bool on_dfl = cgroup_subsys_on_dfl(io_cgrp_subsys);
unsigned int weight = on_dfl ? CGROUP_WEIGHT_DFL : CFQ_WEIGHT_LEGACY_DFL; unsigned int weight = on_dfl ? CGROUP_WEIGHT_DFL : CFQ_WEIGHT_LEGACY_DFL;
if (blkcg == &blkcg_root) if (blkcg == &blkcg_root)
......
...@@ -13,7 +13,6 @@ ...@@ -13,7 +13,6 @@
#include <linux/sched.h> #include <linux/sched.h>
#include <linux/blkdev.h> #include <linux/blkdev.h>
#include <linux/writeback.h> #include <linux/writeback.h>
#include <linux/memcontrol.h>
#include <linux/blk-cgroup.h> #include <linux/blk-cgroup.h>
#include <linux/backing-dev-defs.h> #include <linux/backing-dev-defs.h>
#include <linux/slab.h> #include <linux/slab.h>
...@@ -267,8 +266,8 @@ static inline bool inode_cgwb_enabled(struct inode *inode) ...@@ -267,8 +266,8 @@ static inline bool inode_cgwb_enabled(struct inode *inode)
{ {
struct backing_dev_info *bdi = inode_to_bdi(inode); struct backing_dev_info *bdi = inode_to_bdi(inode);
return cgroup_on_dfl(mem_cgroup_root_css->cgroup) && return cgroup_subsys_on_dfl(memory_cgrp_subsys) &&
cgroup_on_dfl(blkcg_root_css->cgroup) && cgroup_subsys_on_dfl(io_cgrp_subsys) &&
bdi_cap_account_dirty(bdi) && bdi_cap_account_dirty(bdi) &&
(bdi->capabilities & BDI_CAP_CGROUP_WRITEBACK) && (bdi->capabilities & BDI_CAP_CGROUP_WRITEBACK) &&
(inode->i_sb->s_iflags & SB_I_CGROUPWB); (inode->i_sb->s_iflags & SB_I_CGROUPWB);
......
...@@ -76,12 +76,24 @@ enum { ...@@ -76,12 +76,24 @@ enum {
CFTYPE_ONLY_ON_ROOT = (1 << 0), /* only create on root cgrp */ CFTYPE_ONLY_ON_ROOT = (1 << 0), /* only create on root cgrp */
CFTYPE_NOT_ON_ROOT = (1 << 1), /* don't create on root cgrp */ CFTYPE_NOT_ON_ROOT = (1 << 1), /* don't create on root cgrp */
CFTYPE_NO_PREFIX = (1 << 3), /* (DON'T USE FOR NEW FILES) no subsys prefix */ CFTYPE_NO_PREFIX = (1 << 3), /* (DON'T USE FOR NEW FILES) no subsys prefix */
CFTYPE_WORLD_WRITABLE = (1 << 4), /* (DON'T USE FOR NEW FILES) S_IWUGO */
/* internal flags, do not use outside cgroup core proper */ /* internal flags, do not use outside cgroup core proper */
__CFTYPE_ONLY_ON_DFL = (1 << 16), /* only on default hierarchy */ __CFTYPE_ONLY_ON_DFL = (1 << 16), /* only on default hierarchy */
__CFTYPE_NOT_ON_DFL = (1 << 17), /* not on default hierarchy */ __CFTYPE_NOT_ON_DFL = (1 << 17), /* not on default hierarchy */
}; };
/*
* cgroup_file is the handle for a file instance created in a cgroup which
* is used, for example, to generate file changed notifications. This can
* be obtained by setting cftype->file_offset.
*/
struct cgroup_file {
/* do not access any fields from outside cgroup core */
struct list_head node; /* anchored at css->files */
struct kernfs_node *kn;
};
/* /*
* Per-subsystem/per-cgroup state maintained by the system. This is the * Per-subsystem/per-cgroup state maintained by the system. This is the
* fundamental structural building block that controllers deal with. * fundamental structural building block that controllers deal with.
...@@ -122,6 +134,9 @@ struct cgroup_subsys_state { ...@@ -122,6 +134,9 @@ struct cgroup_subsys_state {
*/ */
u64 serial_nr; u64 serial_nr;
/* all cgroup_files associated with this css */
struct list_head files;
/* percpu_ref killing and RCU release */ /* percpu_ref killing and RCU release */
struct rcu_head rcu_head; struct rcu_head rcu_head;
struct work_struct destroy_work; struct work_struct destroy_work;
...@@ -196,6 +211,9 @@ struct css_set { ...@@ -196,6 +211,9 @@ struct css_set {
*/ */
struct list_head e_cset_node[CGROUP_SUBSYS_COUNT]; struct list_head e_cset_node[CGROUP_SUBSYS_COUNT];
/* all css_task_iters currently walking this cset */
struct list_head task_iters;
/* For RCU-protected deletion */ /* For RCU-protected deletion */
struct rcu_head rcu_head; struct rcu_head rcu_head;
}; };
...@@ -217,16 +235,16 @@ struct cgroup { ...@@ -217,16 +235,16 @@ struct cgroup {
int id; int id;
/* /*
* If this cgroup contains any tasks, it contributes one to * Each non-empty css_set associated with this cgroup contributes
* populated_cnt. All children with non-zero popuplated_cnt of * one to populated_cnt. All children with non-zero popuplated_cnt
* their own contribute one. The count is zero iff there's no task * of their own contribute one. The count is zero iff there's no
* in this cgroup or its subtree. * task in this cgroup or its subtree.
*/ */
int populated_cnt; int populated_cnt;
struct kernfs_node *kn; /* cgroup kernfs entry */ struct kernfs_node *kn; /* cgroup kernfs entry */
struct kernfs_node *procs_kn; /* kn for "cgroup.procs" */ struct cgroup_file procs_file; /* handle for "cgroup.procs" */
struct kernfs_node *populated_kn; /* kn for "cgroup.subtree_populated" */ struct cgroup_file events_file; /* handle for "cgroup.events" */
/* /*
* The bitmask of subsystems enabled on the child cgroups. * The bitmask of subsystems enabled on the child cgroups.
...@@ -324,11 +342,6 @@ struct cftype { ...@@ -324,11 +342,6 @@ struct cftype {
*/ */
char name[MAX_CFTYPE_NAME]; char name[MAX_CFTYPE_NAME];
unsigned long private; unsigned long private;
/*
* If not 0, file mode is set to this value, otherwise it will
* be figured out automatically
*/
umode_t mode;
/* /*
* The maximum length of string, excluding trailing nul, that can * The maximum length of string, excluding trailing nul, that can
...@@ -339,6 +352,14 @@ struct cftype { ...@@ -339,6 +352,14 @@ struct cftype {
/* CFTYPE_* flags */ /* CFTYPE_* flags */
unsigned int flags; unsigned int flags;
/*
* If non-zero, should contain the offset from the start of css to
* a struct cgroup_file field. cgroup will record the handle of
* the created file into it. The recorded handle can be used as
* long as the containing css remains accessible.
*/
unsigned int file_offset;
/* /*
* Fields used for internal bookkeeping. Initialized automatically * Fields used for internal bookkeeping. Initialized automatically
* during registration. * during registration.
...@@ -414,12 +435,10 @@ struct cgroup_subsys { ...@@ -414,12 +435,10 @@ struct cgroup_subsys {
int (*can_fork)(struct task_struct *task, void **priv_p); int (*can_fork)(struct task_struct *task, void **priv_p);
void (*cancel_fork)(struct task_struct *task, void *priv); void (*cancel_fork)(struct task_struct *task, void *priv);
void (*fork)(struct task_struct *task, void *priv); void (*fork)(struct task_struct *task, void *priv);
void (*exit)(struct cgroup_subsys_state *css, void (*exit)(struct task_struct *task);
struct cgroup_subsys_state *old_css, void (*free)(struct task_struct *task);
struct task_struct *task);
void (*bind)(struct cgroup_subsys_state *root_css); void (*bind)(struct cgroup_subsys_state *root_css);
int disabled;
int early_init; int early_init;
/* /*
...@@ -473,8 +492,31 @@ struct cgroup_subsys { ...@@ -473,8 +492,31 @@ struct cgroup_subsys {
unsigned int depends_on; unsigned int depends_on;
}; };
void cgroup_threadgroup_change_begin(struct task_struct *tsk); extern struct percpu_rw_semaphore cgroup_threadgroup_rwsem;
void cgroup_threadgroup_change_end(struct task_struct *tsk);
/**
* cgroup_threadgroup_change_begin - threadgroup exclusion for cgroups
* @tsk: target task
*
* Called from threadgroup_change_begin() and allows cgroup operations to
* synchronize against threadgroup changes using a percpu_rw_semaphore.
*/
static inline void cgroup_threadgroup_change_begin(struct task_struct *tsk)
{
percpu_down_read(&cgroup_threadgroup_rwsem);
}
/**
* cgroup_threadgroup_change_end - threadgroup exclusion for cgroups
* @tsk: target task
*
* Called from threadgroup_change_end(). Counterpart of
* cgroup_threadcgroup_change_begin().
*/
static inline void cgroup_threadgroup_change_end(struct task_struct *tsk)
{
percpu_up_read(&cgroup_threadgroup_rwsem);
}
#else /* CONFIG_CGROUPS */ #else /* CONFIG_CGROUPS */
......
...@@ -13,10 +13,10 @@ ...@@ -13,10 +13,10 @@
#include <linux/nodemask.h> #include <linux/nodemask.h>
#include <linux/rculist.h> #include <linux/rculist.h>
#include <linux/cgroupstats.h> #include <linux/cgroupstats.h>
#include <linux/rwsem.h>
#include <linux/fs.h> #include <linux/fs.h>
#include <linux/seq_file.h> #include <linux/seq_file.h>
#include <linux/kernfs.h> #include <linux/kernfs.h>
#include <linux/jump_label.h>
#include <linux/cgroup-defs.h> #include <linux/cgroup-defs.h>
...@@ -41,6 +41,10 @@ struct css_task_iter { ...@@ -41,6 +41,10 @@ struct css_task_iter {
struct list_head *task_pos; struct list_head *task_pos;
struct list_head *tasks_head; struct list_head *tasks_head;
struct list_head *mg_tasks_head; struct list_head *mg_tasks_head;
struct css_set *cur_cset;
struct task_struct *cur_task;
struct list_head iters_node; /* css_set->task_iters */
}; };
extern struct cgroup_root cgrp_dfl_root; extern struct cgroup_root cgrp_dfl_root;
...@@ -50,6 +54,26 @@ extern struct css_set init_css_set; ...@@ -50,6 +54,26 @@ extern struct css_set init_css_set;
#include <linux/cgroup_subsys.h> #include <linux/cgroup_subsys.h>
#undef SUBSYS #undef SUBSYS
#define SUBSYS(_x) \
extern struct static_key_true _x ## _cgrp_subsys_enabled_key; \
extern struct static_key_true _x ## _cgrp_subsys_on_dfl_key;
#include <linux/cgroup_subsys.h>
#undef SUBSYS
/**
* cgroup_subsys_enabled - fast test on whether a subsys is enabled
* @ss: subsystem in question
*/
#define cgroup_subsys_enabled(ss) \
static_branch_likely(&ss ## _enabled_key)
/**
* cgroup_subsys_on_dfl - fast test on whether a subsys is on default hierarchy
* @ss: subsystem in question
*/
#define cgroup_subsys_on_dfl(ss) \
static_branch_likely(&ss ## _on_dfl_key)
bool css_has_online_children(struct cgroup_subsys_state *css); bool css_has_online_children(struct cgroup_subsys_state *css);
struct cgroup_subsys_state *css_from_id(int id, struct cgroup_subsys *ss); struct cgroup_subsys_state *css_from_id(int id, struct cgroup_subsys *ss);
struct cgroup_subsys_state *cgroup_get_e_css(struct cgroup *cgroup, struct cgroup_subsys_state *cgroup_get_e_css(struct cgroup *cgroup,
...@@ -78,6 +102,7 @@ extern void cgroup_cancel_fork(struct task_struct *p, ...@@ -78,6 +102,7 @@ extern void cgroup_cancel_fork(struct task_struct *p,
extern void cgroup_post_fork(struct task_struct *p, extern void cgroup_post_fork(struct task_struct *p,
void *old_ss_priv[CGROUP_CANFORK_COUNT]); void *old_ss_priv[CGROUP_CANFORK_COUNT]);
void cgroup_exit(struct task_struct *p); void cgroup_exit(struct task_struct *p);
void cgroup_free(struct task_struct *p);
int cgroup_init_early(void); int cgroup_init_early(void);
int cgroup_init(void); int cgroup_init(void);
...@@ -211,11 +236,33 @@ void css_task_iter_end(struct css_task_iter *it); ...@@ -211,11 +236,33 @@ void css_task_iter_end(struct css_task_iter *it);
* cgroup_taskset_for_each - iterate cgroup_taskset * cgroup_taskset_for_each - iterate cgroup_taskset
* @task: the loop cursor * @task: the loop cursor
* @tset: taskset to iterate * @tset: taskset to iterate
*
* @tset may contain multiple tasks and they may belong to multiple
* processes. When there are multiple tasks in @tset, if a task of a
* process is in @tset, all tasks of the process are in @tset. Also, all
* are guaranteed to share the same source and destination csses.
*
* Iteration is not in any specific order.
*/ */
#define cgroup_taskset_for_each(task, tset) \ #define cgroup_taskset_for_each(task, tset) \
for ((task) = cgroup_taskset_first((tset)); (task); \ for ((task) = cgroup_taskset_first((tset)); (task); \
(task) = cgroup_taskset_next((tset))) (task) = cgroup_taskset_next((tset)))
/**
* cgroup_taskset_for_each_leader - iterate group leaders in a cgroup_taskset
* @leader: the loop cursor
* @tset: takset to iterate
*
* Iterate threadgroup leaders of @tset. For single-task migrations, @tset
* may not contain any.
*/
#define cgroup_taskset_for_each_leader(leader, tset) \
for ((leader) = cgroup_taskset_first((tset)); (leader); \
(leader) = cgroup_taskset_next((tset))) \
if ((leader) != (leader)->group_leader) \
; \
else
/* /*
* Inline functions. * Inline functions.
*/ */
...@@ -320,11 +367,11 @@ static inline void css_put_many(struct cgroup_subsys_state *css, unsigned int n) ...@@ -320,11 +367,11 @@ static inline void css_put_many(struct cgroup_subsys_state *css, unsigned int n)
*/ */
#ifdef CONFIG_PROVE_RCU #ifdef CONFIG_PROVE_RCU
extern struct mutex cgroup_mutex; extern struct mutex cgroup_mutex;
extern struct rw_semaphore css_set_rwsem; extern spinlock_t css_set_lock;
#define task_css_set_check(task, __c) \ #define task_css_set_check(task, __c) \
rcu_dereference_check((task)->cgroups, \ rcu_dereference_check((task)->cgroups, \
lockdep_is_held(&cgroup_mutex) || \ lockdep_is_held(&cgroup_mutex) || \
lockdep_is_held(&css_set_rwsem) || \ lockdep_is_held(&css_set_lock) || \
((task)->flags & PF_EXITING) || (__c)) ((task)->flags & PF_EXITING) || (__c))
#else #else
#define task_css_set_check(task, __c) \ #define task_css_set_check(task, __c) \
...@@ -412,68 +459,10 @@ static inline struct cgroup *task_cgroup(struct task_struct *task, ...@@ -412,68 +459,10 @@ static inline struct cgroup *task_cgroup(struct task_struct *task,
return task_css(task, subsys_id)->cgroup; return task_css(task, subsys_id)->cgroup;
} }
/**
* cgroup_on_dfl - test whether a cgroup is on the default hierarchy
* @cgrp: the cgroup of interest
*
* The default hierarchy is the v2 interface of cgroup and this function
* can be used to test whether a cgroup is on the default hierarchy for
* cases where a subsystem should behave differnetly depending on the
* interface version.
*
* The set of behaviors which change on the default hierarchy are still
* being determined and the mount option is prefixed with __DEVEL__.
*
* List of changed behaviors:
*
* - Mount options "noprefix", "xattr", "clone_children", "release_agent"
* and "name" are disallowed.
*
* - When mounting an existing superblock, mount options should match.
*
* - Remount is disallowed.
*
* - rename(2) is disallowed.
*
* - "tasks" is removed. Everything should be at process granularity. Use
* "cgroup.procs" instead.
*
* - "cgroup.procs" is not sorted. pids will be unique unless they got
* recycled inbetween reads.
*
* - "release_agent" and "notify_on_release" are removed. Replacement
* notification mechanism will be implemented.
*
* - "cgroup.clone_children" is removed.
*
* - "cgroup.subtree_populated" is available. Its value is 0 if the cgroup
* and its descendants contain no task; otherwise, 1. The file also
* generates kernfs notification which can be monitored through poll and
* [di]notify when the value of the file changes.
*
* - cpuset: tasks will be kept in empty cpusets when hotplug happens and
* take masks of ancestors with non-empty cpus/mems, instead of being
* moved to an ancestor.
*
* - cpuset: a task can be moved into an empty cpuset, and again it takes
* masks of ancestors.
*
* - memcg: use_hierarchy is on by default and the cgroup file for the flag
* is not created.
*
* - blkcg: blk-throttle becomes properly hierarchical.
*
* - debug: disallowed on the default hierarchy.
*/
static inline bool cgroup_on_dfl(const struct cgroup *cgrp)
{
return cgrp->root == &cgrp_dfl_root;
}
/* no synchronization, the result can only be used as a hint */ /* no synchronization, the result can only be used as a hint */
static inline bool cgroup_has_tasks(struct cgroup *cgrp) static inline bool cgroup_is_populated(struct cgroup *cgrp)
{ {
return !list_empty(&cgrp->cset_links); return cgrp->populated_cnt;
} }
/* returns ino associated with a cgroup */ /* returns ino associated with a cgroup */
...@@ -527,6 +516,19 @@ static inline void pr_cont_cgroup_path(struct cgroup *cgrp) ...@@ -527,6 +516,19 @@ static inline void pr_cont_cgroup_path(struct cgroup *cgrp)
pr_cont_kernfs_path(cgrp->kn); pr_cont_kernfs_path(cgrp->kn);
} }
/**
* cgroup_file_notify - generate a file modified event for a cgroup_file
* @cfile: target cgroup_file
*
* @cfile must have been obtained by setting cftype->file_offset.
*/
static inline void cgroup_file_notify(struct cgroup_file *cfile)
{
/* might not have been created due to one of the CFTYPE selector flags */
if (cfile->kn)
kernfs_notify(cfile->kn);
}
#else /* !CONFIG_CGROUPS */ #else /* !CONFIG_CGROUPS */
struct cgroup_subsys_state; struct cgroup_subsys_state;
...@@ -546,6 +548,7 @@ static inline void cgroup_cancel_fork(struct task_struct *p, ...@@ -546,6 +548,7 @@ static inline void cgroup_cancel_fork(struct task_struct *p,
static inline void cgroup_post_fork(struct task_struct *p, static inline void cgroup_post_fork(struct task_struct *p,
void *ss_priv[CGROUP_CANFORK_COUNT]) {} void *ss_priv[CGROUP_CANFORK_COUNT]) {}
static inline void cgroup_exit(struct task_struct *p) {} static inline void cgroup_exit(struct task_struct *p) {}
static inline void cgroup_free(struct task_struct *p) {}
static inline int cgroup_init_early(void) { return 0; } static inline int cgroup_init_early(void) { return 0; }
static inline int cgroup_init(void) { return 0; } static inline int cgroup_init(void) { return 0; }
......
...@@ -48,9 +48,7 @@ int set_hugetlb_cgroup(struct page *page, struct hugetlb_cgroup *h_cg) ...@@ -48,9 +48,7 @@ int set_hugetlb_cgroup(struct page *page, struct hugetlb_cgroup *h_cg)
static inline bool hugetlb_cgroup_disabled(void) static inline bool hugetlb_cgroup_disabled(void)
{ {
if (hugetlb_cgrp_subsys.disabled) return !cgroup_subsys_enabled(hugetlb_cgrp_subsys);
return true;
return false;
} }
extern int hugetlb_cgroup_charge_cgroup(int idx, unsigned long nr_pages, extern int hugetlb_cgroup_charge_cgroup(int idx, unsigned long nr_pages,
......
...@@ -25,13 +25,6 @@ ...@@ -25,13 +25,6 @@
extern struct files_struct init_files; extern struct files_struct init_files;
extern struct fs_struct init_fs; extern struct fs_struct init_fs;
#ifdef CONFIG_CGROUPS
#define INIT_GROUP_RWSEM(sig) \
.group_rwsem = __RWSEM_INITIALIZER(sig.group_rwsem),
#else
#define INIT_GROUP_RWSEM(sig)
#endif
#ifdef CONFIG_CPUSETS #ifdef CONFIG_CPUSETS
#define INIT_CPUSET_SEQ(tsk) \ #define INIT_CPUSET_SEQ(tsk) \
.mems_allowed_seq = SEQCNT_ZERO(tsk.mems_allowed_seq), .mems_allowed_seq = SEQCNT_ZERO(tsk.mems_allowed_seq),
...@@ -65,7 +58,6 @@ extern struct fs_struct init_fs; ...@@ -65,7 +58,6 @@ extern struct fs_struct init_fs;
INIT_PREV_CPUTIME(sig) \ INIT_PREV_CPUTIME(sig) \
.cred_guard_mutex = \ .cred_guard_mutex = \
__MUTEX_INITIALIZER(sig.cred_guard_mutex), \ __MUTEX_INITIALIZER(sig.cred_guard_mutex), \
INIT_GROUP_RWSEM(sig) \
} }
extern struct nsproxy init_nsproxy; extern struct nsproxy init_nsproxy;
......
...@@ -214,11 +214,6 @@ static inline int jump_label_apply_nops(struct module *mod) ...@@ -214,11 +214,6 @@ static inline int jump_label_apply_nops(struct module *mod)
#define STATIC_KEY_INIT STATIC_KEY_INIT_FALSE #define STATIC_KEY_INIT STATIC_KEY_INIT_FALSE
#define jump_label_enabled static_key_enabled #define jump_label_enabled static_key_enabled
static inline bool static_key_enabled(struct static_key *key)
{
return static_key_count(key) > 0;
}
static inline void static_key_enable(struct static_key *key) static inline void static_key_enable(struct static_key *key)
{ {
int count = static_key_count(key); int count = static_key_count(key);
...@@ -265,6 +260,17 @@ struct static_key_false { ...@@ -265,6 +260,17 @@ struct static_key_false {
#define DEFINE_STATIC_KEY_FALSE(name) \ #define DEFINE_STATIC_KEY_FALSE(name) \
struct static_key_false name = STATIC_KEY_FALSE_INIT struct static_key_false name = STATIC_KEY_FALSE_INIT
extern bool ____wrong_branch_error(void);
#define static_key_enabled(x) \
({ \
if (!__builtin_types_compatible_p(typeof(*x), struct static_key) && \
!__builtin_types_compatible_p(typeof(*x), struct static_key_true) &&\
!__builtin_types_compatible_p(typeof(*x), struct static_key_false)) \
____wrong_branch_error(); \
static_key_count((struct static_key *)x) > 0; \
})
#ifdef HAVE_JUMP_LABEL #ifdef HAVE_JUMP_LABEL
/* /*
...@@ -323,8 +329,6 @@ struct static_key_false { ...@@ -323,8 +329,6 @@ struct static_key_false {
* See jump_label_type() / jump_label_init_type(). * See jump_label_type() / jump_label_init_type().
*/ */
extern bool ____wrong_branch_error(void);
#define static_branch_likely(x) \ #define static_branch_likely(x) \
({ \ ({ \
bool branch; \ bool branch; \
......
...@@ -213,6 +213,9 @@ struct mem_cgroup { ...@@ -213,6 +213,9 @@ struct mem_cgroup {
/* OOM-Killer disable */ /* OOM-Killer disable */
int oom_kill_disable; int oom_kill_disable;
/* handle for "memory.events" */
struct cgroup_file events_file;
/* protect arrays of thresholds */ /* protect arrays of thresholds */
struct mutex thresholds_lock; struct mutex thresholds_lock;
...@@ -285,6 +288,7 @@ static inline void mem_cgroup_events(struct mem_cgroup *memcg, ...@@ -285,6 +288,7 @@ static inline void mem_cgroup_events(struct mem_cgroup *memcg,
unsigned int nr) unsigned int nr)
{ {
this_cpu_add(memcg->stat->events[idx], nr); this_cpu_add(memcg->stat->events[idx], nr);
cgroup_file_notify(&memcg->events_file);
} }
bool mem_cgroup_low(struct mem_cgroup *root, struct mem_cgroup *memcg); bool mem_cgroup_low(struct mem_cgroup *root, struct mem_cgroup *memcg);
...@@ -346,9 +350,7 @@ ino_t page_cgroup_ino(struct page *page); ...@@ -346,9 +350,7 @@ ino_t page_cgroup_ino(struct page *page);
static inline bool mem_cgroup_disabled(void) static inline bool mem_cgroup_disabled(void)
{ {
if (memory_cgrp_subsys.disabled) return !cgroup_subsys_enabled(memory_cgrp_subsys);
return true;
return false;
} }
/* /*
......
...@@ -771,18 +771,6 @@ struct signal_struct { ...@@ -771,18 +771,6 @@ struct signal_struct {
unsigned audit_tty_log_passwd; unsigned audit_tty_log_passwd;
struct tty_audit_buf *tty_audit_buf; struct tty_audit_buf *tty_audit_buf;
#endif #endif
#ifdef CONFIG_CGROUPS
/*
* group_rwsem prevents new tasks from entering the threadgroup and
* member tasks from exiting,a more specifically, setting of
* PF_EXITING. fork and exit paths are protected with this rwsem
* using threadgroup_change_begin/end(). Users which require
* threadgroup to remain stable should use threadgroup_[un]lock()
* which also takes care of exec path. Currently, cgroup is the
* only user.
*/
struct rw_semaphore group_rwsem;
#endif
oom_flags_t oom_flags; oom_flags_t oom_flags;
short oom_score_adj; /* OOM kill score adjustment */ short oom_score_adj; /* OOM kill score adjustment */
......
This diff is collapsed.
...@@ -266,11 +266,9 @@ static void pids_fork(struct task_struct *task, void *priv) ...@@ -266,11 +266,9 @@ static void pids_fork(struct task_struct *task, void *priv)
css_put(old_css); css_put(old_css);
} }
static void pids_exit(struct cgroup_subsys_state *css, static void pids_free(struct task_struct *task)
struct cgroup_subsys_state *old_css,
struct task_struct *task)
{ {
struct pids_cgroup *pids = css_pids(old_css); struct pids_cgroup *pids = css_pids(task_css(task, pids_cgrp_id));
pids_uncharge(pids, 1); pids_uncharge(pids, 1);
} }
...@@ -349,7 +347,7 @@ struct cgroup_subsys pids_cgrp_subsys = { ...@@ -349,7 +347,7 @@ struct cgroup_subsys pids_cgrp_subsys = {
.can_fork = pids_can_fork, .can_fork = pids_can_fork,
.cancel_fork = pids_cancel_fork, .cancel_fork = pids_cancel_fork,
.fork = pids_fork, .fork = pids_fork,
.exit = pids_exit, .free = pids_free,
.legacy_cftypes = pids_files, .legacy_cftypes = pids_files,
.dfl_cftypes = pids_files, .dfl_cftypes = pids_files,
}; };
...@@ -473,7 +473,8 @@ static int validate_change(struct cpuset *cur, struct cpuset *trial) ...@@ -473,7 +473,8 @@ static int validate_change(struct cpuset *cur, struct cpuset *trial)
/* On legacy hiearchy, we must be a subset of our parent cpuset. */ /* On legacy hiearchy, we must be a subset of our parent cpuset. */
ret = -EACCES; ret = -EACCES;
if (!cgroup_on_dfl(cur->css.cgroup) && !is_cpuset_subset(trial, par)) if (!cgroup_subsys_on_dfl(cpuset_cgrp_subsys) &&
!is_cpuset_subset(trial, par))
goto out; goto out;
/* /*
...@@ -497,7 +498,7 @@ static int validate_change(struct cpuset *cur, struct cpuset *trial) ...@@ -497,7 +498,7 @@ static int validate_change(struct cpuset *cur, struct cpuset *trial)
* be changed to have empty cpus_allowed or mems_allowed. * be changed to have empty cpus_allowed or mems_allowed.
*/ */
ret = -ENOSPC; ret = -ENOSPC;
if ((cgroup_has_tasks(cur->css.cgroup) || cur->attach_in_progress)) { if ((cgroup_is_populated(cur->css.cgroup) || cur->attach_in_progress)) {
if (!cpumask_empty(cur->cpus_allowed) && if (!cpumask_empty(cur->cpus_allowed) &&
cpumask_empty(trial->cpus_allowed)) cpumask_empty(trial->cpus_allowed))
goto out; goto out;
...@@ -879,7 +880,8 @@ static void update_cpumasks_hier(struct cpuset *cs, struct cpumask *new_cpus) ...@@ -879,7 +880,8 @@ static void update_cpumasks_hier(struct cpuset *cs, struct cpumask *new_cpus)
* If it becomes empty, inherit the effective mask of the * If it becomes empty, inherit the effective mask of the
* parent, which is guaranteed to have some CPUs. * parent, which is guaranteed to have some CPUs.
*/ */
if (cgroup_on_dfl(cp->css.cgroup) && cpumask_empty(new_cpus)) if (cgroup_subsys_on_dfl(cpuset_cgrp_subsys) &&
cpumask_empty(new_cpus))
cpumask_copy(new_cpus, parent->effective_cpus); cpumask_copy(new_cpus, parent->effective_cpus);
/* Skip the whole subtree if the cpumask remains the same. */ /* Skip the whole subtree if the cpumask remains the same. */
...@@ -896,7 +898,7 @@ static void update_cpumasks_hier(struct cpuset *cs, struct cpumask *new_cpus) ...@@ -896,7 +898,7 @@ static void update_cpumasks_hier(struct cpuset *cs, struct cpumask *new_cpus)
cpumask_copy(cp->effective_cpus, new_cpus); cpumask_copy(cp->effective_cpus, new_cpus);
spin_unlock_irq(&callback_lock); spin_unlock_irq(&callback_lock);
WARN_ON(!cgroup_on_dfl(cp->css.cgroup) && WARN_ON(!cgroup_subsys_on_dfl(cpuset_cgrp_subsys) &&
!cpumask_equal(cp->cpus_allowed, cp->effective_cpus)); !cpumask_equal(cp->cpus_allowed, cp->effective_cpus));
update_tasks_cpumask(cp); update_tasks_cpumask(cp);
...@@ -1135,7 +1137,8 @@ static void update_nodemasks_hier(struct cpuset *cs, nodemask_t *new_mems) ...@@ -1135,7 +1137,8 @@ static void update_nodemasks_hier(struct cpuset *cs, nodemask_t *new_mems)
* If it becomes empty, inherit the effective mask of the * If it becomes empty, inherit the effective mask of the
* parent, which is guaranteed to have some MEMs. * parent, which is guaranteed to have some MEMs.
*/ */
if (cgroup_on_dfl(cp->css.cgroup) && nodes_empty(*new_mems)) if (cgroup_subsys_on_dfl(cpuset_cgrp_subsys) &&
nodes_empty(*new_mems))
*new_mems = parent->effective_mems; *new_mems = parent->effective_mems;
/* Skip the whole subtree if the nodemask remains the same. */ /* Skip the whole subtree if the nodemask remains the same. */
...@@ -1152,7 +1155,7 @@ static void update_nodemasks_hier(struct cpuset *cs, nodemask_t *new_mems) ...@@ -1152,7 +1155,7 @@ static void update_nodemasks_hier(struct cpuset *cs, nodemask_t *new_mems)
cp->effective_mems = *new_mems; cp->effective_mems = *new_mems;
spin_unlock_irq(&callback_lock); spin_unlock_irq(&callback_lock);
WARN_ON(!cgroup_on_dfl(cp->css.cgroup) && WARN_ON(!cgroup_subsys_on_dfl(cpuset_cgrp_subsys) &&
!nodes_equal(cp->mems_allowed, cp->effective_mems)); !nodes_equal(cp->mems_allowed, cp->effective_mems));
update_tasks_nodemask(cp); update_tasks_nodemask(cp);
...@@ -1440,7 +1443,7 @@ static int cpuset_can_attach(struct cgroup_subsys_state *css, ...@@ -1440,7 +1443,7 @@ static int cpuset_can_attach(struct cgroup_subsys_state *css,
/* allow moving tasks into an empty cpuset if on default hierarchy */ /* allow moving tasks into an empty cpuset if on default hierarchy */
ret = -ENOSPC; ret = -ENOSPC;
if (!cgroup_on_dfl(css->cgroup) && if (!cgroup_subsys_on_dfl(cpuset_cgrp_subsys) &&
(cpumask_empty(cs->cpus_allowed) || nodes_empty(cs->mems_allowed))) (cpumask_empty(cs->cpus_allowed) || nodes_empty(cs->mems_allowed)))
goto out_unlock; goto out_unlock;
...@@ -1484,9 +1487,8 @@ static void cpuset_attach(struct cgroup_subsys_state *css, ...@@ -1484,9 +1487,8 @@ static void cpuset_attach(struct cgroup_subsys_state *css,
{ {
/* static buf protected by cpuset_mutex */ /* static buf protected by cpuset_mutex */
static nodemask_t cpuset_attach_nodemask_to; static nodemask_t cpuset_attach_nodemask_to;
struct mm_struct *mm;
struct task_struct *task; struct task_struct *task;
struct task_struct *leader = cgroup_taskset_first(tset); struct task_struct *leader;
struct cpuset *cs = css_cs(css); struct cpuset *cs = css_cs(css);
struct cpuset *oldcs = cpuset_attach_old_cs; struct cpuset *oldcs = cpuset_attach_old_cs;
...@@ -1512,20 +1514,23 @@ static void cpuset_attach(struct cgroup_subsys_state *css, ...@@ -1512,20 +1514,23 @@ static void cpuset_attach(struct cgroup_subsys_state *css,
} }
/* /*
* Change mm, possibly for multiple threads in a threadgroup. This is * Change mm for all threadgroup leaders. This is expensive and may
* expensive and may sleep. * sleep and should be moved outside migration path proper.
*/ */
cpuset_attach_nodemask_to = cs->effective_mems; cpuset_attach_nodemask_to = cs->effective_mems;
mm = get_task_mm(leader); cgroup_taskset_for_each_leader(leader, tset) {
struct mm_struct *mm = get_task_mm(leader);
if (mm) { if (mm) {
mpol_rebind_mm(mm, &cpuset_attach_nodemask_to); mpol_rebind_mm(mm, &cpuset_attach_nodemask_to);
/* /*
* old_mems_allowed is the same with mems_allowed here, except * old_mems_allowed is the same with mems_allowed
* if this task is being moved automatically due to hotplug. * here, except if this task is being moved
* In that case @mems_allowed has been updated and is empty, * automatically due to hotplug. In that case
* so @old_mems_allowed is the right nodesets that we migrate * @mems_allowed has been updated and is empty, so
* mm from. * @old_mems_allowed is the right nodesets that we
* migrate mm from.
*/ */
if (is_memory_migrate(cs)) { if (is_memory_migrate(cs)) {
cpuset_migrate_mm(mm, &oldcs->old_mems_allowed, cpuset_migrate_mm(mm, &oldcs->old_mems_allowed,
...@@ -1533,6 +1538,7 @@ static void cpuset_attach(struct cgroup_subsys_state *css, ...@@ -1533,6 +1538,7 @@ static void cpuset_attach(struct cgroup_subsys_state *css,
} }
mmput(mm); mmput(mm);
} }
}
cs->old_mems_allowed = cpuset_attach_nodemask_to; cs->old_mems_allowed = cpuset_attach_nodemask_to;
...@@ -1594,9 +1600,6 @@ static int cpuset_write_u64(struct cgroup_subsys_state *css, struct cftype *cft, ...@@ -1594,9 +1600,6 @@ static int cpuset_write_u64(struct cgroup_subsys_state *css, struct cftype *cft,
case FILE_MEMORY_PRESSURE_ENABLED: case FILE_MEMORY_PRESSURE_ENABLED:
cpuset_memory_pressure_enabled = !!val; cpuset_memory_pressure_enabled = !!val;
break; break;
case FILE_MEMORY_PRESSURE:
retval = -EACCES;
break;
case FILE_SPREAD_PAGE: case FILE_SPREAD_PAGE:
retval = update_flag(CS_SPREAD_PAGE, cs, val); retval = update_flag(CS_SPREAD_PAGE, cs, val);
break; break;
...@@ -1863,9 +1866,6 @@ static struct cftype files[] = { ...@@ -1863,9 +1866,6 @@ static struct cftype files[] = {
{ {
.name = "memory_pressure", .name = "memory_pressure",
.read_u64 = cpuset_read_u64, .read_u64 = cpuset_read_u64,
.write_u64 = cpuset_write_u64,
.private = FILE_MEMORY_PRESSURE,
.mode = S_IRUGO,
}, },
{ {
...@@ -1952,7 +1952,7 @@ static int cpuset_css_online(struct cgroup_subsys_state *css) ...@@ -1952,7 +1952,7 @@ static int cpuset_css_online(struct cgroup_subsys_state *css)
cpuset_inc(); cpuset_inc();
spin_lock_irq(&callback_lock); spin_lock_irq(&callback_lock);
if (cgroup_on_dfl(cs->css.cgroup)) { if (cgroup_subsys_on_dfl(cpuset_cgrp_subsys)) {
cpumask_copy(cs->effective_cpus, parent->effective_cpus); cpumask_copy(cs->effective_cpus, parent->effective_cpus);
cs->effective_mems = parent->effective_mems; cs->effective_mems = parent->effective_mems;
} }
...@@ -2029,7 +2029,7 @@ static void cpuset_bind(struct cgroup_subsys_state *root_css) ...@@ -2029,7 +2029,7 @@ static void cpuset_bind(struct cgroup_subsys_state *root_css)
mutex_lock(&cpuset_mutex); mutex_lock(&cpuset_mutex);
spin_lock_irq(&callback_lock); spin_lock_irq(&callback_lock);
if (cgroup_on_dfl(root_css->cgroup)) { if (cgroup_subsys_on_dfl(cpuset_cgrp_subsys)) {
cpumask_copy(top_cpuset.cpus_allowed, cpu_possible_mask); cpumask_copy(top_cpuset.cpus_allowed, cpu_possible_mask);
top_cpuset.mems_allowed = node_possible_map; top_cpuset.mems_allowed = node_possible_map;
} else { } else {
...@@ -2210,7 +2210,7 @@ static void cpuset_hotplug_update_tasks(struct cpuset *cs) ...@@ -2210,7 +2210,7 @@ static void cpuset_hotplug_update_tasks(struct cpuset *cs)
cpus_updated = !cpumask_equal(&new_cpus, cs->effective_cpus); cpus_updated = !cpumask_equal(&new_cpus, cs->effective_cpus);
mems_updated = !nodes_equal(new_mems, cs->effective_mems); mems_updated = !nodes_equal(new_mems, cs->effective_mems);
if (cgroup_on_dfl(cs->css.cgroup)) if (cgroup_subsys_on_dfl(cpuset_cgrp_subsys))
hotplug_update_tasks(cs, &new_cpus, &new_mems, hotplug_update_tasks(cs, &new_cpus, &new_mems,
cpus_updated, mems_updated); cpus_updated, mems_updated);
else else
...@@ -2241,7 +2241,7 @@ static void cpuset_hotplug_workfn(struct work_struct *work) ...@@ -2241,7 +2241,7 @@ static void cpuset_hotplug_workfn(struct work_struct *work)
static cpumask_t new_cpus; static cpumask_t new_cpus;
static nodemask_t new_mems; static nodemask_t new_mems;
bool cpus_updated, mems_updated; bool cpus_updated, mems_updated;
bool on_dfl = cgroup_on_dfl(top_cpuset.css.cgroup); bool on_dfl = cgroup_subsys_on_dfl(cpuset_cgrp_subsys);
mutex_lock(&cpuset_mutex); mutex_lock(&cpuset_mutex);
......
...@@ -9460,17 +9460,9 @@ static void perf_cgroup_attach(struct cgroup_subsys_state *css, ...@@ -9460,17 +9460,9 @@ static void perf_cgroup_attach(struct cgroup_subsys_state *css,
task_function_call(task, __perf_cgroup_move, task); task_function_call(task, __perf_cgroup_move, task);
} }
static void perf_cgroup_exit(struct cgroup_subsys_state *css,
struct cgroup_subsys_state *old_css,
struct task_struct *task)
{
task_function_call(task, __perf_cgroup_move, task);
}
struct cgroup_subsys perf_event_cgrp_subsys = { struct cgroup_subsys perf_event_cgrp_subsys = {
.css_alloc = perf_cgroup_css_alloc, .css_alloc = perf_cgroup_css_alloc,
.css_free = perf_cgroup_css_free, .css_free = perf_cgroup_css_free,
.exit = perf_cgroup_exit,
.attach = perf_cgroup_attach, .attach = perf_cgroup_attach,
}; };
#endif /* CONFIG_CGROUP_PERF */ #endif /* CONFIG_CGROUP_PERF */
...@@ -251,6 +251,7 @@ void __put_task_struct(struct task_struct *tsk) ...@@ -251,6 +251,7 @@ void __put_task_struct(struct task_struct *tsk)
WARN_ON(atomic_read(&tsk->usage)); WARN_ON(atomic_read(&tsk->usage));
WARN_ON(tsk == current); WARN_ON(tsk == current);
cgroup_free(tsk);
task_numa_free(tsk); task_numa_free(tsk);
security_task_free(tsk); security_task_free(tsk);
exit_creds(tsk); exit_creds(tsk);
...@@ -1149,10 +1150,6 @@ static int copy_signal(unsigned long clone_flags, struct task_struct *tsk) ...@@ -1149,10 +1150,6 @@ static int copy_signal(unsigned long clone_flags, struct task_struct *tsk)
tty_audit_fork(sig); tty_audit_fork(sig);
sched_autogroup_fork(sig); sched_autogroup_fork(sig);
#ifdef CONFIG_CGROUPS
init_rwsem(&sig->group_rwsem);
#endif
sig->oom_score_adj = current->signal->oom_score_adj; sig->oom_score_adj = current->signal->oom_score_adj;
sig->oom_score_adj_min = current->signal->oom_score_adj_min; sig->oom_score_adj_min = current->signal->oom_score_adj_min;
......
...@@ -8244,13 +8244,6 @@ static void cpu_cgroup_attach(struct cgroup_subsys_state *css, ...@@ -8244,13 +8244,6 @@ static void cpu_cgroup_attach(struct cgroup_subsys_state *css,
sched_move_task(task); sched_move_task(task);
} }
static void cpu_cgroup_exit(struct cgroup_subsys_state *css,
struct cgroup_subsys_state *old_css,
struct task_struct *task)
{
sched_move_task(task);
}
#ifdef CONFIG_FAIR_GROUP_SCHED #ifdef CONFIG_FAIR_GROUP_SCHED
static int cpu_shares_write_u64(struct cgroup_subsys_state *css, static int cpu_shares_write_u64(struct cgroup_subsys_state *css,
struct cftype *cftype, u64 shareval) struct cftype *cftype, u64 shareval)
...@@ -8582,7 +8575,6 @@ struct cgroup_subsys cpu_cgrp_subsys = { ...@@ -8582,7 +8575,6 @@ struct cgroup_subsys cpu_cgrp_subsys = {
.fork = cpu_cgroup_fork, .fork = cpu_cgroup_fork,
.can_attach = cpu_cgroup_can_attach, .can_attach = cpu_cgroup_can_attach,
.attach = cpu_cgroup_attach, .attach = cpu_cgroup_attach,
.exit = cpu_cgroup_exit,
.legacy_cftypes = cpu_files, .legacy_cftypes = cpu_files,
.early_init = 1, .early_init = 1,
}; };
......
...@@ -434,7 +434,7 @@ struct cgroup_subsys_state *mem_cgroup_css_from_page(struct page *page) ...@@ -434,7 +434,7 @@ struct cgroup_subsys_state *mem_cgroup_css_from_page(struct page *page)
memcg = page->mem_cgroup; memcg = page->mem_cgroup;
if (!memcg || !cgroup_on_dfl(memcg->css.cgroup)) if (!memcg || !cgroup_subsys_on_dfl(memory_cgrp_subsys))
memcg = root_mem_cgroup; memcg = root_mem_cgroup;
rcu_read_unlock(); rcu_read_unlock();
...@@ -2926,7 +2926,7 @@ static int memcg_activate_kmem(struct mem_cgroup *memcg, ...@@ -2926,7 +2926,7 @@ static int memcg_activate_kmem(struct mem_cgroup *memcg,
* of course permitted. * of course permitted.
*/ */
mutex_lock(&memcg_create_mutex); mutex_lock(&memcg_create_mutex);
if (cgroup_has_tasks(memcg->css.cgroup) || if (cgroup_is_populated(memcg->css.cgroup) ||
(memcg->use_hierarchy && memcg_has_children(memcg))) (memcg->use_hierarchy && memcg_has_children(memcg)))
err = -EBUSY; err = -EBUSY;
mutex_unlock(&memcg_create_mutex); mutex_unlock(&memcg_create_mutex);
...@@ -4066,8 +4066,7 @@ static struct cftype mem_cgroup_legacy_files[] = { ...@@ -4066,8 +4066,7 @@ static struct cftype mem_cgroup_legacy_files[] = {
{ {
.name = "cgroup.event_control", /* XXX: for compat */ .name = "cgroup.event_control", /* XXX: for compat */
.write = memcg_write_event_control, .write = memcg_write_event_control,
.flags = CFTYPE_NO_PREFIX, .flags = CFTYPE_NO_PREFIX | CFTYPE_WORLD_WRITABLE,
.mode = S_IWUGO,
}, },
{ {
.name = "swappiness", .name = "swappiness",
...@@ -4834,7 +4833,7 @@ static int mem_cgroup_can_attach(struct cgroup_subsys_state *css, ...@@ -4834,7 +4833,7 @@ static int mem_cgroup_can_attach(struct cgroup_subsys_state *css,
{ {
struct mem_cgroup *memcg = mem_cgroup_from_css(css); struct mem_cgroup *memcg = mem_cgroup_from_css(css);
struct mem_cgroup *from; struct mem_cgroup *from;
struct task_struct *p; struct task_struct *leader, *p;
struct mm_struct *mm; struct mm_struct *mm;
unsigned long move_flags; unsigned long move_flags;
int ret = 0; int ret = 0;
...@@ -4848,7 +4847,20 @@ static int mem_cgroup_can_attach(struct cgroup_subsys_state *css, ...@@ -4848,7 +4847,20 @@ static int mem_cgroup_can_attach(struct cgroup_subsys_state *css,
if (!move_flags) if (!move_flags)
return 0; return 0;
p = cgroup_taskset_first(tset); /*
* Multi-process migrations only happen on the default hierarchy
* where charge immigration is not used. Perform charge
* immigration if @tset contains a leader and whine if there are
* multiple.
*/
p = NULL;
cgroup_taskset_for_each_leader(leader, tset) {
WARN_ON_ONCE(p);
p = leader;
}
if (!p)
return 0;
from = mem_cgroup_from_task(p); from = mem_cgroup_from_task(p);
VM_BUG_ON(from == memcg); VM_BUG_ON(from == memcg);
...@@ -5064,7 +5076,7 @@ static void mem_cgroup_bind(struct cgroup_subsys_state *root_css) ...@@ -5064,7 +5076,7 @@ static void mem_cgroup_bind(struct cgroup_subsys_state *root_css)
* guarantees that @root doesn't have any children, so turning it * guarantees that @root doesn't have any children, so turning it
* on for the root memcg is enough. * on for the root memcg is enough.
*/ */
if (cgroup_on_dfl(root_css->cgroup)) if (cgroup_subsys_on_dfl(memory_cgrp_subsys))
root_mem_cgroup->use_hierarchy = true; root_mem_cgroup->use_hierarchy = true;
else else
root_mem_cgroup->use_hierarchy = false; root_mem_cgroup->use_hierarchy = false;
...@@ -5208,6 +5220,7 @@ static struct cftype memory_files[] = { ...@@ -5208,6 +5220,7 @@ static struct cftype memory_files[] = {
{ {
.name = "events", .name = "events",
.flags = CFTYPE_NOT_ON_ROOT, .flags = CFTYPE_NOT_ON_ROOT,
.file_offset = offsetof(struct mem_cgroup, events_file),
.seq_show = memory_events_show, .seq_show = memory_events_show,
}, },
{ } /* terminate */ { } /* terminate */
......
...@@ -175,7 +175,7 @@ static bool sane_reclaim(struct scan_control *sc) ...@@ -175,7 +175,7 @@ static bool sane_reclaim(struct scan_control *sc)
if (!memcg) if (!memcg)
return true; return true;
#ifdef CONFIG_CGROUP_WRITEBACK #ifdef CONFIG_CGROUP_WRITEBACK
if (cgroup_on_dfl(memcg->css.cgroup)) if (cgroup_subsys_on_dfl(memory_cgrp_subsys))
return true; return true;
#endif #endif
return false; return false;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment