Commit c45a7795 authored by Elena Reshetova's avatar Elena Reshetova Committed by Ingo Molnar

sched/fair: Convert numa_group.refcount to refcount_t

atomic_t variables are currently used to implement reference
counters with the following properties:

 - counter is initialized to 1 using atomic_set()
 - a resource is freed upon counter reaching zero
 - once counter reaches zero, its further
   increments aren't allowed
 - counter schema uses basic atomic operations
   (set, inc, inc_not_zero, dec_and_test, etc.)

Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.

The variable numa_group.refcount is used as pure reference counter.
Convert it to refcount_t and fix up the operations.

** Important note for maintainers:

Some functions from refcount_t API defined in lib/refcount.c
have different memory ordering guarantees than their atomic
counterparts.

The full comparison can be seen in
https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon
in state to be merged to the documentation tree.

Normally the differences should not matter since refcount_t provides
enough guarantees to satisfy the refcounting use cases, but in
some rare cases it might matter.

Please double check that you don't have some undocumented
memory guarantees for this variable usage.

For the numa_group.refcount it might make a difference
in following places:

 - get_numa_group(): increment in refcount_inc_not_zero() only
   guarantees control dependency on success vs. fully ordered
   atomic counterpart
 - put_numa_group(): decrement in refcount_dec_and_test() only
   provides RELEASE ordering and control dependency on success
   vs. fully ordered atomic counterpart
Suggested-by: default avatarKees Cook <keescook@chromium.org>
Signed-off-by: default avatarElena Reshetova <elena.reshetova@intel.com>
Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: default avatarDavid Windsor <dwindsor@gmail.com>
Reviewed-by: default avatarHans Liljestrand <ishkamiel@gmail.com>
Reviewed-by: default avatarAndrea Parri <andrea.parri@amarulasolutions.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akpm@linux-foundation.org
Cc: viro@zeniv.linux.org.uk
Link: https://lkml.kernel.org/r/1547814450-18902-4-git-send-email-elena.reshetova@intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
parent 60d4de3f
...@@ -1035,7 +1035,7 @@ unsigned int sysctl_numa_balancing_scan_size = 256; ...@@ -1035,7 +1035,7 @@ unsigned int sysctl_numa_balancing_scan_size = 256;
unsigned int sysctl_numa_balancing_scan_delay = 1000; unsigned int sysctl_numa_balancing_scan_delay = 1000;
struct numa_group { struct numa_group {
atomic_t refcount; refcount_t refcount;
spinlock_t lock; /* nr_tasks, tasks */ spinlock_t lock; /* nr_tasks, tasks */
int nr_tasks; int nr_tasks;
...@@ -1104,7 +1104,7 @@ static unsigned int task_scan_start(struct task_struct *p) ...@@ -1104,7 +1104,7 @@ static unsigned int task_scan_start(struct task_struct *p)
unsigned long shared = group_faults_shared(ng); unsigned long shared = group_faults_shared(ng);
unsigned long private = group_faults_priv(ng); unsigned long private = group_faults_priv(ng);
period *= atomic_read(&ng->refcount); period *= refcount_read(&ng->refcount);
period *= shared + 1; period *= shared + 1;
period /= private + shared + 1; period /= private + shared + 1;
} }
...@@ -1127,7 +1127,7 @@ static unsigned int task_scan_max(struct task_struct *p) ...@@ -1127,7 +1127,7 @@ static unsigned int task_scan_max(struct task_struct *p)
unsigned long private = group_faults_priv(ng); unsigned long private = group_faults_priv(ng);
unsigned long period = smax; unsigned long period = smax;
period *= atomic_read(&ng->refcount); period *= refcount_read(&ng->refcount);
period *= shared + 1; period *= shared + 1;
period /= private + shared + 1; period /= private + shared + 1;
...@@ -2203,12 +2203,12 @@ static void task_numa_placement(struct task_struct *p) ...@@ -2203,12 +2203,12 @@ static void task_numa_placement(struct task_struct *p)
static inline int get_numa_group(struct numa_group *grp) static inline int get_numa_group(struct numa_group *grp)
{ {
return atomic_inc_not_zero(&grp->refcount); return refcount_inc_not_zero(&grp->refcount);
} }
static inline void put_numa_group(struct numa_group *grp) static inline void put_numa_group(struct numa_group *grp)
{ {
if (atomic_dec_and_test(&grp->refcount)) if (refcount_dec_and_test(&grp->refcount))
kfree_rcu(grp, rcu); kfree_rcu(grp, rcu);
} }
...@@ -2229,7 +2229,7 @@ static void task_numa_group(struct task_struct *p, int cpupid, int flags, ...@@ -2229,7 +2229,7 @@ static void task_numa_group(struct task_struct *p, int cpupid, int flags,
if (!grp) if (!grp)
return; return;
atomic_set(&grp->refcount, 1); refcount_set(&grp->refcount, 1);
grp->active_nodes = 1; grp->active_nodes = 1;
grp->max_faults_cpu = 0; grp->max_faults_cpu = 0;
spin_lock_init(&grp->lock); spin_lock_init(&grp->lock);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment