Commit dabb16f6 authored by Mandeep Singh Baines's avatar Mandeep Singh Baines Committed by Linus Torvalds

oom: allow a non-CAP_SYS_RESOURCE proces to oom_score_adj down

We'd like to be able to oom_score_adj a process up/down as it
enters/leaves the foreground.  Currently, it is not possible to oom_adj
down without CAP_SYS_RESOURCE.  This patch allows a task to decrease its
oom_score_adj back to the value that a CAP_SYS_RESOURCE thread set it to
or its inherited value at fork.  Assuming the thread that has forked it
has oom_score_adj of 0, each process could decrease it back from 0 upon
activation unless a CAP_SYS_RESOURCE thread elevated it to something
higher.

Alternative considered:

* a setuid binary
* a daemon with CAP_SYS_RESOURCE

Since you don't wan't all processes to be able to reduce their oom_adj, a
setuid or daemon implementation would be complex.  The alternatives also
have much higher overhead.

This patch updated from original patch based on feedback from David
Rientjes.
Signed-off-by: default avatarMandeep Singh Baines <msb@chromium.org>
Acked-by: default avatarDavid Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Ying Han <yinghan@google.com>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
parent d0a21265
...@@ -1323,6 +1323,10 @@ scaled linearly with /proc/<pid>/oom_score_adj. ...@@ -1323,6 +1323,10 @@ scaled linearly with /proc/<pid>/oom_score_adj.
Writing to /proc/<pid>/oom_score_adj or /proc/<pid>/oom_adj will change the Writing to /proc/<pid>/oom_score_adj or /proc/<pid>/oom_adj will change the
other with its scaled value. other with its scaled value.
The value of /proc/<pid>/oom_score_adj may be reduced no lower than the last
value set by a CAP_SYS_RESOURCE process. To reduce the value any lower
requires CAP_SYS_RESOURCE.
NOTICE: /proc/<pid>/oom_adj is deprecated and will be removed, please see NOTICE: /proc/<pid>/oom_adj is deprecated and will be removed, please see
Documentation/feature-removal-schedule.txt. Documentation/feature-removal-schedule.txt.
......
...@@ -1151,7 +1151,7 @@ static ssize_t oom_score_adj_write(struct file *file, const char __user *buf, ...@@ -1151,7 +1151,7 @@ static ssize_t oom_score_adj_write(struct file *file, const char __user *buf,
goto err_task_lock; goto err_task_lock;
} }
if (oom_score_adj < task->signal->oom_score_adj && if (oom_score_adj < task->signal->oom_score_adj_min &&
!capable(CAP_SYS_RESOURCE)) { !capable(CAP_SYS_RESOURCE)) {
err = -EACCES; err = -EACCES;
goto err_sighand; goto err_sighand;
...@@ -1164,6 +1164,8 @@ static ssize_t oom_score_adj_write(struct file *file, const char __user *buf, ...@@ -1164,6 +1164,8 @@ static ssize_t oom_score_adj_write(struct file *file, const char __user *buf,
atomic_dec(&task->mm->oom_disable_count); atomic_dec(&task->mm->oom_disable_count);
} }
task->signal->oom_score_adj = oom_score_adj; task->signal->oom_score_adj = oom_score_adj;
if (has_capability_noaudit(current, CAP_SYS_RESOURCE))
task->signal->oom_score_adj_min = oom_score_adj;
/* /*
* Scale /proc/pid/oom_adj appropriately ensuring that OOM_DISABLE is * Scale /proc/pid/oom_adj appropriately ensuring that OOM_DISABLE is
* always attainable. * always attainable.
......
...@@ -634,6 +634,8 @@ struct signal_struct { ...@@ -634,6 +634,8 @@ struct signal_struct {
int oom_adj; /* OOM kill score adjustment (bit shift) */ int oom_adj; /* OOM kill score adjustment (bit shift) */
int oom_score_adj; /* OOM kill score adjustment */ int oom_score_adj; /* OOM kill score adjustment */
int oom_score_adj_min; /* OOM kill score adjustment minimum value.
* Only settable by CAP_SYS_RESOURCE. */
struct mutex cred_guard_mutex; /* guard against foreign influences on struct mutex cred_guard_mutex; /* guard against foreign influences on
* credential calculations * credential calculations
......
...@@ -910,6 +910,7 @@ static int copy_signal(unsigned long clone_flags, struct task_struct *tsk) ...@@ -910,6 +910,7 @@ static int copy_signal(unsigned long clone_flags, struct task_struct *tsk)
sig->oom_adj = current->signal->oom_adj; sig->oom_adj = current->signal->oom_adj;
sig->oom_score_adj = current->signal->oom_score_adj; sig->oom_score_adj = current->signal->oom_score_adj;
sig->oom_score_adj_min = current->signal->oom_score_adj_min;
mutex_init(&sig->cred_guard_mutex); mutex_init(&sig->cred_guard_mutex);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment