Commit daaf1e68 authored by KAMEZAWA Hiroyuki's avatar KAMEZAWA Hiroyuki Committed by Linus Torvalds

memcg: handle panic_on_oom=always case

Presently, if panic_on_oom=2, the whole system panics even if the oom
happend in some special situation (as cpuset, mempolicy....).  Then,
panic_on_oom=2 means painc_on_oom_always.

Now, memcg doesn't check panic_on_oom flag. This patch adds a check.

BTW, how it's useful ?

kdump+panic_on_oom=2 is the last tool to investigate what happens in
oom-ed system.  When a task is killed, the sysytem recovers and there will
be few hint to know what happnes.  In mission critical system, oom should
never happen.  Then, panic_on_oom=2+kdump is useful to avoid next OOM by
knowing precise information via snapshot.

TODO:
 - For memcg, it's for isolate system's memory usage, oom-notiifer and
   freeze_at_oom (or rest_at_oom) should be implemented. Then, management
   daemon can do similar jobs (as kdump) or taking snapshot per cgroup.
Signed-off-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Nick Piggin <npiggin@suse.de>
Reviewed-by: default avatarDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
parent 1080d7a3
......@@ -182,6 +182,8 @@ list.
NOTE: Reclaim does not work for the root cgroup, since we cannot set any
limits on the root cgroup.
Note2: When panic_on_oom is set to "2", the whole system will panic.
2. Locking
The memory controller uses the following hierarchy
......@@ -379,7 +381,8 @@ The feature can be disabled by
NOTE1: Enabling/disabling will fail if the cgroup already has other
cgroups created below it.
NOTE2: This feature can be enabled/disabled per subtree.
NOTE2: When panic_on_oom is set to "2", the whole system will panic in
case of an oom event in any cgroup.
7. Soft limits
......
......@@ -573,11 +573,14 @@ Because other nodes' memory may be free. This means system total status
may be not fatal yet.
If this is set to 2, the kernel panics compulsorily even on the
above-mentioned.
above-mentioned. Even oom happens under memory cgroup, the whole
system panics.
The default value is 0.
1 and 2 are for failover of clustering. Please select either
according to your policy of failover.
panic_on_oom=2+kdump gives you very strong tool to investigate
why oom happens. You can get snapshot.
=============================================================
......
......@@ -473,6 +473,8 @@ void mem_cgroup_out_of_memory(struct mem_cgroup *mem, gfp_t gfp_mask)
unsigned long points = 0;
struct task_struct *p;
if (sysctl_panic_on_oom == 2)
panic("out of memory(memcg). panic_on_oom is selected.\n");
read_lock(&tasklist_lock);
retry:
p = select_bad_process(&points, mem);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment