Commit ea86630e authored by Andries E. Brouwer's avatar Andries E. Brouwer Committed by Linus Torvalds

[PATCH] mm: overcommit updates

Alan made overcommit mode 2 and it doesnt work at all.  A process passing
the limit often does so at a moment of stack extension, and is killed by a
segfault, not better than being OOM-killed.

Another problem is that close to the edge no other processes can be
started, so that a sysadmin has problems logging in and investigating.

Below a patch that does 3 things:

(1) It reserves a reasonable amount of virtual stack space (amount
    randomly chosen, no guarantees given) when the process is started, so
    that the common utilities will not be killed by segfault on stack
    extension.

(2) It reserves a reasonable amount of virtual memory for root, so that
    root can do things when the system is out-of-memory

(3) It limits a single process to 97% of what is left, so that also an
    ordinary user is able to use getty, login, bash, ps, kill and similar
    things when one of her processes got out of control.

Since the current overcommit mode 2 is not really useful, I did not give
this a new number.

The patch is just for playing, not to be applied by Linus.  But, Andrew, I
hope that you would be willing to put this in -mm so that people can
experiment.  Of course it only does something if one sets overcommit mode
to 2.

The past month I have pressured people asking for feedback, and now have
about a dozen reports, mostly positive, one very positive.
Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
parent 182e0eba
......@@ -341,6 +341,8 @@ void install_arg_page(struct vm_area_struct *vma,
force_sig(SIGKILL, current);
}
#define EXTRA_STACK_VM_PAGES 20 /* random */
int setup_arg_pages(struct linux_binprm *bprm, int executable_stack)
{
unsigned long stack_base;
......@@ -378,15 +380,15 @@ int setup_arg_pages(struct linux_binprm *bprm, int executable_stack)
memmove(to, to + offset, PAGE_SIZE - offset);
kunmap(bprm->page[j - 1]);
/* Adjust bprm->p to point to the end of the strings. */
bprm->p = PAGE_SIZE * i - offset;
/* Limit stack size to 1GB */
stack_base = current->signal->rlim[RLIMIT_STACK].rlim_max;
if (stack_base > (1 << 30))
stack_base = 1 << 30;
stack_base = PAGE_ALIGN(STACK_TOP - stack_base);
/* Adjust bprm->p to point to the end of the strings. */
bprm->p = stack_base + PAGE_SIZE * i - offset;
mm->arg_start = stack_base;
arg_size = i << PAGE_SHIFT;
......@@ -395,11 +397,13 @@ int setup_arg_pages(struct linux_binprm *bprm, int executable_stack)
bprm->page[i++] = NULL;
#else
stack_base = STACK_TOP - MAX_ARG_PAGES * PAGE_SIZE;
mm->arg_start = bprm->p + stack_base;
bprm->p += stack_base;
mm->arg_start = bprm->p;
arg_size = STACK_TOP - (PAGE_MASK & (unsigned long) mm->arg_start);
#endif
bprm->p += stack_base;
arg_size += EXTRA_STACK_VM_PAGES * PAGE_SIZE;
if (bprm->loader)
bprm->loader += stack_base;
bprm->exec += stack_base;
......@@ -420,11 +424,10 @@ int setup_arg_pages(struct linux_binprm *bprm, int executable_stack)
mpnt->vm_mm = mm;
#ifdef CONFIG_STACK_GROWSUP
mpnt->vm_start = stack_base;
mpnt->vm_end = PAGE_MASK &
(PAGE_SIZE - 1 + (unsigned long) bprm->p);
mpnt->vm_end = stack_base + arg_size;
#else
mpnt->vm_start = PAGE_MASK & (unsigned long) bprm->p;
mpnt->vm_end = STACK_TOP;
mpnt->vm_start = mpnt->vm_end - arg_size;
#endif
/* Adjust stack execute permissions; explicitly enable
* for EXSTACK_ENABLE_X, disable for EXSTACK_DISABLE_X
......
......@@ -386,6 +386,14 @@ int cap_vm_enough_memory(long pages)
allowed -= allowed / 32;
allowed += total_swap_pages;
/* Leave the last 3% for root */
if (current->euid)
allowed -= allowed / 32;
/* Don't let a single process grow too big:
leave 3% of the size of this process for other processes */
allowed -= current->mm->total_vm / 32;
if (atomic_read(&vm_committed_space) < allowed)
return 0;
......
......@@ -160,6 +160,14 @@ static int dummy_vm_enough_memory(long pages)
* sysctl_overcommit_ratio / 100;
allowed += total_swap_pages;
/* Leave the last 3% for root */
if (current->euid)
allowed -= allowed / 32;
/* Don't let a single process grow too big:
leave 3% of the size of this process for other processes */
allowed -= current->mm->total_vm / 32;
if (atomic_read(&vm_committed_space) < allowed)
return 0;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment