Commit a7031f14 authored by Mateusz Guzik's avatar Mateusz Guzik Committed by Andrew Morton

kernel/fork: stop playing lockless games for exe_file replacement

xchg originated in 6e399cd1 ("prctl: avoid using mmap_sem for exe_file
serialization").  While the commit message does not explain *why* the
change, I found the original submission [1] which ultimately claims it
cleans things up by removing dependency of exe_file on the semaphore.

However, fe69d560 ("kernel/fork: always deny write access to current
MM exe_file") added a semaphore up/down cycle to synchronize the state of
exe_file against fork, defeating the point of the original change.

This is on top of semaphore trips already present both in the replacing
function and prctl (the only consumer).

Normally replacing exe_file does not happen for busy processes, thus
write-locking is not an impediment to performance in the intended use
case.  If someone keeps invoking the routine for a busy processes they are
trying to play dirty and that's another reason to avoid any trickery.

As such I think the atomic here only adds complexity for no benefit.

Just write-lock around the replacement.

I also note that replacement races against the mapping check loop as
nothing synchronizes actual assignment with with said checks but I am not
addressing it in this patch.  (Is the loop of any use to begin with?)

Link: https://lore.kernel.org/linux-mm/1424979417.10344.14.camel@stgolabs.net/ [1]
Link: https://lkml.kernel.org/r/20230814172140.1777161-1-mjguzik@gmail.comSigned-off-by: default avatarMateusz Guzik <mjguzik@gmail.com>
Acked-by: default avatarOleg Nesterov <oleg@redhat.com>
Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
Cc: "Christian Brauner (Microsoft)" <brauner@kernel.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
parent 8bd49ef2
...@@ -1276,8 +1276,8 @@ int begin_new_exec(struct linux_binprm * bprm) ...@@ -1276,8 +1276,8 @@ int begin_new_exec(struct linux_binprm * bprm)
/* /*
* Must be called _before_ exec_mmap() as bprm->mm is * Must be called _before_ exec_mmap() as bprm->mm is
* not visible until then. This also enables the update * not visible until then. Doing it here also ensures
* to be lockless. * we don't race against replace_mm_exe_file().
*/ */
retval = set_mm_exe_file(bprm->mm, bprm->file); retval = set_mm_exe_file(bprm->mm, bprm->file);
if (retval) if (retval)
......
...@@ -1396,8 +1396,8 @@ EXPORT_SYMBOL_GPL(mmput_async); ...@@ -1396,8 +1396,8 @@ EXPORT_SYMBOL_GPL(mmput_async);
* This changes mm's executable file (shown as symlink /proc/[pid]/exe). * This changes mm's executable file (shown as symlink /proc/[pid]/exe).
* *
* Main users are mmput() and sys_execve(). Callers prevent concurrent * Main users are mmput() and sys_execve(). Callers prevent concurrent
* invocations: in mmput() nobody alive left, in execve task is single * invocations: in mmput() nobody alive left, in execve it happens before
* threaded. * the new mm is made visible to anyone.
* *
* Can only fail if new_exe_file != NULL. * Can only fail if new_exe_file != NULL.
*/ */
...@@ -1432,9 +1432,7 @@ int set_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file) ...@@ -1432,9 +1432,7 @@ int set_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file)
/** /**
* replace_mm_exe_file - replace a reference to the mm's executable file * replace_mm_exe_file - replace a reference to the mm's executable file
* *
* This changes mm's executable file (shown as symlink /proc/[pid]/exe), * This changes mm's executable file (shown as symlink /proc/[pid]/exe).
* dealing with concurrent invocation and without grabbing the mmap lock in
* write mode.
* *
* Main user is sys_prctl(PR_SET_MM_MAP/EXE_FILE). * Main user is sys_prctl(PR_SET_MM_MAP/EXE_FILE).
*/ */
...@@ -1464,22 +1462,20 @@ int replace_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file) ...@@ -1464,22 +1462,20 @@ int replace_mm_exe_file(struct mm_struct *mm, struct file *new_exe_file)
return ret; return ret;
} }
/* set the new file, lockless */
ret = deny_write_access(new_exe_file); ret = deny_write_access(new_exe_file);
if (ret) if (ret)
return -EACCES; return -EACCES;
get_file(new_exe_file); get_file(new_exe_file);
old_exe_file = xchg(&mm->exe_file, new_exe_file); /* set the new file */
mmap_write_lock(mm);
old_exe_file = rcu_dereference_raw(mm->exe_file);
rcu_assign_pointer(mm->exe_file, new_exe_file);
mmap_write_unlock(mm);
if (old_exe_file) { if (old_exe_file) {
/*
* Don't race with dup_mmap() getting the file and disallowing
* write access while someone might open the file writable.
*/
mmap_read_lock(mm);
allow_write_access(old_exe_file); allow_write_access(old_exe_file);
fput(old_exe_file); fput(old_exe_file);
mmap_read_unlock(mm);
} }
return 0; return 0;
} }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment