1. 25 Feb, 2017 39 commits
  2. 24 Feb, 2017 1 commit
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace · f1ef09fd
      Linus Torvalds authored
      Pull namespace updates from Eric Biederman:
       "There is a lot here. A lot of these changes result in subtle user
        visible differences in kernel behavior. I don't expect anything will
        care but I will revert/fix things immediately if any regressions show
        up.
      
        From Seth Forshee there is a continuation of the work to make the vfs
        ready for unpriviled mounts. We had thought the previous changes
        prevented the creation of files outside of s_user_ns of a filesystem,
        but it turns we missed the O_CREAT path. Ooops.
      
        Pavel Tikhomirov and Oleg Nesterov worked together to fix a long
        standing bug in the implemenation of PR_SET_CHILD_SUBREAPER where only
        children that are forked after the prctl are considered and not
        children forked before the prctl. The only known user of this prctl
        systemd forks all children after the prctl. So no userspace
        regressions will occur. Holding earlier forked children to the same
        rules as later forked children creates a semantic that is sane enough
        to allow checkpoing of processes that use this feature.
      
        There is a long delayed change by Nikolay Borisov to limit inotify
        instances inside a user namespace.
      
        Michael Kerrisk extends the API for files used to maniuplate
        namespaces with two new trivial ioctls to allow discovery of the
        hierachy and properties of namespaces.
      
        Konstantin Khlebnikov with the help of Al Viro adds code that when a
        network namespace exits purges it's sysctl entries from the dcache. As
        in some circumstances this could use a lot of memory.
      
        Vivek Goyal fixed a bug with stacked filesystems where the permissions
        on the wrong inode were being checked.
      
        I continue previous work on ptracing across exec. Allowing a file to
        be setuid across exec while being ptraced if the tracer has enough
        credentials in the user namespace, and if the process has CAP_SETUID
        in it's own namespace. Proc files for setuid or otherwise undumpable
        executables are now owned by the root in the user namespace of their
        mm. Allowing debugging of setuid applications in containers to work
        better.
      
        A bug I introduced with permission checking and automount is now
        fixed. The big change is to mark the mounts that the kernel initiates
        as a result of an automount. This allows the permission checks in sget
        to be safely suppressed for this kind of mount. As the permission
        check happened when the original filesystem was mounted.
      
        Finally a special case in the mount namespace is removed preventing
        unbounded chains in the mount hash table, and making the semantics
        simpler which benefits CRIU.
      
        The vfs fix along with related work in ima and evm I believe makes us
        ready to finish developing and merge fully unprivileged mounts of the
        fuse filesystem. The cleanups of the mount namespace makes discussing
        how to fix the worst case complexity of umount. The stacked filesystem
        fixes pave the way for adding multiple mappings for the filesystem
        uids so that efficient and safer containers can be implemented"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
        proc/sysctl: Don't grab i_lock under sysctl_lock.
        vfs: Use upper filesystem inode in bprm_fill_uid()
        proc/sysctl: prune stale dentries during unregistering
        mnt: Tuck mounts under others instead of creating shadow/side mounts.
        prctl: propagate has_child_subreaper flag to every descendant
        introduce the walk_process_tree() helper
        nsfs: Add an ioctl() to return owner UID of a userns
        fs: Better permission checking for submounts
        exit: fix the setns() && PR_SET_CHILD_SUBREAPER interaction
        vfs: open() with O_CREAT should not create inodes with unknown ids
        nsfs: Add an ioctl() to return the namespace type
        proc: Better ownership of files for non-dumpable tasks in user namespaces
        exec: Remove LSM_UNSAFE_PTRACE_CAP
        exec: Test the ptracer's saved cred to see if the tracee can gain caps
        exec: Don't reset euid and egid when the tracee has CAP_SETUID
        inotify: Convert to using per-namespace limits
      f1ef09fd