1. 29 Jul, 2012 20 commits
    • Kees Cook's avatar
      fs: add link restrictions · 800179c9
      Kees Cook authored
      This adds symlink and hardlink restrictions to the Linux VFS.
      
      Symlinks:
      
      A long-standing class of security issues is the symlink-based
      time-of-check-time-of-use race, most commonly seen in world-writable
      directories like /tmp. The common method of exploitation of this flaw
      is to cross privilege boundaries when following a given symlink (i.e. a
      root process follows a symlink belonging to another user). For a likely
      incomplete list of hundreds of examples across the years, please see:
      http://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=/tmp
      
      The solution is to permit symlinks to only be followed when outside
      a sticky world-writable directory, or when the uid of the symlink and
      follower match, or when the directory owner matches the symlink's owner.
      
      Some pointers to the history of earlier discussion that I could find:
      
       1996 Aug, Zygo Blaxell
        http://marc.info/?l=bugtraq&m=87602167419830&w=2
       1996 Oct, Andrew Tridgell
        http://lkml.indiana.edu/hypermail/linux/kernel/9610.2/0086.html
       1997 Dec, Albert D Cahalan
        http://lkml.org/lkml/1997/12/16/4
       2005 Feb, Lorenzo Hernández García-Hierro
        http://lkml.indiana.edu/hypermail/linux/kernel/0502.0/1896.html
       2010 May, Kees Cook
        https://lkml.org/lkml/2010/5/30/144
      
      Past objections and rebuttals could be summarized as:
      
       - Violates POSIX.
         - POSIX didn't consider this situation and it's not useful to follow
           a broken specification at the cost of security.
       - Might break unknown applications that use this feature.
         - Applications that break because of the change are easy to spot and
           fix. Applications that are vulnerable to symlink ToCToU by not having
           the change aren't. Additionally, no applications have yet been found
           that rely on this behavior.
       - Applications should just use mkstemp() or O_CREATE|O_EXCL.
         - True, but applications are not perfect, and new software is written
           all the time that makes these mistakes; blocking this flaw at the
           kernel is a single solution to the entire class of vulnerability.
       - This should live in the core VFS.
         - This should live in an LSM. (https://lkml.org/lkml/2010/5/31/135)
       - This should live in an LSM.
         - This should live in the core VFS. (https://lkml.org/lkml/2010/8/2/188)
      
      Hardlinks:
      
      On systems that have user-writable directories on the same partition
      as system files, a long-standing class of security issues is the
      hardlink-based time-of-check-time-of-use race, most commonly seen in
      world-writable directories like /tmp. The common method of exploitation
      of this flaw is to cross privilege boundaries when following a given
      hardlink (i.e. a root process follows a hardlink created by another
      user). Additionally, an issue exists where users can "pin" a potentially
      vulnerable setuid/setgid file so that an administrator will not actually
      upgrade a system fully.
      
      The solution is to permit hardlinks to only be created when the user is
      already the existing file's owner, or if they already have read/write
      access to the existing file.
      
      Many Linux users are surprised when they learn they can link to files
      they have no access to, so this change appears to follow the doctrine
      of "least surprise". Additionally, this change does not violate POSIX,
      which states "the implementation may require that the calling process
      has permission to access the existing file"[1].
      
      This change is known to break some implementations of the "at" daemon,
      though the version used by Fedora and Ubuntu has been fixed[2] for
      a while. Otherwise, the change has been undisruptive while in use in
      Ubuntu for the last 1.5 years.
      
      [1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/linkat.html
      [2] http://anonscm.debian.org/gitweb/?p=collab-maint/at.git;a=commitdiff;h=f4114656c3a6c6f6070e315ffdf940a49eda3279
      
      This patch is based on the patches in Openwall and grsecurity, along with
      suggestions from Al Viro. I have added a sysctl to enable the protected
      behavior, and documentation.
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Acked-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      800179c9
    • Jeff Layton's avatar
      vfs: don't let do_last pass negative dentry to audit_inode · 3134f37e
      Jeff Layton authored
      I can reliably reproduce the following panic by simply setting an audit
      rule on a recent 3.5.0+ kernel:
      
       BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
       IP: [<ffffffff810d1250>] audit_copy_inode+0x10/0x90
       PGD 7acd9067 PUD 7b8fb067 PMD 0
       Oops: 0000 [#86] SMP
       Modules linked in: nfs nfs_acl auth_rpcgss fscache lockd sunrpc tpm_bios btrfs zlib_deflate libcrc32c kvm_amd kvm joydev virtio_net pcspkr i2c_piix4 floppy virtio_balloon microcode virtio_blk cirrus drm_kms_helper ttm drm i2c_core [last unloaded: scsi_wait_scan]
       CPU 0
       Pid: 1286, comm: abrt-dump-oops Tainted: G      D      3.5.0+ #1 Bochs Bochs
       RIP: 0010:[<ffffffff810d1250>]  [<ffffffff810d1250>] audit_copy_inode+0x10/0x90
       RSP: 0018:ffff88007aebfc38  EFLAGS: 00010282
       RAX: 0000000000000000 RBX: ffff88003692d860 RCX: 00000000000038c4
       RDX: 0000000000000000 RSI: ffff88006baf5d80 RDI: ffff88003692d860
       RBP: ffff88007aebfc68 R08: 0000000000000000 R09: 0000000000000000
       R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
       R13: ffff880036d30f00 R14: ffff88006baf5d80 R15: ffff88003692d800
       FS:  00007f7562634740(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000000040 CR3: 000000003643d000 CR4: 00000000000006f0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
       Process abrt-dump-oops (pid: 1286, threadinfo ffff88007aebe000, task ffff880079614530)
       Stack:
        ffff88007aebfdf8 ffff88007aebff28 ffff88007aebfc98 ffffffff81211358
        ffff88003692d860 0000000000000000 ffff88007aebfcc8 ffffffff810d4968
        ffff88007aebfcc8 ffff8800000038c4 0000000000000000 0000000000000000
       Call Trace:
        [<ffffffff81211358>] ? ext4_lookup+0xe8/0x160
        [<ffffffff810d4968>] __audit_inode+0x118/0x2d0
        [<ffffffff811955a9>] do_last+0x999/0xe80
        [<ffffffff81191fe8>] ? inode_permission+0x18/0x50
        [<ffffffff81171efa>] ? kmem_cache_alloc_trace+0x11a/0x130
        [<ffffffff81195b4a>] path_openat+0xba/0x420
        [<ffffffff81196111>] do_filp_open+0x41/0xa0
        [<ffffffff811a24bd>] ? alloc_fd+0x4d/0x120
        [<ffffffff811855cd>] do_sys_open+0xed/0x1c0
        [<ffffffff810d40cc>] ? __audit_syscall_entry+0xcc/0x300
        [<ffffffff811856c1>] sys_open+0x21/0x30
        [<ffffffff81611ca9>] system_call_fastpath+0x16/0x1b
        RSP <ffff88007aebfc38>
       CR2: 0000000000000040
      
      The problem is that do_last is passing a negative dentry to audit_inode.
      The comments on lookup_open note that it can pass back a negative dentry
      if O_CREAT is not set.
      
      This patch fixes the oops, but I'm not clear on whether there's a better
      approach.
      
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      3134f37e
    • Al Viro's avatar
      brcm80211: pointless current->files passed to filp_close() · 0b5306b3
      Al Viro authored
      ... only needed if it's been in descriptor table
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      0b5306b3
    • Al Viro's avatar
      sound_firmware: don't pass crap to filp_close() · 58609306
      Al Viro authored
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      58609306
    • Al Viro's avatar
      gadgetfs: clean up · 20818a0c
      Al Viro authored
      sigh...
      * opened files have non-NULL dentries and non-NULL inodes
      * close_filp() needs current->files only if the file had been
      in descriptor table.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      20818a0c
    • Al Viro's avatar
      slightly reduce lossage in gdm72xx · 09fada5b
      Al Viro authored
      * filp_close() needs non-NULL second argument only if it'd been in descriptor
      table
      * opened files have non-NULL dentries, TYVM
      * ... and those dentries are positive - it's kinda hard to open a file that
      doesn't exist.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      09fada5b
    • Al Viro's avatar
      slightly reduce idiocy in drivers/staging/bcm/Misc.c · 32aecdd3
      Al Viro authored
      a) vfs_llseek() does *not* access userland pointers of any kind
      b) neither does filp_close(), for that matter
      c) ... nor filp_open()
      d) vfs_read() does, but we do have a wrapper for that (kernel_read()),
      so there's no need to reinvent it.
      e) passing current->files to filp_close() on something that never
      had been in descriptor table is pointless.
      
      ISAGN: voodoo dolls to be used on voodoo programmers...
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      32aecdd3
    • Al Viro's avatar
      consolidate pipe file creation · e4fad8e5
      Al Viro authored
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      e4fad8e5
    • Al Viro's avatar
      take grabbing f->f_path to do_dentry_open() · b5bcdda3
      Al Viro authored
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      b5bcdda3
    • Al Viro's avatar
      uninline file_free_rcu() · 5c33b183
      Al Viro authored
      What inline?  Its only use is passing its address to call_rcu(), for fuck sake!
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      5c33b183
    • Al Viro's avatar
      ecryptfs_lookup_interpose(): allocate dentry_info first · 0b1d9011
      Al Viro authored
      less work on failure that way
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      0b1d9011
    • Al Viro's avatar
      sanitize ecryptfs_lookup() · bc65a121
      Al Viro authored
      * ->lookup() never gets hit with . or ..
      * dentry it gets is unhashed, so unless we had gone and hashed it ourselves, there's
      no need to d_drop() the sucker.
      * wrong name printed in one of the printks (NULL, in fact)
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      bc65a121
    • Al Viro's avatar
      clean unix_bind() up a bit · faf02010
      Al Viro authored
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      faf02010
    • Al Viro's avatar
      pull mnt_want_write()/mnt_drop_write() into kern_path_create()/done_path_create() resp. · a8104a9f
      Al Viro authored
      One side effect - attempt to create a cross-device link on a read-only fs fails
      with EROFS instead of EXDEV now.  Makes more sense, POSIX allows, etc.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      a8104a9f
    • Al Viro's avatar
      mknod: take sanity checks on mode into the very beginning · 8e4bfca1
      Al Viro authored
      Note that applying umask can't affect their results.  While
      that affects errno in cases like
      	mknod("/no_such_directory/a", 030000)
      yielding -EINVAL (due to impossible mode_t) instead of
      -ENOENT (due to inexistent directory), IMO that makes a lot
      more sense, POSIX allows to return either and any software
      that relies on getting -ENOENT instead of -EINVAL in that
      case deserves everything it gets.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      8e4bfca1
    • Al Viro's avatar
      new helper: done_path_create() · 921a1650
      Al Viro authored
      releases what needs to be released after {kern,user}_path_create()
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      921a1650
    • Al Viro's avatar
      pull unlock+dput() out into do_spu_create() · 25b2692a
      Al Viro authored
      ... and cleaning spufs_create() a bit, while we are at it
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      25b2692a
    • Al Viro's avatar
      1ba44cc9
    • Al Viro's avatar
      spufs_create_context(): simplify failure exits · 66ec7b2c
      Al Viro authored
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      66ec7b2c
    • Al Viro's avatar
      move spu_forget() into spufs_rmdir() · 67cba9fd
      Al Viro authored
      now that __fput() is *not* done in any callchain containing mmput(),
      we can do that...
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      67cba9fd
  2. 22 Jul, 2012 20 commits