• Christian Brauner's avatar
    fork: do not release lock that wasn't taken · c3b7112d
    Christian Brauner authored
    Avoid calling cgroup_threadgroup_change_end() without having called
    cgroup_threadgroup_change_begin() first.
    
    During process creation we need to check whether the cgroup we are in
    allows us to fork. To perform this check the cgroup needs to guard itself
    against threadgroup changes and takes a lock.
    Prior to CLONE_PIDFD the cleanup target "bad_fork_free_pid" would also need
    to call cgroup_threadgroup_change_end() because said lock had already been
    taken.
    However, this is not the case anymore with the addition of CLONE_PIDFD. We
    are now allocating a pidfd before we check whether the cgroup we're in can
    fork and thus prior to taking the lock. So when copy_process() fails at the
    right step it would release a lock we haven't taken.
    This bug is not even very subtle to be honest. It's just not very clear
    from the naming of cgroup_threadgroup_change_{begin,end}() that a lock is
    taken.
    
    Here's the relevant splat:
    
    entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
    RIP: 0023:0xf7fec849
    Code: 85 d2 74 02 89 0a 5b 5d c3 8b 04 24 c3 8b 14 24 c3 8b 3c 24 c3 90 90
    90 90 90 90 90 90 90 90 90 90 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90
    90 90 90 eb 0d 90 90 90 90 90 90 90 90 90 90 90 90
    RSP: 002b:00000000ffed5a8c EFLAGS: 00000246 ORIG_RAX: 0000000000000078
    RAX: ffffffffffffffda RBX: 0000000000003ffc RCX: 0000000000000000
    RDX: 00000000200005c0 RSI: 0000000000000000 RDI: 0000000000000000
    RBP: 0000000000000012 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
    R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
    ------------[ cut here ]------------
    DEBUG_LOCKS_WARN_ON(depth <= 0)
    WARNING: CPU: 1 PID: 7744 at kernel/locking/lockdep.c:4052 __lock_release
    kernel/locking/lockdep.c:4052 [inline]
    WARNING: CPU: 1 PID: 7744 at kernel/locking/lockdep.c:4052
    lock_release+0x667/0xa00 kernel/locking/lockdep.c:4321
    Kernel panic - not syncing: panic_on_warn set ...
    CPU: 1 PID: 7744 Comm: syz-executor007 Not tainted 5.1.0+ #4
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
    Google 01/01/2011
    Call Trace:
      __dump_stack lib/dump_stack.c:77 [inline]
      dump_stack+0x172/0x1f0 lib/dump_stack.c:113
      panic+0x2cb/0x65c kernel/panic.c:214
      __warn.cold+0x20/0x45 kernel/panic.c:566
      report_bug+0x263/0x2b0 lib/bug.c:186
      fixup_bug arch/x86/kernel/traps.c:179 [inline]
      fixup_bug arch/x86/kernel/traps.c:174 [inline]
      do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:272
      do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:291
      invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:972
    RIP: 0010:__lock_release kernel/locking/lockdep.c:4052 [inline]
    RIP: 0010:lock_release+0x667/0xa00 kernel/locking/lockdep.c:4321
    Code: 0f 85 a0 03 00 00 8b 35 77 66 08 08 85 f6 75 23 48 c7 c6 a0 55 6b 87
    48 c7 c7 40 25 6b 87 4c 89 85 70 ff ff ff e8 b7 a9 eb ff <0f> 0b 4c 8b 85
    70 ff ff ff 4c 89 ea 4c 89 e6 4c 89 c7 e8 52 63 ff
    RSP: 0018:ffff888094117b48 EFLAGS: 00010086
    RAX: 0000000000000000 RBX: 1ffff11012822f6f RCX: 0000000000000000
    RDX: 0000000000000000 RSI: ffffffff815af236 RDI: ffffed1012822f5b
    RBP: ffff888094117c00 R08: ffff888092bfc400 R09: fffffbfff113301d
    R10: fffffbfff113301c R11: ffffffff889980e3 R12: ffffffff8a451df8
    R13: ffffffff8142e71f R14: ffffffff8a44cc80 R15: ffff888094117bd8
      percpu_up_read.constprop.0+0xcb/0x110 include/linux/percpu-rwsem.h:92
      cgroup_threadgroup_change_end include/linux/cgroup-defs.h:712 [inline]
      copy_process.part.0+0x47ff/0x6710 kernel/fork.c:2222
      copy_process kernel/fork.c:1772 [inline]
      _do_fork+0x25d/0xfd0 kernel/fork.c:2338
      __do_compat_sys_x86_clone arch/x86/ia32/sys_ia32.c:240 [inline]
      __se_compat_sys_x86_clone arch/x86/ia32/sys_ia32.c:236 [inline]
      __ia32_compat_sys_x86_clone+0xbc/0x140 arch/x86/ia32/sys_ia32.c:236
      do_syscall_32_irqs_on arch/x86/entry/common.c:334 [inline]
      do_fast_syscall_32+0x281/0xd54 arch/x86/entry/common.c:405
      entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
    RIP: 0023:0xf7fec849
    Code: 85 d2 74 02 89 0a 5b 5d c3 8b 04 24 c3 8b 14 24 c3 8b 3c 24 c3 90 90
    90 90 90 90 90 90 90 90 90 90 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90
    90 90 90 eb 0d 90 90 90 90 90 90 90 90 90 90 90 90
    RSP: 002b:00000000ffed5a8c EFLAGS: 00000246 ORIG_RAX: 0000000000000078
    RAX: ffffffffffffffda RBX: 0000000000003ffc RCX: 0000000000000000
    RDX: 00000000200005c0 RSI: 0000000000000000 RDI: 0000000000000000
    RBP: 0000000000000012 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
    R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
    Kernel Offset: disabled
    Rebooting in 86400 seconds..
    
    Reported-and-tested-by: syzbot+3286e58549edc479faae@syzkaller.appspotmail.com
    Fixes: b3e58382 ("clone: add CLONE_PIDFD")
    Signed-off-by: default avatarChristian Brauner <christian@brauner.io>
    c3b7112d
fork.c 66.2 KB