• Paul Gortmaker's avatar
    cgroup1: fix leaked context root causing sporadic NULL deref in LTP · 1e7107c5
    Paul Gortmaker authored
    Richard reported sporadic (roughly one in 10 or so) null dereferences and
    other strange behaviour for a set of automated LTP tests.  Things like:
    
       BUG: kernel NULL pointer dereference, address: 0000000000000008
       #PF: supervisor read access in kernel mode
       #PF: error_code(0x0000) - not-present page
       PGD 0 P4D 0
       Oops: 0000 [#1] PREEMPT SMP PTI
       CPU: 0 PID: 1516 Comm: umount Not tainted 5.10.0-yocto-standard #1
       Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-48-gd9c812dda519-prebuilt.qemu.org 04/01/2014
       RIP: 0010:kernfs_sop_show_path+0x1b/0x60
    
    ...or these others:
    
       RIP: 0010:do_mkdirat+0x6a/0xf0
       RIP: 0010:d_alloc_parallel+0x98/0x510
       RIP: 0010:do_readlinkat+0x86/0x120
    
    There were other less common instances of some kind of a general scribble
    but the common theme was mount and cgroup and a dubious dentry triggering
    the NULL dereference.  I was only able to reproduce it under qemu by
    replicating Richard's setup as closely as possible - I never did get it
    to happen on bare metal, even while keeping everything else the same.
    
    In commit 71d883c3 ("cgroup_do_mount(): massage calling conventions")
    we see this as a part of the overall change:
    
       --------------
               struct cgroup_subsys *ss;
       -       struct dentry *dentry;
    
       [...]
    
       -       dentry = cgroup_do_mount(&cgroup_fs_type, fc->sb_flags, root,
       -                                CGROUP_SUPER_MAGIC, ns);
    
       [...]
    
       -       if (percpu_ref_is_dying(&root->cgrp.self.refcnt)) {
       -               struct super_block *sb = dentry->d_sb;
       -               dput(dentry);
       +       ret = cgroup_do_mount(fc, CGROUP_SUPER_MAGIC, ns);
       +       if (!ret && percpu_ref_is_dying(&root->cgrp.self.refcnt)) {
       +               struct super_block *sb = fc->root->d_sb;
       +               dput(fc->root);
                       deactivate_locked_super(sb);
                       msleep(10);
                       return restart_syscall();
               }
       --------------
    
    In changing from the local "*dentry" variable to using fc->root, we now
    export/leave that dentry pointer in the file context after doing the dput()
    in the unlikely "is_dying" case.   With LTP doing a crazy amount of back to
    back mount/unmount [testcases/bin/cgroup_regression_5_1.sh] the unlikely
    becomes slightly likely and then bad things happen.
    
    A fix would be to not leave the stale reference in fc->root as follows:
    
       --------------
                      dput(fc->root);
      +               fc->root = NULL;
                      deactivate_locked_super(sb);
       --------------
    
    ...but then we are just open-coding a duplicate of fc_drop_locked() so we
    simply use that instead.
    
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Cc: Tejun Heo <tj@kernel.org>
    Cc: Zefan Li <lizefan.x@bytedance.com>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: stable@vger.kernel.org      # v5.1+
    Reported-by: default avatarRichard Purdie <richard.purdie@linuxfoundation.org>
    Fixes: 71d883c3 ("cgroup_do_mount(): massage calling conventions")
    Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
    Signed-off-by: default avatarTejun Heo <tj@kernel.org>
    1e7107c5
cgroup-v1.c 33.1 KB