• Frederic Weisbecker's avatar
    cgroup: Walk task list under tasklist_lock in cgroup_enable_task_cg_list · 3ce3230a
    Frederic Weisbecker authored
    Walking through the tasklist in cgroup_enable_task_cg_list() inside
    an RCU read side critical section is not enough because:
    
    - RCU is not (yet) safe against while_each_thread()
    
    - If we use only RCU, a forking task that has passed cgroup_post_fork()
      without seeing use_task_css_set_links == 1 is not guaranteed to have
      its child immediately visible in the tasklist if we walk through it
      remotely with RCU. In this case it will be missing in its css_set's
      task list.
    
    Thus we need to traverse the list (unfortunately) under the
    tasklist_lock. It makes us safe against while_each_thread() and also
    make sure we see all forked task that have been added to the tasklist.
    
    As a secondary effect, reading and writing use_task_css_set_links are
    now well ordered against tasklist traversing and modification. The new
    layout is:
    
    CPU 0                                      CPU 1
    
    use_task_css_set_links = 1                write_lock(tasklist_lock)
    read_lock(tasklist_lock)                  add task to tasklist
    do_each_thread() {                        write_unlock(tasklist_lock)
    	add thread to css set links       if (use_task_css_set_links)
    } while_each_thread()                         add thread to css set links
    read_unlock(tasklist_lock)
    
    If CPU 0 traverse the list after the task has been added to the tasklist
    then it is correctly added to the css set links. OTOH if CPU 0 traverse
    the tasklist before the new task had the opportunity to be added to the
    tasklist because it was too early in the fork process, then CPU 1
    catches up and add the task to the css set links after it added the task
    to the tasklist. The right value of use_task_css_set_links is guaranteed
    to be visible from CPU 1 due to the LOCK/UNLOCK implicit barrier properties:
    the read_unlock on CPU 0 makes the write on use_task_css_set_links happening
    and the write_lock on CPU 1 make the read of use_task_css_set_links that comes
    afterward to return the correct value.
    Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
    Acked-by: default avatarLi Zefan <lizf@cn.fujitsu.com>
    Signed-off-by: default avatarTejun Heo <tj@kernel.org>
    Cc: Mandeep Singh Baines <msb@chromium.org>
    Cc: Oleg Nesterov <oleg@redhat.com>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    3ce3230a
cgroup.c 138 KB