• David Vernet's avatar
    bpf: Make struct task_struct an RCU-safe type · d02c48fa
    David Vernet authored
    struct task_struct objects are a bit interesting in terms of how their
    lifetime is protected by refcounts. task structs have two refcount
    fields:
    
    1. refcount_t usage: Protects the memory backing the task struct. When
       this refcount drops to 0, the task is immediately freed, without
       waiting for an RCU grace period to elapse. This is the field that
       most callers in the kernel currently use to ensure that a task
       remains valid while it's being referenced, and is what's currently
       tracked with bpf_task_acquire() and bpf_task_release().
    
    2. refcount_t rcu_users: A refcount field which, when it drops to 0,
       schedules an RCU callback that drops a reference held on the 'usage'
       field above (which is acquired when the task is first created). This
       field therefore provides a form of RCU protection on the task by
       ensuring that at least one 'usage' refcount will be held until an RCU
       grace period has elapsed. The qualifier "a form of" is important
       here, as a task can remain valid after task->rcu_users has dropped to
       0 and the subsequent RCU gp has elapsed.
    
    In terms of BPF, we want to use task->rcu_users to protect tasks that
    function as referenced kptrs, and to allow tasks stored as referenced
    kptrs in maps to be accessed with RCU protection.
    
    Let's first determine whether we can safely use task->rcu_users to
    protect tasks stored in maps. All of the bpf_task* kfuncs can only be
    called from tracepoint, struct_ops, or BPF_PROG_TYPE_SCHED_CLS, program
    types. For tracepoint and struct_ops programs, the struct task_struct
    passed to a program handler will always be trusted, so it will always be
    safe to call bpf_task_acquire() with any task passed to a program.
    Note, however, that we must update bpf_task_acquire() to be KF_RET_NULL,
    as it is possible that the task has exited by the time the program is
    invoked, even if the pointer is still currently valid because the main
    kernel holds a task->usage refcount. For BPF_PROG_TYPE_SCHED_CLS, tasks
    should never be passed as an argument to the any program handlers, so it
    should not be relevant.
    
    The second question is whether it's safe to use RCU to access a task
    that was acquired with bpf_task_acquire(), and stored in a map. Because
    bpf_task_acquire() now uses task->rcu_users, it follows that if the task
    is present in the map, that it must have had at least one
    task->rcu_users refcount by the time the current RCU cs was started.
    Therefore, it's safe to access that task until the end of the current
    RCU cs.
    
    With all that said, this patch makes struct task_struct is an
    RCU-protected object. In doing so, we also change bpf_task_acquire() to
    be KF_ACQUIRE | KF_RCU | KF_RET_NULL, and adjust any selftests as
    necessary. A subsequent patch will remove bpf_task_kptr_get(), and
    bpf_task_acquire_not_zero() respectively.
    Signed-off-by: default avatarDavid Vernet <void@manifault.com>
    Link: https://lore.kernel.org/r/20230331195733.699708-2-void@manifault.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
    d02c48fa
task_kfunc.c 2.07 KB