• Aditi Ghag's avatar
    bpf: Add bpf_sock_destroy kfunc · 4ddbcb88
    Aditi Ghag authored
    The socket destroy kfunc is used to forcefully terminate sockets from
    certain BPF contexts. We plan to use the capability in Cilium
    load-balancing to terminate client sockets that continue to connect to
    deleted backends.  The other use case is on-the-fly policy enforcement
    where existing socket connections prevented by policies need to be
    forcefully terminated.  The kfunc also allows terminating sockets that may
    or may not be actively sending traffic.
    
    The kfunc can currently be called only from BPF TCP and UDP iterators
    where users can filter, and terminate selected sockets. More
    specifically, it can only be called from  BPF contexts that ensure
    socket locking in order to allow synchronous execution of protocol
    specific `diag_destroy` handlers. The previous commit that batches UDP
    sockets during iteration facilitated a synchronous invocation of the UDP
    destroy callback from BPF context by skipping socket locks in
    `udp_abort`. TCP iterator already supported batching of sockets being
    iterated. To that end, `tracing_iter_filter` callback filter is added so
    that verifier can restrict the kfunc to programs with `BPF_TRACE_ITER`
    attach type, and reject other programs.
    
    The kfunc takes `sock_common` type argument, even though it expects, and
    casts them to a `sock` pointer. This enables the verifier to allow the
    sock_destroy kfunc to be called for TCP with `sock_common` and UDP with
    `sock` structs. Furthermore, as `sock_common` only has a subset of
    certain fields of `sock`, casting pointer to the latter type might not
    always be safe for certain sockets like request sockets, but these have a
    special handling in the diag_destroy handlers.
    
    Additionally, the kfunc is defined with `KF_TRUSTED_ARGS` flag to avoid the
    cases where a `PTR_TO_BTF_ID` sk is obtained by following another pointer.
    eg. getting a sk pointer (may be even NULL) by following another sk
    pointer. The pointer socket argument passed in TCP and UDP iterators is
    tagged as `PTR_TRUSTED` in {tcp,udp}_reg_info.  The TRUSTED arg changes
    are contributed by Martin KaFai Lau <martin.lau@kernel.org>.
    Signed-off-by: default avatarAditi Ghag <aditi.ghag@isovalent.com>
    Link: https://lore.kernel.org/r/20230519225157.760788-8-aditi.ghag@isovalent.comSigned-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
    4ddbcb88
tcp.c 127 KB