• Jakub Sitnicki's avatar
    bpf: Add link-based BPF program attachment to network namespace · 7f045a49
    Jakub Sitnicki authored
    Extend bpf() syscall subcommands that operate on bpf_link, that is
    LINK_CREATE, LINK_UPDATE, OBJ_GET_INFO, to accept attach types tied to
    network namespaces (only flow dissector at the moment).
    
    Link-based and prog-based attachment can be used interchangeably, but only
    one can exist at a time. Attempts to attach a link when a prog is already
    attached directly, and the other way around, will be met with -EEXIST.
    Attempts to detach a program when link exists result in -EINVAL.
    
    Attachment of multiple links of same attach type to one netns is not
    supported with the intention to lift the restriction when a use-case
    presents itself. Because of that link create returns -E2BIG when trying to
    create another netns link, when one already exists.
    
    Link-based attachments to netns don't keep a netns alive by holding a ref
    to it. Instead links get auto-detached from netns when the latter is being
    destroyed, using a pernet pre_exit callback.
    
    When auto-detached, link lives in defunct state as long there are open FDs
    for it. -ENOLINK is returned if a user tries to update a defunct link.
    
    Because bpf_link to netns doesn't hold a ref to struct net, special care is
    taken when releasing, updating, or filling link info. The netns might be
    getting torn down when any of these link operations are in progress. That
    is why auto-detach and update/release/fill_info are synchronized by the
    same mutex. Also, link ops have to always check if auto-detach has not
    happened yet and if netns is still alive (refcnt > 0).
    Signed-off-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
    Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
    Link: https://lore.kernel.org/bpf/20200531082846.2117903-5-jakub@cloudflare.com
    7f045a49
syscall.c 97.9 KB