• Andrey Ignatov's avatar
    bpf: Support replacing cgroup-bpf program in MULTI mode · 7dd68b32
    Andrey Ignatov authored
    The common use-case in production is to have multiple cgroup-bpf
    programs per attach type that cover multiple use-cases. Such programs
    are attached with BPF_F_ALLOW_MULTI and can be maintained by different
    people.
    
    Order of programs usually matters, for example imagine two egress
    programs: the first one drops packets and the second one counts packets.
    If they're swapped the result of counting program will be different.
    
    It brings operational challenges with updating cgroup-bpf program(s)
    attached with BPF_F_ALLOW_MULTI since there is no way to replace a
    program:
    
    * One way to update is to detach all programs first and then attach the
      new version(s) again in the right order. This introduces an
      interruption in the work a program is doing and may not be acceptable
      (e.g. if it's egress firewall);
    
    * Another way is attach the new version of a program first and only then
      detach the old version. This introduces the time interval when two
      versions of same program are working, what may not be acceptable if a
      program is not idempotent. It also imposes additional burden on
      program developers to make sure that two versions of their program can
      co-exist.
    
    Solve the problem by introducing a "replace" mode in BPF_PROG_ATTACH
    command for cgroup-bpf programs being attached with BPF_F_ALLOW_MULTI
    flag. This mode is enabled by newly introduced BPF_F_REPLACE attach flag
    and bpf_attr.replace_bpf_fd attribute to pass fd of the old program to
    replace
    
    That way user can replace any program among those attached with
    BPF_F_ALLOW_MULTI flag without the problems described above.
    
    Details of the new API:
    
    * If BPF_F_REPLACE is set but replace_bpf_fd doesn't have valid
      descriptor of BPF program, BPF_PROG_ATTACH will return corresponding
      error (EINVAL or EBADF).
    
    * If replace_bpf_fd has valid descriptor of BPF program but such a
      program is not attached to specified cgroup, BPF_PROG_ATTACH will
      return ENOENT.
    
    BPF_F_REPLACE is introduced to make the user intent clear, since
    replace_bpf_fd alone can't be used for this (its default value, 0, is a
    valid fd). BPF_F_REPLACE also makes it possible to extend the API in the
    future (e.g. add BPF_F_BEFORE and BPF_F_AFTER if needed).
    Signed-off-by: default avatarAndrey Ignatov <rdna@fb.com>
    Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
    Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
    Acked-by: default avatarAndrii Narkyiko <andriin@fb.com>
    Link: https://lore.kernel.org/bpf/30cd850044a0057bdfcaaf154b7d2f39850ba813.1576741281.git.rdna@fb.com
    7dd68b32
cgroup.c 167 KB