• John Fastabend's avatar
    bpf: sockmap, remove STRPARSER map_flags and add multi-map support · 2f857d04
    John Fastabend authored
    The addition of map_flags BPF_SOCKMAP_STRPARSER flags was to handle a
    specific use case where we want to have BPF parse program disabled on
    an entry in a sockmap.
    
    However, Alexei found the API a bit cumbersome and I agreed. Lets
    remove the STRPARSER flag and support the use case by allowing socks
    to be in multiple maps. This allows users to create two maps one with
    programs attached and one without. When socks are added to maps they
    now inherit any programs attached to the map. This is a nice
    generalization and IMO improves the API.
    
    The API rules are less ambiguous and do not need a flag:
    
      - When a sock is added to a sockmap we have two cases,
    
         i. The sock map does not have any attached programs so
            we can add sock to map without inheriting bpf programs.
            The sock may exist in 0 or more other maps.
    
        ii. The sock map has an attached BPF program. To avoid duplicate
            bpf programs we only add the sock entry if it does not have
            an existing strparser/verdict attached, returning -EBUSY if
            a program is already attached. Otherwise attach the program
            and inherit strparser/verdict programs from the sock map.
    
    This allows for socks to be in a multiple maps for redirects and
    inherit a BPF program from a single map.
    
    Also this patch simplifies the logic around BPF_{EXIST|NOEXIST|ANY}
    flags. In the original patch I tried to be extra clever and only
    update map entries when necessary. Now I've decided the complexity
    is not worth it. If users constantly update an entry with the same
    sock for no reason (i.e. update an entry without actually changing
    any parameters on map or sock) we still do an alloc/release. Using
    this and allowing multiple entries of a sock to exist in a map the
    logic becomes much simpler.
    
    Note: Now that multiple maps are supported the "maps" pointer called
    when a socket is closed becomes a list of maps to remove the sock from.
    To keep the map up to date when a sock is added to the sockmap we must
    add the map/elem in the list. Likewise when it is removed we must
    remove it from the list. This results in searching the per psock list
    on delete operation. On TCP_CLOSE events we walk the list and remove
    the psock from all map/entry locations. I don't see any perf
    implications in this because at most I have a psock in two maps. If
    a psock were to be in many maps its possibly this might be noticeable
    on delete but I can't think of a reason to dup a psock in many maps.
    The sk_callback_lock is used to protect read/writes to the list. This
    was convenient because in all locations we were taking the lock
    anyways just after working on the list. Also the lock is per sock so
    in normal cases we shouldn't see any contention.
    Suggested-by: default avatarAlexei Starovoitov <ast@kernel.org>
    Fixes: 174a79ff ("bpf: sockmap with sk redirect support")
    Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
    Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    2f857d04
sockmap.c 22.4 KB