• Daniel Borkmann's avatar
    bpf: Allow for map-in-map with dynamic inner array map entries · 4a8f87e6
    Daniel Borkmann authored
    Recent work in f4d05259 ("bpf: Add map_meta_equal map ops") and 134fede4
    ("bpf: Relax max_entries check for most of the inner map types") added support
    for dynamic inner max elements for most map-in-map types. Exceptions were maps
    like array or prog array where the map_gen_lookup() callback uses the maps'
    max_entries field as a constant when emitting instructions.
    
    We recently implemented Maglev consistent hashing into Cilium's load balancer
    which uses map-in-map with an outer map being hash and inner being array holding
    the Maglev backend table for each service. This has been designed this way in
    order to reduce overall memory consumption given the outer hash map allows to
    avoid preallocating a large, flat memory area for all services. Also, the
    number of service mappings is not always known a-priori.
    
    The use case for dynamic inner array map entries is to further reduce memory
    overhead, for example, some services might just have a small number of back
    ends while others could have a large number. Right now the Maglev backend table
    for small and large number of backends would need to have the same inner array
    map entries which adds a lot of unneeded overhead.
    
    Dynamic inner array map entries can be realized by avoiding the inlined code
    generation for their lookup. The lookup will still be efficient since it will
    be calling into array_map_lookup_elem() directly and thus avoiding retpoline.
    The patch adds a BPF_F_INNER_MAP flag to map creation which therefore skips
    inline code generation and relaxes array_map_meta_equal() check to ignore both
    maps' max_entries. This also still allows to have faster lookups for map-in-map
    when BPF_F_INNER_MAP is not specified and hence dynamic max_entries not needed.
    
    Example code generation where inner map is dynamic sized array:
    
      # bpftool p d x i 125
      int handle__sys_enter(void * ctx):
      ; int handle__sys_enter(void *ctx)
         0: (b4) w1 = 0
      ; int key = 0;
         1: (63) *(u32 *)(r10 -4) = r1
         2: (bf) r2 = r10
      ;
         3: (07) r2 += -4
      ; inner_map = bpf_map_lookup_elem(&outer_arr_dyn, &key);
         4: (18) r1 = map[id:468]
         6: (07) r1 += 272
         7: (61) r0 = *(u32 *)(r2 +0)
         8: (35) if r0 >= 0x3 goto pc+5
         9: (67) r0 <<= 3
        10: (0f) r0 += r1
        11: (79) r0 = *(u64 *)(r0 +0)
        12: (15) if r0 == 0x0 goto pc+1
        13: (05) goto pc+1
        14: (b7) r0 = 0
        15: (b4) w6 = -1
      ; if (!inner_map)
        16: (15) if r0 == 0x0 goto pc+6
        17: (bf) r2 = r10
      ;
        18: (07) r2 += -4
      ; val = bpf_map_lookup_elem(inner_map, &key);
        19: (bf) r1 = r0                               | No inlining but instead
        20: (85) call array_map_lookup_elem#149280     | call to array_map_lookup_elem()
      ; return val ? *val : -1;                        | for inner array lookup.
        21: (15) if r0 == 0x0 goto pc+1
      ; return val ? *val : -1;
        22: (61) r6 = *(u32 *)(r0 +0)
      ; }
        23: (bc) w0 = w6
        24: (95) exit
    Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
    Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
    Acked-by: default avatarAndrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20201010234006.7075-4-daniel@iogearbox.net
    4a8f87e6
xskmap.c 6.25 KB