1. 19 Jun, 2019 6 commits
    • Alexei Starovoitov's avatar
      bpf: fix callees pruning callers · eea1c227
      Alexei Starovoitov authored
      The commit 7640ead9 partially resolved the issue of callees
      incorrectly pruning the callers.
      With introduction of bounded loops and jmps_processed heuristic
      single verifier state may contain multiple branches and calls.
      It's possible that new verifier state (for future pruning) will be
      allocated inside callee. Then callee will exit (still within the same
      verifier state). It will go back to the caller and there R6-R9 registers
      will be read and will trigger mark_reg_read. But the reg->live for all frames
      but the top frame is not set to LIVE_NONE. Hence mark_reg_read will fail
      to propagate liveness into parent and future walking will incorrectly
      conclude that the states are equivalent because LIVE_READ is not set.
      In other words the rule for parent/live should be:
      whenever register parentage chain is set the reg->live should be set to LIVE_NONE.
      is_state_visited logic already follows this rule for spilled registers.
      
      Fixes: 7640ead9 ("bpf: verifier: make sure callees don't prune with caller differences")
      Fixes: f4d7e40a ("bpf: introduce function calls (verification)")
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      eea1c227
    • Alexei Starovoitov's avatar
      bpf: introduce bounded loops · 2589726d
      Alexei Starovoitov authored
      Allow the verifier to validate the loops by simulating their execution.
      Exisiting programs have used '#pragma unroll' to unroll the loops
      by the compiler. Instead let the verifier simulate all iterations
      of the loop.
      In order to do that introduce parentage chain of bpf_verifier_state and
      'branches' counter for the number of branches left to explore.
      See more detailed algorithm description in bpf_verifier.h
      
      This algorithm borrows the key idea from Edward Cree approach:
      https://patchwork.ozlabs.org/patch/877222/
      Additional state pruning heuristics make such brute force loop walk
      practical even for large loops.
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      2589726d
    • Alexei Starovoitov's avatar
      bpf: extend is_branch_taken to registers · fb8d251e
      Alexei Starovoitov authored
      This patch extends is_branch_taken() logic from JMP+K instructions
      to JMP+X instructions.
      Conditional branches are often done when src and dst registers
      contain known scalars. In such case the verifier can follow
      the branch that is going to be taken when program executes.
      That speeds up the verification and is essential feature to support
      bounded loops.
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      fb8d251e
    • Alexei Starovoitov's avatar
      selftests/bpf: fix tests due to const spill/fill · fc559a70
      Alexei Starovoitov authored
      fix tests that incorrectly assumed that the verifier
      cannot track constants through stack.
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      fc559a70
    • Alexei Starovoitov's avatar
      bpf: track spill/fill of constants · f7cf25b2
      Alexei Starovoitov authored
      Compilers often spill induction variables into the stack,
      hence it is necessary for the verifier to track scalar values
      of the registers through stack slots.
      
      Also few bpf programs were incorrectly rejected in the past,
      since the verifier was not able to track such constants while
      they were used to compute offsets into packet headers.
      
      Tracking constants through the stack significantly decreases
      the chances of state pruning, since two different constants
      are considered to be different by state equivalency.
      End result that cilium tests suffer serious degradation in the number
      of states processed and corresponding verification time increase.
      
                           before  after
      bpf_lb-DLB_L3.o      1838    6441
      bpf_lb-DLB_L4.o      3218    5908
      bpf_lb-DUNKNOWN.o    1064    1064
      bpf_lxc-DDROP_ALL.o  26935   93790
      bpf_lxc-DUNKNOWN.o   34439   123886
      bpf_netdev.o         9721    31413
      bpf_overlay.o        6184    18561
      bpf_lxc_jit.o        39389   359445
      
      After further debugging turned out that cillium progs are
      getting hurt by clang due to the same constant tracking issue.
      Newer clang generates better code by spilling less to the stack.
      Instead it keeps more constants in the registers which
      hurts state pruning since the verifier already tracks constants
      in the registers:
                        old clang  new clang
                               (no spill/fill tracking introduced by this patch)
      bpf_lb-DLB_L3.o      1838    1923
      bpf_lb-DLB_L4.o      3218    3077
      bpf_lb-DUNKNOWN.o    1064    1062
      bpf_lxc-DDROP_ALL.o  26935   166729
      bpf_lxc-DUNKNOWN.o   34439   174607
      bpf_netdev.o         9721    8407
      bpf_overlay.o        6184    5420
      bpf_lcx_jit.o        39389   39389
      
      The final table is depressing:
                        old clang  old clang    new clang  new clang
                                 const spill/fill        const spill/fill
      bpf_lb-DLB_L3.o      1838    6441          1923      8128
      bpf_lb-DLB_L4.o      3218    5908          3077      6707
      bpf_lb-DUNKNOWN.o    1064    1064          1062      1062
      bpf_lxc-DDROP_ALL.o  26935   93790         166729    380712
      bpf_lxc-DUNKNOWN.o   34439   123886        174607    440652
      bpf_netdev.o         9721    31413         8407      31904
      bpf_overlay.o        6184    18561         5420      23569
      bpf_lxc_jit.o        39389   359445        39389     359445
      
      Tracking constants in the registers hurts state pruning already.
      Adding tracking of constants through stack hurts pruning even more.
      The later patch address this general constant tracking issue
      with coarse/precise logic.
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      f7cf25b2
    • Andrii Nakryiko's avatar
      libbpf: constify getter APIs · a324aae3
      Andrii Nakryiko authored
      Add const qualifiers to bpf_object/bpf_program/bpf_map arguments for
      getter APIs. There is no need for them to not be const pointers.
      
      Verified that
      
      make -C tools/lib/bpf
      make -C tools/testing/selftests/bpf
      make -C tools/perf
      
      all build without warnings.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      a324aae3
  2. 17 Jun, 2019 13 commits
    • Daniel T. Lee's avatar
      samples: bpf: refactor header include path · 4d18f6de
      Daniel T. Lee authored
      Currently, header inclusion in each file is inconsistent.
      For example, "libbpf.h" header is included as multiple ways.
      
          #include "bpf/libbpf.h"
          #include "libbpf.h"
      
      Due to commit b552d33c ("samples/bpf: fix include path
      in Makefile"), $(srctree)/tools/lib/bpf/ path had been included
      during build, path "bpf/" in header isn't necessary anymore.
      
      This commit removes path "bpf/" in header inclusion.
      Signed-off-by: default avatarDaniel T. Lee <danieltimlee@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      4d18f6de
    • Daniel T. Lee's avatar
      samples: bpf: remove unnecessary include options in Makefile · fa206dcc
      Daniel T. Lee authored
      Due to recent change of include path at commit b552d33c
      ("samples/bpf: fix include path in Makefile"), some of the
      previous include options became unnecessary.
      
      This commit removes duplicated include options in Makefile.
      Signed-off-by: default avatarDaniel T. Lee <danieltimlee@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      fa206dcc
    • Daniel Borkmann's avatar
      Merge branch 'bpf-libbpf-btf-defined-maps' · 32b88d37
      Daniel Borkmann authored
      Andrii Nakryiko says:
      
      ====================
      This patch set implements initial version (as discussed at LSF/MM2019
      conference) of a new way to specify BPF maps, relying on BTF type information,
      which allows for easy extensibility, preserving forward and backward
      compatibility. See details and examples in description for patch #6.
      
      [0] contains an outline of follow up extensions to be added after this basic
      set of features lands. They are useful by itself, but also allows to bring
      libbpf to feature-parity with iproute2 BPF loader. That should open a path
      forward for BPF loaders unification.
      
      Patch #1 centralizes commonly used min/max macro in libbpf_internal.h.
      Patch #2 extracts .BTF and .BTF.ext loading loging from elf_collect().
      Patch #3 simplifies elf_collect() error-handling logic.
      Patch #4 refactors map initialization logic into user-provided maps and global
      data maps, in preparation to adding another way (BTF-defined maps).
      Patch #5 adds support for map definitions in multiple ELF sections and
      deprecates bpf_object__find_map_by_offset() API which doesn't appear to be
      used anymore and makes assumption that all map definitions reside in single
      ELF section.
      Patch #6 splits BTF intialization from sanitization/loading into kernel to
      preserve original BTF at the time of map initialization.
      Patch #7 adds support for BTF-defined maps.
      Patch #8 adds new test for BTF-defined map definition.
      Patches #9-11 convert test BPF map definitions to use BTF way.
      
      [0] https://lore.kernel.org/bpf/CAEf4BzbfdG2ub7gCi0OYqBrUoChVHWsmOntWAkJt47=FE+km+A@mail.gmail.com/
      
      v1->v2:
      - more BTF-sanity checks in parsing map definitions (Song);
      - removed confusing usage of "attribute", switched to "field;
      - split off elf_collect() refactor from btf loading refactor (Song);
      - split selftests conversion into 3 patches (Stanislav):
        1. test already relying on BTF;
        2. tests w/ custom types as key/value (so benefiting from BTF);
        3. all the rest tests (integers as key/value, special maps w/o BTF support).
      - smaller code improvements (Song);
      
      rfc->v1:
      - error out on unknown field by default (Stanislav, Jakub, Lorenz);
      ====================
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      32b88d37
    • Andrii Nakryiko's avatar
      selftests/bpf: convert tests w/ custom values to BTF-defined maps · df0b7792
      Andrii Nakryiko authored
      Convert a bulk of selftests that have maps with custom (not integer) key
      and/or value.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      df0b7792
    • Andrii Nakryiko's avatar
      selftests/bpf: switch BPF_ANNOTATE_KV_PAIR tests to BTF-defined maps · f6544074
      Andrii Nakryiko authored
      Switch tests that already rely on BTF to BTF-defined map definitions.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      f6544074
    • Andrii Nakryiko's avatar
      selftests/bpf: add test for BTF-defined maps · 9e3d709c
      Andrii Nakryiko authored
      Add file test for BTF-defined map definition.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      9e3d709c
    • Andrii Nakryiko's avatar
      libbpf: allow specifying map definitions using BTF · abd29c93
      Andrii Nakryiko authored
      This patch adds support for a new way to define BPF maps. It relies on
      BTF to describe mandatory and optional attributes of a map, as well as
      captures type information of key and value naturally. This eliminates
      the need for BPF_ANNOTATE_KV_PAIR hack and ensures key/value sizes are
      always in sync with the key/value type.
      
      Relying on BTF, this approach allows for both forward and backward
      compatibility w.r.t. extending supported map definition features. By
      default, any unrecognized attributes are treated as an error, but it's
      possible relax this using MAPS_RELAX_COMPAT flag. New attributes, added
      in the future will need to be optional.
      
      The outline of the new map definition (short, BTF-defined maps) is as follows:
      1. All the maps should be defined in .maps ELF section. It's possible to
         have both "legacy" map definitions in `maps` sections and BTF-defined
         maps in .maps sections. Everything will still work transparently.
      2. The map declaration and initialization is done through
         a global/static variable of a struct type with few mandatory and
         extra optional fields:
         - type field is mandatory and specified type of BPF map;
         - key/value fields are mandatory and capture key/value type/size information;
         - max_entries attribute is optional; if max_entries is not specified or
           initialized, it has to be provided in runtime through libbpf API
           before loading bpf_object;
         - map_flags is optional and if not defined, will be assumed to be 0.
      3. Key/value fields should be **a pointer** to a type describing
         key/value. The pointee type is assumed (and will be recorded as such
         and used for size determination) to be a type describing key/value of
         the map. This is done to save excessive amounts of space allocated in
         corresponding ELF sections for key/value of big size.
      4. As some maps disallow having BTF type ID associated with key/value,
         it's possible to specify key/value size explicitly without
         associating BTF type ID with it. Use key_size and value_size fields
         to do that (see example below).
      
      Here's an example of simple ARRAY map defintion:
      
      struct my_value { int x, y, z; };
      
      struct {
      	int type;
      	int max_entries;
      	int *key;
      	struct my_value *value;
      } btf_map SEC(".maps") = {
      	.type = BPF_MAP_TYPE_ARRAY,
      	.max_entries = 16,
      };
      
      This will define BPF ARRAY map 'btf_map' with 16 elements. The key will
      be of type int and thus key size will be 4 bytes. The value is struct
      my_value of size 12 bytes. This map can be used from C code exactly the
      same as with existing maps defined through struct bpf_map_def.
      
      Here's an example of STACKMAP definition (which currently disallows BTF type
      IDs for key/value):
      
      struct {
      	__u32 type;
      	__u32 max_entries;
      	__u32 map_flags;
      	__u32 key_size;
      	__u32 value_size;
      } stackmap SEC(".maps") = {
      	.type = BPF_MAP_TYPE_STACK_TRACE,
      	.max_entries = 128,
      	.map_flags = BPF_F_STACK_BUILD_ID,
      	.key_size = sizeof(__u32),
      	.value_size = PERF_MAX_STACK_DEPTH * sizeof(struct bpf_stack_build_id),
      };
      
      This approach is naturally extended to support map-in-map, by making a value
      field to be another struct that describes inner map. This feature is not
      implemented yet. It's also possible to incrementally add features like pinning
      with full backwards and forward compatibility. Support for static
      initialization of BPF_MAP_TYPE_PROG_ARRAY using pointers to BPF programs
      is also on the roadmap.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      abd29c93
    • Andrii Nakryiko's avatar
      libbpf: split initialization and loading of BTF · 063183bf
      Andrii Nakryiko authored
      Libbpf does sanitization of BTF before loading it into kernel, if kernel
      doesn't support some of newer BTF features. This removes some of the
      important information from BTF (e.g., DATASEC and VAR description),
      which will be used for map construction. This patch splits BTF
      processing into initialization step, in which BTF is initialized from
      ELF and all the original data is still preserved; and
      sanitization/loading step, which ensures that BTF is safe to load into
      kernel. This allows to use full BTF information to construct maps, while
      still loading valid BTF into older kernels.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      063183bf
    • Andrii Nakryiko's avatar
      libbpf: identify maps by section index in addition to offset · db48814b
      Andrii Nakryiko authored
      To support maps to be defined in multiple sections, it's important to
      identify map not just by offset within its section, but section index as
      well. This patch adds tracking of section index.
      
      For global data, we record section index of corresponding
      .data/.bss/.rodata ELF section for uniformity, and thus don't need
      a special value of offset for those maps.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      db48814b
    • Andrii Nakryiko's avatar
      libbpf: refactor map initialization · bf829271
      Andrii Nakryiko authored
      User and global data maps initialization has gotten pretty complicated
      and unnecessarily convoluted. This patch splits out the logic for global
      data map and user-defined map initialization. It also removes the
      restriction of pre-calculating how many maps will be initialized,
      instead allowing to keep adding new maps as they are discovered, which
      will be used later for BTF-defined map definitions.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      bf829271
    • Andrii Nakryiko's avatar
      libbpf: streamline ELF parsing error-handling · 01b29d1d
      Andrii Nakryiko authored
      Simplify ELF parsing logic by exiting early, as there is no common clean
      up path to execute. That makes it unnecessary to track when err was set
      and when it was cleared. It also reduces nesting in some places.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      01b29d1d
    • Andrii Nakryiko's avatar
      libbpf: extract BTF loading logic · 9c6660d0
      Andrii Nakryiko authored
      As a preparation for adding BTF-based BPF map loading, extract .BTF and
      .BTF.ext loading logic.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      9c6660d0
    • Andrii Nakryiko's avatar
      libbpf: add common min/max macro to libbpf_internal.h · d7fe74f9
      Andrii Nakryiko authored
      Multiple files in libbpf redefine their own definitions for min/max.
      Let's define them in libbpf_internal.h and use those everywhere.
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      d7fe74f9
  3. 14 Jun, 2019 11 commits
  4. 12 Jun, 2019 1 commit
    • Valdis Klētnieks's avatar
      bpf: silence warning messages in core · aee450cb
      Valdis Klētnieks authored
      Compiling kernel/bpf/core.c with W=1 causes a flood of warnings:
      
      kernel/bpf/core.c:1198:65: warning: initialized field overwritten [-Woverride-init]
       1198 | #define BPF_INSN_3_TBL(x, y, z) [BPF_##x | BPF_##y | BPF_##z] = true
            |                                                                 ^~~~
      kernel/bpf/core.c:1087:2: note: in expansion of macro 'BPF_INSN_3_TBL'
       1087 |  INSN_3(ALU, ADD,  X),   \
            |  ^~~~~~
      kernel/bpf/core.c:1202:3: note: in expansion of macro 'BPF_INSN_MAP'
       1202 |   BPF_INSN_MAP(BPF_INSN_2_TBL, BPF_INSN_3_TBL),
            |   ^~~~~~~~~~~~
      kernel/bpf/core.c:1198:65: note: (near initialization for 'public_insntable[12]')
       1198 | #define BPF_INSN_3_TBL(x, y, z) [BPF_##x | BPF_##y | BPF_##z] = true
            |                                                                 ^~~~
      kernel/bpf/core.c:1087:2: note: in expansion of macro 'BPF_INSN_3_TBL'
       1087 |  INSN_3(ALU, ADD,  X),   \
            |  ^~~~~~
      kernel/bpf/core.c:1202:3: note: in expansion of macro 'BPF_INSN_MAP'
       1202 |   BPF_INSN_MAP(BPF_INSN_2_TBL, BPF_INSN_3_TBL),
            |   ^~~~~~~~~~~~
      
      98 copies of the above.
      
      The attached patch silences the warnings, because we *know* we're overwriting
      the default initializer. That leaves bpf/core.c with only 6 other warnings,
      which become more visible in comparison.
      Signed-off-by: default avatarValdis Kletnieks <valdis.kletnieks@vt.edu>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      aee450cb
  5. 11 Jun, 2019 9 commits