1. 06 Oct, 2021 14 commits
    • Andrii Nakryiko's avatar
      selftests/bpf: Refactor btf_write selftest to reuse BTF generation logic · c65eb808
      Andrii Nakryiko authored
      Next patch will need to reuse BTF generation logic, which tests every
      supported BTF kind, for testing btf__add_btf() APIs. So restructure
      existing selftests and make it as a single subtest that uses bulk
      VALIDATE_RAW_BTF() macro for raw BTF dump checking.
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Link: https://lore.kernel.org/bpf/20211006051107.17921-3-andrii@kernel.org
      c65eb808
    • Andrii Nakryiko's avatar
      libbpf: Add API that copies all BTF types from one BTF object to another · 7ca61121
      Andrii Nakryiko authored
      Add a bulk copying api, btf__add_btf(), that speeds up and simplifies
      appending entire contents of one BTF object to another one, taking care
      of copying BTF type data, adjusting resulting BTF type IDs according to
      their new locations in the destination BTF object, as well as copying
      and deduplicating all the referenced strings and updating all the string
      offsets in new BTF types as appropriate.
      
      This API is intended to be used from tools that are generating and
      otherwise manipulating BTFs generically, such as pahole. In pahole's
      case, this API is useful for speeding up parallelized BTF encoding, as
      it allows pahole to offload all the intricacies of BTF type copying to
      libbpf and handle the parallelization aspects of the process.
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Link: https://lore.kernel.org/bpf/20211006051107.17921-2-andrii@kernel.org
      7ca61121
    • Jie Meng's avatar
      bpf, x64: Save bytes for DIV by reducing reg copies · 57a610f1
      Jie Meng authored
      Instead of unconditionally performing push/pop on %rax/%rdx in case of
      division/modulo, we can save a few bytes in case of destination register
      being either BPF r0 (%rax) or r3 (%rdx) since the result is written in
      there anyway.
      
      Also, we do not need to copy the source to %r11 unless the source is either
      %rax, %rdx or an immediate.
      
      For example, before the patch:
      
        22:   push   %rax
        23:   push   %rdx
        24:   mov    %rsi,%r11
        27:   xor    %edx,%edx
        29:   div    %r11
        2c:   mov    %rax,%r11
        2f:   pop    %rdx
        30:   pop    %rax
        31:   mov    %r11,%rax
      
      After:
      
        22:   push   %rdx
        23:   xor    %edx,%edx
        25:   div    %rsi
        28:   pop    %rdx
      Signed-off-by: default avatarJie Meng <jmeng@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Tested-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20211002035626.2041910-1-jmeng@fb.com
      57a610f1
    • Andrey Ignatov's avatar
      bpf: Avoid retpoline for bpf_for_each_map_elem · 0640c77c
      Andrey Ignatov authored
      Similarly to 09772d92 ("bpf: avoid retpoline for
      lookup/update/delete calls on maps") and 84430d42 ("bpf, verifier:
      avoid retpoline for map push/pop/peek operation") avoid indirect call
      while calling bpf_for_each_map_elem.
      
      Before (a program fragment):
      
        ; if (rules_map) {
         142: (15) if r4 == 0x0 goto pc+8
         143: (bf) r3 = r10
        ; bpf_for_each_map_elem(rules_map, process_each_rule, &ctx, 0);
         144: (07) r3 += -24
         145: (bf) r1 = r4
         146: (18) r2 = subprog[+5]
         148: (b7) r4 = 0
         149: (85) call bpf_for_each_map_elem#143680  <-- indirect call via
                                                          helper
      
      After (same program fragment):
      
         ; if (rules_map) {
          142: (15) if r4 == 0x0 goto pc+8
          143: (bf) r3 = r10
         ; bpf_for_each_map_elem(rules_map, process_each_rule, &ctx, 0);
          144: (07) r3 += -24
          145: (bf) r1 = r4
          146: (18) r2 = subprog[+5]
          148: (b7) r4 = 0
          149: (85) call bpf_for_each_array_elem#170336  <-- direct call
      
      On a benchmark that calls bpf_for_each_map_elem() once and does many
      other things (mostly checking fields in skb) with CONFIG_RETPOLINE=y it
      makes program faster.
      
      Before:
      
        ============================================================================
        Benchmark.cpp                                              time/iter iters/s
        ============================================================================
        IngressMatchByRemoteEndpoint                                80.78ns 12.38M
        IngressMatchByRemoteIP                                      80.66ns 12.40M
        IngressMatchByRemotePort                                    80.87ns 12.37M
      
      After:
      
        ============================================================================
        Benchmark.cpp                                              time/iter iters/s
        ============================================================================
        IngressMatchByRemoteEndpoint                                73.49ns 13.61M
        IngressMatchByRemoteIP                                      71.48ns 13.99M
        IngressMatchByRemotePort                                    70.39ns 14.21M
      Signed-off-by: default avatarAndrey Ignatov <rdna@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20211006001838.75607-1-rdna@fb.com
      0640c77c
    • Alexei Starovoitov's avatar
      Merge branch 'Support kernel module function calls from eBPF' · 32a16f6b
      Alexei Starovoitov authored
      Kumar Kartikeya says:
      
      ====================
      
      This set enables kernel module function calls, and also modifies verifier logic
      to permit invalid kernel function calls as long as they are pruned as part of
      dead code elimination. This is done to provide better runtime portability for
      BPF objects, which can conditionally disable parts of code that are pruned later
      by the verifier (e.g. const volatile vars, kconfig options). libbpf
      modifications are made along with kernel changes to support module function
      calls.
      
      It also converts TCP congestion control objects to use the module kfunc support
      instead of relying on IS_BUILTIN ifdef.
      
      Changelog:
      ----------
      v6 -> v7
      v6: https://lore.kernel.org/bpf/20210930062948.1843919-1-memxor@gmail.com
      
       * Let __bpf_check_kfunc_call take kfunc_btf_id_list instead of generating
         callbacks (Andrii)
       * Rename it to bpf_check_mod_kfunc_call to reflect usage
       * Remove OOM checks (Alexei)
       * Remove resolve_btfids invocation for bpf_testmod (Andrii)
       * Move fd_array_cnt initialization near fd_array alloc (Andrii)
       * Rename helper to btf_find_by_name_kind and pass start_id (Andrii)
       * memset when data is NULL in add_data (Alexei)
       * Fix other nits
      
      v5 -> v6
      v5: https://lore.kernel.org/bpf/20210927145941.1383001-1-memxor@gmail.com
      
       * Rework gen_loader relocation emits
         * Only emit bpf_btf_find_by_name_kind call when required (Alexei)
         * Refactor code to emit ksym var and func relo into separate helpers, this
           will be easier to add future weak/typeless ksym support to (for my followup)
         * Count references for both ksym var and funcs, and avoid calling helpers
           unless required for both of them. This also means we share fds between
           ksym vars for the module BTFs. Also be careful with this when closing
           BTF fd so that we only close one instance of the fd for each ksym
      
      v4 -> v5
      v4: https://lore.kernel.org/bpf/20210920141526.3940002-1-memxor@gmail.com
      
       * Address comments from Alexei
         * Use reserved fd_array area in loader map instead of creating a new map
         * Drop selftest testing the 256 kfunc limit, however selftest testing reuse
           of BTF fd for same kfunc in gen_loader and libbpf is kept
       * Address comments from Andrii
         * Make --no-fail the default for resolve_btfids, i.e. only fail if we find
           BTF section and cannot process it
         * Use obj->btf_modules array to store index in the fd_array, so that we don't
           have to do any searching to reuse the index, instead only set it the first
           time a module BTF's fd is used
         * Make find_ksym_btf_id to return struct module_btf * in last parameter
         * Improve logging when index becomes bigger than INT16_MAX
         * Add btf__find_by_name_kind_own internal helper to only start searching for
           kfunc ID in module BTF, since find_ksym_btf_id already checks vmlinux BTF
           before iterating over module BTFs.
         * Fix various other nits
       * Fixes for failing selftests on BPF CI
       * Rearrange/cleanup selftests
         * Avoid testing kfunc limit (Alexei)
         * Do test gen_loader and libbpf BTF fd index dedup with 256 calls
         * Move invalid kfunc failure test to verifier selftest
         * Minimize duplication
       * Use consistent bpf_<type>_check_kfunc_call naming for module kfunc callback
       * Since we try to add fd using add_data while we can, cherry pick Alexei's
         patch from CO-RE RFC series to align gen_loader data.
      
      v3 -> v4
      v3: https://lore.kernel.org/bpf/20210915050943.679062-1-memxor@gmail.com
      
       * Address comments from Alexei
         * Drop MAX_BPF_STACK change, instead move map_fd and BTF fd to BPF array map
           and pass fd_array using BPF_PSEUDO_MAP_IDX_VALUE
       * Address comments from Andrii
         * Fix selftest to store to variable for observing function call instead of
           printk and polluting CI logs
       * Drop use of raw_tp for testing, instead reuse classifier based prog_test_run
       * Drop index + 1 based insn->off convention for kfunc module calls
       * Expand selftests to cover more corner cases
       * Misc cleanups
      
      v2 -> v3
      v2: https://lore.kernel.org/bpf/20210914123750.460750-1-memxor@gmail.com
      
       * Fix issues pointed out by Kernel Test Robot
       * Fix find_kfunc_desc to also take offset into consideration when comparing
      
      RFC v1 -> v2
      v1: https://lore.kernel.org/bpf/20210830173424.1385796-1-memxor@gmail.com
      
       * Address comments from Alexei
         * Reuse fd_array instead of introducing kfunc_btf_fds array
         * Take btf and module reference as needed, instead of preloading
         * Add BTF_KIND_FUNC relocation support to gen_loader infrastructure
       * Address comments from Andrii
         * Drop hashmap in libbpf for finding index of existing BTF in fd_array
         * Preserve invalid kfunc calls only when the symbol is weak
       * Adjust verifier selftests
      ====================
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      32a16f6b
    • Kumar Kartikeya Dwivedi's avatar
      bpf: selftests: Add selftests for module kfunc support · c48e51c8
      Kumar Kartikeya Dwivedi authored
      This adds selftests that tests the success and failure path for modules
      kfuncs (in presence of invalid kfunc calls) for both libbpf and
      gen_loader. It also adds a prog_test kfunc_btf_id_list so that we can
      add module BTF ID set from bpf_testmod.
      
      This also introduces  a couple of test cases to verifier selftests for
      validating whether we get an error or not depending on if invalid kfunc
      call remains after elimination of unreachable instructions.
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20211002011757.311265-10-memxor@gmail.com
      c48e51c8
    • Kumar Kartikeya Dwivedi's avatar
      libbpf: Update gen_loader to emit BTF_KIND_FUNC relocations · 18f4fccb
      Kumar Kartikeya Dwivedi authored
      This change updates the BPF syscall loader to relocate BTF_KIND_FUNC
      relocations, with support for weak kfunc relocations. The general idea
      is to move map_fds to loader map, and also use the data for storing
      kfunc BTF fds. Since both reuse the fd_array parameter, they need to be
      kept together.
      
      For map_fds, we reserve MAX_USED_MAPS slots in a region, and for kfunc,
      we reserve MAX_KFUNC_DESCS. This is done so that insn->off has more
      chances of being <= INT16_MAX than treating data map as a sparse array
      and adding fd as needed.
      
      When the MAX_KFUNC_DESCS limit is reached, we fall back to the sparse
      array model, so that as long as it does remain <= INT16_MAX, we pass an
      index relative to the start of fd_array.
      
      We store all ksyms in an array where we try to avoid calling the
      bpf_btf_find_by_name_kind helper, and also reuse the BTF fd that was
      already stored. This also speeds up the loading process compared to
      emitting calls in all cases, in later tests.
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20211002011757.311265-9-memxor@gmail.com
      18f4fccb
    • Kumar Kartikeya Dwivedi's avatar
      libbpf: Resolve invalid weak kfunc calls with imm = 0, off = 0 · 466b2e13
      Kumar Kartikeya Dwivedi authored
      Preserve these calls as it allows verifier to succeed in loading the
      program if they are determined to be unreachable after dead code
      elimination during program load. If not, the verifier will fail at
      runtime. This is done for ext->is_weak symbols similar to the case for
      variable ksyms.
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/20211002011757.311265-8-memxor@gmail.com
      466b2e13
    • Kumar Kartikeya Dwivedi's avatar
      libbpf: Support kernel module function calls · 9dbe6015
      Kumar Kartikeya Dwivedi authored
      This patch adds libbpf support for kernel module function call support.
      The fd_array parameter is used during BPF program load to pass module
      BTFs referenced by the program. insn->off is set to index into this
      array, but starts from 1, because insn->off as 0 is reserved for
      btf_vmlinux.
      
      We try to use existing insn->off for a module, since the kernel limits
      the maximum distinct module BTFs for kfuncs to 256, and also because
      index must never exceed the maximum allowed value that can fit in
      insn->off (INT16_MAX). In the future, if kernel interprets signed offset
      as unsigned for kfunc calls, this limit can be increased to UINT16_MAX.
      
      Also introduce a btf__find_by_name_kind_own helper to start searching
      from module BTF's start id when we know that the BTF ID is not present
      in vmlinux BTF (in find_ksym_btf_id).
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20211002011757.311265-7-memxor@gmail.com
      9dbe6015
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Enable TCP congestion control kfunc from modules · 0e32dfc8
      Kumar Kartikeya Dwivedi authored
      This commit moves BTF ID lookup into the newly added registration
      helper, in a way that the bbr, cubic, and dctcp implementation set up
      their sets in the bpf_tcp_ca kfunc_btf_set list, while the ones not
      dependent on modules are looked up from the wrapper function.
      
      This lifts the restriction for them to be compiled as built in objects,
      and can be loaded as modules if required. Also modify Makefile.modfinal
      to call resolve_btfids for each module.
      
      Note that since kernel kfunc_ids never overlap with module kfunc_ids, we
      only match the owner for module btf id sets.
      
      See following commits for background on use of:
      
       CONFIG_X86 ifdef:
       569c484f (bpf: Limit static tcp-cc functions in the .BTF_ids list to x86)
      
       CONFIG_DYNAMIC_FTRACE ifdef:
       7aae231a (bpf: tcp: Limit calling some tcp cc functions to CONFIG_DYNAMIC_FTRACE)
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20211002011757.311265-6-memxor@gmail.com
      0e32dfc8
    • Kumar Kartikeya Dwivedi's avatar
      tools: Allow specifying base BTF file in resolve_btfids · f614f2c7
      Kumar Kartikeya Dwivedi authored
      This commit allows specifying the base BTF for resolving btf id
      lists/sets during link time in the resolve_btfids tool. The base BTF is
      set to NULL if no path is passed. This allows resolving BTF ids for
      module kernel objects.
      
      Also, drop the --no-fail option, as it is only used in case .BTF_ids
      section is not present, instead make no-fail the default mode. The long
      option name is same as that of pahole.
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/20211002011757.311265-5-memxor@gmail.com
      f614f2c7
    • Kumar Kartikeya Dwivedi's avatar
      bpf: btf: Introduce helpers for dynamic BTF set registration · 14f267d9
      Kumar Kartikeya Dwivedi authored
      This adds helpers for registering btf_id_set from modules and the
      bpf_check_mod_kfunc_call callback that can be used to look them up.
      
      With in kernel sets, the way this is supposed to work is, in kernel
      callback looks up within the in-kernel kfunc whitelist, and then defers
      to the dynamic BTF set lookup if it doesn't find the BTF id. If there is
      no in-kernel BTF id set, this callback can be used directly.
      
      Also fix includes for btf.h and bpfptr.h so that they can included in
      isolation. This is in preparation for their usage in tcp_bbr, tcp_cubic
      and tcp_dctcp modules in the next patch.
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20211002011757.311265-4-memxor@gmail.com
      14f267d9
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Be conservative while processing invalid kfunc calls · a5d82727
      Kumar Kartikeya Dwivedi authored
      This patch also modifies the BPF verifier to only return error for
      invalid kfunc calls specially marked by userspace (with insn->imm == 0,
      insn->off == 0) after the verifier has eliminated dead instructions.
      This can be handled in the fixup stage, and skip processing during add
      and check stages.
      
      If such an invalid call is dropped, the fixup stage will not encounter
      insn->imm as 0, otherwise it bails out and returns an error.
      
      This will be exposed as weak ksym support in libbpf in later patches.
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20211002011757.311265-3-memxor@gmail.com
      a5d82727
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Introduce BPF support for kernel module function calls · 2357672c
      Kumar Kartikeya Dwivedi authored
      This change adds support on the kernel side to allow for BPF programs to
      call kernel module functions. Userspace will prepare an array of module
      BTF fds that is passed in during BPF_PROG_LOAD using fd_array parameter.
      In the kernel, the module BTFs are placed in the auxilliary struct for
      bpf_prog, and loaded as needed.
      
      The verifier then uses insn->off to index into the fd_array. insn->off
      0 is reserved for vmlinux BTF (for backwards compat), so userspace must
      use an fd_array index > 0 for module kfunc support. kfunc_btf_tab is
      sorted based on offset in an array, and each offset corresponds to one
      descriptor, with a max limit up to 256 such module BTFs.
      
      We also change existing kfunc_tab to distinguish each element based on
      imm, off pair as each such call will now be distinct.
      
      Another change is to check_kfunc_call callback, which now include a
      struct module * pointer, this is to be used in later patch such that the
      kfunc_id and module pointer are matched for dynamically registered BTF
      sets from loadable modules, so that same kfunc_id in two modules doesn't
      lead to check_kfunc_call succeeding. For the duration of the
      check_kfunc_call, the reference to struct module exists, as it returns
      the pointer stored in kfunc_btf_tab.
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20211002011757.311265-2-memxor@gmail.com
      2357672c
  2. 05 Oct, 2021 26 commits