1. 17 Sep, 2020 3 commits
    • Maciej Fijalkowski's avatar
      bpf: rename poke descriptor's 'ip' member to 'tailcall_target' · cf71b174
      Maciej Fijalkowski authored
      Reflect the actual purpose of poke->ip and rename it to
      poke->tailcall_target so that it will not the be confused with another
      poke target that will be introduced in next commit.
      
      While at it, do the same thing with poke->ip_stable - rename it to
      poke->tailcall_target_stable.
      Signed-off-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      cf71b174
    • Maciej Fijalkowski's avatar
      bpf: propagate poke descriptors to subprograms · a748c697
      Maciej Fijalkowski authored
      Previously, there was no need for poke descriptors being present in
      subprogram's bpf_prog_aux struct since tailcalls were simply not allowed
      in them. Each subprog is JITed independently so in order to enable
      JITing subprograms that use tailcalls, do the following:
      
      - in fixup_bpf_calls() store the index of tailcall insn onto the generated
        poke descriptor,
      - in case when insn patching occurs, adjust the tailcall insn idx from
        bpf_patch_insn_data,
      - then in jit_subprogs() check whether the given poke descriptor belongs
        to the current subprog by checking if that previously stored absolute
        index of tail call insn is in the scope of the insns of given subprog,
      - update the insn->imm with new poke descriptor slot so that while JITing
        the proper poke descriptor will be grabbed
      
      This way each of the main program's poke descriptors are distributed
      across the subprograms poke descriptor array, so main program's
      descriptors can be untracked out of the prog array map.
      
      Add also subprog's aux struct to the BPF map poke_progs list by calling
      on it map_poke_track().
      
      In case of any error, call the map_poke_untrack() on subprog's aux
      structs that have already been registered to prog array map.
      Signed-off-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      a748c697
    • Maciej Fijalkowski's avatar
      bpf, x64: use %rcx instead of %rax for tail call retpolines · 0d4ddce3
      Maciej Fijalkowski authored
      Currently, %rax is used to store the jump target when BPF program is
      emitting the retpoline instructions that are handling the indirect
      tailcall.
      
      There is a plan to use %rax for different purpose, which is storing the
      tail call counter. In order to preserve this value across the tailcalls,
      adjust the BPF indirect tailcalls so that the target program will reside
      in %rcx and teach the retpoline instructions about new location of jump
      target.
      Signed-off-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      0d4ddce3
  2. 16 Sep, 2020 7 commits
    • Andrii Nakryiko's avatar
      selftests/bpf: Merge most of test_btf into test_progs · c64779e2
      Andrii Nakryiko authored
      Merge 183 tests from test_btf into test_progs framework to be exercised
      regularly. All the test_btf tests that were moved are modeled as proper
      sub-tests in test_progs framework for ease of debugging and reporting.
      
      No functional or behavioral changes were intended, I tried to preserve
      original behavior as much as possible. E.g., `test_progs -v` will activate
      "always_log" flag to emit BTF validation log.
      
      The only difference is in reducing the max_entries limit for pretty-printing
      tests from (128 * 1024) to just 128 to reduce tests running time without
      reducing the coverage.
      
      Example test run:
      
        $ sudo ./test_progs -n 8
        ...
        #8 btf:OK
        Summary: 1/183 PASSED, 0 SKIPPED, 0 FAILED
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200916004819.3767489-1-andriin@fb.com
      c64779e2
    • Alexei Starovoitov's avatar
      Merge branch 'bpf_metadata' · ffa915f4
      Alexei Starovoitov authored
      Stanislav Fomichev says:
      
      ====================
      Currently, if a user wants to store arbitrary metadata for an eBPF
      program, for example, the program build commit hash or version, they
      could store it in a map, and conveniently libbpf uses .data section to
      populate an internal map. However, if the program does not actually
      reference the map, then the map would be de-refcounted and freed.
      
      This patch set introduces a new syscall BPF_PROG_BIND_MAP to add a map
      to a program's used_maps, even if the program instructions does not
      reference the map.
      
      libbpf is extended to always BPF_PROG_BIND_MAP .rodata section so the
      metadata is kept in place.
      bpftool is also extended to print metadata in the 'bpftool prog' list.
      
      The variable is considered metadata if it starts with the
      magic 'bpf_metadata_' prefix; everything after the prefix is the
      metadata name.
      
      An example use of this would be BPF C file declaring:
      
        volatile const char bpf_metadata_commit_hash[] SEC(".rodata") = "abcdef123456";
      
      and bpftool would emit:
      
        $ bpftool prog
        [...]
              metadata:
                      commit_hash = "abcdef123456"
      
      v6 changes:
      * libbpf: drop FEAT_GLOBAL_DATA from probe_prog_bind_map (Andrii Nakryiko)
      * bpftool: combine find_metadata_map_id & find_metadata;
        drops extra bpf_map_get_fd_by_id and bpf_map_get_fd_by_id (Andrii Nakryiko)
      * bpftool: use strncmp instead of strstr (Andrii Nakryiko)
      * bpftool: memset(map_info) and extra empty line (Andrii Nakryiko)
      
      v5 changes:
      * selftest: verify that prog holds rodata (Andrii Nakryiko)
      * selftest: use volatile for metadata (Andrii Nakryiko)
      * bpftool: use sizeof in BPF_METADATA_PREFIX_LEN (Andrii Nakryiko)
      * bpftool: new find_metadata that does map lookup (Andrii Nakryiko)
      * libbpf: don't generalize probe_create_global_data (Andrii Nakryiko)
      * libbpf: use OPTS_VALID in bpf_prog_bind_map (Andrii Nakryiko)
      * libbpf: keep LIBBPF_0.2.0 sorted (Andrii Nakryiko)
      
      v4 changes:
      * Don't return EEXIST from syscall if already bound (Andrii Nakryiko)
      * Removed --metadata argument (Andrii Nakryiko)
      * Removed custom .metadata section (Alexei Starovoitov)
      * Addressed Andrii's suggestions about btf helpers and vsi (Andrii Nakryiko)
      * Moved bpf_prog_find_metadata into bpftool (Alexei Starovoitov)
      
      v3 changes:
      * API changes for bpf_prog_find_metadata (Toke Høiland-Jørgensen)
      
      v2 changes:
      * Made struct bpf_prog_bind_opts in libbpf so flags is optional.
      * Deduped probe_kern_global_data and probe_prog_bind_map into a common
        helper.
      * Added comment regarding why EEXIST is ignored in libbpf bind map.
      * Froze all LIBBPF_MAP_METADATA internal maps.
      * Moved bpf_prog_bind_map into new LIBBPF_0.1.1 in libbpf.map.
      * Added p_err() calls on error cases in bpftool show_prog_metadata.
      * Reverse christmas tree coding style in bpftool show_prog_metadata.
      * Made bpftool gen skeleton recognize .metadata as an internal map and
        generate datasec definition in skeleton.
      * Added C test using skeleton to see asset that the metadata is what we
        expect and rebinding causes EEXIST.
      
      v1 changes:
      * Fixed a few missing unlocks, and missing close while iterating map fds.
      * Move mutex initialization to right after prog aux allocation, and mutex
        destroy to right after prog aux free.
      * s/ADD_MAP/BIND_MAP/
      * Use mutex only instead of RCU to protect the used_map array & count.
      
      Cc: YiFei Zhu <zhuyifei1999@gmail.com>
      ====================
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      ffa915f4
    • YiFei Zhu's avatar
      selftests/bpf: Test load and dump metadata with btftool and skel · d42d1cc4
      YiFei Zhu authored
      This is a simple test to check that loading and dumping metadata
      in btftool works, whether or not metadata contents are used by the
      program.
      
      A C test is also added to make sure the skeleton code can read the
      metadata values.
      Signed-off-by: default avatarYiFei Zhu <zhuyifei@google.com>
      Signed-off-by: default avatarStanislav Fomichev <sdf@google.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Cc: YiFei Zhu <zhuyifei1999@gmail.com>
      Link: https://lore.kernel.org/bpf/20200915234543.3220146-6-sdf@google.com
      d42d1cc4
    • YiFei Zhu's avatar
      bpftool: Support dumping metadata · aff52e68
      YiFei Zhu authored
      Dump metadata in the 'bpftool prog' list if it's present.
      For some formatting some BTF code is put directly in the
      metadata dumping. Sanity checks on the map and the kind of the btf_type
      to make sure we are actually dumping what we are expecting.
      
      A helper jsonw_reset is added to json writer so we can reuse the same
      json writer without having extraneous commas.
      
      Sample output:
      
        $ bpftool prog
        6: cgroup_skb  name prog  tag bcf7977d3b93787c  gpl
        [...]
        	btf_id 4
        	metadata:
        		a = "foo"
        		b = 1
      
        $ bpftool prog --json --pretty
        [{
                "id": 6,
        [...]
                "btf_id": 4,
                "metadata": {
                    "a": "foo",
                    "b": 1
                }
            }
        ]
      Signed-off-by: default avatarYiFei Zhu <zhuyifei@google.com>
      Signed-off-by: default avatarStanislav Fomichev <sdf@google.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Cc: YiFei Zhu <zhuyifei1999@gmail.com>
      Link: https://lore.kernel.org/bpf/20200915234543.3220146-5-sdf@google.com
      aff52e68
    • YiFei Zhu's avatar
      libbpf: Add BPF_PROG_BIND_MAP syscall and use it on .rodata section · 5d23328d
      YiFei Zhu authored
      The patch adds a simple wrapper bpf_prog_bind_map around the syscall.
      When the libbpf tries to load a program, it will probe the kernel for
      the support of this syscall and unconditionally bind .rodata section
      to the program.
      Signed-off-by: default avatarYiFei Zhu <zhuyifei@google.com>
      Signed-off-by: default avatarStanislav Fomichev <sdf@google.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Cc: YiFei Zhu <zhuyifei1999@gmail.com>
      Link: https://lore.kernel.org/bpf/20200915234543.3220146-4-sdf@google.com
      5d23328d
    • YiFei Zhu's avatar
      bpf: Add BPF_PROG_BIND_MAP syscall · ef15314a
      YiFei Zhu authored
      This syscall binds a map to a program. Returns success if the map is
      already bound to the program.
      Signed-off-by: default avatarYiFei Zhu <zhuyifei@google.com>
      Signed-off-by: default avatarStanislav Fomichev <sdf@google.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Cc: YiFei Zhu <zhuyifei1999@gmail.com>
      Link: https://lore.kernel.org/bpf/20200915234543.3220146-3-sdf@google.com
      ef15314a
    • YiFei Zhu's avatar
      bpf: Mutex protect used_maps array and count · 984fe94f
      YiFei Zhu authored
      To support modifying the used_maps array, we use a mutex to protect
      the use of the counter and the array. The mutex is initialized right
      after the prog aux is allocated, and destroyed right before prog
      aux is freed. This way we guarantee it's initialized for both cBPF
      and eBPF.
      Signed-off-by: default avatarYiFei Zhu <zhuyifei@google.com>
      Signed-off-by: default avatarStanislav Fomichev <sdf@google.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Cc: YiFei Zhu <zhuyifei1999@gmail.com>
      Link: https://lore.kernel.org/bpf/20200915234543.3220146-2-sdf@google.com
      984fe94f
  3. 15 Sep, 2020 7 commits
    • Yonghong Song's avatar
      libbpf: Fix a compilation error with xsk.c for ubuntu 16.04 · d317b0a8
      Yonghong Song authored
      When syncing latest libbpf repo to bcc, ubuntu 16.04 (4.4.0 LTS kernel)
      failed compilation for xsk.c:
        In file included from /tmp/debuild.0jkauG/bcc/src/cc/libbpf/src/xsk.c:23:0:
        /tmp/debuild.0jkauG/bcc/src/cc/libbpf/src/xsk.c: In function ‘xsk_get_ctx’:
        /tmp/debuild.0jkauG/bcc/src/cc/libbpf/include/linux/list.h:81:9: warning: implicit
        declaration of function ‘container_of’ [-Wimplicit-function-declaration]
                 container_of(ptr, type, member)
                 ^
        /tmp/debuild.0jkauG/bcc/src/cc/libbpf/include/linux/list.h:83:9: note: in expansion
        of macro ‘list_entry’
                 list_entry((ptr)->next, type, member)
        ...
        src/cc/CMakeFiles/bpf-static.dir/build.make:209: recipe for target
        'src/cc/CMakeFiles/bpf-static.dir/libbpf/src/xsk.c.o' failed
      
      Commit 2f6324a3 ("libbpf: Support shared umems between queues and devices")
      added include file <linux/list.h>, which uses macro "container_of".
      xsk.c file also includes <linux/ethtool.h> before <linux/list.h>.
      
      In a more recent distro kernel, <linux/ethtool.h> includes <linux/kernel.h>
      which contains the macro definition for "container_of". So compilation is all fine.
      But in ubuntu 16.04 kernel, <linux/ethtool.h> does not contain <linux/kernel.h>
      which caused the above compilation error.
      
      Let explicitly add <linux/kernel.h> in xsk.c to avoid compilation error
      in old distro's.
      
      Fixes: 2f6324a3 ("libbpf: Support shared umems between queues and devices")
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Link: https://lore.kernel.org/bpf/20200914223210.1831262-1-yhs@fb.com
      d317b0a8
    • Yonghong Song's avatar
      bpftool: Fix build failure · 63bea244
      Yonghong Song authored
      When building bpf selftests like
        make -C tools/testing/selftests/bpf -j20
      I hit the following errors:
        ...
        GEN      /net-next/tools/testing/selftests/bpf/tools/build/bpftool/Documentation/bpftool-gen.8
        <stdin>:75: (WARNING/2) Block quote ends without a blank line; unexpected unindent.
        <stdin>:71: (WARNING/2) Literal block ends without a blank line; unexpected unindent.
        <stdin>:85: (WARNING/2) Literal block ends without a blank line; unexpected unindent.
        <stdin>:57: (WARNING/2) Block quote ends without a blank line; unexpected unindent.
        <stdin>:66: (WARNING/2) Literal block ends without a blank line; unexpected unindent.
        <stdin>:109: (WARNING/2) Literal block ends without a blank line; unexpected unindent.
        <stdin>:175: (WARNING/2) Literal block ends without a blank line; unexpected unindent.
        <stdin>:273: (WARNING/2) Literal block ends without a blank line; unexpected unindent.
        make[1]: *** [/net-next/tools/testing/selftests/bpf/tools/build/bpftool/Documentation/bpftool-perf.8] Error 12
        make[1]: *** Waiting for unfinished jobs....
        make[1]: *** [/net-next/tools/testing/selftests/bpf/tools/build/bpftool/Documentation/bpftool-iter.8] Error 12
        make[1]: *** [/net-next/tools/testing/selftests/bpf/tools/build/bpftool/Documentation/bpftool-struct_ops.8] Error 12
        ...
      
      I am using:
        -bash-4.4$ rst2man --version
        rst2man (Docutils 0.11 [repository], Python 2.7.5, on linux2)
        -bash-4.4$
      
      The Makefile generated final .rst file (e.g., bpftool-cgroup.rst) looks like
        ...
            ID       AttachType      AttachFlags     Name
        \n SEE ALSO\n========\n\t**bpf**\ (2),\n\t**bpf-helpers**\
        (7),\n\t**bpftool**\ (8),\n\t**bpftool-btf**\
        (8),\n\t**bpftool-feature**\ (8),\n\t**bpftool-gen**\
        (8),\n\t**bpftool-iter**\ (8),\n\t**bpftool-link**\
        (8),\n\t**bpftool-map**\ (8),\n\t**bpftool-net**\
        (8),\n\t**bpftool-perf**\ (8),\n\t**bpftool-prog**\
        (8),\n\t**bpftool-struct_ops**\ (8)\n
      
      The rst2man generated .8 file looks like
      Literal block ends without a blank line; unexpected unindent.
       .sp
       n SEEALSOn========nt**bpf**(2),nt**bpf\-helpers**(7),nt**bpftool**(8),nt**bpftool\-btf**(8),nt**
       bpftool\-feature**(8),nt**bpftool\-gen**(8),nt**bpftool\-iter**(8),nt**bpftool\-link**(8),nt**
       bpftool\-map**(8),nt**bpftool\-net**(8),nt**bpftool\-perf**(8),nt**bpftool\-prog**(8),nt**
       bpftool\-struct_ops**(8)n
      
      Looks like that particular version of rst2man prefers to have actual new line
      instead of \n.
      
      Since `echo -e` may not be available in some environment, let us use `printf`.
      Format string "%b" is used for `printf` to ensure all escape characters are
      interpretted properly.
      
      Fixes: 18841da9 ("tools: bpftool: Automate generation for "SEE ALSO" sections in man pages")
      Suggested-by: default avatarAndrii Nakryiko <andrii.nakryiko@gmail.com>
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Cc: Quentin Monnet <quentin@isovalent.com>
      Link: https://lore.kernel.org/bpf/20200914183110.999906-1-yhs@fb.com
      63bea244
    • Magnus Karlsson's avatar
      xsk: Fix refcount warning in xp_dma_map · bf74a370
      Magnus Karlsson authored
      Fix a potential refcount warning that a zero value is increased to one
      in xp_dma_map, by initializing the refcount to one to start with,
      instead of zero plus a refcount_inc().
      
      Fixes: 921b6869 ("xsk: Enable sharing of dma mappings")
      Signed-off-by: default avatarMagnus Karlsson <magnus.karlsson@intel.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Link: https://lore.kernel.org/bpf/1600095036-23868-1-git-send-email-magnus.karlsson@gmail.com
      bf74a370
    • Magnus Karlsson's avatar
      samples/bpf: Add quiet option to xdpsock · 74e00676
      Magnus Karlsson authored
      Add a quiet option (-Q) that disables the statistics print outs of
      xdpsock. This is good to have when measuring 0% loss rate performance
      as it will be quite terrible if the application uses printfs.
      Signed-off-by: default avatarMagnus Karlsson <magnus.karlsson@intel.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/1599726666-8431-4-git-send-email-magnus.karlsson@gmail.com
      74e00676
    • Magnus Karlsson's avatar
      samples/bpf: Fix possible deadlock in xdpsock · 5a2a0dd8
      Magnus Karlsson authored
      Fix a possible deadlock in the l2fwd application in xdpsock that can
      occur when there is no space in the Tx ring. There are two ways to get
      the kernel to consume entries in the Tx ring: calling sendto() to make
      it send packets and freeing entries from the completion ring, as the
      kernel will not send a packet if there is no space for it to add a
      completion entry in the completion ring. The Tx loop in l2fwd only
      used to call sendto(). This patches adds cleaning the completion ring
      in that loop.
      Signed-off-by: default avatarMagnus Karlsson <magnus.karlsson@intel.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/1599726666-8431-3-git-send-email-magnus.karlsson@gmail.com
      5a2a0dd8
    • Magnus Karlsson's avatar
      samples/bpf: Fix one packet sending in xdpsock · 3131cf66
      Magnus Karlsson authored
      Fix the sending of a single packet (or small burst) in xdpsock when
      executing in copy mode. Currently, the l2fwd application in xdpsock
      only transmits the packets after a batch of them has been received,
      which might be confusing if you only send one packet and expect that
      it is returned pronto. Fix this by calling sendto() more often and add
      a comment in the code that states that this can be optimized if
      needed.
      Reported-by: default avatarTirthendu Sarkar <tirthendu.sarkar@intel.com>
      Signed-off-by: default avatarMagnus Karlsson <magnus.karlsson@intel.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/1599726666-8431-2-git-send-email-magnus.karlsson@gmail.com
      3131cf66
    • Ilya Leoshkevich's avatar
      s390/bpf: Fix multiple tail calls · d72714c1
      Ilya Leoshkevich authored
      In order to branch around tail calls (due to out-of-bounds index,
      exceeding tail call count or missing tail call target), JIT uses
      label[0] field, which contains the address of the instruction following
      the tail call. When there are multiple tail calls, label[0] value comes
      from handling of a previous tail call, which is incorrect.
      
      Fix by getting rid of label array and resolving the label address
      locally: for all 3 branches that jump to it, emit 0 offsets at the
      beginning, and then backpatch them with the correct value.
      
      Also, do not use the long jump infrastructure: the tail call sequence
      is known to be short, so make all 3 jumps short.
      
      Fixes: 6651ee07 ("s390/bpf: implement bpf_tail_call() helper")
      Signed-off-by: default avatarIlya Leoshkevich <iii@linux.ibm.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200909232141.3099367-1-iii@linux.ibm.com
      d72714c1
  4. 11 Sep, 2020 14 commits
  5. 10 Sep, 2020 8 commits
  6. 09 Sep, 2020 1 commit