1. 16 Sep, 2020 2 commits
  2. 15 Sep, 2020 7 commits
    • Yonghong Song's avatar
      libbpf: Fix a compilation error with xsk.c for ubuntu 16.04 · d317b0a8
      Yonghong Song authored
      When syncing latest libbpf repo to bcc, ubuntu 16.04 (4.4.0 LTS kernel)
      failed compilation for xsk.c:
        In file included from /tmp/debuild.0jkauG/bcc/src/cc/libbpf/src/xsk.c:23:0:
        /tmp/debuild.0jkauG/bcc/src/cc/libbpf/src/xsk.c: In function ‘xsk_get_ctx’:
        /tmp/debuild.0jkauG/bcc/src/cc/libbpf/include/linux/list.h:81:9: warning: implicit
        declaration of function ‘container_of’ [-Wimplicit-function-declaration]
                 container_of(ptr, type, member)
                 ^
        /tmp/debuild.0jkauG/bcc/src/cc/libbpf/include/linux/list.h:83:9: note: in expansion
        of macro ‘list_entry’
                 list_entry((ptr)->next, type, member)
        ...
        src/cc/CMakeFiles/bpf-static.dir/build.make:209: recipe for target
        'src/cc/CMakeFiles/bpf-static.dir/libbpf/src/xsk.c.o' failed
      
      Commit 2f6324a3 ("libbpf: Support shared umems between queues and devices")
      added include file <linux/list.h>, which uses macro "container_of".
      xsk.c file also includes <linux/ethtool.h> before <linux/list.h>.
      
      In a more recent distro kernel, <linux/ethtool.h> includes <linux/kernel.h>
      which contains the macro definition for "container_of". So compilation is all fine.
      But in ubuntu 16.04 kernel, <linux/ethtool.h> does not contain <linux/kernel.h>
      which caused the above compilation error.
      
      Let explicitly add <linux/kernel.h> in xsk.c to avoid compilation error
      in old distro's.
      
      Fixes: 2f6324a3 ("libbpf: Support shared umems between queues and devices")
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Link: https://lore.kernel.org/bpf/20200914223210.1831262-1-yhs@fb.com
      d317b0a8
    • Yonghong Song's avatar
      bpftool: Fix build failure · 63bea244
      Yonghong Song authored
      When building bpf selftests like
        make -C tools/testing/selftests/bpf -j20
      I hit the following errors:
        ...
        GEN      /net-next/tools/testing/selftests/bpf/tools/build/bpftool/Documentation/bpftool-gen.8
        <stdin>:75: (WARNING/2) Block quote ends without a blank line; unexpected unindent.
        <stdin>:71: (WARNING/2) Literal block ends without a blank line; unexpected unindent.
        <stdin>:85: (WARNING/2) Literal block ends without a blank line; unexpected unindent.
        <stdin>:57: (WARNING/2) Block quote ends without a blank line; unexpected unindent.
        <stdin>:66: (WARNING/2) Literal block ends without a blank line; unexpected unindent.
        <stdin>:109: (WARNING/2) Literal block ends without a blank line; unexpected unindent.
        <stdin>:175: (WARNING/2) Literal block ends without a blank line; unexpected unindent.
        <stdin>:273: (WARNING/2) Literal block ends without a blank line; unexpected unindent.
        make[1]: *** [/net-next/tools/testing/selftests/bpf/tools/build/bpftool/Documentation/bpftool-perf.8] Error 12
        make[1]: *** Waiting for unfinished jobs....
        make[1]: *** [/net-next/tools/testing/selftests/bpf/tools/build/bpftool/Documentation/bpftool-iter.8] Error 12
        make[1]: *** [/net-next/tools/testing/selftests/bpf/tools/build/bpftool/Documentation/bpftool-struct_ops.8] Error 12
        ...
      
      I am using:
        -bash-4.4$ rst2man --version
        rst2man (Docutils 0.11 [repository], Python 2.7.5, on linux2)
        -bash-4.4$
      
      The Makefile generated final .rst file (e.g., bpftool-cgroup.rst) looks like
        ...
            ID       AttachType      AttachFlags     Name
        \n SEE ALSO\n========\n\t**bpf**\ (2),\n\t**bpf-helpers**\
        (7),\n\t**bpftool**\ (8),\n\t**bpftool-btf**\
        (8),\n\t**bpftool-feature**\ (8),\n\t**bpftool-gen**\
        (8),\n\t**bpftool-iter**\ (8),\n\t**bpftool-link**\
        (8),\n\t**bpftool-map**\ (8),\n\t**bpftool-net**\
        (8),\n\t**bpftool-perf**\ (8),\n\t**bpftool-prog**\
        (8),\n\t**bpftool-struct_ops**\ (8)\n
      
      The rst2man generated .8 file looks like
      Literal block ends without a blank line; unexpected unindent.
       .sp
       n SEEALSOn========nt**bpf**(2),nt**bpf\-helpers**(7),nt**bpftool**(8),nt**bpftool\-btf**(8),nt**
       bpftool\-feature**(8),nt**bpftool\-gen**(8),nt**bpftool\-iter**(8),nt**bpftool\-link**(8),nt**
       bpftool\-map**(8),nt**bpftool\-net**(8),nt**bpftool\-perf**(8),nt**bpftool\-prog**(8),nt**
       bpftool\-struct_ops**(8)n
      
      Looks like that particular version of rst2man prefers to have actual new line
      instead of \n.
      
      Since `echo -e` may not be available in some environment, let us use `printf`.
      Format string "%b" is used for `printf` to ensure all escape characters are
      interpretted properly.
      
      Fixes: 18841da9 ("tools: bpftool: Automate generation for "SEE ALSO" sections in man pages")
      Suggested-by: default avatarAndrii Nakryiko <andrii.nakryiko@gmail.com>
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Cc: Quentin Monnet <quentin@isovalent.com>
      Link: https://lore.kernel.org/bpf/20200914183110.999906-1-yhs@fb.com
      63bea244
    • Magnus Karlsson's avatar
      xsk: Fix refcount warning in xp_dma_map · bf74a370
      Magnus Karlsson authored
      Fix a potential refcount warning that a zero value is increased to one
      in xp_dma_map, by initializing the refcount to one to start with,
      instead of zero plus a refcount_inc().
      
      Fixes: 921b6869 ("xsk: Enable sharing of dma mappings")
      Signed-off-by: default avatarMagnus Karlsson <magnus.karlsson@intel.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Link: https://lore.kernel.org/bpf/1600095036-23868-1-git-send-email-magnus.karlsson@gmail.com
      bf74a370
    • Magnus Karlsson's avatar
      samples/bpf: Add quiet option to xdpsock · 74e00676
      Magnus Karlsson authored
      Add a quiet option (-Q) that disables the statistics print outs of
      xdpsock. This is good to have when measuring 0% loss rate performance
      as it will be quite terrible if the application uses printfs.
      Signed-off-by: default avatarMagnus Karlsson <magnus.karlsson@intel.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/1599726666-8431-4-git-send-email-magnus.karlsson@gmail.com
      74e00676
    • Magnus Karlsson's avatar
      samples/bpf: Fix possible deadlock in xdpsock · 5a2a0dd8
      Magnus Karlsson authored
      Fix a possible deadlock in the l2fwd application in xdpsock that can
      occur when there is no space in the Tx ring. There are two ways to get
      the kernel to consume entries in the Tx ring: calling sendto() to make
      it send packets and freeing entries from the completion ring, as the
      kernel will not send a packet if there is no space for it to add a
      completion entry in the completion ring. The Tx loop in l2fwd only
      used to call sendto(). This patches adds cleaning the completion ring
      in that loop.
      Signed-off-by: default avatarMagnus Karlsson <magnus.karlsson@intel.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/1599726666-8431-3-git-send-email-magnus.karlsson@gmail.com
      5a2a0dd8
    • Magnus Karlsson's avatar
      samples/bpf: Fix one packet sending in xdpsock · 3131cf66
      Magnus Karlsson authored
      Fix the sending of a single packet (or small burst) in xdpsock when
      executing in copy mode. Currently, the l2fwd application in xdpsock
      only transmits the packets after a batch of them has been received,
      which might be confusing if you only send one packet and expect that
      it is returned pronto. Fix this by calling sendto() more often and add
      a comment in the code that states that this can be optimized if
      needed.
      Reported-by: default avatarTirthendu Sarkar <tirthendu.sarkar@intel.com>
      Signed-off-by: default avatarMagnus Karlsson <magnus.karlsson@intel.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/1599726666-8431-2-git-send-email-magnus.karlsson@gmail.com
      3131cf66
    • Ilya Leoshkevich's avatar
      s390/bpf: Fix multiple tail calls · d72714c1
      Ilya Leoshkevich authored
      In order to branch around tail calls (due to out-of-bounds index,
      exceeding tail call count or missing tail call target), JIT uses
      label[0] field, which contains the address of the instruction following
      the tail call. When there are multiple tail calls, label[0] value comes
      from handling of a previous tail call, which is incorrect.
      
      Fix by getting rid of label array and resolving the label address
      locally: for all 3 branches that jump to it, emit 0 offsets at the
      beginning, and then backpatch them with the correct value.
      
      Also, do not use the long jump infrastructure: the tail call sequence
      is known to be short, so make all 3 jumps short.
      
      Fixes: 6651ee07 ("s390/bpf: implement bpf_tail_call() helper")
      Signed-off-by: default avatarIlya Leoshkevich <iii@linux.ibm.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200909232141.3099367-1-iii@linux.ibm.com
      d72714c1
  3. 11 Sep, 2020 14 commits
  4. 10 Sep, 2020 8 commits
  5. 09 Sep, 2020 4 commits
    • Andrii Nakryiko's avatar
      perf: Stop using deprecated bpf_program__title() · 8081ede1
      Andrii Nakryiko authored
      Switch from deprecated bpf_program__title() API to
      bpf_program__section_name(). Also drop unnecessary error checks because
      neither bpf_program__title() nor bpf_program__section_name() can fail or
      return NULL.
      
      Fixes: 52109584 ("libbpf: Deprecate notion of BPF program "title" in favor of "section name"")
      Signed-off-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Reviewed-by: default avatarTobias Klauser <tklauser@distanz.ch>
      Acked-by: default avatarJiri Olsa <jolsa@redhat.com>
      Link: https://lore.kernel.org/bpf/20200908180127.1249-1-andriin@fb.com
      8081ede1
    • Yonghong Song's avatar
      selftests/bpf: Fix test_sysctl_loop{1, 2} failure due to clang change · 7fb5eefd
      Yonghong Song authored
      Andrii reported that with latest clang, when building selftests, we have
      error likes:
        error: progs/test_sysctl_loop1.c:23:16: in function sysctl_tcp_mem i32 (%struct.bpf_sysctl*):
        Looks like the BPF stack limit of 512 bytes is exceeded.
        Please move large on stack variables into BPF per-cpu array map.
      
      The error is triggered by the following LLVM patch:
        https://reviews.llvm.org/D87134
      
      For example, the following code is from test_sysctl_loop1.c:
        static __always_inline int is_tcp_mem(struct bpf_sysctl *ctx)
        {
          volatile char tcp_mem_name[] = "net/ipv4/tcp_mem/very_very_very_very_long_pointless_string";
          ...
        }
      Without the above LLVM patch, the compiler did optimization to load the string
      (59 bytes long) with 7 64bit loads, 1 8bit load and 1 16bit load,
      occupying 64 byte stack size.
      
      With the above LLVM patch, the compiler only uses 8bit loads, but subregister is 32bit.
      So stack requirements become 4 * 59 = 236 bytes. Together with other stuff on
      the stack, total stack size exceeds 512 bytes, hence compiler complains and quits.
      
      To fix the issue, removing "volatile" key word or changing "volatile" to
      "const"/"static const" does not work, the string is put in .rodata.str1.1 section,
      which libbpf did not process it and errors out with
        libbpf: elf: skipping unrecognized data section(6) .rodata.str1.1
        libbpf: prog 'sysctl_tcp_mem': bad map relo against '.L__const.is_tcp_mem.tcp_mem_name'
                in section '.rodata.str1.1'
      
      Defining the string const as global variable can fix the issue as it puts the string constant
      in '.rodata' section which is recognized by libbpf. In the future, when libbpf can process
      '.rodata.str*.*' properly, the global definition can be changed back to local definition.
      
      Defining tcp_mem_name as a global, however, triggered a verifier failure.
         ./test_progs -n 7/21
        libbpf: load bpf program failed: Permission denied
        libbpf: -- BEGIN DUMP LOG ---
        libbpf:
        invalid stack off=0 size=1
        verification time 6975 usec
        stack depth 160+64
        processed 889 insns (limit 1000000) max_states_per_insn 4 total_states
        14 peak_states 14 mark_read 10
      
        libbpf: -- END LOG --
        libbpf: failed to load program 'sysctl_tcp_mem'
        libbpf: failed to load object 'test_sysctl_loop2.o'
        test_bpf_verif_scale:FAIL:114
        #7/21 test_sysctl_loop2.o:FAIL
      This actually exposed a bpf program bug. In test_sysctl_loop{1,2}, we have code
      like
        const char tcp_mem_name[] = "<...long string...>";
        ...
        char name[64];
        ...
        for (i = 0; i < sizeof(tcp_mem_name); ++i)
            if (name[i] != tcp_mem_name[i])
                return 0;
      In the above code, if sizeof(tcp_mem_name) > 64, name[i] access may be
      out of bound. The sizeof(tcp_mem_name) is 59 for test_sysctl_loop1.c and
      79 for test_sysctl_loop2.c.
      
      Without promotion-to-global change, old compiler generates code where
      the overflowed stack access is actually filled with valid value, so hiding
      the bpf program bug. With promotion-to-global change, the code is different,
      more specifically, the previous loading constants to stack is gone, and
      "name" occupies stack[-64:0] and overflow access triggers a verifier error.
      To fix the issue, adjust "name" buffer size properly.
      Reported-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Link: https://lore.kernel.org/bpf/20200909171542.3673449-1-yhs@fb.com
      7fb5eefd
    • Yonghong Song's avatar
      selftests/bpf: Add test for map_ptr arithmetic · e6054fc1
      Yonghong Song authored
      Change selftest map_ptr_kern.c with disabling inlining for
      one of subtests, which will fail the test without previous
      verifier change. Also added to verifier test for both
      "map_ptr += scalar" and "scalar += map_ptr" arithmetic.
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Link: https://lore.kernel.org/bpf/20200908175703.2463721-1-yhs@fb.com
      e6054fc1
    • Yonghong Song's avatar
      bpf: Permit map_ptr arithmetic with opcode add and offset 0 · 7c696732
      Yonghong Song authored
      Commit 41c48f3a ("bpf: Support access
      to bpf map fields") added support to access map fields
      with CORE support. For example,
      
                  struct bpf_map {
                          __u32 max_entries;
                  } __attribute__((preserve_access_index));
      
                  struct bpf_array {
                          struct bpf_map map;
                          __u32 elem_size;
                  } __attribute__((preserve_access_index));
      
                  struct {
                          __uint(type, BPF_MAP_TYPE_ARRAY);
                          __uint(max_entries, 4);
                          __type(key, __u32);
                          __type(value, __u32);
                  } m_array SEC(".maps");
      
                  SEC("cgroup_skb/egress")
                  int cg_skb(void *ctx)
                  {
                          struct bpf_array *array = (struct bpf_array *)&m_array;
      
                          /* .. array->map.max_entries .. */
                  }
      
      In kernel, bpf_htab has similar structure,
      
      	    struct bpf_htab {
      		    struct bpf_map map;
                          ...
                  }
      
      In the above cg_skb(), to access array->map.max_entries, with CORE, the clang will
      generate two builtin's.
                  base = &m_array;
                  /* access array.map */
                  map_addr = __builtin_preserve_struct_access_info(base, 0, 0);
                  /* access array.map.max_entries */
                  max_entries_addr = __builtin_preserve_struct_access_info(map_addr, 0, 0);
      	    max_entries = *max_entries_addr;
      
      In the current llvm, if two builtin's are in the same function or
      in the same function after inlining, the compiler is smart enough to chain
      them together and generates like below:
                  base = &m_array;
                  max_entries = *(base + reloc_offset); /* reloc_offset = 0 in this case */
      and we are fine.
      
      But if we force no inlining for one of functions in test_map_ptr() selftest, e.g.,
      check_default(), the above two __builtin_preserve_* will be in two different
      functions. In this case, we will have code like:
         func check_hash():
                  reloc_offset_map = 0;
                  base = &m_array;
                  map_base = base + reloc_offset_map;
                  check_default(map_base, ...)
         func check_default(map_base, ...):
                  max_entries = *(map_base + reloc_offset_max_entries);
      
      In kernel, map_ptr (CONST_PTR_TO_MAP) does not allow any arithmetic.
      The above "map_base = base + reloc_offset_map" will trigger a verifier failure.
        ; VERIFY(check_default(&hash->map, map));
        0: (18) r7 = 0xffffb4fe8018a004
        2: (b4) w1 = 110
        3: (63) *(u32 *)(r7 +0) = r1
         R1_w=invP110 R7_w=map_value(id=0,off=4,ks=4,vs=8,imm=0) R10=fp0
        ; VERIFY_TYPE(BPF_MAP_TYPE_HASH, check_hash);
        4: (18) r1 = 0xffffb4fe8018a000
        6: (b4) w2 = 1
        7: (63) *(u32 *)(r1 +0) = r2
         R1_w=map_value(id=0,off=0,ks=4,vs=8,imm=0) R2_w=invP1 R7_w=map_value(id=0,off=4,ks=4,vs=8,imm=0) R10=fp0
        8: (b7) r2 = 0
        9: (18) r8 = 0xffff90bcb500c000
        11: (18) r1 = 0xffff90bcb500c000
        13: (0f) r1 += r2
        R1 pointer arithmetic on map_ptr prohibited
      
      To fix the issue, let us permit map_ptr + 0 arithmetic which will
      result in exactly the same map_ptr.
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Link: https://lore.kernel.org/bpf/20200908175702.2463625-1-yhs@fb.com
      7c696732
  6. 07 Sep, 2020 3 commits
  7. 04 Sep, 2020 2 commits