1. 09 Dec, 2022 5 commits
  2. 08 Dec, 2022 8 commits
    • Stanislav Fomichev's avatar
      selftests/bpf: Bring test_offload.py back to life · e60db051
      Stanislav Fomichev authored
      Bpftool has new extra libbpf_det_bind probing map we need to exclude.
      Also skip trying to load netdevsim modules if it's already loaded (builtin).
      
      v2:
      - drop iproute2->bpftool changes (Toke)
      Signed-off-by: default avatarStanislav Fomichev <sdf@google.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/20221206232739.2504890-1-sdf@google.com
      e60db051
    • Yang Jihong's avatar
      bpf: Fix comment error in fixup_kfunc_call function · c2cc0ce7
      Yang Jihong authored
      insn->imm for kfunc is the relative address of __bpf_call_base,
      instead of __bpf_base_call, Fix the comment error.
      Signed-off-by: default avatarYang Jihong <yangjihong1@huawei.com>
      Link: https://lore.kernel.org/r/20221208013724.257848-1-yangjihong1@huawei.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      c2cc0ce7
    • Björn Töpel's avatar
      bpf: Do not zero-extend kfunc return values · d35af0a7
      Björn Töpel authored
      In BPF all global functions, and BPF helpers return a 64-bit
      value. For kfunc calls, this is not the case, and they can return
      e.g. 32-bit values.
      
      The return register R0 for kfuncs calls can therefore be marked as
      subreg_def != DEF_NOT_SUBREG. In general, if a register is marked with
      subreg_def != DEF_NOT_SUBREG, some archs (where bpf_jit_needs_zext()
      returns true) require the verifier to insert explicit zero-extension
      instructions.
      
      For kfuncs calls, however, the caller should do sign/zero extension
      for return values. In other words, the compiler is responsible to
      insert proper instructions, not the verifier.
      
      An example, provided by Yonghong Song:
      
      $ cat t.c
      extern unsigned foo(void);
      unsigned bar1(void) {
           return foo();
      }
      unsigned bar2(void) {
           if (foo()) return 10; else return 20;
      }
      
      $ clang -target bpf -mcpu=v3 -O2 -c t.c && llvm-objdump -d t.o
      t.o:    file format elf64-bpf
      
      Disassembly of section .text:
      
      0000000000000000 <bar1>:
      	0:       85 10 00 00 ff ff ff ff call -0x1
      	1:       95 00 00 00 00 00 00 00 exit
      
      0000000000000010 <bar2>:
      	2:       85 10 00 00 ff ff ff ff call -0x1
      	3:       bc 01 00 00 00 00 00 00 w1 = w0
      	4:       b4 00 00 00 14 00 00 00 w0 = 0x14
      	5:       16 01 01 00 00 00 00 00 if w1 == 0x0 goto +0x1 <LBB1_2>
      	6:       b4 00 00 00 0a 00 00 00 w0 = 0xa
      
      0000000000000038 <LBB1_2>:
      	7:       95 00 00 00 00 00 00 00 exit
      
      If the return value of 'foo()' is used in the BPF program, the proper
      zero-extension will be done.
      
      Currently, the verifier correctly marks, say, a 32-bit return value as
      subreg_def != DEF_NOT_SUBREG, but will fail performing the actual
      zero-extension, due to a verifier bug in
      opt_subreg_zext_lo32_rnd_hi32(). load_reg is not properly set to R0,
      and the following path will be taken:
      
      		if (WARN_ON(load_reg == -1)) {
      			verbose(env, "verifier bug. zext_dst is set, but no reg is defined\n");
      			return -EFAULT;
      		}
      
      A longer discussion from v1 can be found in the link below.
      
      Correct the verifier by avoiding doing explicit zero-extension of R0
      for kfunc calls. Note that R0 will still be marked as a sub-register
      for return values smaller than 64-bit.
      
      Fixes: 83a28819 ("bpf: Account for BPF_FETCH in insn_has_def32()")
      Link: https://lore.kernel.org/bpf/20221202103620.1915679-1-bjorn@kernel.org/Suggested-by: default avatarYonghong Song <yhs@meta.com>
      Signed-off-by: default avatarBjörn Töpel <bjorn@rivosinc.com>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Link: https://lore.kernel.org/r/20221207103540.396496-1-bjorn@kernel.orgSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      d35af0a7
    • Alexei Starovoitov's avatar
      Merge branch 'Document some recent core kfunc additions' · 2d141236
      Alexei Starovoitov authored
      David Vernet says:
      
      ====================
      
      A series of recent patch sets introduced kfuncs that allowed struct
      task_struct and struct cgroup objects to be used as kptrs. These were
      introduced in [0], [1], and [2].
      
      [0]: https://lore.kernel.org/lkml/20221120051004.3605026-1-void@manifault.com/
      [1]: https://lore.kernel.org/lkml/20221122145300.251210-2-void@manifault.com/T/
      [2]: https://lore.kernel.org/lkml/20221122055458.173143-1-void@manifault.com/
      
      These are "core" kfuncs, in that they may be used by a wide variety of
      possible BPF tracepoint or struct_ops programs, and are defined in
      kernel/bpf/helpers.c. Even though as kfuncs they have no ABI stability
      guarantees, they should still be properly documented. This patch set
      adds that documentation.
      
      Some other kfuncs were added recently as well, such as
      bpf_rcu_read_lock() and bpf_rcu_read_unlock(). Those could and should be
      added to this "Core kfuncs" section as well in subsequent patch sets.
      
      Note that this patch set does not contain documentation for
      bpf_task_acquire_not_zero(), or bpf_task_kptr_get(). As discussed in
      [3], those kfuncs currently always return NULL pending resolution on how
      to properly protect their arguments using RCU.
      
      [3]: https://lore.kernel.org/all/20221206210538.597606-1-void@manifault.com/
      ---
      Changelog:
      v2 -> v3:
      - Don't document bpf_task_kptr_get(), and instead provide a more
        substantive example for bpf_cgroup_kptr_get().
      - Further clarify expected behavior of bpf_task_from_pid() in comments
        (Alexei)
      
      v1 -> v2:
      - Expand comment to specify that a map holds a reference to a task kptr
        if we don't end up releasing it (Alexei)
      - Just read task->pid instead of using a probed read (Alexei)
      ====================
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      2d141236
    • David Vernet's avatar
      bpf/docs: Document struct cgroup * kfuncs · 36aa10ff
      David Vernet authored
      bpf_cgroup_acquire(), bpf_cgroup_release(), bpf_cgroup_kptr_get(), and
      bpf_cgroup_ancestor(), are kfuncs that were recently added to
      kernel/bpf/helpers.c. These are "core" kfuncs in that they're available
      for use in any tracepoint or struct_ops BPF program. Though they have no
      ABI stability guarantees, we should still document them. This patch adds
      a struct cgroup * subsection to the Core kfuncs section which describes
      each of these kfuncs.
      Signed-off-by: default avatarDavid Vernet <void@manifault.com>
      Link: https://lore.kernel.org/r/20221207204911.873646-3-void@manifault.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      36aa10ff
    • David Vernet's avatar
      bpf/docs: Document struct task_struct * kfuncs · 25c5e92d
      David Vernet authored
      bpf_task_acquire(), bpf_task_release(), and bpf_task_from_pid() are
      kfuncs that were recently added to kernel/bpf/helpers.c. These are
      "core" kfuncs in that they're available for use for any tracepoint or
      struct_ops BPF program. Though they have no ABI stability guarantees, we
      should still document them. This patch adds a new Core kfuncs section to
      the BPF kfuncs doc, and adds entries for all of these task kfuncs.
      
      Note that bpf_task_kptr_get() is not documented, as it still returns
      NULL while we're working to resolve how it can use RCU to ensure struct
      task_struct * lifetime.
      Signed-off-by: default avatarDavid Vernet <void@manifault.com>
      Link: https://lore.kernel.org/r/20221207204911.873646-2-void@manifault.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      25c5e92d
    • Andrii Nakryiko's avatar
      selftests/bpf: convert dynptr_fail and map_kptr_fail subtests to generic tester · 26c386ec
      Andrii Nakryiko authored
      Convert big chunks of dynptr and map_kptr subtests to use generic
      verification_tester. They are switched from using manually maintained
      tables of test cases, specifying program name and expected error
      verifier message, to btf_decl_tag-based annotations directly on
      corresponding BPF programs: __failure to specify that BPF program is
      expected to fail verification, and __msg() to specify expected log
      message.
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Acked-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20221207201648.2990661-2-andrii@kernel.orgSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      26c386ec
    • Andrii Nakryiko's avatar
      selftests/bpf: add generic BPF program tester-loader · 537c3f66
      Andrii Nakryiko authored
      It's become a common pattern to have a collection of small BPF programs
      in one BPF object file, each representing one test case. On user-space
      side of such tests we maintain a table of program names and expected
      failure or success, along with optional expected verifier log message.
      
      This works, but each set of tests reimplement this mundane code over and
      over again, which is a waste of time for anyone trying to add a new set
      of tests. Furthermore, it's quite error prone as it's way too easy to miss
      some entries in these manually maintained test tables (as evidences by
      dynptr_fail tests, in which ringbuf_release_uninit_dynptr subtest was
      accidentally missed; this is fixed in next patch).
      
      So this patch implements generic test_loader, which accepts skeleton
      name and handles the rest of details: opens and loads BPF object file,
      making sure each program is tested in isolation. Optionally each test
      case can specify expected BPF verifier log message. In case of failure,
      tester makes sure to report verifier log, but it also reports verifier
      log in verbose mode unconditionally.
      
      Now, the interesting deviation from existing custom implementations is
      the use of btf_decl_tag attribute to specify expected-to-fail vs
      expected-to-succeed markers and, optionally, expected log message
      directly next to BPF program source code, eliminating the need to
      manually create and update table of tests.
      
      We define few macros wrapping btf_decl_tag with a convention that all
      values of btf_decl_tag start with "comment:" prefix, and then utilizing
      a very simple "just_some_text_tag" or "some_key_name=<value>" pattern to
      define things like expected success/failure, expected verifier message,
      extra verifier log level (if necessary). This approach is demonstrated
      by next patch in which two existing sets of failure tests are converted.
      
      Tester supports both expected-to-fail and expected-to-succeed programs,
      though this patch set didn't convert any existing expected-to-succeed
      programs yet, as existing tests couple BPF program loading with their
      further execution through attach or test_prog_run. One way to allow
      testing scenarios like this would be ability to specify custom callback,
      executed for each successfully loaded BPF program. This is left for
      follow up patches, after some more analysis of existing test cases.
      
      This test_loader is, hopefully, a start of a test_verifier-like runner,
      but integrated into test_progs infrastructure. It will allow much better
      "user experience" of defining low-level verification tests that can take
      advantage of all the libbpf-provided nicety features on BPF side: global
      variables, declarative maps, etc.  All while having a choice of defining
      it in C or as BPF assembly (through __attribute__((naked)) functions and
      using embedded asm), depending on what makes most sense in each
      particular case. This will be explored in follow up patches as well.
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/r/20221207201648.2990661-1-andrii@kernel.orgSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      537c3f66
  3. 07 Dec, 2022 15 commits
  4. 06 Dec, 2022 8 commits
  5. 05 Dec, 2022 4 commits
    • James Hilliard's avatar
      selftests/bpf: Fix conflicts with built-in functions in bpf_iter_ksym · ab0350c7
      James Hilliard authored
      Both tolower and toupper are built in c functions, we should not
      redefine them as this can result in a build error.
      
      Fixes the following errors:
      progs/bpf_iter_ksym.c:10:20: error: conflicting types for built-in function 'tolower'; expected 'int(int)' [-Werror=builtin-declaration-mismatch]
         10 | static inline char tolower(char c)
            |                    ^~~~~~~
      progs/bpf_iter_ksym.c:5:1: note: 'tolower' is declared in header '<ctype.h>'
          4 | #include <bpf/bpf_helpers.h>
        +++ |+#include <ctype.h>
          5 |
      progs/bpf_iter_ksym.c:17:20: error: conflicting types for built-in function 'toupper'; expected 'int(int)' [-Werror=builtin-declaration-mismatch]
         17 | static inline char toupper(char c)
            |                    ^~~~~~~
      progs/bpf_iter_ksym.c:17:20: note: 'toupper' is declared in header '<ctype.h>'
      
      See background on this sort of issue:
      https://stackoverflow.com/a/20582607
      https://gcc.gnu.org/bugzilla/show_bug.cgi?id=12213
      
      (C99, 7.1.3p1) "All identifiers with external linkage in any of the
      following subclauses (including the future library directions) are
      always reserved for use as identifiers with external linkage."
      
      This is documented behavior in GCC:
      https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#index-std-2Signed-off-by: default avatarJames Hilliard <james.hilliard1@gmail.com>
      Acked-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/r/20221203010847.2191265-1-james.hilliard1@gmail.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      ab0350c7
    • Eric Dumazet's avatar
      bpf, sockmap: fix race in sock_map_free() · 0a182f8d
      Eric Dumazet authored
      sock_map_free() calls release_sock(sk) without owning a reference
      on the socket. This can cause use-after-free as syzbot found [1]
      
      Jakub Sitnicki already took care of a similar issue
      in sock_hash_free() in commit 75e68e5b ("bpf, sockhash:
      Synchronize delete from bucket list on map free")
      
      [1]
      refcount_t: decrement hit 0; leaking memory.
      WARNING: CPU: 0 PID: 3785 at lib/refcount.c:31 refcount_warn_saturate+0x17c/0x1a0 lib/refcount.c:31
      Modules linked in:
      CPU: 0 PID: 3785 Comm: kworker/u4:6 Not tainted 6.1.0-rc7-syzkaller-00103-gef4d3ea4 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
      Workqueue: events_unbound bpf_map_free_deferred
      RIP: 0010:refcount_warn_saturate+0x17c/0x1a0 lib/refcount.c:31
      Code: 68 8b 31 c0 e8 75 71 15 fd 0f 0b e9 64 ff ff ff e8 d9 6e 4e fd c6 05 62 9c 3d 0a 01 48 c7 c7 80 bb 68 8b 31 c0 e8 54 71 15 fd <0f> 0b e9 43 ff ff ff 89 d9 80 e1 07 80 c1 03 38 c1 0f 8c a2 fe ff
      RSP: 0018:ffffc9000456fb60 EFLAGS: 00010246
      RAX: eae59bab72dcd700 RBX: 0000000000000004 RCX: ffff8880207057c0
      RDX: 0000000000000000 RSI: 0000000000000201 RDI: 0000000000000000
      RBP: 0000000000000004 R08: ffffffff816fdabd R09: fffff520008adee5
      R10: fffff520008adee5 R11: 1ffff920008adee4 R12: 0000000000000004
      R13: dffffc0000000000 R14: ffff88807b1c6c00 R15: 1ffff1100f638dcf
      FS: 0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
      CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000001b30c30000 CR3: 000000000d08e000 CR4: 00000000003506f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
      <TASK>
      __refcount_dec include/linux/refcount.h:344 [inline]
      refcount_dec include/linux/refcount.h:359 [inline]
      __sock_put include/net/sock.h:779 [inline]
      tcp_release_cb+0x2d0/0x360 net/ipv4/tcp_output.c:1092
      release_sock+0xaf/0x1c0 net/core/sock.c:3468
      sock_map_free+0x219/0x2c0 net/core/sock_map.c:356
      process_one_work+0x81c/0xd10 kernel/workqueue.c:2289
      worker_thread+0xb14/0x1330 kernel/workqueue.c:2436
      kthread+0x266/0x300 kernel/kthread.c:376
      ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306
      </TASK>
      
      Fixes: 7e81a353 ("bpf: Sockmap, ensure sock lock held during tear down")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Jakub Sitnicki <jakub@cloudflare.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Song Liu <songliubraving@fb.com>
      Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Link: https://lore.kernel.org/r/20221202111640.2745533-1-edumazet@google.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      0a182f8d
    • Toke Høiland-Jørgensen's avatar
      bpf: Add dummy type reference to nf_conn___init to fix type deduplication · 578ce69f
      Toke Høiland-Jørgensen authored
      The bpf_ct_set_nat_info() kfunc is defined in the nf_nat.ko module, and
      takes as a parameter the nf_conn___init struct, which is allocated through
      the bpf_xdp_ct_alloc() helper defined in the nf_conntrack.ko module.
      However, because kernel modules can't deduplicate BTF types between each
      other, and the nf_conn___init struct is not referenced anywhere in vmlinux
      BTF, this leads to two distinct BTF IDs for the same type (one in each
      module). This confuses the verifier, as described here:
      
      https://lore.kernel.org/all/87leoh372s.fsf@toke.dk/
      
      As a workaround, add an explicit BTF_TYPE_EMIT for the type in
      net/filter.c, so the type definition gets included in vmlinux BTF. This
      way, both modules can refer to the same type ID (as they both build on top
      of vmlinux BTF), and the verifier is no longer confused.
      
      v2:
      
      - Use BTF_TYPE_EMIT (which is a statement so it has to be inside a function
        definition; use xdp_func_proto() for this, since this is mostly
        xdp-related).
      
      Fixes: 820dc052 ("net: netfilter: move bpf_ct_set_nat_info kfunc in nf_nat_bpf.c")
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Link: https://lore.kernel.org/r/20221201123939.696558-1-toke@redhat.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      578ce69f
    • Yonghong Song's avatar
      bpf: Add sleepable prog tests for cgrp local storage · 41d76c72
      Yonghong Song authored
      Add three tests for cgrp local storage support for sleepable progs.
      Two tests can load and run properly, one for cgroup_iter, another
      for passing current->cgroups->dfl_cgrp to bpf_cgrp_storage_get()
      helper. One test has bpf_rcu_read_lock() and failed to load.
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Link: https://lore.kernel.org/r/20221201050449.2785613-1-yhs@fb.comSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      41d76c72