Commits · cd984b2ed62423eb3daceacb21d651115a612af6 · Kirill Smelkov / linux

06 Jun, 2024 7 commits

selftests/bpf: Use start_test in test_dctcp in bpf_tcp_ca · cd984b2e

Geliang Tang authored May 30, 2024

The "if (sk_stg_map)" block in do_test() is only used by test_dctcp(),
it makes sense to move it from do_test() into test_dctcp(). Then
do_test() can be used by other tests except test_dctcp().
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/9938916627b9810c877e5c03a621bc0ba5acf5c5.1717054461.git.tanggeliang@kylinos.cn

cd984b2e

selftests/bpf: Use start_test in test_dctcp_fallback in bpf_tcp_ca · 224eeb55

Geliang Tang authored May 30, 2024

The newly added helper start_test() can be used in test_dctcp_fallback()
too, to replace start_server_str() and connect_to_fd_opts(). In that
way, two network_helper_opts srv_opts and cli_opts are used instead of
the previously shared opts.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/792ca3bb013fa06e618176da02d75e4f79a76733.1717054461.git.tanggeliang@kylinos.cn

224eeb55

selftests/bpf: Add start_test helper in bpf_tcp_ca · fee97d0c

Geliang Tang authored May 30, 2024

For moving the "if (sk_stg_map)" block out of do_test(), extract the
code before this block as a new function start_test(). It creates
server-side and client-side sockets and returns them to the caller.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/48f2921ff9be958f5d3d28fe6bb7269a61cafa9f.1717054461.git.tanggeliang@kylinos.cn

fee97d0c

selftests/bpf: Use connect_to_fd_opts in do_test in bpf_tcp_ca · 9abdfd8a

Geliang Tang authored May 30, 2024

This patch uses connect_to_fd_opts() instead of using connect_fd_to_fd()
and settcpca() in do_test() in prog_tests/bpf_tcp_ca.c to accept a struct
network_helper_opts argument.

Then define a dctcp dedicated post_socket_cb callback stg_post_socket_cb(),
invoking both settcpca() and bpf_map_update_elem() in it, and set it in
test_dctcp(). For passing map_fd into stg_post_socket_cb() callback, a new
member map_fd is added in struct cb_opts.

Add another "const struct network_helper_opts *cli_opts" to do_test() to
separate it from the server "opts".
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/876ec90430865bc468e3b7f6fb2648420b075548.1717054461.git.tanggeliang@kylinos.cn

9abdfd8a

libbpf: Auto-attach struct_ops BPF maps in BPF skeleton · 08ac454e

Mykyta Yatsenko authored Jun 05, 2024

Similarly to `bpf_program`, support `bpf_map` automatic attachment in
`bpf_object__attach_skeleton`. Currently only struct_ops maps could be
attached.

On bpftool side, code-generate links in skeleton struct for struct_ops maps.
Similarly to `bpf_program_skeleton`, set links in `bpf_map_skeleton`.

On libbpf side, extend `bpf_map` with new `autoattach` field to support
enabling or disabling autoattach functionality, introducing
getter/setter for this field.

`bpf_object__(attach|detach)_skeleton` is extended with
attaching/detaching struct_ops maps logic.
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240605175135.117127-1-yatsenko@meta.com

08ac454e

selftests/bpf: Add btf_field_iter selftests · b24862ba

Alan Maguire authored Jun 05, 2024

The added selftests verify that for every BTF kind we iterate correctly
over consituent strings and ids.
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240605153314.3727466-1-alan.maguire@oracle.com

b24862ba

selftests/bpf: Fix send_signal test with nested CONFIG_PARAVIRT · 7015843a

Yonghong Song authored Jun 05, 2024

Alexei reported that send_signal test may fail with nested CONFIG_PARAVIRT
configs. In this particular case, the base VM is AMD with 166 cpus, and I
run selftests with regular qemu on top of that and indeed send_signal test
failed. I also tried with an Intel box with 80 cpus and there is no issue.

The main qemu command line includes:

  -enable-kvm -smp 16 -cpu host

The failure log looks like:

  $ ./test_progs -t send_signal
  [   48.501588] watchdog: BUG: soft lockup - CPU#9 stuck for 26s! [test_progs:2225]
  [   48.503622] Modules linked in: bpf_testmod(O)
  [   48.503622] CPU: 9 PID: 2225 Comm: test_progs Tainted: G           O       6.9.0-08561-g2c1713a8-dirty #69
  [   48.507629] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
  [   48.511635] RIP: 0010:handle_softirqs+0x71/0x290
  [   48.511635] Code: [...] 10 0a 00 00 00 31 c0 65 66 89 05 d5 f4 fa 7e fb bb ff ff ff ff <49> c7 c2 cb
  [   48.518527] RSP: 0018:ffffc90000310fa0 EFLAGS: 00000246
  [   48.519579] RAX: 0000000000000000 RBX: 00000000ffffffff RCX: 00000000000006e0
  [   48.522526] RDX: 0000000000000006 RSI: ffff88810791ae80 RDI: 0000000000000000
  [   48.523587] RBP: ffffc90000fabc88 R08: 00000005a0af4f7f R09: 0000000000000000
  [   48.525525] R10: 0000000561d2f29c R11: 0000000000006534 R12: 0000000000000280
  [   48.528525] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
  [   48.528525] FS:  00007f2f2885cd00(0000) GS:ffff888237c40000(0000) knlGS:0000000000000000
  [   48.531600] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [   48.535520] CR2: 00007f2f287059f0 CR3: 0000000106a28002 CR4: 00000000003706f0
  [   48.537538] Call Trace:
  [   48.537538]  <IRQ>
  [   48.537538]  ? watchdog_timer_fn+0x1cd/0x250
  [   48.539590]  ? lockup_detector_update_enable+0x50/0x50
  [   48.539590]  ? __hrtimer_run_queues+0xff/0x280
  [   48.542520]  ? hrtimer_interrupt+0x103/0x230
  [   48.544524]  ? __sysvec_apic_timer_interrupt+0x4f/0x140
  [   48.545522]  ? sysvec_apic_timer_interrupt+0x3a/0x90
  [   48.547612]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
  [   48.547612]  ? handle_softirqs+0x71/0x290
  [   48.547612]  irq_exit_rcu+0x63/0x80
  [   48.551585]  sysvec_apic_timer_interrupt+0x75/0x90
  [   48.552521]  </IRQ>
  [   48.553529]  <TASK>
  [   48.553529]  asm_sysvec_apic_timer_interrupt+0x1a/0x20
  [   48.555609] RIP: 0010:finish_task_switch.isra.0+0x90/0x260
  [   48.556526] Code: [...] 9f 58 0a 00 00 48 85 db 0f 85 89 01 00 00 4c 89 ff e8 53 d9 bd 00 fb 66 90 <4d> 85 ed 74
  [   48.562524] RSP: 0018:ffffc90000fabd38 EFLAGS: 00000282
  [   48.563589] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff83385620
  [   48.563589] RDX: ffff888237c73ae4 RSI: 0000000000000000 RDI: ffff888237c6fd00
  [   48.568521] RBP: ffffc90000fabd68 R08: 0000000000000000 R09: 0000000000000000
  [   48.569528] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8881009d0000
  [   48.573525] R13: ffff8881024e5400 R14: ffff88810791ae80 R15: ffff888237c6fd00
  [   48.575614]  ? finish_task_switch.isra.0+0x8d/0x260
  [   48.576523]  __schedule+0x364/0xac0
  [   48.577535]  schedule+0x2e/0x110
  [   48.578555]  pipe_read+0x301/0x400
  [   48.579589]  ? destroy_sched_domains_rcu+0x30/0x30
  [   48.579589]  vfs_read+0x2b3/0x2f0
  [   48.579589]  ksys_read+0x8b/0xc0
  [   48.583590]  do_syscall_64+0x3d/0xc0
  [   48.583590]  entry_SYSCALL_64_after_hwframe+0x4b/0x53
  [   48.586525] RIP: 0033:0x7f2f28703fa1
  [   48.587592] Code: [...] 00 00 00 0f 1f 44 00 00 f3 0f 1e fa 80 3d c5 23 14 00 00 74 13 31 c0 0f 05 <48> 3d 00 f0
  [   48.593534] RSP: 002b:00007ffd90f8cf88 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
  [   48.595589] RAX: ffffffffffffffda RBX: 00007ffd90f8d5e8 RCX: 00007f2f28703fa1
  [   48.595589] RDX: 0000000000000001 RSI: 00007ffd90f8cfb0 RDI: 0000000000000006
  [   48.599592] RBP: 00007ffd90f8d2f0 R08: 0000000000000064 R09: 0000000000000000
  [   48.602527] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
  [   48.603589] R13: 00007ffd90f8d608 R14: 00007f2f288d8000 R15: 0000000000f6bdb0
  [   48.605527]  </TASK>

In the test, two processes are communicating through pipe. Further debugging
with strace found that the above splat is triggered as read() syscall could
not receive the data even if the corresponding write() syscall in another
process successfully wrote data into the pipe.

The failed subtest is "send_signal_perf". The corresponding perf event has
sample_period 1 and config PERF_COUNT_SW_CPU_CLOCK. sample_period 1 means every
overflow event will trigger a call to the BPF program. So I suspect this may
overwhelm the system. So I increased the sample_period to 100,000 and the test
passed. The sample_period 10,000 still has the test failed.

In other parts of selftest, e.g., [1], sample_freq is used instead. So I
decided to use sample_freq = 1,000 since the test can pass as well.

  [1] https://lore.kernel.org/bpf/20240604070700.3032142-1-song@kernel.org/Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240605201203.2603846-1-yonghong.song@linux.dev

7015843a

05 Jun, 2024 5 commits

libbpf: Remove callback-based type/string BTF field visitor helpers · 07208870

Andrii Nakryiko authored Jun 04, 2024

Now that all libbpf/bpftool code switched to btf_field_iter, remove
btf_type_visit_type_ids() and btf_type_visit_str_offs() callback-based
helpers as not needed anymore.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20240605001629.4061937-6-andrii@kernel.org

07208870

bpftool: Use BTF field iterator in btfgen · e1a86302

Andrii Nakryiko authored Jun 04, 2024

Switch bpftool's code which is using libbpf-internal
btf_type_visit_type_ids() helper to new btf_field_iter functionality.

This makes bpftool code simpler, but also unblocks removing libbpf's
btf_type_visit_type_ids() helper completely.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Reviewed-by: Quentin Monnet <qmo@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20240605001629.4061937-5-andrii@kernel.org

e1a86302

libbpf: Make use of BTF field iterator in BTF handling code · c2641123

Andrii Nakryiko authored Jun 04, 2024

Use new BTF field iterator logic to replace all the callback-based
visitor calls. There is still a .BTF.ext callback-based visitor APIs
that should be converted, which will happens as a follow up.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20240605001629.4061937-4-andrii@kernel.org

c2641123

libbpf: Make use of BTF field iterator in BPF linker code · 2bce2c1c

Andrii Nakryiko authored Jun 04, 2024

Switch all BPF linker code dealing with iterating BTF type ID and string
offset fields to new btf_field_iter facilities.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20240605001629.4061937-3-andrii@kernel.org

2bce2c1c

libbpf: Add BTF field iterator · 68153bb2

Andrii Nakryiko authored Jun 04, 2024

Implement iterator-based type ID and string offset BTF field iterator.
This is used extensively in BTF-handling code and BPF linker code for
various sanity checks, rewriting IDs/offsets, etc. Currently this is
implemented as visitor pattern calling custom callbacks, which makes the
logic (especially in simple cases) unnecessarily obscure and harder to
follow.

Having equivalent functionality using iterator pattern makes for simpler
to understand and maintain code. As we add more code for BTF processing
logic in libbpf, it's best to switch to iterator pattern before adding
more callback-based code.

The idea for iterator-based implementation is to record offsets of
necessary fields within fixed btf_type parts (which should be iterated
just once), and, for kinds that have multiple members (based on vlen
field), record where in each member necessary fields are located.

Generic iteration code then just keeps track of last offset that was
returned and handles N members correctly. Return type is just u32
pointer, where NULL is returned when all relevant fields were already
iterated.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20240605001629.4061937-2-andrii@kernel.org

68153bb2

04 Jun, 2024 12 commits

selftests/bpf: Ignore .llvm.<hash> suffix in kallsyms_find() · 898ac74c

Yonghong Song authored Jun 04, 2024

I hit the following failure when running selftests with
internal backported upstream kernel:
  test_ksyms:PASS:kallsyms_fopen 0 nsec
  test_ksyms:FAIL:ksym_find symbol 'bpf_link_fops' not found
  #123     ksyms:FAIL

In /proc/kallsyms, we have
  $ cat /proc/kallsyms | grep bpf_link_fops
  ffffffff829f0cb0 d bpf_link_fops.llvm.12608678492448798416
The CONFIG_LTO_CLANG_THIN is enabled in the kernel which is responsible
for bpf_link_fops.llvm.12608678492448798416 symbol name.

In prog_tests/ksyms.c we have
  kallsyms_find("bpf_link_fops", &link_fops_addr)
and kallsyms_find() compares "bpf_link_fops" with symbols
in /proc/kallsyms in order to find the entry. With
bpf_link_fops.llvm.<hash> in /proc/kallsyms, the kallsyms_find()
failed.

To fix the issue, in kallsyms_find(), if a symbol has suffix
.llvm.<hash>, that suffix will be ignored for comparison.
This fixed the test failure.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240604180034.1356016-1-yonghong.song@linux.dev

898ac74c

selftests/bpf: Fix bpf_cookie and find_vma in nested VM · 61ce0ea7

Song Liu authored Jun 04, 2024

bpf_cookie and find_vma are flaky in nested VMs, which is used by some CI
systems. It turns out these failures are caused by unreliable perf event
in nested VM. Fix these by:

  1. Use PERF_COUNT_SW_CPU_CLOCK in find_vma;
  2. Increase sample_freq in bpf_cookie.
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240604070700.3032142-1-song@kernel.org

61ce0ea7

Merge branch 'enable-bpf-programs-to-declare-arrays-of-kptr-bpf_rb_root-and-bpf_list_head' · 49df0019

Alexei Starovoitov authored Jun 03, 2024

Kui-Feng Lee says:

====================
Enable BPF programs to declare arrays of kptr, bpf_rb_root, and bpf_list_head.

Some types, such as type kptr, bpf_rb_root, and bpf_list_head, are
treated in a special way. Previously, these types could not be the
type of a field in a struct type that is used as the type of a global
variable. They could not be the type of a field in a struct type that
is used as the type of a field in the value type of a map either. They
could not even be the type of array elements. This means that they can
only be the type of global variables or of direct fields in the value
type of a map.

The patch set aims to enable the use of these specific types in arrays
and struct fields, providing flexibility. It examines the types of
global variables or the value types of maps, such as arrays and struct
types, recursively to identify these special types and generate field
information for them.

For example,

  ...
  struct task_struct __kptr *ptr[3];
  ...

it will create 3 instances of "struct btf_field" in the "btf_record" of
the data section.

 [...,
  btf_field(offset=0x100, type=BPF_KPTR_REF),
  btf_field(offset=0x108, type=BPF_KPTR_REF),
  btf_field(offset=0x110, type=BPF_KPTR_REF),
  ...
 ]

It creates a record of each of three elements. These three records are
almost identical except their offsets.

Another example is

  ...
  struct A {
    ...
    struct task_struct __kptr *task;
    struct bpf_rb_root root;
    ...
  }

  struct A foo[2];

it will create 4 records.

 [...,
  btf_field(offset=0x7100, type=BPF_KPTR_REF),
  btf_field(offset=0x7108, type=BPF_RB_ROOT:),
  btf_field(offset=0x7200, type=BPF_KPTR_REF),
  btf_field(offset=0x7208, type=BPF_RB_ROOT:),
  ...
 ]

Assuming that the size of an element/struct A is 0x100 and "foo"
starts at 0x7000, it includes two kptr records at 0x7100 and 0x7200,
and two rbtree root records at 0x7108 and 0x7208.

All these field information will be flatten, for struct types, and
repeated, for arrays.
---
Changes from v6:

 - Return BPF_KPTR_REF from btf_get_field_type() only if var_type is a
   struct type.

   - Pass btf and type to btf_get_field_type().

Changes from v5:

 - Ensure field->offset values of kptrs are advanced correctly from
   one nested struct/or array to another.

Changes from v4:

 - Return -E2BIG for i == MAX_RESOLVE_DEPTH.

Changes from v3:

 - Refactor the common code of btf_find_struct_field() and
   btf_find_datasec_var().

 - Limit the number of levels looking into a struct types.

Changes from v2:

 - Support fields in nested struct type.

 - Remove nelems and duplicate field information with offset
   adjustments for arrays.

Changes from v1:

 - Move the check of element alignment out of btf_field_cmp() to
   btf_record_find().

 - Change the order of the previous patch 4 "bpf:
   check_map_kptr_access() compute the offset from the reg state" as
   the patch 7 now.

 - Reject BPF_RB_NODE and BPF_LIST_NODE with nelems > 1.

 - Rephrase the commit log of the patch "bpf: check_map_access() with
   the knowledge of arrays" to clarify the alignment on elements.

v6: https://lore.kernel.org/all/20240520204018.884515-1-thinker.li@gmail.com/
v5: https://lore.kernel.org/all/20240510011312.1488046-1-thinker.li@gmail.com/
v4: https://lore.kernel.org/all/20240508063218.2806447-1-thinker.li@gmail.com/
v3: https://lore.kernel.org/all/20240501204729.484085-1-thinker.li@gmail.com/
v2: https://lore.kernel.org/all/20240412210814.603377-1-thinker.li@gmail.com/
v1: https://lore.kernel.org/bpf/20240410004150.2917641-1-thinker.li@gmail.com/

Kui-Feng Lee (9):
  bpf: Remove unnecessary checks on the offset of btf_field.
  bpf: Remove unnecessary call to btf_field_type_size().
  bpf: refactor btf_find_struct_field() and btf_find_datasec_var().
  bpf: create repeated fields for arrays.
  bpf: look into the types of the fields of a struct type recursively.
  bpf: limit the number of levels of a nested struct type.
  selftests/bpf: Test kptr arrays and kptrs in nested struct fields.
  selftests/bpf: Test global bpf_rb_root arrays and fields in nested
    struct types.
  selftests/bpf: Test global bpf_list_head arrays.

 kernel/bpf/btf.c                              | 310 ++++++++++++------
 kernel/bpf/verifier.c                         |   4 +-
 .../selftests/bpf/prog_tests/cpumask.c        |   5 +
 .../selftests/bpf/prog_tests/linked_list.c    |  12 +
 .../testing/selftests/bpf/prog_tests/rbtree.c |  47 +++
 .../selftests/bpf/progs/cpumask_success.c     | 171 ++++++++++
 .../testing/selftests/bpf/progs/linked_list.c |  42 +++
 tools/testing/selftests/bpf/progs/rbtree.c    |  77 +++++
 8 files changed, 558 insertions(+), 110 deletions(-)
====================

Link: https://lore.kernel.org/r/20240523174202.461236-1-thinker.li@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

49df0019

selftests/bpf: Test global bpf_list_head arrays. · 43d50ffb

Kui-Feng Lee authored May 23, 2024

Make sure global arrays of bpf_list_heads and fields of bpf_list_heads in
nested struct types work correctly.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Link: https://lore.kernel.org/r/20240523174202.461236-10-thinker.li@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

43d50ffb

selftests/bpf: Test global bpf_rb_root arrays and fields in nested struct types. · d55c765a

Kui-Feng Lee authored May 23, 2024

Make sure global arrays of bpf_rb_root and fields of bpf_rb_root in nested
struct types work correctly.
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Link: https://lore.kernel.org/r/20240523174202.461236-9-thinker.li@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

d55c765a

selftests/bpf: Test kptr arrays and kptrs in nested struct fields. · c4c6c3b7

Kui-Feng Lee authored May 23, 2024

Make sure that BPF programs can declare global kptr arrays and kptr fields
in struct types that is the type of a global variable or the type of a
nested descendant field in a global variable.

An array with only one element is special case, that it treats the element
like a non-array kptr field. Nested arrays are also tested to ensure they
are handled properly.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Link: https://lore.kernel.org/r/20240523174202.461236-8-thinker.li@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

c4c6c3b7

bpf: limit the number of levels of a nested struct type. · f19caf57

Kui-Feng Lee authored May 23, 2024

Limit the number of levels looking into struct types to avoid running out
of stack space.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Link: https://lore.kernel.org/r/20240523174202.461236-7-thinker.li@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

f19caf57

bpf: look into the types of the fields of a struct type recursively. · 64e8ee81

Kui-Feng Lee authored May 23, 2024

The verifier has field information for specific special types, such as
kptr, rbtree root, and list head. These types are handled
differently. However, we did not previously examine the types of fields of
a struct type variable. Field information records were not generated for
the kptrs, rbtree roots, and linked_list heads that are not located at the
outermost struct type of a variable.

For example,

  struct A {
    struct task_struct __kptr * task;
  };

  struct B {
    struct A mem_a;
  }

  struct B var_b;

It did not examine "struct A" so as not to generate field information for
the kptr in "struct A" for "var_b".

This patch enables BPF programs to define fields of these special types in
a struct type other than the direct type of a variable or in a struct type
that is the type of a field in the value type of a map.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Link: https://lore.kernel.org/r/20240523174202.461236-6-thinker.li@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

64e8ee81

bpf: create repeated fields for arrays. · 994796c0

Kui-Feng Lee authored May 23, 2024

The verifier uses field information for certain special types, such as
kptr, rbtree root, and list head. These types are treated
differently. However, we did not previously support these types in
arrays. This update examines arrays and duplicates field information the
same number of times as the length of the array if the element type is one
of the special types.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Link: https://lore.kernel.org/r/20240523174202.461236-5-thinker.li@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

994796c0

bpf: refactor btf_find_struct_field() and btf_find_datasec_var(). · a7db0d4f

Kui-Feng Lee authored May 23, 2024

Move common code of the two functions to btf_find_field_one().
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Link: https://lore.kernel.org/r/20240523174202.461236-4-thinker.li@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

a7db0d4f

bpf: Remove unnecessary call to btf_field_type_size(). · 482f7133

Kui-Feng Lee authored May 23, 2024

field->size has been initialized by bpf_parse_fields() with the value
returned by btf_field_type_size(). Use it instead of calling
btf_field_type_size() again.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Link: https://lore.kernel.org/r/20240523174202.461236-3-thinker.li@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

482f7133

bpf: Remove unnecessary checks on the offset of btf_field. · c95a3be4

Kui-Feng Lee authored May 23, 2024

reg_find_field_offset() always return a btf_field with a matching offset
value. Checking the offset of the returned btf_field is unnecessary.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Kui-Feng Lee <thinker.li@gmail.com>
Link: https://lore.kernel.org/r/20240523174202.461236-2-thinker.li@gmail.comSigned-off-by: Alexei Starovoitov <ast@kernel.org>

c95a3be4

03 Jun, 2024 14 commits

selftests/bpf: Drop duplicate bpf_map_lookup_elem in test_sockmap · 49784c79

Geliang Tang authored May 23, 2024

bpf_map_lookup_elem is invoked in bpf_prog3() already, no need to invoke
it again. This patch drops it.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/ea8458462b876ee445173e3effb535fd126137ed.1716446893.git.tanggeliang@kylinos.cn

49784c79

selftests/bpf: Check length of recv in test_sockmap · de1b5ea7

Geliang Tang authored May 23, 2024

The value of recv in msg_loop may be negative, like EWOULDBLOCK, so it's
necessary to check if it is positive before accumulating it to bytes_recvd.

Fixes: 16962b24 ("bpf: sockmap, add selftests")
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/5172563f7c7b2a2e953cef02e89fc34664a7b190.1716446893.git.tanggeliang@kylinos.cn

de1b5ea7

selftests/bpf: Fix size of map_fd in test_sockmap · dcb681b6

Geliang Tang authored May 23, 2024

The array size of map_fd[] is 9, not 8. This patch changes it as a more
general form: ARRAY_SIZE(map_fd).
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/0972529ee01ebf8a8fd2b310bdec90831c94be77.1716446893.git.tanggeliang@kylinos.cn

dcb681b6

selftests/bpf: Drop prog_fd array in test_sockmap · 467a0c79

Geliang Tang authored May 23, 2024

The program fds can be got by using bpf_program__fd(progs[]), then
prog_fd becomes useless. This patch drops it.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/9a6335e4d8dbab23c0d8906074457ceddd61e74b.1716446893.git.tanggeliang@kylinos.cn

467a0c79

selftests/bpf: Replace tx_prog_fd with tx_prog in test_sockmap · 24bb90a4

Geliang Tang authored May 23, 2024

bpf_program__attach_sockmap() needs to take a parameter of type bpf_program
instead of an fd, so tx_prog_fd becomes useless. This patch uses a pointer
tx_prog to point to an item in progs[] array.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/23b37f932c547dd1ebfe154bbc0b0e957be21ee6.1716446893.git.tanggeliang@kylinos.cn

24bb90a4

selftests/bpf: Use bpf_link attachments in test_sockmap · 3f32a115

Geliang Tang authored May 23, 2024

Switch attachments to bpf_link using bpf_program__attach_sockmap() instead
of bpf_prog_attach().

This patch adds a new array progs[] to replace prog_fd[] array, set in
populate_progs() for each program in bpf object.

And another new array links[] to save the attached bpf_link. It is
initalized as NULL in populate_progs, set as the return valuses of
bpf_program__attach_sockmap(), and detached by bpf_link__detach().
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/32cf8376a810e2e9c719f8e4cfb97132ed2d1f9c.1716446893.git.tanggeliang@kylinos.cn

3f32a115

selftests/bpf: Drop duplicate definition of i in test_sockmap · a9f0ea17

Geliang Tang authored May 23, 2024

There's already a definition of i in run_options() at the beginning, no
need to define a new one in "if (tx_prog_fd > 0)" block.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/8d690682330a59361562bca75d6903253d16f312.1716446893.git.tanggeliang@kylinos.cn

a9f0ea17

selftests/bpf: Fix tx_prog_fd values in test_sockmap · d95ba15b

Geliang Tang authored May 23, 2024

The values of tx_prog_fd in run_options() should not be 0, so set it as -1
in else branch, and test it using "if (tx_prog_fd > 0)" condition, not
"if (tx_prog_fd)" or "if (tx_prog_fd >= 0)".
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/08b20ffc544324d40939efeae93800772a91a58e.1716446893.git.tanggeliang@kylinos.cn

d95ba15b

test_bpf: Add missing MODULE_DESCRIPTION() · ec1249d3

Jeff Johnson authored May 31, 2024

make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in lib/test_bpf.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240531-md-lib-test_bpf-v1-1-868e4bd2f9ed@quicinc.com

ec1249d3

bpftool: Fix typo in MAX_NUM_METRICS macro name · ce5249b9

Swan Beaujard authored Jun 03, 2024

Correct typo in bpftool profiler and change all instances of 'MATRICS' to
'METRICS' in the profiler.bpf.c file.
Signed-off-by: Swan Beaujard <beaujardswan@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Quentin Monnet <qmo@kernel.org>
Link: https://lore.kernel.org/bpf/20240602225812.81171-1-beaujardswan@gmail.com

ce5249b9

selftests/bpf: Remove unused struct 'libcap' · a450d36b

Dr. David Alan Gilbert authored Jun 03, 2024

'libcap' is unused since commit b1c2768a ("bpf: selftests: Remove libcap
usage from test_verifier"). Remove it.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240602234112.225107-4-linux@treblig.org

a450d36b

selftests/bpf: Remove unused 'key_t' structs · 3f67639d

Dr. David Alan Gilbert authored Jun 03, 2024

'key_t' is unused in a couple of files since the original commit 60dd49ea
("selftests/bpf: Add test for bpf array map iterators"). Remove it.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240602234112.225107-3-linux@treblig.org

3f67639d

selftests/bpf: Remove unused struct 'scale_test_def' · dfa7c9ff

Dr. David Alan Gilbert authored Jun 03, 2024

'scale_test_def' is unused since commit 3762a39c ("selftests/bpf: Split out
bpf_verif_scale selftests into multiple tests"). Remove it.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240602234112.225107-2-linux@treblig.org

dfa7c9ff

riscv, bpf: Introduce shift add helper with Zba optimization · 96a27ee7

Xiao Wang authored May 24, 2024

Zba extension is very useful for generating addresses that index into array
of basic data types. This patch introduces sh2add and sh3add helpers for
RV32 and RV64 respectively, to accelerate addressing for array of unsigned
long data.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/bpf/20240524075543.4050464-3-xiao.w.wang@intel.com

96a27ee7

01 Jun, 2024 1 commit

libbpf: keep FD_CLOEXEC flag when dup()'ing FD · 531876c8

Andrii Nakryiko authored May 29, 2024

Make sure to preserve and/or enforce FD_CLOEXEC flag on duped FDs.
Use dup3() with O_CLOEXEC flag for that.

Without this fix libbpf effectively clears FD_CLOEXEC flag on each of BPF
map/prog FD, which is definitely not the right or expected behavior.
Reported-by: Lennart Poettering <lennart@poettering.net>
Fixes: bc308d01 ("libbpf: call dup2() syscall directly")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20240529223239.504241-1-andrii@kernel.orgSigned-off-by: Alexei Starovoitov <ast@kernel.org>

531876c8

30 May, 2024 1 commit

Merge branch 'Notify user space when a struct_ops object is detached/unregistered' · 3f8fde31

Martin KaFai Lau authored May 30, 2024

Kui-Feng Lee says:

====================
The subsystems managing struct_ops objects may need to detach a
struct_ops object due to errors or other reasons. It would be useful
to notify user space programs so that error recovery or logging can be
carried out.

This patch set enables the detach feature for struct_ops links and
send an event to epoll when a link is detached.  Subsystems could call
link->ops->detach() to detach a link and notify user space programs
through epoll.

The signatures of callback functions in "struct bpf_struct_ops" have
been changed as well to pass an extra link argument to
subsystems. Subsystems could detach the links received from reg() and
update() callbacks if there is. This also provides a way that
subsystems can distinguish registrations for an object that has been
registered multiple times for several links.

However, bpf struct_ops maps without BPF_F_LINK have no any link.
Subsystems will receive NULL link pointer for this case.
---
Changes from v6:

 - Fix the missing header at patch 5.

 - Move RCU_INIT_POINTER() back to its original position.

Changes from v5:

 - Change the commit title of the patch for bpftool.

Changes from v4:

 - Change error code for bpf_struct_ops_map_link_update()

 - Always return 0 for bpf_struct_ops_map_link_detach()

 - Hold update_mutex in bpf_struct_ops_link_create()

 - Add a separated instance of file_operations for links supporting
    poll.

 - Fix bpftool for bpf_link_fops_poll.

Changes from v3:

 - Add a comment to explain why holding update_mutex is not necessary
    in bpf_struct_ops_link_create()

 - Use rcu_access_pointer() in bpf_struct_ops_map_link_poll().

Changes from v2:

 - Rephrased commit logs and comments.

 - Addressed some mistakes from patch splitting.

 - Replace mutex with spinlock in bpf_testmod.c to address lockdep
    Splat and simplify the implementation.

 - Fix an argument passing to rcu_dereference_protected().

Changes from v1:

 - Pass a link to reg, unreg, and update callbacks.

 - Provide a function to detach a link from underlying subsystems.

 - Add a kfunc to mimic detachments from subsystems, and provide a
    flexible way to control when to do detachments.

 - Add two tests to detach a link from the subsystem after the refcount
    of the link drops to zero.

v6: https://lore.kernel.org/bpf/20240524223036.318800-1-thinker.li@gmail.com/
v5: https://lore.kernel.org/all/20240523230848.2022072-1-thinker.li@gmail.com/
v4: https://lore.kernel.org/all/20240521225121.770930-1-thinker.li@gmail.com/
v3: https://lore.kernel.org/all/20240510002942.1253354-1-thinker.li@gmail.com/
v2: https://lore.kernel.org/all/20240507055600.2382627-1-thinker.li@gmail.com/
v1: https://lore.kernel.org/all/20240429213609.487820-1-thinker.li@gmail.com/
====================
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>

3f8fde31