- 24 Aug, 2021 9 commits
-
-
Alexei Starovoitov authored
Li Zhijian says: ==================== Fix a few issues reported by 0Day/LKP during runing selftests/bpf. Changelog: V2: - folded previous similar standalone patch to [1/5], and add acked tag from Song Liu - add acked tag to [2/5], [3/5] from Song Liu - [4/5]: move test_bpftool.py to TEST_PROGS_EXTENDED, files in TEST_GEN_PROGS_EXTENDED are generated by make. Otherwise, it will break out-of-tree install: 'make O=/kselftest-build SKIP_TARGETS= V=1 -C tools/testing/selftests install INSTALL_PATH=/kselftest-install' - [5/5]: new patch ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Li Zhijian authored
This would happend when we run the tests after install kselftests root@lkp-skl-d01 ~# /kselftests/run_kselftest.sh -t bpf:test_doc_build.sh TAP version 13 1..1 # selftests: bpf: test_doc_build.sh perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LANGUAGE = (unset), LC_ALL = (unset), LC_ADDRESS = "en_US.UTF-8", LC_NAME = "en_US.UTF-8", LC_MONETARY = "en_US.UTF-8", LC_PAPER = "en_US.UTF-8", LC_IDENTIFICATION = "en_US.UTF-8", LC_TELEPHONE = "en_US.UTF-8", LC_MEASUREMENT = "en_US.UTF-8", LC_TIME = "en_US.UTF-8", LC_NUMERIC = "en_US.UTF-8", LANG = "en_US.UTF-8" are supported and installed on your system. perl: warning: Falling back to the standard locale ("C"). # skip: bpftool files not found! # ok 1 selftests: bpf: test_doc_build.sh # SKIP Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210820025549.28325-1-lizhijian@cn.fujitsu.com
-
Li Zhijian authored
test_bpftool.sh relies on bpftool and test_bpftool.py. 'make install' will install bpftool to INSTALL_PATH/bpf/bpftool, and export it to PATH so that it can be used after installing. Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210820015556.23276-5-lizhijian@cn.fujitsu.com
-
Li Zhijian authored
For 'make run_tests': selftests will build bpftool into tools/testing/selftests/bpf/tools/sbin/bpftool by default. ================== root@lkp-skl-d01 /opt/rootfs/v5.14-rc4# make -C tools/testing/selftests/bpf run_tests make: Entering directory '/opt/rootfs/v5.14-rc4/tools/testing/selftests/bpf' MKDIR include MKDIR libbpf MKDIR bpftool [...] GEN /opt/rootfs/v5.14-rc4/tools/testing/selftests/bpf/tools/build/bpftool/profiler.skel.h CC /opt/rootfs/v5.14-rc4/tools/testing/selftests/bpf/tools/build/bpftool/prog.o GEN /opt/rootfs/v5.14-rc4/tools/testing/selftests/bpf/tools/build/bpftool/pid_iter.skel.h CC /opt/rootfs/v5.14-rc4/tools/testing/selftests/bpf/tools/build/bpftool/pids.o LINK /opt/rootfs/v5.14-rc4/tools/testing/selftests/bpf/tools/build/bpftool/bpftool INSTALL bpftool GEN vmlinux.h [...] # test_feature_dev_json (test_bpftool.TestBpftool) ... ERROR # test_feature_kernel (test_bpftool.TestBpftool) ... ERROR # test_feature_kernel_full (test_bpftool.TestBpftool) ... ERROR # test_feature_kernel_full_vs_not_full (test_bpftool.TestBpftool) ... ERROR # test_feature_macros (test_bpftool.TestBpftool) ... Error: bug: failed to retrieve CAP_BPF status: Invalid argument # ERROR # # ====================================================================== # ERROR: test_feature_dev_json (test_bpftool.TestBpftool) # ---------------------------------------------------------------------- # Traceback (most recent call last): # File "/opt/rootfs/v5.14-rc4/tools/testing/selftests/bpf/test_bpftool.py", line 57, in wrapper # return f(*args, iface, **kwargs) # File "/opt/rootfs/v5.14-rc4/tools/testing/selftests/bpf/test_bpftool.py", line 82, in test_feature_dev_json # res = bpftool_json(["feature", "probe", "dev", iface]) # File "/opt/rootfs/v5.14-rc4/tools/testing/selftests/bpf/test_bpftool.py", line 42, in bpftool_json # res = _bpftool(args) # File "/opt/rootfs/v5.14-rc4/tools/testing/selftests/bpf/test_bpftool.py", line 34, in _bpftool # return subprocess.check_output(_args) # File "/usr/lib/python3.7/subprocess.py", line 395, in check_output # **kwargs).stdout # File "/usr/lib/python3.7/subprocess.py", line 487, in run # output=stdout, stderr=stderr) # subprocess.CalledProcessError: Command '['bpftool', '-j', 'feature', 'probe', 'dev', 'dummy0']' returned non-zero exit status 255. # ================== Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20210820015556.23276-4-lizhijian@cn.fujitsu.com
-
Li Zhijian authored
Previously, it fails as below: ------------- root@lkp-skl-d01 /opt/rootfs/v5.14-rc4/tools/testing/selftests/bpf# ./test_doc_build.sh ++ realpath --relative-to=/opt/rootfs/v5.14-rc4/tools/testing/selftests/bpf ./test_doc_build.sh + SCRIPT_REL_PATH=test_doc_build.sh ++ dirname test_doc_build.sh + SCRIPT_REL_DIR=. ++ realpath /opt/rootfs/v5.14-rc4/tools/testing/selftests/bpf/./../../../../ + KDIR_ROOT_DIR=/opt/rootfs/v5.14-rc4 + cd /opt/rootfs/v5.14-rc4 + for tgt in docs docs-clean + make -s -C /opt/rootfs/v5.14-rc4/. docs make: *** No rule to make target 'docs'. Stop. + for tgt in docs docs-clean + make -s -C /opt/rootfs/v5.14-rc4/. docs-clean make: *** No rule to make target 'docs-clean'. Stop. ----------- Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20210820015556.23276-3-lizhijian@cn.fujitsu.com
-
Li Zhijian authored
0Day robot observed that it's easily timeout on a heavy load host. ------------------- # selftests: bpf: test_maps # Fork 1024 tasks to 'test_update_delete' # Fork 1024 tasks to 'test_update_delete' # Fork 100 tasks to 'test_hashmap' # Fork 100 tasks to 'test_hashmap_percpu' # Fork 100 tasks to 'test_hashmap_sizes' # Fork 100 tasks to 'test_hashmap_walk' # Fork 100 tasks to 'test_arraymap' # Fork 100 tasks to 'test_arraymap_percpu' # Failed sockmap unexpected timeout not ok 3 selftests: bpf: test_maps # exit=1 # selftests: bpf: test_lru_map # nr_cpus:8 ------------------- Since this test will be scheduled by 0Day to a random host that could have only a few cpus(2-8), enlarge the timeout to avoid a false NG report. In practice, i tried to pin it to only one cpu by 'taskset 0x01 ./test_maps', and knew 10S is likely enough, but i still perfer to a larger value 30. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20210820015556.23276-2-lizhijian@cn.fujitsu.com
-
Yucong Sun authored
This patch extends wait time in timer_mim. As observed in slow CI environment, it is possible to have interrupt/preemption long enough to cause the test to fail, almost 1 failure in 5 runs. Signed-off-by: Yucong Sun <fallentree@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210823213629.3519641-1-fallentree@fb.com
-
Alexei Starovoitov authored
Dave Marchevsky says: ==================== The cgroup_bpf struct has a few arrays (effective, progs, and flags) of size MAX_BPF_ATTACH_TYPE. These are meant to separate progs by their attach type, currently represented by the bpf_attach_type enum. There are some bpf_attach_type values which are not valid attach types for cgroup bpf programs. Programs with these attach types will never be handled by cgroup_bpf_{attach,detach} and thus will never be held in cgroup_bpf structs. Even if such programs did make it into their reserved slot in those arrays, they would never be executed. Accordingly we can migrate to a new internal cgroup_bpf-specific enum for these arrays, saving some bytes per cgroup and making it more obvious which BPF programs belong there. netns_bpf_attach_type is an existing example of this pattern, let's do similar for cgroup_bpf. v1->v2: Address Daniel's comments * Reverse xmas tree ordering for def changes * Helper macro to reduce to_cgroup_bpf_attach_type boilerplate * checkpatch.pl complains: "ERROR: Macros with complex values should be enclosed in parentheses". Found some existing macros (do 'git grep "define case"') which get same complaint. Think it's fine to keep as-is since it's immediately undef'd. * Remove CG_BPF_ prefix from cgroup_bpf_attach_type * Although I agree that the prefix is redundant, the de-prefixed names feel a bit too 'general' given the internal use of the enum. e.g. when someone sees CGROUP_INET6_BIND it's not obvious that it should only be used in certain ways internally. * Don't feel strongly about this, just my thoughts as a noob to the internals. * Rebase onto latest bpf-next/master * No significant conflicts, some small boilerplate adjustments needed to catch up to Andrii's "bpf: Refactor BPF_PROG_RUN_ARRAY family of macros into functions" change ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Dave Marchevsky authored
Add an enum (cgroup_bpf_attach_type) containing only valid cgroup_bpf attach types and a function to map bpf_attach_type values to the new enum. Inspired by netns_bpf_attach_type. Then, migrate cgroup_bpf to use cgroup_bpf_attach_type wherever possible. Functionality is unchanged as attach_type_to_prog_type switches in bpf/syscall.c were preventing non-cgroup programs from making use of the invalid cgroup_bpf array slots. As a result struct cgroup_bpf uses 504 fewer bytes relative to when its arrays were sized using MAX_BPF_ATTACH_TYPE. bpf_cgroup_storage is notably not migrated as struct bpf_cgroup_storage_key is part of uapi and contains a bpf_attach_type member which is not meant to be opaque. Similarly, bpf_cgroup_link continues to report its bpf_attach_type member to userspace via fdinfo and bpf_link_info. To ease disambiguation, bpf_attach_type variables are renamed from 'type' to 'atype' when changed to cgroup_bpf_attach_type. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210819092420.1984861-2-davemarchevsky@fb.com
-
- 23 Aug, 2021 1 commit
-
-
Jiang Wang authored
Commit 94531cfc ("af_unix: Add unix_stream_proto for sockmap") introduced a bug for af_unix SEQPACKET type. In unix_shutdown, the unhash function will call prot->unhash(), which is NULL for SEQPACKET. And kernel will panic. On ARM32, it will show following messages: (it likely affects x86 too). Fix the bug by checking the prot->unhash is NULL or not first. Kernel log: <--- cut here --- Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = 2fba1ffb *pgd=00000000 Internal error: Oops: 80000005 [#1] PREEMPT SMP THUMB2 Modules linked in: CPU: 1 PID: 1999 Comm: falkon Tainted: G W 5.14.0-rc5-01175-g94531cfc-dirty #9240 Hardware name: NVIDIA Tegra SoC (Flattened Device Tree) PC is at 0x0 LR is at unix_shutdown+0x81/0x1a8 pc : [<00000000>] lr : [<c08f3311>] psr: 600f0013 sp : e45aff70 ip : e463a3c0 fp : beb54f04 r10: 00000125 r9 : e45ae000 r8 : c4a56664 r7 : 00000001 r6 : c4a56464 r5 : 00000001 r4 : c4a56400 r3 : 00000000 r2 : c5a6b180 r1 : 00000000 r0 : c4a56400 Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none Control: 50c5387d Table: 05aa804a DAC: 00000051 Register r0 information: slab PING start c4a56400 pointer offset 0 Register r1 information: NULL pointer Register r2 information: slab task_struct start c5a6b180 pointer offset 0 Register r3 information: NULL pointer Register r4 information: slab PING start c4a56400 pointer offset 0 Register r5 information: non-paged memory Register r6 information: slab PING start c4a56400 pointer offset 100 Register r7 information: non-paged memory Register r8 information: slab PING start c4a56400 pointer offset 612 Register r9 information: non-slab/vmalloc memory Register r10 information: non-paged memory Register r11 information: non-paged memory Register r12 information: slab filp start e463a3c0 pointer offset 0 Process falkon (pid: 1999, stack limit = 0x9ec48895) Stack: (0xe45aff70 to 0xe45b0000) ff60: e45ae000 c5f26a00 00000000 00000125 ff80: c0100264 c07f7fa3 beb54f04 fffffff7 00000001 e6f3fc0e b5e5e9ec beb54ec4 ffa0: b5da0ccc c010024b b5e5e9ec beb54ec4 0000000f 00000000 00000000 beb54ebc ffc0: b5e5e9ec beb54ec4 b5da0ccc 00000125 beb54f58 00785238 beb5529c beb54f04 ffe0: b5da1e24 beb54eac b301385c b62b6ee8 600f0030 0000000f 00000000 00000000 [<c08f3311>] (unix_shutdown) from [<c07f7fa3>] (__sys_shutdown+0x2f/0x50) [<c07f7fa3>] (__sys_shutdown) from [<c010024b>] (__sys_trace_return+0x1/0x16) Exception stack(0xe45affa8 to 0xe45afff0) Fixes: 94531cfc ("af_unix: Add unix_stream_proto for sockmap") Reported-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Jiang Wang <jiang.wang@bytedance.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Dmitry Osipenko <digetx@gmail.com> Acked-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Link: https://lore.kernel.org/bpf/20210821180738.1151155-1-jiang.wang@bytedance.com
-
- 19 Aug, 2021 7 commits
-
-
Prankur Gupta authored
Adding selftests for the newly added functionality to call bpf_setsockopt() and bpf_getsockopt() from setsockopt BPF programs. Test Details: 1. BPF Program Checks for changes in IPV6_TCLASS(SOL_IPV6) via setsockopt If the cca for the socket is not cubic do nothing If the newly set value for IPV6_TCLASS is 45 (0x2d) (as per our use-case) then change the cc from cubic to reno 2. User Space Program Creates an AF_INET6 socket and set the cca for that to be "cubic" Attach the program and set the IPV6_TCLASS to 0x2d using setsockopt Verify the cca for the socket changed to reno Signed-off-by: Prankur Gupta <prankgup@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20210817224221.3257826-3-prankgup@fb.com
-
Prankur Gupta authored
Add logic to call bpf_setsockopt() and bpf_getsockopt() from setsockopt BPF programs. An example use case is when the user sets the IPV6_TCLASS socket option, we would also like to change the tcp-cc for that socket. We don't have any use case for calling bpf_setsockopt() from supposedly read- only sys_getsockopt(), so it is made available to BPF_CGROUP_SETSOCKOPT only at this point. Signed-off-by: Prankur Gupta <prankgup@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20210817224221.3257826-2-prankgup@fb.com
-
Stanislav Fomichev authored
Same as previous patch but for the keys. memdup_bpfptr is renamed to kvmemdup_bpfptr (and converted to kvmalloc). Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20210818235216.1159202-2-sdf@google.com
-
Stanislav Fomichev authored
Use kvmalloc/kvfree for temporary value when manipulating a map via syscall. kmalloc might not be sufficient for percpu maps where the value is big (and further multiplied by hundreds of CPUs). Can be reproduced with netcnt test on qemu with "-smp 255". Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20210818235216.1159202-1-sdf@google.com
-
Yucong Sun authored
This patch adds a 1ms delay to reduce flakyness of the test. Signed-off-by: Yucong Sun <fallentree@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210819163609.2583758-1-fallentree@fb.com
-
Yonghong Song authored
Andrii reported that libbpf CI hit the following oops when running selftest send_signal: [ 1243.160719] BUG: kernel NULL pointer dereference, address: 0000000000000030 [ 1243.161066] #PF: supervisor read access in kernel mode [ 1243.161066] #PF: error_code(0x0000) - not-present page [ 1243.161066] PGD 0 P4D 0 [ 1243.161066] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 1243.161066] CPU: 1 PID: 882 Comm: new_name Tainted: G O 5.14.0-rc5 #1 [ 1243.161066] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 [ 1243.161066] RIP: 0010:bpf_overflow_handler+0x9a/0x1e0 [ 1243.161066] Code: 5a 84 c0 0f 84 06 01 00 00 be 66 02 00 00 48 c7 c7 6d 96 07 82 48 8b ab 18 05 00 00 e8 df 55 eb ff 66 90 48 8d 75 48 48 89 e7 <ff> 55 30 41 89 c4 e8 fb c1 f0 ff 84 c0 0f 84 94 00 00 00 e8 6e 0f [ 1243.161066] RSP: 0018:ffffc900000c0d80 EFLAGS: 00000046 [ 1243.161066] RAX: 0000000000000002 RBX: ffff8881002e0dd0 RCX: 00000000b4b47cf8 [ 1243.161066] RDX: ffffffff811dcb06 RSI: 0000000000000048 RDI: ffffc900000c0d80 [ 1243.161066] RBP: 0000000000000000 R08: 0000000000000000 R09: 1a9d56bb00000000 [ 1243.161066] R10: 0000000000000001 R11: 0000000000080000 R12: 0000000000000000 [ 1243.161066] R13: ffffc900000c0e00 R14: ffffc900001c3c68 R15: 0000000000000082 [ 1243.161066] FS: 00007fc0be2d3380(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000 [ 1243.161066] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1243.161066] CR2: 0000000000000030 CR3: 0000000104f8e000 CR4: 00000000000006e0 [ 1243.161066] Call Trace: [ 1243.161066] <IRQ> [ 1243.161066] __perf_event_overflow+0x4f/0xf0 [ 1243.161066] perf_swevent_hrtimer+0x116/0x130 [ 1243.161066] ? __lock_acquire+0x378/0x2730 [ 1243.161066] ? __lock_acquire+0x372/0x2730 [ 1243.161066] ? lock_is_held_type+0xd5/0x130 [ 1243.161066] ? find_held_lock+0x2b/0x80 [ 1243.161066] ? lock_is_held_type+0xd5/0x130 [ 1243.161066] ? perf_event_groups_first+0x80/0x80 [ 1243.161066] ? perf_event_groups_first+0x80/0x80 [ 1243.161066] __hrtimer_run_queues+0x1a3/0x460 [ 1243.161066] hrtimer_interrupt+0x110/0x220 [ 1243.161066] __sysvec_apic_timer_interrupt+0x8a/0x260 [ 1243.161066] sysvec_apic_timer_interrupt+0x89/0xc0 [ 1243.161066] </IRQ> [ 1243.161066] asm_sysvec_apic_timer_interrupt+0x12/0x20 [ 1243.161066] RIP: 0010:finish_task_switch+0xaf/0x250 [ 1243.161066] Code: 31 f6 68 90 2a 09 81 49 8d 7c 24 18 e8 aa d6 03 00 4c 89 e7 e8 12 ff ff ff 4c 89 e7 e8 ca 9c 80 00 e8 35 af 0d 00 fb 4d 85 f6 <58> 74 1d 65 48 8b 04 25 c0 6d 01 00 4c 3b b0 a0 04 00 00 74 37 f0 [ 1243.161066] RSP: 0018:ffffc900001c3d18 EFLAGS: 00000282 [ 1243.161066] RAX: 000000000000031f RBX: ffff888104cf4980 RCX: 0000000000000000 [ 1243.161066] RDX: 0000000000000000 RSI: ffffffff82095460 RDI: ffffffff820adc4e [ 1243.161066] RBP: ffffc900001c3d58 R08: 0000000000000001 R09: 0000000000000001 [ 1243.161066] R10: 0000000000000001 R11: 0000000000080000 R12: ffff88813bd2bc80 [ 1243.161066] R13: ffff8881002e8000 R14: ffff88810022ad80 R15: 0000000000000000 [ 1243.161066] ? finish_task_switch+0xab/0x250 [ 1243.161066] ? finish_task_switch+0x70/0x250 [ 1243.161066] __schedule+0x36b/0xbb0 [ 1243.161066] ? _raw_spin_unlock_irqrestore+0x2d/0x50 [ 1243.161066] ? lockdep_hardirqs_on+0x79/0x100 [ 1243.161066] schedule+0x43/0xe0 [ 1243.161066] pipe_read+0x30b/0x450 [ 1243.161066] ? wait_woken+0x80/0x80 [ 1243.161066] new_sync_read+0x164/0x170 [ 1243.161066] vfs_read+0x122/0x1b0 [ 1243.161066] ksys_read+0x93/0xd0 [ 1243.161066] do_syscall_64+0x35/0x80 [ 1243.161066] entry_SYSCALL_64_after_hwframe+0x44/0xae The oops can also be reproduced with the following steps: ./vmtest.sh -s # at qemu shell cd /root/bpf && while true; do ./test_progs -t send_signal Further analysis showed that the failure is introduced with commit b89fbfbb ("bpf: Implement minimal BPF perf link"). With the above commit, the following scenario becomes possible: cpu1 cpu2 hrtimer_interrupt -> bpf_overflow_handler (due to closing link_fd) bpf_perf_link_release -> perf_event_free_bpf_prog -> perf_event_free_bpf_handler -> WRITE_ONCE(event->overflow_handler, event->orig_overflow_handler) event->prog = NULL bpf_prog_run(event->prog, &ctx) In the above case, the event->prog is NULL for bpf_prog_run, hence causing oops. To fix the issue, check whether event->prog is NULL or not. If it is, do not call bpf_prog_run. This seems working as the above reproducible step runs more than one hour and I didn't see any failures. Fixes: b89fbfbb ("bpf: Implement minimal BPF perf link") Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210819155209.1927994-1-yhs@fb.com
-
Daniel Borkmann authored
The BPF interpreter as well as x86-64 BPF JIT were both in line by allowing up to 33 tail calls (however odd that number may be!). Recently, this was changed for the interpreter to reduce it down to 32 with the assumption that this should have been the actual limit "which is in line with the behavior of the x86 JITs" according to b61a28cf ("bpf: Fix off-by-one in tail call count limiting"). Paul recently reported: I'm a bit surprised by this because I had previously tested the tail call limit of several JIT compilers and found it to be 33 (i.e., allowing chains of up to 34 programs). I've just extended a test program I had to validate this again on the x86-64 JIT, and found a limit of 33 tail calls again [1]. Also note we had previously changed the RISC-V and MIPS JITs to allow up to 33 tail calls [2, 3], for consistency with other JITs and with the interpreter. We had decided to increase these two to 33 rather than decrease the other JITs to 32 for backward compatibility, though that probably doesn't matter much as I'd expect few people to actually use 33 tail calls. [1] https://github.com/pchaigno/tail-call-bench/commit/ae7887482985b4b1745c9b2ef7ff9ae506c82886 [2] 96bc4432 ("bpf, riscv: Limit to 33 tail calls") [3] e49e6f6d ("bpf, mips: Limit to 33 tail calls") Therefore, revert b61a28cf to re-align interpreter to limit a maximum of 33 tail calls. While it is unlikely to hit the limit for the vast majority, programs in the wild could one way or another depend on this, so lets rather be a bit more conservative, and lets align the small remainder of JITs to 33. If needed in future, this limit could be slightly increased, but not decreased. Fixes: b61a28cf ("bpf: Fix off-by-one in tail call count limiting") Reported-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/CAO5pjwTWrC0_dzTbTHFPSqDwA56aVH+4KFGVqdq8=ASs0MqZGQ@mail.gmail.com
-
- 18 Aug, 2021 3 commits
-
-
Xu Liu authored
Add test to use get_netns_cookie() from BPF_PROG_TYPE_SOCK_OPS. Signed-off-by: Xu Liu <liuxu623@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20210818105820.91894-3-liuxu623@gmail.com
-
Xu Liu authored
We'd like to be able to identify netns from sockops hooks to accelerate local process communication form different netns. Signed-off-by: Xu Liu <liuxu623@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20210818105820.91894-2-liuxu623@gmail.com
-
Grant Seltzer authored
This patch renames a documentation libbpf.rst to index.rst. In order for readthedocs.org to pick this file up and properly build the documentation site. It also changes the title type of the ABI subsection in the naming convention doc. This is so that readthedocs.org doesn't treat this section as a separate document. Signed-off-by: Grant Seltzer <grantseltzer@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210818151313.49992-1-grantseltzer@gmail.com
-
- 17 Aug, 2021 19 commits
-
-
Colin Ian King authored
The variable allow is being initialized with a value that is never read, it is being updated later on. The assignment is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210817170842.495440-1-colin.king@canonical.com
-
Andrii Nakryiko authored
Yonghong Song says: ==================== The bpf selftest send_signal() is flaky for its subtests trying to send signals in softirq/nmi context. To reduce flakiness, the signal-targetted process priority is boosted, which should minimize preemption of that process and improve the possibility that the underlying task in softirq/nmi context is the bpf_send_signal() wanted task. Patch #1 did a refactoring to use ASSERT_* instead of old CHECK macros. Patch #2 did actual change of boosting priority. Changelog: v1 -> v2: remove skip logic where the underlying task in interrupt context is not the intended one. ==================== Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
-
Yonghong Song authored
libbpf CI has reported send_signal test is flaky although I am not able to reproduce it in my local environment. But I am able to reproduce with on-demand libbpf CI ([1]). Through code analysis, the following is possible reason. The failed subtest runs bpf program in softirq environment. Since bpf_send_signal() only sends to a fork of "test_progs" process. If the underlying current task is not "test_progs", bpf_send_signal() will not be triggered and the subtest will fail. To reduce the chances where the underlying process is not the intended one, this patch boosted scheduling priority to -20 (highest allowed by setpriority() call). And I did 10 runs with on-demand libbpf CI with this patch and I didn't observe any failures. [1] https://github.com/libbpf/libbpf/actions/workflows/ondemand.ymlSigned-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210817190923.3186725-1-yhs@fb.com
-
Yonghong Song authored
Replace CHECK in send_signal.c with ASSERT_* macros as ASSERT_* macros are generally preferred. There is no funcitonality change. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210817190918.3186400-1-yhs@fb.com
-
Andrii Nakryiko authored
Yucong Sun says: ==================== This short series adds two new switches to test_progs, "-a" and "-d", adding support for both exact string matching, as well as '*' wildcards. It also cleans up the output to make it possible to generate allowlist/denylist using common cli tools. ==================== Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
-
Yucong Sun authored
This patch adds '-a' and '-d' arguments supporting both exact string match as well as using '*' wildcard in test/subtests selection. '-a' and '-t' can co-exists, same as '-d' and '-b', in which case they just add to the list of allowed or denied test selectors. Caveat: Same as the current substring matching mechanism, test and subtest selector applies independently, 'a*/b*' will execute all tests matching "a*", and with subtest name matching "b*", but tests matching "a*" that has no subtests will also be executed. Signed-off-by: Yucong Sun <fallentree@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210817044732.3263066-5-fallentree@fb.com
-
Yucong Sun authored
This patch add test name in subtest status message line, making it possible to grep ':OK' in the output to generate a list of passed test+subtest names, which can be processed to generate argument list to be used with "-a", "-d" exact string matching. Example: #1/1 align/mov:OK .. #1/12 align/pointer variable subtraction:OK #1 align:OK Signed-off-by: Yucong Sun <fallentree@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210817044732.3263066-4-fallentree@fb.com
-
Yucong Sun authored
In skip_account(), test->skip_cnt is set to 0 at the end, this makes next print statement never display SKIP status for the subtest. This patch moves the accounting logic after the print statement, fixing the issue. This patch also added SKIP status display for normal tests. Signed-off-by: Yucong Sun <fallentree@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210817044732.3263066-3-fallentree@fb.com
-
Yucong Sun authored
When using "-l", test_progs often is executed as non-root user, load_bpf_testmod() will fail and output errors. This patch skips loading bpf testmod when "-l" is specified, making output cleaner. Signed-off-by: Yucong Sun <fallentree@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210817044732.3263066-2-fallentree@fb.com
-
Yucong Sun authored
Using a fixed delay of 1 microsecond has proven flaky in slow CPU environment, e.g. Github Actions CI system. This patch adds exponential backoff with a cap of 50ms to reduce the flakiness of the test. Initial delay is chosen at random in the range [0ms, 5ms). Signed-off-by: Yucong Sun <fallentree@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210817045713.3307985-1-fallentree@fb.com
-
Yucong Sun authored
Using a fixed delay of 1 microsecond has proven flaky in slow CPU environment, e.g. Github Actions CI system. This patch adds exponential backoff with a cap of 50ms to reduce the flakiness of the test. Initial delay is chosen at random in the range [0ms, 5ms). Signed-off-by: Yucong Sun <fallentree@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20210816175250.296110-1-fallentree@fb.com
-
Andrii Nakryiko authored
Jiang Wang says: ==================== This patch series add support for unix stream type for sockmap. Sockmap already supports TCP, UDP, unix dgram types. The unix stream support is similar to unix dgram. Also add selftests for unix stream type in sockmap tests. ==================== Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
-
Jiang Wang authored
Add two new test cases in sockmap tests, where unix stream is redirected to tcp and vice versa. Signed-off-by: Jiang Wang <jiang.wang@bytedance.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Cong Wang <cong.wang@bytedance.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20210816190327.2739291-6-jiang.wang@bytedance.com
-
Jiang Wang authored
This is to prepare for adding new unix stream tests. Mostly renames, also pass the socket types as an argument. Signed-off-by: Jiang Wang <jiang.wang@bytedance.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Cong Wang <cong.wang@bytedance.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20210816190327.2739291-5-jiang.wang@bytedance.com
-
Jiang Wang authored
Add two tests for unix stream to unix stream redirection in sockmap tests. Signed-off-by: Jiang Wang <jiang.wang@bytedance.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Cong Wang <cong.wang@bytedance.com> Acked-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210816190327.2739291-4-jiang.wang@bytedance.com
-
Jiang Wang authored
Previously, sockmap for AF_UNIX protocol only supports dgram type. This patch add unix stream type support, which is similar to unix_dgram_proto. To support sockmap, dgram and stream cannot share the same unix_proto anymore, because they have different implementations, such as unhash for stream type (which will remove closed or disconnected sockets from the map), so rename unix_proto to unix_dgram_proto and add a new unix_stream_proto. Also implement stream related sockmap functions. And add dgram key words to those dgram specific functions. Signed-off-by: Jiang Wang <jiang.wang@bytedance.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Cong Wang <cong.wang@bytedance.com> Acked-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210816190327.2739291-3-jiang.wang@bytedance.com
-
Jiang Wang authored
To support sockmap for af_unix stream type, implement read_sock, which is similar to the read_sock for unix dgram sockets. Signed-off-by: Jiang Wang <jiang.wang@bytedance.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Cong Wang <cong.wang@bytedance.com> Acked-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210816190327.2739291-2-jiang.wang@bytedance.com
-
Hengqi Chen authored
Add test for btf__load_vmlinux_btf/btf__load_module_btf APIs. The test loads bpf_testmod module BTF and check existence of a symbol which is known to exist. Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210815081035.205879-1-hengqi.chen@gmail.com
-
grantseltzer authored
This removes the libbpf_api.rst file from the kernel documentation. The intention for this file was to pull documentation from comments above API functions in libbpf. However, due to limitations of the kernel documentation system, this API documentation could not be versioned, which is counterintuative to how users expect to use it. There is also currently no doc comments, making this a blank page. Once the kernel comment documentation is actually contributed, it will still exist in the kernel repository, just in the code itself. A seperate site is being spun up to generate documentaiton from those comments in a way in which it can be versioned properly. This also reconfigures the bpf documentation index page to make it easier to sync to the previously mentioned documentaiton site. Signed-off-by: Grant Seltzer <grantseltzer@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210810020508.280639-1-grantseltzer@gmail.com
-
- 16 Aug, 2021 1 commit
-
-
Daniel Borkmann authored
Andrii Nakryiko says: ==================== This patch set implements an ability for users to specify custom black box u64 value for each BPF program attachment, bpf_cookie, which is available to BPF program at runtime. This is a feature that's critically missing for cases when some sort of generic processing needs to be done by the common BPF program logic (or even exactly the same BPF program) across multiple BPF hooks (e.g., many uniformly handled kprobes) and it's important to be able to distinguish between each BPF hook at runtime (e.g., for additional configuration lookup). The choice of restricting this to a fixed-size 8-byte u64 value is an explicit design decision. Making this configurable by users adds unnecessary complexity (extra memory allocations, extra complications on the verifier side to validate accesses to variable-sized data area) while not really opening up new possibilities. If user's use case requires storing more data per attachment, it's possible to use either global array, or ARRAY/HASHMAP BPF maps, where bpf_cookie would be used as an index into respective storage, populated by user-space code before creating BPF link. This gives user all the flexibility and control while keeping BPF verifier and BPF helper API simple. Currently, similar functionality can only be achieved through: - code-generation and BPF program cloning, which is very complicated and unmaintainable; - on-the-fly C code generation and further runtime compilation, which is what BCC uses and allows to do pretty simply. The big downside is a very heavy-weight Clang/LLVM dependency and inefficient memory usage (due to many BPF program clones and the compilation process itself); - in some cases (kprobes and sometimes uprobes) it's possible to do function IP lookup to get function-specific configuration. This doesn't work for all the cases (e.g., when attaching uprobes to shared libraries) and has higher runtime overhead and additional programming complexity due to BPF_MAP_TYPE_HASHMAP lookups. Up until recently, before bpf_get_func_ip() BPF helper was added, it was also very complicated and unstable (API-wise) to get traced function's IP from fentry/fexit and kretprobe. With libbpf and BPF CO-RE, runtime compilation is not an option, so to be able to build generic tracing tooling simply and efficiently, ability to provide additional bpf_cookie value for each *attachment* (as opposed to each BPF program) is extremely important. Two immediate users of this functionality are going to be libbpf-based USDT library (currently in development) and retsnoop ([0]), but I'm sure more applications will come once users get this feature in their kernels. To achieve above described, all perf_event-based BPF hooks are made available through a new BPF_LINK_TYPE_PERF_EVENT BPF link, which allows to use common LINK_CREATE command for program attachments and generally brings perf_event-based attachments into a common BPF link infrastructure. With that, LINK_CREATE gets ability to pass throught bpf_cookie value during link creation (BPF program attachment) time. bpf_get_attach_cookie() BPF helper is added to allow fetching this value at runtime from BPF program side. BPF cookie is stored either on struct perf_event itself and fetched from the BPF program context, or is passed through ambient BPF run context, added in c7603cfa ("bpf: Add ambient BPF runtime context stored in current"). On the libbpf side of things, BPF perf link is utilized whenever is supported by the kernel instead of using PERF_EVENT_IOC_SET_BPF ioctl on perf_event FD. All the tracing attach APIs are extended with OPTS and bpf_cookie is passed through corresponding opts structs. Last part of the patch set adds few self-tests utilizing new APIs. There are also a few refactorings along the way to make things cleaner and easier to work with, both in kernel (BPF_PROG_RUN and BPF_PROG_RUN_ARRAY), and throughout libbpf and selftests. Follow-up patches will extend bpf_cookie to fentry/fexit programs. While adding uprobe_opts, also extend it with ref_ctr_offset for specifying USDT semaphore (reference counter) offset. Update attach_probe selftests to validate its functionality. This is another feature (along with bpf_cookie) required for implementing libbpf-based USDT solution. [0] https://github.com/anakryiko/retsnoop v4->v5: - rebase on latest bpf-next to resolve merge conflict; - add ref_ctr_offset to uprobe_opts and corresponding selftest; v3->v4: - get rid of BPF_PROG_RUN macro in favor of bpf_prog_run() (Daniel); - move #ifdef CONFIG_BPF_SYSCALL check into bpf_set_run_ctx (Daniel); v2->v3: - user_ctx -> bpf_cookie, bpf_get_user_ctx -> bpf_get_attach_cookie (Peter); - fix BPF_LINK_TYPE_PERF_EVENT value fix (Jiri); - use bpf_prog_run() from bpf_prog_run_pin_on_cpu() (Yonghong); v1->v2: - fix build failures on non-x86 arches by gating on CONFIG_PERF_EVENTS. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-